This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
0. (2.15) N −1 q
||d b f || L q (V ) ≤
From (2.13) and (2.14), we get for any a, b, q as above the following inequality: ||d b f || L q (V ) ≤ C1 ||d a ∇ f || L 1 (V ) , where C1 :=
N (q−1) 1 q SN
a a−c0 δ0
+1 +
q−N (q−1) 1 q a−c0 δ0 to | f |s instead of
Let us now apply inequality (2.16) b := Bs. Due to (2.15) we have a = b+1− q−1 q N = In this way we obtain
d B Q | f | Q dy
Q 1 2 +1 Q
V
Q 2
. f , for s := Q2 + 1, q := Qs , A := B +1− Q−2 2Q N .
BQ 2 + A, where
BQ Q Q +1 d 2 +A | f | 2 |∇ f |dy ≤ 2 V 1 1 2 2 BQ Q 2A 2 ≤ C2 d | f | dy d |∇ f | dy ;
≤ C1
V
where C2 := C1
(2.16)
V
+1 .
After simplifying we see that we have proved the following: there exists R = R( B2Q + A, ) such that for all 0 < r < R and all x ∈ with d(x) < γ r , there holds
2 B(x,r )∩
d
BQ
| f | dy Q
Q
≤C
B(x,r )∩
d 2 A (y)|∇ f |2 dy,
for any N ≥ 2 and any f ∈ C0∞ (B(x, r )) under the following conditions A := B + BQ 2N 1 − Q−2 2Q N , 2 + A > 0, 2 < Q < ∞ if N = 2, 2 < Q ≤ N −2 if N ≥ 3; here C3 = C22 = C3 (N , Q, B, c0 , δ0 ). +α) α Taking A = α2 , Q := 2(N N +α−2 and B := Q we deduce the local weighted Sobolev inequality (2.12) with C S = C S (N , α, c0 , δ0 ) and this completes the proof of Theorem 2.6. Remark 2.7. Note that the upper bound for the length of the “balls” in the local weighted Moser inequality, denoted by R(α, ), goes to zero as α tends to zero.
254
S. Filippas, L. Moschini, A. Tertikas
Remark 2.8. Let us note that when N = 1, the corresponding analogue of the local weighted Sobolev inequality (2.12) when = (−1, 1) is the following one:
min{1,x+r } max{−1,x−r }
≤ CS r
α
(1 − |y|) | f | (y)dy
α+1 1−α q + 2
q
min{1,x+r }
max{−1,x−r }
q1
α
2
(1 − |y|) | f | (y)dy
21
,
for any f ∈ C0∞ (x − r, x + r ), and any q > 2 if 0 < α ≤ 1 and 2 < q ≤ 2(α+1) α−1 if α > 1. Consequently Theorem 1.5 as well as its consequences can be also stated for N = 1; see [KO]. From the results within this subsection, we will now deduce a new parabolic Harnack inequality up to the boundary for the doubly degenerate elliptic operator L λα defined in (2.1). To this end let us first make precise the notion of a weak solution. Definition 2.9. By a solution v(y, t) to vt = −L λα v in Q := {B(x, r ) ∩ } × (0, r 2 ), we mean a function v ∈ C 1 ((0, r 2 ); L 2 (B(x, r ) ∩ , |y|λ d α (y)dy)) ∩ C 0 ((0, r 2 ); H 1 (B(x, r ) ∩ , |y|λ d α (y)dy)) such that for any ∈ C 0 ((0, r 2 ); C0∞ (B(x, r ) ∩ )) and any 0 < t1 < t2 < r 2 we have t2 {|y|λ d α (y)vt + |y|λ d α (y)∇v∇}dydt = 0. (2.17) t1
B(x,r )∩
Then we have Theorem 2.10. Let α ≥ 1, N ≥ 2, λ ∈ [2 − N , 0] and ⊂ R N be a smooth bounded domain containing the origin. Then there exist positive constants C H and R = R() such that for x ∈ , 0 < r < R and for any positive solution v(y, t) 1 λ α 2 of ∂v ∂t = |y|λ d α (y) div(|y| d (y)∇v) in {B(x, r ) ∩ } × (0, r ), the following estimate holds true: ess sup
(y,t)∈{B(x, r2 )∩}×( r4 , r2 ) 2
2
v(y, t) ≤ C H ess inf (y,t)∈{B(x, r )∩}×( 3 r 2 ,r 2 ) v(y, t). 2
4
In order to prove the parabolic Harnack inequality in Theorem 2.10 we use the Moser iteration technique as adapted to degenerate elliptic operators in [FKS, CS] as well as [GSC]. In this approach one inserts in the weak form of the equation vt = −L λα v suitable test functions . One of the key ideas is to use test functions of the form η2 v q , where v is the weak solution of the equation, η is a cut off function and q ∈ R. To this end one has to check that η2 v q is in the right space of test function. In this direction the following density theorem is crucial. Theorem 2.11. Let N ≥ 2 and ⊂ R N be a smooth bounded domain. Then for any α ≥ 1, H 1 (, d α (y) dy) = H01 (, d α (y) dy). In particular for any α ≥ 1, the set C0∞ () is dense in H 1 (, d α (y)dy).
Sharp Two–Sided Heat Kernel Estimates for Critical Schrödinger Operators
Here H 1 (, d α (y)dy) denotes the set {v = v(y) : the corresponding norm being defined in (1.12). We are now ready to prove the density theorem.
d
255
α (y)(v 2
+ |∇v|2 )dy < ∞},
Proof. Let us prove here the result when α = 1. We refer to Proposition 9.10 in [K] for the case α > 1, even though our proof with some minor changes can also cover this range. First of all from Theorem 7.2 in [K] it is known that the set C ∞ () is dense in 1 H (, d(y) dy). Thus for any v ∈ H 1 (, d(y) dy) there exists vm ∈ C ∞ () such that for any > 0 we have ||v − vm || H 1 ≤ if m ≥ m(). Let us choose w := vm() and let 1 us define, for k ≥ 1, the following function: ⎧ ⎪ ⎨0 ϕk (x) = 1 + ⎪ ⎩1
ln(kd(x)) ln(k)
i f d(x) ≤ k12 , i f k12 < d(x) < i f d(x) ≥ k1 .
1 k
,
Then wk := wϕk ∈ C00,1 (), moreover we have ||w − wk || H 1 = ||w(1 − ϕk )|| H 1 1 1 2 ≤ 2 (w + |∇w|2 )(1 − ϕk )2 d(y) dy + 2 w 2 |∇ϕk |2 d(y) dy ≤
≤2
d(y)< k1
(w 2 + |∇w|2 )d(y) dy + 2
1
w2 dy. d(y)(ln(k))2
Now as k → ∞ the right hand side goes to zero, this proves the theorem. The above theorem allows us to take the cut off function η in C0∞ (B(x, r )) instead of taking it as usual in C0∞ (B(x, r ) ∩ ). Clearly the two function spaces differ only if the “ball” intersects the boundary of . To explain what are the appropriate modifications of the standard iteration argument by Moser, we now present in detail the first step, which is the L 2 mean value inequality for any positive local subsolution of the equation vt = −L λα v. Theorem 2.12. Let α ≥ 1, N ≥ 2, λ ∈ [2 − N , 0] and ⊂ R N be a smooth bounded domain containing the origin. Then there exist positive constants C and R() such that for x ∈ , 0 < r < R() and for any positive subsolution v(y, t) of vt − 1 div(|y|λ d α (y)∇v) = 0 in {B(x, r ) ∩ } × (0, r 2 ) we have the estimate |y|λ d α (y) ess sup
2
(y,t)∈{B(x, r2 )∩}×( r2 ,r 2 )
≤
C 2 r V (x, r )
v 2 (y, t)
{B(x,r )∩}×(0,r 2 )
|y|λ d α (y) v 2 (y, t)dydt.
Proof. We will only prove the result in the non-standard case in which the “ball” B(x, r ) intersects the boundary of ; we refer to [MT2] as well as to [GSC] for details in the other case. Similarly to Definition 2.9 we define a subsolution v(y, t) to be a
256
S. Filippas, L. Moschini, A. Tertikas
function in C 1 ((0, r 2 ); L 2 (B(x, r ) ∩ , |y|λ d α (y) dy)) ∩ C 0 ((0, r 2 ); H 1 (B(x, r ) ∩ , |y|λ d α (y) dy)) such that the following holds true:
r2
B(x,r )∩ 0
0
{|y|λ d α (y)vt + |y|λ d α (y)∇v∇}dydt ≤ 0,
∀ ∈ C ((0, r 2 ); C0∞ (B(x, r ) ∩ )), ≥ 0. Hence in particular we have also {|y|λ d α (y)vt +|y|λ d α (y)∇v∇}dy ≤ 0, ∀ ∈ C0∞ (B(x, r ) ∩ ), ≥ 0. B(x,r )∩
Let us define for any q, M ≥ 1 the following functions G(z) =√z q if z ≤ M and G(z) = M q + q(z − M)M q−1 if z > M and H (z) ≥ 0 by H (z) = G (z), H (0) = 0; note that G(z) ≤ zG (z) as well as H (z) ≤ z H (z). Due to Theorem 2.11 there exists a sequence of functions vm in C ∞ (B(x, r ) ∩ ) having compact support in such that vm → v in H 1 (B(x, r ) ∩ , d α (y) dy) as m → +∞; whence due to (2.4) also in H 1 (B(x, r ) ∩ , |y|λ d α (y) dy). Hence for any η ∈ C0∞ (B(x, r )) and m ≥ 1 the function := η2 G(vm ) is an admissible test function, that is the following holds true: {|y|λ d α (y)η2 G(vm )vt + |y|λ d α (y)∇v∇(η2 G(vm ))}dy ≤ 0. B(x,r )∩
Passing to the limit as m → +∞ we get {|y|λ d α (y)η2 G(v)vt +|y|λ d α (y)∇v∇(η2 G(v))}dy ≤ 0, ∀ η ∈ C0∞ (B(x, r )). B(x,r )∩
This is the standard starting point in the Moser iteration technique apart from the fact that the cut off function η is not necessarily zero on ∂, this is crucial. Then by the Schwarz inequality we get {|y|λ d α (y)η2 G(v)vt + |y|λ d α (y)|∇v|2 G (v)η2 }dy B(x,r )∩ ≤C |y|λ d α (y)|∇η|2 v 2 G (v)dy, B(x,r )∩
thus also that
B(x,r )∩
{|y|λ d α (y)η2 G(v)vt + |y|λ d α (y)|∇(ηH (v))|2 }dy
≤C
B(x,r )∩
|y|λ d α (y)|∇η|2 v 2 G (v)dy.
For any smooth function χ of the time variable t, we easily get d λ α 2 2 |y| d (y)(ηχ F(v)) dy + χ |y|λ d α (y)|∇(ηH (v))|2 dy ≤ dt B(x,r )∩ B(x,r )∩
≤ Cχ χ ||∇η|| L ∞ (Rn ) + ||χ || L ∞ (R) |y|λ d α (y)v 2 G (v)dy; suppη ∩
Sharp Two–Sided Heat Kernel Estimates for Critical Schrödinger Operators
257
here F(z) is such that 2F(z)F (z) = G(z). For 21 ≤ s < s < 1 we choose as usual χ such that 0 ≤ χ ≤ 1, χ = 0 in (−∞, r 2 (1 − s )), χ = 1 in (r 2 (1 − s), ∞), moreover if ξ ∈ C0∞ (0, 1) be a nonnegative non-increasing function such that ξ(z) = 1 if z ≤ s and ξ(z) = 0 if z ≥ s , we define, use oflocal coordinates, the following cut off making |y −x | |a(y )−y N −d(x)| ξ . Then clearly ||∇η|| L ∞ (Rn ) ≤ r (sC −s) function η(y) := ξ r r and ||χ || L ∞ (R) ≤
C . r 2 (s −s)
Integrating our inequality over (0, t), with t ∈ (r 2 (1 − s), r 2 ) we obtain |y|λ d α (y)(ηF(v))2 dy sup t∈J B(x,r )∩ + |y|λ d α (y)|∇(ηH (v))|2 dydt ≤ {B(x,r )∩}×(r 2 (1−s),r 2 ) C ≤ 2 |y|λ d α (y)v 2 G (v)dydt. r (s − s)2 {B(x,s r )∩}×(r 2 (1−s ),r 2 ) Making once again use of Theorem 2.11 we note that we can apply the local weighted Moser inequality in Theorem 2.6 to the function f := ηH (v) thus obtaining 2 1+ N2+α λ α |y| d (y)(ηH (v)) dydt ≤ {B(x,r )∩}×(r 2 (1−s),r 2 )
≤
C 2 r (s − s)2
{B(x,s r )∩}×(r 2 (1−s ),r 2 )
Let us now denote by γ˜ := 1 +
2 N +α ,
{B(x,sr )∩}×(r 2 (1−s),r 2 )
C ≤ 2 r (s − s)2
s s − s
2
1+
2 N +α
.
thus as M tends to infinity we have for p := q + 1,
|y|λ d α (y)v pγ˜ dydt
λ α
{B(x,s r )∩}×(r 2 (1−s ),r 2 )
Thus due to Lemma 2.2 also that −1 2 −1 V (x, sr ) (r s) ≤C
|y|λ d α (y)v 2 G (v)dydt
{B(x,sr )∩}×(r 2 (1−s),r 2 )
V (x, s r )−1 (r 2 s )−1
|y| d (y)v dydt p
γ˜
.
|y|λ d α (y)v pγ˜ dydt ≤
{B(x,s r )∩}×(r 2 (1−s ),r 2 )
|y|λ d α (y)v p dydt
γ˜
.
i+2 then if we denote Take now p = pi := 2γ˜ i , s = θi+1 and s = θi , where θi := 2(i+1) 1 p by I (i) := V (x, θi r )−1 (r 2 θi )−1 {B(x,θi r )∩}×(r 2 (1−θi ),r 2 ) |y|λ d α (y)v pi dydt i the above inequality can be restated as follows I (i +1) ≤ C(i)I (i). Thus since one can show ∞ that the product of C(i) for all i ≥ 0 is finite, we obtain I (∞) ≤ i=0 C(i) I (0); this completes the proof of the proposition. To this end the choice R() := min{β, R(1, )} can be made; here β and R(1, ) are the constants appearing respectively in the local representation of ∂ and in Theorem 2.6 when α := 1.
258
S. Filippas, L. Moschini, A. Tertikas
Theorem 2.6 corresponds to the local weighted Moser inequality needed in the proof of the parabolic Harnack inequality up to the boundary stated in Theorem 1.5. The local weighted Moser inequality involved in the proof of Theorem 2.10 differs from Theorem 2.6 only if d(x) ≥ γ r , N ≥ 3, λ = 0, and in this case it reads as follows Theorem 2.13. Let N ≥ 3, λ ∈ [2 − N , 0) and ⊂ R N be a smooth bounded domain containing the origin. Then there exist a positive constant C M such that for any ν ≥ N , x ∈ , r > 0 and f ∈ C0∞ (B(x, r )) we have 2 1+ ν2 λ |y| | f (y)| dy B(x,r )
2
≤ C M r 2 (r N (|x| + r )λ )− ν
B(x,r )
|y|λ |∇ f |2 dy
B(x,r )
|y|λ | f |2 dy
2 ν
.
Proof. By Hölder inequality the result easily follows with C M := C S as soon as the following local weighted Sobolev inequality holds true:
2N
B(x,r )
|y|λ | f (y)| N −2 dy
N −2 N
≤ C S (|x| + r )
2|λ| N
B(x,r )
|y|λ |∇ f |2 dy
(2.18)
(we refer to the proof of Theorem 2.6 where a similar argument is used). Let us first prove the above inequality for any λ ∈ (2 − N , 0). As a consequence of the Caffarelli Kohn Nirenberg inequality (e.g. see Corollary 2 in Sect. 2.1.6 of [M]), the following holds true: N −2 N 2N Nλ N −2 N −2 f |y| dy ≤C |∇ f |2 |y|λ dy, ∀ f ∈ C0∞ (B(x, r )), B(x,r )
B(x,r )
and for some positive constant C independent of x and r . Whence also B(x,r )
2N N −2
f
λ
|y| dy
≤C
N −2 N
2|λ| N sup |y|
y∈B(x,r )
B(x,r )
λ
|∇ f | |y| dy ≤ C(|x| + r ) 2
2|λ| N
B(x,r )
|∇ f |2 |y|λ dy.
Let us now prove the result for λ = 2 − N . To this end let us apply Proposition 3.1 to 1 = B(0, 1) with D = e N −2 . Then there exists a positive constant C such that
2N
v N −2 |x|−N X
|∇v|2 |x|2−N d x ≥ C B(0,1)
∀v∈
B(0,1)
C0∞ (B(0, 1));
2(N −1) N −2
N −2 N |x| dx , D
x 1 ∞ here X (t) = 1−ln t , t ∈ (0, 1]. Now let us take v(x) := f R for any f ∈ C 0 (B(0, R)), then from above we have N −2 N −1) 2N |y| 2 2−N −N 2(N N −2 N −2 dy |∇ f | |y| dy ≥ C f |y| X . DR B(0,R) B(0,R)
Sharp Two–Sided Heat Kernel Estimates for Critical Schrödinger Operators
259
Then if y ∈ B(x, r ) clearly y ∈ B(0, |x| + r ), thus if we take R := |x| + r and f ∈ C0∞ (B(x, r )) from above we have
B(x,r )
|∇ f |2 |y|2−N dy ≥ C
≥
B(x,r )
f
2N N −2
B(x,r )
f
2N N −2
|y|−N X
N −2 N
|y|2−N dy
inf
y∈B(x,r )
2(N −1) N −2
|y|−2 X
2(N −1) N −2
N −2 N |y| dy ≥ DR N −2 N |y| . DR
Whence the claim easily follows as soon as we prove that sup |y|X
N −1 −N −2
y∈B(x,r )
|y| DR
2(NN−2) ≤ C S (|x| + r )
2(N −2) N
.
This is indeed the case, in fact we have N −1 N −2 N −1 |y| |y| −N −2 ≤ sup |y| 1 − ln sup |y|X = DR DR 0≤|y|≤|x|+r y∈B(x,r )
N −1 (thus using the fact that the function ϕ(t) = t 1 − ln DtR N −2 is an increasing function for t ∈ [0, R] if D and R are as above) N −1 N −2 N −1 |x| + r = (|x| + r ) 1 − ln = (|x| + r )(1 + ln(D)) N −2 DR N −1 N − 1 N −2 = (|x| + r ) . N −2 This completes the proof of Theorem 2.13. To state the heat kernel estimates following from Theorem 2.10 we introduce some notation. The operator L λα is defined for α ≥ 1 and λ ∈ [2−N , 0] in L 2 (, |x|λ d α (x) d x) as the generator of the symmetric form Lλα [v1 , v2 ] := |x|λ d α (x)∇v1 ∇v2 d x,
namely D(L λα )
:= v ∈ H01 (, |x|λ d α (x) d x) :
1 λ α 2 λ α div(|x| d (x)∇v) ∈ L (, |x| d (x) d x) , |x|λ d α (x) 1 div(|x|λ d α (x)∇v) for any v ∈ D(L λα ), L λα v := − λ α |x| d (x) −
where H01 (, |x|λ d α (x) d x) denotes the closure of C0∞ () in the norm v → ||v|| H 1 := α,λ
λ α
|x| d (x) |∇v| + v 2
2
1
2
dx
.
(2.19)
260
S. Filippas, L. Moschini, A. Tertikas
Then L λα is a nonnegative self-adjoint operator on L 2 (, |y|λ d α (y)dy) such −L λ t −L λ t := that λ for every t >λ α0, e α has λan integral kernel, that is e α v0 (x) λ lα (t, x, y)v0 (y)|y| d (y)dy; here lα (t, x, y) is called the heat kernel of L α . The existence of lαλ (t, x, y) can be proved arguing as in [DS1]; that is, using a global Sobolev inequality on , which can be easily deduced from its local version (2.12) as well as (2.18), by means of the partition of unity as in [K]. Then, from the parabolic Harnack inequality in Theorem 2.10, the following sharp two-sided heat kernel estimate for small time can be easily deduced: Theorem 2.14. Let α ≥ 1, N ≥ 2, λ ∈ [2 − N , 0] and ⊂ R N be a smooth bounded domain containing the origin. Then there exist positive constants C1 , C2 , with C1 ≤ C2 , and T > 0 depending on such that √ |λ| √ |λ| |x−y|2 N 1 (|x| + t) 2 (|y| + t) 2 C1 min α , t − 2 e−C2 t ≤ lαλ (t, x, y) ≤ α α t2 d 2 (x)d 2 (y) √ |λ| √ |λ| |x−y|2 N 1 (|x| + t) 2 (|y| + t) 2 ≤ C2 min α , t − 2 e−C1 t α α t2 d 2 (x)d 2 (y) for all x, y ∈ and 0 < t ≤ T . Proof of Theorem 2.14. Using the mean value estimate for subsolutions as in Theorem 2.12 and the parabolic Harnack inequality of Theorem 2.10 and arguing as in Theorems 5.2.10, 5.4.10 and 5.4.11 in [SC2] we are lead to the following Li-Yau type estimate: |x−y|2
|x−y|2
C1 e−C2 t C2 e−C1 t λ √ 1 √ 1 ≤ lα (t, x, y) ≤ √ 1 √ 1, V (x, t) 2 V (y, t) 2 V (x, t) 2 V (y, t) 2 for all x, y ∈ and 0 < t ≤ T ; where C1 , C2 are two positive constants with C1 ≤ C2 , and T > 0 depends on . From this the result follows using the volume estimate in Lemma 2.2. Using the machinery we have produced in this section we can handle more general operators than the one in Theorems 2.10 and 2.14. Thus, consider the operator !λ := − L α
N " ∂ 1 ∂ λ α a , (x)|x| d (x) i, j |x|λ d α (x) ∂ xi ∂x j
(2.20)
i, j=1
where ai, j (x) N ×N is a measurable symmetric uniformly elliptic matrix. The operator !λ is defined for α ≥ 1 and λ ∈ [2 − N , 0] in L 2 (, |x|λ d α (x) d x) as the generator of L α the symmetric form !λ [v1 , v2 ] := L α
N " i, j=1
|x|λ d α (x)ai, j (x)
∂v ∂v d x. ∂ xi ∂ x j
Then the existence of a heat kernel l#αλ (t, x, y) follows as in [DS1], and we have
Sharp Two–Sided Heat Kernel Estimates for Critical Schrödinger Operators
261
Theorem 2.15. Let α ≥ 1, N ≥ 2, λ ∈ [2 − N , 0] and ⊂ R N be a smooth bounded domain containing the origin. Then there exist positive constants C1 , C2 , with C1 ≤ C2 , and T > 0 depending on such that √ |λ| √ |λ| |x−y|2 N 1 (|x| + t) 2 (|y| + t) 2 C1 min α , t − 2 e−C2 t ≤ l#αλ (t, x, y) ≤ α α t2 d 2 (x)d 2 (y) √ |λ| √ |λ| |x−y|2 N 1 (|x| + t) 2 (|y| + t) 2 ≤ C2 min α , t − 2 e−C1 t α α t2 d 2 (x)d 2 (y) for all x, y ∈ and 0 < t ≤ T . Remark 2.16. A parabolic Harnack inequality up to the boundary similar to the one of Theorem 2.10 can be stated under the same assumptions of Theorem 2.15 for the more !λ . general operator L α 3. Critical Point Singularity In this section we establish a new Improved Hardy inequality (Theorem 3.2) and then we give the proofs of Theorem 1.1 and Theorem 1.2. The structure of this section is as follows. In Subsect. 3.1 we first deduce the improved Hardy inequality and then the global in time pointwise upper bound for the heat kernel of the Schrödinger operator − − ((N − 2)2 /4)|x|−2 , which is sharp when x and y are close to the boundary (see Theorem 3.4); then, due to an argument contained in [D1], we complete the proof of Theorem 1.2 proving the sharp lower bound for time large enough. The proof of Theorem 1.1 is finally completed in Subsect. 3.2, using the parabolic Harnack inequality up to the boundary of Theorem 2.10. 3.1. Boundary upper bounds and complete sharp description of the heat kernel for large values of time. We first recall the following improved Hardy-Sobolev inequality stated in Theorem A in [FT] (see also inequality (3.3) in [BFT2]) Proposition 3.1. For N ≥ 3, let ⊂ R N be a smooth bounded domain containing the origin and D ≥ supx∈ |x|. Then there exists a positive constant C such that
for any v ∈
|∇v|2 |x|2−N d x ≥ C
C0∞ ();
here X (t) =
2N
v N −2 |x|−N X
1 1−ln t ,
2(N −1) N −2
N −2 N |x| dx , D
t ∈ (0, 1].
We next state a new result, the proof of which will be given later on. Theorem 3.2 (Improved Hardy inequality). Let ⊂ R N , N ≥ 3, be a smooth bounded domain containing the origin. Then there exists a constant C = C() ∈ (0, 41 ] such that u2 (N − 2)2 2 |∇u|2 − d x ≥ C() d x, ∀ u ∈ C0∞ (). (3.1) u 2 4|x|2 d (x)
262
S. Filippas, L. Moschini, A. Tertikas
The positive constant C() can be taken to be exactly following condition:
1 4
for all domains satisfying the
−div(|x|2−N ∇d(x)) ≥ 0 a.e. in .
(3.2)
For example when ≡ B(0, R), for arbitrary R > 0, condition (3.2) is satisfied. Consequently, in this case the Hardy inequality involving the Schrödinger operator having critical singularity at the origin can be improved exactly by the inverse-square potential having critical singularity at the boundary. As a consequence of Proposition 3.1 and of the improved Hardy inequality of Theorem 3.2, the following logarithmic Sobolev inequality can be easily obtained: Theorem 3.3 (Logarithmic Hardy Sobolev inequality). For N ≥ 3, let ⊂ R N be a smooth bounded domain containing the origin. Then for any u ∈ C0∞ ( \ {0}), u ≥ 0, and any > 0 we have u u 2 log dx 2−N ||u||2 |x| 2 d(x) (N − 2)2 2 N +2 2 |∇u| − u d x + K3 − (3.3) ≤ log ||u||22 ; 4|x|2 4 here K 3 is a positive constant independent of and ||u||2 :=
u
2d x
1 2
.
Then using the Gross theorem of logarithmic Sobolev inequalities, as adapted by Davies and Simon (see Theorem 2.2.7 in [D4]), we will show the following global in time pointwise upper bound for the heat kernel: Theorem 3.4. For N ≥ 3, let ⊂ R N be a smooth bounded domain containing the origin. Then there exists a positive constant C such that k(t, x, y) ≤ C
2−N N d(x)d(y) (|x||y|) 2 t − 2 e−λ1 t , ∀ x, y ∈ , t > 0. t
Let us first prove the logarithmic Hardy Sobolev inequality (3.3). Proof of Theorem 3.3. As a first step we claim that the following logarithmic Hardy Sobolev inequality holds true: (N − 2)2 2 1 2 |∇u| − u (− log d(x)) d x ≤ u d x + K 1 − log ||u||22 , 4|x|2 2 (3.4)
2
for any u ∈ C0∞ (), u ≥ 0, and any > 0; here K 1 is a positive constant independent of . To see this let us first suppose that the nonnegative function u ∈ C0∞ () is such that ||u||2 = 1. We then have 1 1 1 2 2 −2 2 u (− log d(x)) d x = u (log d(x) ) d x ≤ log u dx ≤ 2 2 2 d(x) (N − 2)2 2 1 |∇u|2 − dx ; u ≤ log C −1 2 4|x|2
Sharp Two–Sided Heat Kernel Estimates for Critical Schrödinger Operators
263
here we have used first Jensen’s inequality and then the improved Hardy inequality (3.1). For a general nonnegative u ∈ C0∞ () we apply the above inequality to the function u ||u||2 , to get −1 2 C 1 (N − 2) |∇u|2 − u 2 (− log d(x)) d x ≤ ||u||22 log u2 d x . 2 4|x|2 ||u||22 Since log z ≤ z for any z > 0, then also log y ≤ 2C y − log (2C), for any > 0; 1 whence from this we deduce (3.4), with K 1 := 21 log( 2C ). We will next show the following logarithmic Hardy Sobolev inequality: u u 2 log dx 2−N ||u||2 |x| 2 (N − 2)2 2 N 2 |∇u| − u d x + K 2 − log ||u||22 , (3.5) ≤ 4|x|2 4 for any u ∈ C0∞ ( \ {0}), u ≥ 0, and any > 0; here K 2 is a positive constant independent of . By Proposition 3.1 it follows easily that there exists a positive constant C such that
|∇v|2 |x|2−N d x ≥ C
2N
v N −2 |x|2−N d x
N −2 N
,
(3.6)
for any v ∈ C0∞ () (this is inequality (4.12) in [BFT2]). Whence we claim that the following logarithmic Sobolev inequality holds true: v N |x|2−N d x ≤ v 2 log |∇v|2 |x|2−N d x + K 2 − log ||v||22 , ||v||2 4 (3.7) for any v ∈ C0∞ (), v ≥ 0, and any > 0; here K 2 is a positive constant independent of
1 and ||v||2 := v 2 |x|2−N d x 2 . To see this let us first suppose that the nonnegative function v ∈ C0∞ () is such that ||v||2 = 1. We then have 4 N −2 2 2−N v log(v) |x| dx = v 2 log v N −2 |x|2−N d x 4 4 N −2 +2 2−N log v N −2 |x| dx = ≤ 4 N −2 N 2N N N log log C −1 = v N −2 |x|2−N d x ≤ |∇v|2 |x|2−N d x ; 4 4 here we have used first Jensen’s inequality and then the improved Hardy-Sobolev inequality (3.6). For a general nonnegative v ∈ C0∞ () we apply the above inequality to the v function ||v|| , to get 2 C −1 v N 2 2−N 2 2 2−N |x| v log d x ≤ ||v||2 log |∇v| |x| dx . ||v||2 4 ||v||22
264
S. Filippas, L. Moschini, A. Tertikas
, for any > 0; Since log z ≤ z for any z > 0, then also log y ≤ 4C y − log 4C N N
N N whence from this we deduce (3.7) with K 2 := 4 log 4C . 2−N
Inequality (3.7) implies (3.5) via the following change of variables u := v|x| 2 . Finally from (3.4) and (3.5), the logarithmic Hardy Sobolev inequality (3.3) easily follows with constant K 3 := K 1 + K 2 + N4+2 log 2. We are now ready to give the proof of Theorem 3.4. Proof of Theorem 3.4. Let us define, as in Sect. 2 of [D2], the operator K˜ := U −1 (K − λ1 )U , U : L 2 (, ϕ12 d x) → L 2 () being the unitary operator U w := ϕ1 w, thus K˜ := − ϕ12 div(ϕ12 ∇). Here ϕ1 > 0 denotes the first eigenfunction and λ1 > 0 the first 1
−2) eigenvalue corresponding to the Dirichlet problem −ϕ1 − (N4|x| 2 ϕ1 = λ1 ϕ1 in , 2 ϕ1 = 0 on ∂, normalized in such a way that ϕ1 (x) d x = 1. Due to the results in Lemma 7 in [DD] and using Theorem 7.1 in [DS1] on one hand and elliptic regularity on the other, there exist two positive constants c1 , c2 such that 2
c1 |x|
2−N 2
d(x) ≤ ϕ1 (x) ≤ c2 |x|
2−N 2
d(x),
∀ x ∈ .
(3.8)
From this and (3.3) we deduce the following logarithmic Sobolev inequality: w N +2 2 2 ˜ w log log ||w||22 , ϕ1 d x ≤ < K w, w > L 2 (,ϕ 2 d x) + K 4 − 1 ||w||2 4 (3.9) for any w ∈ C0∞ ( \ {0}), w ≥ 0, and any > 0; where K 4 := K 3 + λ1 − log c1 and
1 ||w||2 := w 2 ϕ12 d x 2 . Let us remark that only the lower bound in estimate (3.8) was used. From now on one can use the standard approach of [D4] to complete the proof of the theorem. Here are some details for the convenience of the reader. As a first step we claim that the following L p logarithmic Sobolev inequalities holds true: p w p p ϕ12 d x ≤ < K˜ w, w p−1 > L 2 (,ϕ 2 d x) w log 1 2 ||w|| p 2 N +2 p log ||w|| p + K4 − (3.10) 4 for any w ∈ C0∞ ( \ {0}), w ≥ 0, and any > 0, p > 2. To see this we apply inequality p (3.9) to w 2 ; whence due to the fact that p p2 p2 2 2 2 < ∇w, ∇w p−1 > L 2 (,ϕ 2 d x) |∇w | ϕ1 d x = w p−2 |∇w|2 ϕ12 d x = 1 4 4( p − 1) p ≤ < K˜ w, w p−1 > L 2 (,ϕ 2 d x) , 1 2 since
p 2( p−1)
≤ 1 if p ≥ 2; the claim follows.
Sharp Two–Sided Heat Kernel Estimates for Critical Schrödinger Operators
265
Let H01 (, ϕ12 d x) be the closure of C0∞ () with respect to the norm ||w|| H 1
0,ϕ12
:=
21 |∇w|2 ϕ12 + w 2 ϕ12 d x ;
as one can easily prove this is also the closure of C0∞ ( \ {0}) with respect to the same norm. Then the operator K˜ defined in the domain D( K˜ ) = {w ∈ H01 (, ϕ12 d x) : K˜ w ∈ L 2 (, ϕ12 d x)} is naturally associated with the bilinear symmetric form defined 2 ˜ 1 , w2 ] :=< K˜ w1 , w2 > 2 as follows K[w L (,ϕ12 d x) = ∇w1 ∇w2 ϕ1 d x, which is a Dirichlet form. Whence Lemma 1.3.4 and Theorems 1.3.2 and 1.3.3 in [D4] imply that ˜ e− K t , which is an analytic contraction semigroup in L 2 (, ϕ12 d x), is also positivity preserving and a contraction semigroup in L p (, ϕ12 d x) for any 1 ≤ p ≤ ∞. As a consequence for any t > 0 and any p ≥ 2, ˜
e− K t [L 2 (, ϕ12 d x) ∩ L ∞ ()]+ ⊂ [H01 (, ϕ12 d x) ∩ L p (, ϕ12 d x) ∩ L ∞ ()]+ ; where we denote by [E]+ the subset of positive functions in the space E. Thus by density argument the L p logarithmic Sobolev inequality (3.10), more gen˜ erally applies to any function in ∪t>0 e− K t [L 2 (, ϕ12 d x) ∩ L ∞ ()]+ . This means that Theorem 2.2.7 in [D4] can be applied, in the same way as in Corollary 2.2.8 in [D4], to the operator K˜ ; whence obtaining that ˜
||e− K t ||2→∞ ≤ Ct −
N +2 4
,
and by duality that ˜
||e− K t ||1→2 ≤ Ct −
N +2 4
,
that is ˜
||e− K t ||1→∞ ≤ Ct −
N +2 2
.
Here we use the following notation: ˜
||e− K t ||q→ p := where || f ||q :=
˜ semigroup e− K t
q 2 | f | ϕ1
˜
||e− K t f (x)|| p , || f (x)||q 0<|| f ||q ≤1 sup
q1
. This implies, by Dunford-Pettis theorem, that the ˜ x, y) is indeed a semigroup of integral operators; that is a heat kernel k(t, dx
˜
associated to the semigroup e− K t is well defined and satisfies the following pointwise ˜ x, y) ≤ C 1 t − N2 , for any x, y ∈ and any t > 0. Theorem 3.4 then upper bound k(t, t follows, due to the upper bound in (3.8) and to the fact that, as a consequence of the ˜ x, y), corresponding respectively unitary operator U , the heat kernels k(t, x, y) and k(t, to K and K˜ , satisfy the following equivalence: ˜ x, y)e−λ1 t . k(t, x, y) ≡ ϕ1 (x)ϕ1 (y) k(t,
(3.11)
266
S. Filippas, L. Moschini, A. Tertikas
Remark 3.5. Applying Davies’s method of exponential perturbation to the operator K˜ (see Sect. 2 in [D3] for details), the upper bound in Theorem 3.4 can be improved by |x−y|2
adding a factor cδ e− 4(1+δ)t . Let us now deduce from the upper bound in Theorem 3.4 an analogous lower bound for time large enough, thus completing the proof of Theorem 1.2. We argue as in Theorem 6 of [D1] (see also Prop. 4 of [D2]), we give the details here for the convenience of the reader. Proof of Theorem 1.2. Making use of the same notation as in the proof of Theorem 3.4, ˜ x, y) ≥ C for any the lower bound we want to prove corresponds to the statement k(t, x, y ∈ if t is large enough, C being some positive constant. For any f ∈ L 1 (, ϕ12 d x), we clearly have f =< f, 1 > 1 + g, where < f, 1 >:=< f, 1 > L 2 (,ϕ 2 1
d x) , and <
g, 1 >= 0, since
making use of the fact that by definition K˜ 1 = 0 we have ˜
2 ϕ1 (x)d x
= 1. Thus,
˜
e− K t f =< f, 1 > 1 + e− K t g, ˜
that is the semigroup e−At f := e− K t f − < f, 1 > 1, to whom it is clearly associated ˜ x, y) − 1, is such that for any f ∈ L 1 (, ϕ 2 d x) the heat kernel k(t, 1 ˜
e−At f ≡ e− K t g, where g = g( f ) is a function in L 1 (, ϕ12 d x) such that < g, 1 >= 0. Thus, due to Theorem 3.4, ˜
||e−At ||1→∞ ≤ ||e− K t ||1→∞ ≤ Ct −
N +2 2
,
here C is some positive constant; this is equivalent to say that ˜ x, y) − 1| ≤ Ct − |k(t,
N +2 2
,
from which the claim easily follows for t large enough. In the sequel we will give the proof of Theorem 3.2. We will use the following lemma whose proof will be postponed until the end of this subsection. Lemma 3.6. For N ≥ 3, let ⊂ R N be a smooth bounded domain containing the origin. Then there exists δ0 > 0, such that 2−N |∇ f |2 d x 1 δ |x| inf = , 2 ∞ f 4 2−N f ∈C0 (δ ) dx δ |x| d 2 (x) for all 0 < δ ≤ δ0 ; here δ := {x ∈ : dist(x, ∂) ≤ δ}.
Sharp Two–Sided Heat Kernel Estimates for Critical Schrödinger Operators
267
We are now ready to prove the improved Hardy inequality. Proof of Theorem 3.2. (i) Let us first prove the claim on any domain satisfying condition (3.2). To this end let us define for any u ∈ C0∞ () as a new variable w := |x|
N −2 2
1
d − 2 (x)u, obviously w ∈ H01 (). By direct computations we have ∇u =
thus
1 2−N 1 1 2−N 1 2 − N − N −1 |x| 2 x d 2 w + |x| 2 d − 2 ∇d w + |x| 2 d 2 ∇w, 2 2
|∇u|2 =
(N − 2)2 −N 1 |x| d w 2 + |x|2−N d −1 w 2 + |x|2−N d |∇w|2 d x 4 4 N − 2 −N 2 − |x| + w x ∇d − (N − 2)|x|−N d w x ∇w 2 2−N d x. + ∇d∇w w |x|
Whence (N − 2)2 2 1 2 2 |∇u| − u − 2 u dx 4|x|2 4d N − 2 −N 2 = w x ∇d |∇w|2 d |x|2−N − |x| 2 1 (N − 2) −N 2 2 2−N dx |x| d x ∇w + ∇d ∇w |x| − 2 2 N − 2 −N 2 |∇w|2 d|x|2−N − |x| = w x ∇d 2 1 (N − 2) div(|x|−N d x)w 2 − div(|x|2−N ∇d)w 2 d x + 2 2 1 |∇w|2 d |x|2−N − div(|x|2−N ∇d)w 2 d x ≥ 0, = 2 due to condition (3.2) on . Thus inequality (3.1) is proved with constant C() ≡ 41 in any domain satisfying condition (3.2). (ii) Let us prove indirectly the claim in the remaining case. To this end let us denote by H01 (, |x|2−N d x) the closure of C0∞ () in the norm || f || H 1
2−N
:=
1 (|∇ f | + f )|x| 2
2
2−N
2
dx
.
(3.12)
The improved Hardy inequality (3.1) we are going to prove, in the new variable v := N −2 |x| 2 u reads as follows:
|∇v|2 |x|2−N d x ≥ C
|x|2−N
v2 d x. d2
268
S. Filippas, L. Moschini, A. Tertikas
Let us suppose that the improved Hardy inequality (3.1) is false; whence let us suppose that the following holds true: inf |x|2−N |∇v|2 d x = 0; {
|x|
2−N v 2 d2
d x = 1}
thus there exists a sequence {v j } j≥0 in H01 (, |x|2−N d x) such that and
|x|
2 2−N v j d2
d x = 1,
|x|2−N |∇v j |2 d x → 0,
as j → ∞.
(3.13)
For any arbitrary function ϕ ∈ C0∞ (), such that ϕ ≡ 1 in a neighborhood of the origin, we also have 2−N 2 |x| |∇(ϕv j )| d x ≤ 2 |x|2−N |∇v j |2 ϕ 2 + |∇ϕ|2 v 2j d x ≤C |x|2−N |∇v j |2 + v 2j d x ≤C |x|2−N |∇v j |2 d x → 0 as j → ∞. (3.14)
Here we use the fact that the following inequality holds true: 2−N 2 |x| f dx ≤ C |x|2−N |∇ f |2 d x, ∀ f ∈ H01 (, |x|2−N d x).
(3.15)
Inequality (3.15) for example follows easily from inequality (3.6) by the Holder inequality. From estimate (3.14) and inequality (3.15) (applied to f := ϕv j ) we easily deduce that |x|2−N ϕ 2 v 2j → 0, as j → ∞,
or similarly (due to the fact that ϕ has compact support inside ) that
|x|2−N ϕ 2
v 2j d2
d x → 0,
as j → ∞.
(3.16)
We then compute v 2j (ϕv j + (1 − ϕ)v j )2 1= |x|2−N 2 d x = |x|2−N dx = d d2 v 2j v 2j v 2j = |x|2−N ϕ 2 2 d x + 2 |x|2−N ϕ(1 − ϕ) 2 d x + |x|2−N (1 − ϕ)2 2 d x. d d d We observe that the first two terms in the last line tend to zero as j tends to infinity and therefore we obtain that v 2j |x|2−N (1 − ϕ)2 2 d x = 1 + o(1), as j → ∞. (3.17) d
Sharp Two–Sided Heat Kernel Estimates for Critical Schrödinger Operators
269
On the other hand we have that |x|2−N |∇[(1 − ϕ)v j ]|2 d x ≤ 2 |x|2−N |∇v j |2 d x + 2 |x|2−N |∇(ϕv j )|2 d x,
both terms in the right-hand side going to zero as j tends to infinity due to (3.13) and (3.14); whence we deduce that
|x|2−N |∇[(1 − ϕ)v j ]|2 d x → 0,
as j → ∞.
(3.18)
Since for any j ≥ 0 the function f := (1 − ϕ)v j is an element of H01 (δ ) for a suitable choice of the function ϕ (take it identically one in a subset containing \ δ ), by means of (3.17) and (3.18) we reach a contradiction with Lemma 3.6, thus proving the improved Hardy inequality. A similar improved Hardy inequality for a potential behaving like ((N − 2)2 /4)|x|−2 near the origin and exactly like (1/4)d −2 (x) near the boundary is also shown without any geometric assumption on the domain (see Theorem 3.10 below). We next prove Lemma 3.6. One can consider it as a consequence of the following more general result. Lemma 3.7. For N ≥ 3, let ⊂ R N be a smooth bounded domain. Then there exists a N positive constant δ0 = δ0 (), such that for any V ∈ L 2 (δ0 ) and 0 < δ ≤ δ0 , we have the following estimate: 1 2 2 |∇u| − 2 u d x ≥ c V u 2 d x, ∀ u ∈ C0∞ (δ ); 4d δ δ here c = c(δ) → ∞ as δ → 0 and δ := {x ∈ : dist(x, ∂) ≤ δ}. −2) Proof of Lemma 3.6. Let us choose V (x) := (N4|x| in Lemma 3.7 above and let us 2 choose δ small enough such that c(δ) ≥ 1 and 0 < δ ≤ δ0 , thus we have 2
δ
|∇u|2 −
(N − 2)2 2 1 2 d x ≥ u u d x, 2 4d 4|x|2 δ 2−N
(3.19)
for any u ∈ C0∞ (δ ). For any f ∈ C0∞ (δ ), u := f |x| 2 will be in C0∞ (δ ), moreover by easy computations we have 1 (N − 2)2 2 1 2 |∇u|2 − 2 u 2 d x = |∇ f |2 + |x|2−N d x, f − f 4d 4|x|2 4d 2 δ δ thus (3.19) can be restated as follows: 1 |∇ f |2 − 2 f 2 |x|2−N d x ≥ 0; 4d δ this proves the claim.
270
S. Filippas, L. Moschini, A. Tertikas
Whence it only remains to prove Lemma 3.7. Before doing so let us observe that inequality (3.19) simply says that the improved Hardy inequality (3.1) indeed holds true with constant C() = 41 whenever the support of the functions considered is contained in a neighborhood of the boundary. The proof of Lemma 3.7 makes use of the following improved Hardy-Sobolev inequality near the boundary stated in Theorem 3 of [FMT1], we recall it here for the convenience of the reader: Proposition 3.8. For N ≥ 3, let ⊂ R N be a smooth bounded domain. Then there exist positive constants δ0 = δ0 () and C = C(N ), such that N −2 N 2N 1 2 2 |∇u| − 2 u d x ≥ C u N −2 d x , ∀ u ∈ C0∞ (δ ), 4d δ δ
and any 0 < δ ≤ δ0 ; here δ := {x ∈ : dist(x, ∂) ≤ δ}. Let us focus here on the fact that in Proposition 3.8 no convexity assumption on the domain is made; this is due to the fact that we only consider functions whose supports are contained in a neighborhood of the boundary. Proof of Lemma 3.7. By Holder inequality we have
V u dx ≤
δ
≤
2 N
N 2
2
δ
V dx 2
N
N 2
δ
δ
V dx
u
C(N )
2N N −2
−1
N −2 N
dx
≤
1 u2 2 |∇u| − d x, 4 d2 δ
the last step being due to Proposition 3.8. This proves the claim with constant c(δ) :=
δ
N 2
V dx
− 2
N
C(N ),
which tends to infinity as δ tends to zero due to the integrability assumption on V . With some minor changes in the proof of Theorem 3.2 one can indeed prove the following improved Hardy inequality, which does not a priori require the bounded domain to be smooth. Theorem 3.9. For N ≥ 3, let ⊂ R N be a bounded domain containing the origin such that 2 u 2 |∇u| d x ≥ C d x, ∀ u ∈ C0∞ () 2 d and some positive constant C. Then there exists a positive constant C˜ such that 2 u (N − 2)2 2 ˜ |∇u|2 − d x ≥ C u d x, ∀ u ∈ C0∞ (). 2 2 4|x| d We finally mention the following related new Hardy inequality, which we think is of independent interest
Sharp Two–Sided Heat Kernel Estimates for Critical Schrödinger Operators
271
Theorem 3.10. For N ≥ 3, let ⊂ R N be a smooth bounded domain containing the origin, and define for > 0, (N −2)2 i f {x ∈ : d(x) ≥ } 4|x|2 V (x) = 1 i f {x ∈ : d(x) < }. 4d 2 (x) Then there exists 0 = 0 () such that for all 0 < ≤ 0 and u ∈ C0∞ (), we have 2 |∇u| d x ≥ V (x)u 2 d x.
Proof. We will only sketch it. Let 1 = {x ∈ : d(x) ≥ }. Then using the change of 2−N variable u := |x| 2 v, one can prove the following inequality: 2−N u2 (N − 2)2 2 |∇u|2 − d x ≥ u x · ν d Sx . 2 2 4|x| 2 1 ∂1 |x| 1
1
Similarly using the change of variable u := d 2 (x)X − 2 (d(x))v with X (t) = (1−ln t)−1 one can prove the following inequality: 1 u2 1 |∇u|2 − 2 u 2 d x ≥ − ∇d · ν d Sx , 4d (x) 4 ∂1 d(x) \1 for any 0 < ≤ min{e−1 , 1 }, where 1 > 0 is such that d −1 X (d) + 2d(ln d) ≥ 0 for d ≤ 1 . The result then follows showing that for 0 < ≤ 0 = min{e−1 , 1 , 2NR−3 } we ) 1 have 2(2−N x − d(x) ∇d · ν ≥ 0 since ν := −∇d on ∂1 ; here R denotes a positive |x|2 constant such that B(0, R) ⊂ , which exists due to the assumption on . 3.2. Complete sharp description of the heat kernel for small values of time. In this section we prove the two-sided sharp estimate on the heat kernel k(t, x, y) stated for small time in Theorem 1.1. Proof of Theorem 1.1. Since for any x ∈ and for some positive constants c1 , c2 we λ α λ α have the following estimate c1 |x| 2 d 2 (x) ≤ ϕ1 (x) ≤ c2 |x| 2 d 2 (x) for α = 2 and λ = 2− N , we can apply the result of Theorem 2.15 to the operator K˜ = − ϕ 21(x) div(ϕ12 (x)∇). Hence due to (3.11) the result follows immediately.
1
Let us finally make some remarks concerning Schrödinger operators having potential V (x) = c|x|−2 . Arguing as in Lemma 7 in [DD] one can easily prove that the first Di2 , behaves richlet eigenfunction for the Schrödinger operator − − |x|c 2 , 0 < c < (N −2) 4 $ λ 2 like |x| 2 d(x) on all , where λ := 2 − N + (N − 2) − 4c. Then we have Theorem 3.11. For N ≥ 3, let ⊂ R N be a smooth bounded domain containing the origin. Then there exist positive constants C1 , C2 , with C1 ≤ C2 , and T > 0 depending on such that √ |λ| √ |λ| d(x)d(y)
|x−y|2 λ N (|x||y|) 2 t − 2 e−C2 t ≤ C1 min (|x| + t) 2 (|y| + t) 2 , t √ |λ| √ |λ| d(x)d(y)
|x−y|2 λ N (|x||y|) 2 t − 2 e−C1 t , ≤ kc (t, x, y) ≤ C2 min (|x| + t) 2 (|y| + t) 2 , t
272
S. Filippas, L. Moschini, A. Tertikas
for all x, y ∈ and 0 < t ≤ T ; here kc (t, x, y) denotes the heat kernel associated to 2 the operator − − |x|c 2 in under Dirichlet boundary conditions for 0 < c < (N −2) , 4 $ 2 and λ := 2 − N + (N − 2) − 4c. Theorem 3.12. For N ≥ 3, let ⊂ R N be a smooth bounded domain containing the origin. Then there exist two positive constants C1 , C2 , with C1 ≤ C2 , such that λ
λ
C1 d(x) d(y) (|x||y|) 2 e−λ1 t ≤ kc (t, x, y) ≤ C2 d(x) d(y) (|x||y|) 2 e−λ1 t , for all x, y ∈ and t > 0 large enough; here kc (t, x, y) denotes the heat kernel associated to the operator − − |x|c 2 in under Dirichlet boundary conditions for 0 < c < $ (N −2)2 , λ1 its (positive) elliptic first eigenvalue and λ := 2 − N + (N − 2)2 − 4c. 4
4. Critical Boundary Singularity In this section we prove Theorems 1.3 and 1.4 as well as a new Hardy-Moser inequality (Theorem 4.3). The structure of this section is as follows. In Subsect. 4.1 we first prove the improved Hardy-Moser inequality. Then in Subsect. 4.2 we get the global in time pointwise upper bound for the heal kernel of the Schrödinger operator − − (1/4)d −2 (x), which is sharp when x and y are close to the boundary (see Theorem 4.4). Then arguing as in [D1], we deduce the sharp heat kernel lower bound for time large enough, thus completing the proof of Theorem 1.4. The proof of Theorem 1.3 is finally completed in Subsect. 4.3, using the parabolic Harnack inequality up to the boundary stated in Theorem 1.5.
4.1. The improved Hardy-Moser inequality. Here we will prove a new improved HardyMoser inequality which we think is of independent interest. The proof is based on an auxiliary Hardy-Sobolev inequality, that we will show here, as well as on the following improved Hardy inequality stated in Theorem A in [BFT1]. Proposition 4.1. For N ≥ 2, let ⊂ R N be a smooth bounded and convex domain. Then there exists D0 positive such that for all D ≥ D0 , |∇u|2 −
1 u2 4d 2 (x)
for any u ∈ C0∞ (); here X (t) :=
dx ≥
1 1−ln t ,
1 4
t ∈ (0, 1].
X2
d(x) D
d 2 (x)
u 2 d x,
Sharp Two–Sided Heat Kernel Estimates for Critical Schrödinger Operators
273
Let us now state the auxiliary Hardy-Sobolev inequality we will use in the sequel. Lemma 4.2. Let α > 0, N ≥ 2 and ⊂ R N be a smooth bounded domain. Then there exist δ0 > 0 and C = C(α, δ0 ) > 0 such that
α
d (x)|∇v| d x +
d
\δ
α−1
(x)|v| d x ≥ C
d
αN N −1
(x)|v|
N N −1
N −1 N
dx
,
for any v ∈ C0∞ () and any 0 < δ ≤ δ0 ; here δ := {x ∈ : dist(x, ∂) ≤ δ}. Proof. We will follow closely the argument of [FMT2]. Our starting point is the following Gagliardo-Nirenberg inequality (see p. 189 in [M]) S N || f ||
N
L N −1 ()
≤ ||∇ f || L 1 () , ∀ f ∈ C0∞ (),
where S N is a positive constant depending only on N . For any v ∈ C0∞ () let us apply the above inequality to the function f := d α v. Hence we obtain S N ||d α v|| N ≤ d α (x)|∇v|d x + α d α−1 (x)|v|d x. L N −1 ()
C0∞ (2δ ),
We next estimate the last term above. Let ϕδ ∈ 0 ≤ ϕδ ≤ 1, be a cut off function which is identically one in δ and identically zero in R N \ 2δ . Clearly v = ϕδ v + (1 − ϕδ )v. Then we have d α−1 (x)|v|d x ≤ α d α−1 (x)|ϕδ v|d x + α d α−1 (x)(1 − ϕδ )|v|d x ≤ α α−1 ≤α d (x)|ϕδ v|d x + α d α−1 (x)|v|d x.
\δ
Concerning the first term on the right-hand side we have d α−1 (x)|ϕδ v|d x = ∇d α · ∇d|ϕδ v|d x α α =− d (x)∇d · ∇|ϕδ v|d x − d α (x)d|ϕδ v|d x ≤ d α (x)|∇(ϕδ v)|d x + c0 δ d α−1 (x)|ϕδ v|d x,
here we used the smoothness assumption on which implies that |dd| ≤ c0 δ in δ for δ small, say 0 < δ ≤ δ0 , and for some positive constant c0 independent of δ (δ0 , c0 depending on ). Thus we have for any 0 < δ ≤ δ0 , α α−1 α d (x)|ϕδ v|d x ≤ d (x)|∇(ϕδ v)|d x ≤ C d α (x)|∇v|ϕδ d x + α α − c0 δ0 C d α (x)|v|d x ≤ C d α (x)|∇v|d x + C d α−1 (x)|v|d x, + δ 2δ \δ 2δ \δ from which the result follows.
274
S. Filippas, L. Moschini, A. Tertikas
We next state the new improved Hardy-Moser inequality. Theorem 4.3. (Improved Hardy-Moser inequality) For N ≥ 2, let ⊂ R N be a smooth bounded and convex domain. Then there exists a positive constant C such that 2 N 2 1 2 2 2 |∇u| − 2 u dx u dx ≥C u 2(1+ N ) d x, ∀ u ∈ C0∞ (). 4d 1
Proof. Changing variables by v := ud − 2 , we get 2(N +2) αN N N +2 2(N +2) u N dx = d N v N dx = d N −1 (v 2α ) N −1 d x,
with α :=
(N +2)(N −1) . N2
αN
Applying Lemma 4.2 to the function v 2α we have
N
d N −1 (v 2α ) N −1 d x ≤ C
d α |∇v 2α |d x +
\δ
d α−1 v 2α d x
α 2α−1 ≤ C 2α d |∇v||v| dx +
≤C
1 2
2
d(x)|∇v| d x
+
\δ
v2 dx d
\δ
d
v
d
21
≤C
\δ
2α−1 2(2α−1)
dx
d
N N −1
α−1 2α
v dx
N N −1
1 v
2α−1 2(2α−1)
2
dx
d 2α−1 v 2(2α−1) d x
⎫ N 1 ⎬ N −1 2
⎭
N 2(N −1)
N 2(N −1) 1 2 2 × ; |∇v| d − d v d x 2 d ) ≥ 41 X 2 ( Dδ ) if x ∈ \ δ ) and here we used Proposition 4.1 (observe that 41 X 2 ( D standard estimates. Returning to the original variable u, we obtain
u
2(N +2) N
dx ≤ C
u 2(2α−1) d x
N 2(N −1)
N 2(N −1) 1 |∇u|2 − 2 u 2 d x , 4d
that is,
u
2(N +2) N
2(N −1) N
dx
≤C
u
2(2α−1)
dx
1 2 |∇u| − 2 u d x . 4d 2
If N = 2 we have that α = 1, thus the above inequality becomes 1 2 4 2 2 |∇u| − 2 u d x u dx ≤ C u dx 4d
Sharp Two–Sided Heat Kernel Estimates for Critical Schrödinger Operators
275
which is the sought for estimate. For N ≥ 3, we use the Hölder inequality to obtain 2(N −1) 2 N −2 N N N 2(N +2) 2(N +2) 2 u N dx ≤C u dx u N dx 1 |∇u|2 − 2 u 2 d x × 4d from which
u
2(N +2) N
2 N 1 2 2 |∇u| − 2 u d x ; u dx 4d
dx ≤ C
2
and this completes the proof of Theorem 4.3.
4.2. Boundary upper bounds and complete sharp description of the heat kernel for large values of time. Here we will first prove the following: Theorem 4.4. For N ≥ 2, let ⊂ R N be a smooth bounded and convex domain. Then there exists a positive constant C such that 1
h(t, x, y) ≤ C
1
d 2 (x)d 2 (y) 1 2
N
t − 2 , ∀ x, y ∈ , t > 0.
t To this end we need the following estimate of [FMT2]:
Proposition 4.5. For N ≥ 2, let ⊂ R N be a smooth bounded and convex domain. Then there exists a positive constant C such that 2 q q 1 2 2 (N −2)−N q 2 |∇u| − 2 u d x ≥ C d (x)|u| d x , (4.1) 4d (x) for any u ∈ C0∞ () and any 2 < q ≤
2N N −2
if N ≥ 3 or any 2 < q < ∞ if N = 2.
Using (4.1) the following logarithmic Sobolev inequality can be easily obtained v 1 N +1 d dx ≤ |∇v|2 d − d v 2 d x + K 1 − log ||v||22 , v 2 log ||v||2 2 4 (4.2)
for all v ∈ C0∞ (), v ≥ 0, and any > 0; here K 1 is a positive constant independent of
1 and ||v||2 := |v|2 d d x 2 . 1
To obtain (4.2) we apply (4.1) to v := ud − 2 to get for any v ∈ C0∞ (), 2 q 1 2 2 q q2 (N −2)−N + q2 |∇v| d − d v d x ≥ C v d dx . 2
Taking q :=
2(N +1) N −1
we have
N −1 N +1 2(N +1) 1 2 2 (N −1) |∇v| d − d v d x ≥ C v d dx . 2
(4.3)
Then arguing in a quite similar way as in the proof of (3.7) in Subsect. 3.1 we obtain
(4.2) with K 1 := N4+1 log N4C+1 .
276
S. Filippas, L. Moschini, A. Tertikas
Proof of Theorem 4.4. Let H01 (, d d x) be the closure of C0∞ () with respect to the norm 1 2 1 2 2 ||v|| H 1 := |∇v| d + (−d)v d x . 0,d 2 Let H¯ := U −1 HU , U : L 2 (, d d x) → L 2 () being the unitary operator U v := d 2 v, ¯ ¯ thus H¯ := − d1 div(d∇) − 21 d d . To the operator H defined in the domain D( H ) = {v ∈ 1 2 ¯ H0 (, d d x) : H v ∈ L (, d d x)} is naturally associated the bilinear symmetric form ¯ 1 , v2 ] :=< H¯ v1 , v2 > L 2 (,d d x) =< v1 , v2 > 1 defined as follows H[v H0 (,d d x) , which is a Dirichlet form. Whence Lemma 1.3.4 and Theorems 1.3.2 and 1.3.3 in [D4] imply ¯ that e− H t , which is an analytic contraction semigroup in L 2 (, d d x), is also positivity preserving and a contraction semigroup in L p (, d d x) for any 1 ≤ p ≤ ∞. As a consequence for any t > 0 and any p ≥ 2, 1
¯
e− H t [L 2 (, d d x) ∩ L ∞ ()]+ ⊂ [H01 (, d d x) ∩ L p (, d d x) ∩ L ∞ ()]+ ; thus by density argument the L p logarithmic Sobolev inequality, which can be deduced as usual from the L 2 logarithmic Sobolev inequality (4.2) (see Subsect. 3.1 where a simi¯ lar argument is used) more generally applies to any function in ∪t>0 e− H t [L 2 (, d d x)∩ ∞ + L ()] . This means that Theorem 2.2.7 in [D4] can be applied, as in Corollary 2.2.8 in [D4], to the operator H¯ ; whence obtaining that ¯
||e− H t ||2→∞ ≤ Ct −
N +1 4
,
and by duality that ¯
||e− H t ||1→2 ≤ Ct −
N +1 4
,
that is ¯
||e− H t ||1→∞ ≤ Ct −
N +1 2
.
Here we use the following notation: ¯
||e− H t ||q→ p :=
¯
||e− H t f (x)|| p , || f (x)||q 0<|| f ||q ≤1 sup
1 where || f ||q := | f |q d d x q . This implies, by Dunford-Pettis theorem, that the ¯ ¯ x, y) semigroup e− H t is indeed a semigroup of integral operators; that is a heat kernel h(t, ¯t − H associated to the semigroup e is well defined and satisfies the following pointwise 1 − N2 ¯ upper bound h(t, x, y) ≤ C 1 t , for any x, y ∈ and any t > 0. Theorem 4.4 then t2
follows, due to the fact that, as a consequence of the unitary operator U , the heat kernels ¯ x, y), corresponding respectively to H and H¯ , satisfy the following h(t, x, y) and h(t, 1 1 ¯ x, y). equivalence h(t, x, y) ≡ d 2 (x)d 2 (y) h(t, Remark 4.6. Applying Davies’s method of exponential perturbation to the operator H¯ (see Sect. 2 in [D3] for details) the upper bound in Theorem 4.4 can be improved by |x−y|2
adding a factor cδ e− 4(1+δ)t .
Sharp Two–Sided Heat Kernel Estimates for Critical Schrödinger Operators
277
Let us now give the sketch of the proof of Theorem 1.4. Proof of Theorem 1.4. Let us first improve by an exponential-decreasing-in-time factor the global in time upper bound stated in Theorem 4.4. To this end let us define, as in Sect. 2 of [D2], the operator H˜ := U −1 (H − λ1 )U , U : L 2 (, ϕ12 d x) → L 2 () being the unitary operator U w := ϕ1 w, thus H˜ := − ϕ12 div(ϕ12 ∇). Here ϕ1 > 0 denotes 1
the first eigenfunction and λ1 > 0 the first eigenvalue corresponding to the Dirichlet problem −ϕ1 − 4d 21(x) ϕ1 = λ1 ϕ1 in , ϕ1 = 0 on ∂, normalized in such a way that 2 ϕ1 (x) d x = 1. Since it is known that there exist two positive constants c1 , c2 such that 1
1
c1 d 2 (x) ≤ ϕ1 (x) ≤ c2 d 2 (x), ∀ x ∈
(4.4)
(as a consequence of Lemma 7 in [DD]), logarithmic Sobolev inequalities analogous to (4.2) also hold true if we replace H¯ by H˜ . Thus as a consequence of the Gross theorem, ˜ x, y) satisexactly as in the proof of Theorem 4.4, the corresponding heat kernel h(t, N ¯ ˜ fies the same pointwise upper bound as h(t, x, y), that is h(t, x, y) ≤ C1 t − 2 for any t2
x, y ∈ and any t > 0. From the definition of U , it follows ˜ x, y)e−λ1 t , h(t, x, y) ≡ ϕ1 (x)ϕ1 (y)h(t, thus we get that h(t, x, y) ≤ C
1 1 d 2 (x)d 2 (y) 1 t2
(4.5)
N
t − 2 e−λ1 t for any x, y ∈ and any t > 0.
Finally arguing as in Theorem 6 of [D1], an analogous lower estimate can be easily deduced (we refer to the proof of Theorem 1.2 where a similar argument is used). 4.3. Complete sharp description of the heat kernel for small values of time. In this section we prove the two-sided sharp estimate on the heat kernel h(t, x, y) stated for small time in Theorem 1.3. Before doing so let us observe that Theorem 1.5 entails also the following parabolic Harnack inequality for the Schrödinger operator having critical singularity at the boundary. Theorem 4.7. For N ≥ 2, let ⊂ R N be a smooth bounded and convex domain. Then there exist positive constants C H and R = R() such that for x ∈ , 0 < r < R and 1 2 for any positive solution u(y, t) of ∂u ∂t = u + 4d 2 (y) u in {B(x, r ) ∩ } × (0, r ) we have the estimate ess sup
1
(y,t)∈{B(x, r2 )∩}×( r4 , r2 ) 2
2
u(y, t)d − 2 (y) 1
≤ C H ess inf (y,t)∈{B(x, r )∩}×( 3 r 2 ,r 2 ) u(y, t)d − 2 (y). 2
4
Proof. As a first step let us observe that if u satisfies u t = −H u then v(y, t) := eλ1 t ϕ1 (y)−1 u(y, t) satisfies vt = − H˜ v. Whence as a consequence of (4.4), due to Remark 2.16, v satisfies Theorem 1.5 for α = 1. From this the claim can be easily deduced. The proof of Corollary 1.7 is similar to the above proof of Theorem 4.7, thus we omit the details. We are now ready to prove Theorem 1.3.
278
S. Filippas, L. Moschini, A. Tertikas
Proof of Theorem 1.3. Since for any x ∈ and for some positive constants c1 , c2 we α α have the following estimate c1 d 2 (x) ≤ ϕ1 (x) ≤ c2 d 2 (x) for α = 1, we can apply the 1 result of Theorem 2.15 to the operator H˜ = − ϕ 2 (x) div(ϕ12 (x)∇). Hence due to (4.5) 1
the result follows immediately.
The proof of Corollary 1.8 is similar to the above proofs of Theorems 1.3 and 1.4, thus we omit the details. Let us finally make some remarks concerning Schrödinger operators having potential V (x) = cd −2 (x). Arguing as in Lemma 7 in [DD] one can easily prove that the first Dirichlet eigenfunction for the Schrödinger operator − − d 2c(x) , 0 < c < 41 , behaves √ α like d 2 on all , for α := 1 + 1 − 4c. Then we have: Theorem 4.8. For N ≥ 2, let ⊂ R N be a smooth bounded and convex domain. Then there exist positive constants C1 , C2 , with C1 ≤ C2 , and T > 0 depending on such that α α d 2 (x)d 2 (y) − N −C2 |x−y|2 t C1 min 1, ≤ h c (t, x, y) t 2e α t2 α α d 2 (x)d 2 (y) − N −C1 |x−y|2 t ≤ C2 min 1, t 2e α t2 for all x, y ∈ and 0 < t ≤ T ; where h c (t, x, y) denotes the heat kernel associated to the operator − − d 2c(x) in under Dirichlet boundary conditions, for any 0 < c < 41 √ and α := 1 + 1 − 4c. Theorem 4.9. For N ≥ 2, let ⊂ R N be a smooth bounded and convex domain. Then there exist two positive constants C1 , C2 , with C1 ≤ C2 , such that α
α
α
α
C1 d 2 (x) d 2 (y) e−λ1 t ≤ h c (t, x, y) ≤ C2 d 2 (x) d 2 (y) e−λ1 t for all x, y ∈ and t > 0 large enough; where h c (t, x, y) denotes the heat kernel associated to the operator − − d 2c(x) in under Dirichlet boundary conditions, λ1 its √ (positive) elliptic first eigenvalue, for any 0 < c < 41 and α := 1 + 1 − 4c. Remark 4.10. Let us at this point remark that Theorems 1.3 and 4.8 as well as Theorem 4.7 concerning respectively sharp asymptotic for small time and the parabolic Harnack inequality up to the boundary for the Schrödinger operator having potential V (x) = cd −2 (x), hold true also without any convexity assumption on the domain under consideration.
4.4. On Davies conjecture. In this subsection we consider Davies conjecture. For this # denotes the self-adjoint operator associated with the closure of the we suppose that E positive quadratic form ⎛ ⎞ N " ∂ f ∂ f ⎝ ai, j (x) − V f 2 ⎠ d x, Q( f ) = ∂ xi ∂ x j i, j=1
Sharp Two–Sided Heat Kernel Estimates for Critical Schrödinger Operators
279
initially defined on C0∞ (); where (ai, j (x)) N ×N is a measurable symmetric uniformly elliptic matrix such that N "
ai, j (x)ξi ξ j ≥ |ξ |2
i, j=1
and V is a potential on such that V (x) = V1 (x) + V2 (x) ,
(4.6)
where |V1 (x)| ≤
1 4d 2 (x)
, V2 (x) ∈ L p (), p >
N . 2
(4.7)
We also suppose that λ1 :=
inf∞
0=ϕ∈C0 ()
Q(ϕ) > 0, 2 ϕ dx
(4.8)
and that to λ1 there corresponds a positive eigenfunction ϕ1 satisfying for all x ∈ the following estimate, α
α
c1 d 2 (x) ≤ ϕ1 (x) ≤ c2 d 2 (x), for some
α≥1
(4.9)
and for c1 , c2 two positive constants. # is defined on the closure of C ∞ () with respect to the norm defined by Thus E 0 # is a well defined nonnegthe quadratic form Q. Then as before it can be shown that E # 2 ative self-adjoint operator on L () such that for every t > 0, e− Et has an integral # kernel, that is e− Et u 0 (x) := # e(t, x, y)u 0 (y)dy and if N ≥ 3 a Green function ∞ #−1 . e(t, x, y)dt denoting the kernel of E G E#(x, y) = 0 # Theorem 4.11. For N ≥ 3, let ⊂ R N be a smooth bounded domain. Suppose that (4.6), (4.7), (4.8) and (4.9) are satisfied. Then there exist two positive constants C1 , C2 , with C1 ≤ C2 , such that for any x, y ∈ , α α 1 d 2 (x)d 2 (y) C1 min , ≤ G E#(x, y) |x − y| N −2 |x − y| N +α−2 α α 1 d 2 (x)d 2 (y) ≤ C2 min , . |x − y| N −2 |x − y| N +α−2 Davies conjecture is stated under slightly stronger assumptions on V than (4.7) and on ϕ1 (only when α = 2). # − λ1 )U , U : L 2 (, Proof. We note that we have E 1 := − ϕ12 div(ϕ12 ∇) ≡ U −1 ( E 1
ϕ12 d x) → L 2 () being the unitary operator U w := ϕ1 w; hence we have the following relationship between heat kernels: # e(t, x, y) = ϕ1 (x)ϕ1 (y)e1 (t, x, y)e−λ1 t .
(4.10)
280
S. Filippas, L. Moschini, A. Tertikas
Due to (4.9) we can apply Theorem 2.15 to the operator E 1 . Hence due to (4.10) for two positive constants C1 ≤ C2 , we have for small time α α d 2 (x)d 2 (y) − N −C2 |x−y|2 t C1 min 1, ≤# e(t, x, y) t 2e α t2 α α d 2 (x)d 2 (y) − N −C1 |x−y|2 t ≤ C2 min 1, . (4.11) t 2e α t2 On the other hand for large time α
α
α
α
C1 d 2 (x) d 2 (y) e−λ1 t ≤ # e(t, x, y) ≤ C2 d 2 (x) d 2 (y) e−λ1 t ,
(4.12)
for all x, y ∈ . To obtain this estimate we need to prove a global Sobolev inequality on , which can be easily deduced from its local version (2.12) as well as (2.18) with λ = 0 there, by means of a partition of unity as in [K]. Then the result follows integrating # e(t, x, y) in the time variable. Acknowledgement. This work was largely done whilst the second author was visiting the University of Crete and FORTH in Heraklion, the hospitality of which is acknowledged. This research has been partially supported by the RTN European network Fronts–Singularities, HPRN-CT-2002-00274.
References [A]
Aronson, D.G.: Bounds for the fundamental solution of a parabolic equation. Bull. Amer. Math. Soc. 73, 890–896 (1967) [BG] Baras, P., Goldstein, J.: The heat equation with a singular potential. Trans. Amer. Math. Soc. 284, 121–139 (1984) [BFT1] Barbatis, G., Filippas, S., Tertikas, A.: A unified approach to improved L p Hardy inequalities with best constants. Trans. Amer. Math. Soc. 356(6), 2169–2196 (2003) [BFT2] Barbatis, G., Filippas, S., Tertikas, A.: Critical heat kernel estimates for Schrödinger operators via Hardy-Sobolev inequalities. J. Funct. Anal. 208, 1–30 (2004) [BM] Brezis, H., Marcus, M.: Hardy’s inequalities revised. Dedicated to Ennio De Giorgi. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 25(1–2), 217–237 (1997) [BV] Brezis, H., Vazquez, J.L.: Blow-up solutions of some nonlinear elliptic problems. Rev. Mat. Univ. Complut. Madrid 10(2), 443–469 (1997) [CM] Cabré, X., Martel, Y.: Existence versus explosion instantanée pour des équationes de la chaleur linéaires avec potentiel singulier. C.R. Acad. Sci. Paris Ser. I Math. 329(11), 973–978 (1999) [CS] Chiarenza, F.M., Serapioni, R.P.: A remark on a Harnack inequality for degenerate parabolic equations. Rend. Sem. Mat. Univ. Padova 73, 179–190 (1985) [D1] Davies, E.B.: Perturbations of ultracontractive semigroups. Quart. J. Math. Oxford 2(37), 167– 176 (1986) [D2] Davies, E.B.: The equivalence of certain heat kernel and green function bounds. J. Funct. Anal. 71, 88–103 (1987) [D3] Davies, E.B.: Explicit constants for Gaussian upper bounds on heat kernels. Amer. J. of Math. 109, 319–334 (1987) [D4] Davies, E.B.: Heat kernels and spectral theory. Cambridge: Cambridge University Press, 1989 [DS1] Davies, E.B., Simon, B.: Ultracontractivity and the heat kernels for Schrodinger operators and Dirichlet Laplacians. J. Funct. Anal. 59, 335–395 (1984) [DS2] Davies, E.B., Simon, B.: L p norms of non-critical Schrödinger semigroups. J. Funct. Anal. 102, 95–115 (1991) [DD] Dávila, J., Dupaigne, L.: Hardy type inequalities. J. Eur. Math. Soc. 6(3), 335–365 (2004) [DG] De Giorgi, E.: Sulla differenziabilitá e l’analiticitá delle estremali degli integrali multipli regolari. Mem. Accad. Sci. Torino Cl. Sci. Fis. Mat. Nat. (3), n. 3, 25–43 (1957) [FGS] Fabes, E.B., Garofalo, N., Salsa, S.: A backward Harnack inequality and Fatou Theorem for nonnegative solutions of parabolic equations. Ill. J. Math. 30(4), 536–565 (1986)
Sharp Two–Sided Heat Kernel Estimates for Critical Schrödinger Operators
[FKJ]
281
Fabes, E.B., Kenig, C.E., Jerison, D.: Boundary behavior of solutions to degenerate elliptic equations. Conference on harmonic analysis in honor of Antoni Zygmund, Vol. I, II (Chicago, Ill., 1981), Wadsworth Math. Ser., Belmont, CA: Wadsworth, 1983, pp. 577–589 [FKS] Fabes, E.B., Kenig, C.E., Serapioni, R.P.: The local regularity of solutions of degenerate elliptic equations. Comm. Part. Diff. Eq. 7, 77–116 (1982) [FS] Fabes, E.B., Stroock, D.W.: A new proof of Moser’s parabolic Harnack inequality via the old ideas of Nash. Arch. Rat. Mech. Anal. 96, 327–338 (1986) [FMT1] Filippas, S., Maz’ya, V.G., Tertikas, A.: Sharp Hardy-Sobolev inequalities. C.R. Math. Acad. Sci. Paris 339(7), 483–486 (2004) [FMT2] Filippas, S., Maz’ya, V.G., Tertikas, A.: Critical Hardy-Sobolev inequalities. J. Math. Pures Appl. 87, 37–56 (2007) [FT] Filippas, S., Tertikas, A.: Optimizing Improved Hardy inequalities. J. Funct. Anal. 192, 186– 233 (2002) [Ga] Gao, P.: The boundary Harnack principle for some degenerate elliptic operators. Comm. Partial Differ. Eq. 18(12), 2001–2022 (1993) [G1] Grigoryan, A.: The heat equation on non-compact Riemannian manifolds (in Russian). Matem. Sbornik 182(1), 55–87 (1991). Engl. Transl.: Math. USSR Sb. 72(1), 47–77 (1992) [G2] Grigoryan, A.: Heat kernels on weighted manifolds and applications. Cont. Math. 398, 93–191 (2006) [GSC] Grigoryan, A., Saloff-Coste, L.: Stability results for Harnack inequalities. Ann. Inst. Fourier, Grenoble 55(3), 825–890 (2005) [K] Kufner, A.: Weighted Sobolev spaces. Teubner-Texte zur Mathematik, 31, Stüttgart, Teubner, 1981 [KO] Kufner, A., Opic, B.: Hardy type inequalities. Pitman Research Notes in Math Series 219, London: Pitman, 1990 [LY] Li, P., Yau, S.-T.: On the parabolic kernel of the schrödinger operator. Acta Math. 156(3–4), 153– 201 (1986) [M] Mazya, V.G.: Sobolev spaces. Berlin-Heidelberg. New York: Springer-Verlag, 1985 [MS] Milman, P.D., Semenov, Yu.A.: Global heat kernel bounds via desingularizing weights. J. Funct. Anal. 212, 373–398 (2004) [MT1] Moschini, L., Tesei, A.: Harnack inequality and heat kernel estimates for the Schrödinger operator with Hardy potential. Rend. Mat. Acc. Lincei 16, 171–180 (2005) [MT2] Moschini, L., Tesei, A.: Parabolic Harnack inequality for the heat equation with inverse-square potential. To appear in Forum Mathematicum [Mo1] Moser, J.: On harnack’s theorem for elliptic differential equations. Comm. Pure Appl. Math. 14, 577– 591 (1961) [Mo2] Moser, J.: A Harnack inequality for parabolic differential equations. Comm. Pure. Appl. Math. 17, 101–134 (1964); Correction: 20, 231–236 (1967) [N] Nash, J.: Continuity of solutions of parabolic and elliptic equations. Amer. J. Math. 80, 931– 954 (1958) [SC1] Saloff-Coste, L.: A note on Poincaré, Sobolev, and Harnack inequalities. Internat. Math. Res. Notices 2, 27–38 (1992) [SC2] Saloff-Coste, L.: Aspects of Sobolev-type inequalities. London Math. Soc. Lecture Notes Series 289, Cambridge: Cambridge University Press, 2002 [VZ] Vázquez, J.L., Zuazua, E.: The hardy inequality and the asymptotic behaviour of the heat equation with an inverse-square potential. J. Funct. Anal. 173, 103–153 (2000) [Z] Zhang, Qi. S.: The boundary behaviour of heat kernels of Dirichlet Laplacians. J. Diff. Eq. 182, 416– 430 (2002) Communicated by B. Simon
Commun. Math. Phys. 273, 283–304 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0226-2
Communications in
Mathematical Physics
Design of Hyperbolic Billiards Maciej P. Wojtkowski Department of Mathematics and Physics, University of Szczecin, ul. Wielkopolska 15, 70-451 Szczecin, Poland. E-mail: [email protected] Received: 27 February 2006 / Accepted: 8 September 2006 Published online: 11 May 2007 – © M. P. Wojtkowski 2007
Abstract: We formulate a general framework for the construction of hyperbolic billiards. Spherical symmetry is exploited for a simple treatment of billiards with spherical caps and soft billiards in higher dimensions. Other examples include the Papenbrock stadium. 1. Introduction The purpose of this paper is to present a general framework for the construction of hyperbolic billiards, especially with some convex pieces in the boundary, and also “soft” billiards. Most of the examples of hyperbolic billiards that were constructed up to date can be understood in this framework, with the notable exception of systems studied in [W5, W6]. Billiards are a class of dynamical systems with appealingly simple description. A point particle moves with constant velocity in a box of arbitrary dimension (“the billiard table”) and reflects elastically from the boundary (the component of velocity perpendicular to the boundary is reversed and the parallel component is preserved). Mathematically it is a class of hamiltonian systems with collisions defined by symplectic maps on the boundary of the phase space, [W1]. Such systems are also called hybrid systems, being a concatenation of continuous time and discrete time dynamics. The billiard dynamics defines a one parameter group of maps t of the phase space which preserve the Lebesgue measure, and are in general only measurable due to discontinuities. The boundaries of the box are made up of pieces: concave, convex and/or flat. Discontinuities occur in particular at the orbits tangent to concave pieces of the boundary of the box. The orbits hitting two adjacent pieces (“corners”) have two natural continuations, which is another source of discontinuity. These singularities are not too severe so that the flow has well defined Lyapunov exponents and Pesin structural theory Reproduction of the entire article for non-commercial purposes is permitted without charge.
284
M. P. Wojtkowski
is applicable, [K-S]. A billiard system is called hyperbolic if it has nonzero Lyapunov exponents almost everywhere (or at least on a subset of positive Lebesgue measure). It is called completely hyperbolic if all of its Lyapunov exponents are nonzero almost everywhere, except for one zero exponent in the direction of the flow. Billiards in smooth strictly convex domains have no singularities, but no such systems are known to be hyperbolic. In dimension 2 Lazutkin showed that near the boundary of such domains the system is near integrable. Applying the KAM theory he proved that for these “grazing orbits” there is always a family of invariant curves with positive total measure in the phase space, and with zero Lyapunov exponents. In general billiards exhibit mixed behavior just like other hamiltonian systems; there are invariant tori intertwined with the “chaotic sea”. In hyperbolic billiards stable behavior is excluded by the choice of the pieces in the boundary of the box, arbitrary concave pieces and special convex ones, and their particular placement, usually separation. Thus hyperbolicity is achieved by design, as in optical instruments. Hyperbolicity is the universal mechanism for random behavior in deterministic dynamical systems. Under additional assumptions it leads to ergodicity, mixing, K-property, Bernoulli property, decay of correlations, central limit theorem, and other stochastic properties. Hyperbolic billiards provide a natural class of examples for which these properties were extensively studied. In this article we restrict ourselves to hyperbolicity itself. The most prominent example of a hyperbolic billiard is the gas of hard spheres. This way of looking at the system was developed in the groundbreaking papers of Sinai, see [Ch-S] for an exhaustive list of references. The excellent collection of papers, [H], contains more up to date information. An important source on hyperbolic billiards is the book by Chernov and Markarian, [Ch-M]. The books by Kozlov and Treschev [K-T], and by Tabachnikov [T] provide broad surveys of billiards from different perspectives. 2. Jacobi Fields and Monotonicity The key to understanding hyperbolicity in billiards lies in two essentially equivalent descriptions of infinitesimal families of trajectories. The basic notion is that of a Jacobi field along a billiard trajectory. Let γ (t, u) be a family of billiard trajectories, where t is time and u is a parameter, |u| < , for some > 0. A Jacobi field J (t) along γ (t) = γ (t, 0) is defined by J (t) = ∂γ ∂u |u=0 . Jacobi fields form a finite dimensional vector space that can be naturally identified with the tangent space of the phase space at any point on the trajectory. Jacobi fields t contain as the flow D if we the same information derivatives of the billiard . Indeed, (t) treat J (0), J (0) and J (t), J (t) as tangent vectors at γ (0), γ (0) and γ (t), γ t respectively then D J (0), J (0) = J (t), J (t) . In particular the Lyapunov exponents are the exponential rates of growth of Jacobi fields. Jacobi fields split naturally into parallel and perpendicular components to the trajectory, each of them a Jacobi field in its own right. The parallel Jacobi field carries the zero Lyapunov exponent. In the following we discuss only the perpendicular Jacobi fields until Sects. 6, 7 and 8, where we are forced to consider general Jacobi fields. They form a codimension 1 subspace in the tangent to the unit tangent bundle, i.e., the phase space restricted by the condition that velocity has length one. Since the billiard trajectories are geodesics of the Euclidean metric the Jacobi fields satisfy between collisions the differential equation J = 0, and hence J (t) = J (0) + t J (0).
(1)
Design of Hyperbolic Billiards
285
At a collision a Jacobi field undergoes a change by the map J (tc+ ) = RJ (tc− )
J (tc+ ) = RJ (tc− ) + P ∗ KP J (tc+ ),
(2)
where J (tc− ) and J (tc+ ) are Jacobi fields immediately before and after collision, K is the shape operator of the piece of the boundary (K = ∇n, n is the inside unit normal to the boundary), and P is the projection along the velocity vector from the hyperplane perpendicular to the orbit to the hyperplane tangent to the boundary. Finally R is the orthogonal reflection in the hyperplane tangent to the boundary. To measure the growth/decay of Jacobi fields we introduce form in the a quadratic tangent spaces, or equivalently on Jacobi fields, Q(J, J ) = J, J . Evaluation of Q on a Jacobi field is a function of time Q(t). Definition 1. A billiard trajectory γ (t) is (strictly) monotone on a Jacobi field J , between its two points q0 = γ (0) and q1 = γ (t1 ), or equivalently between time 0 and time t1 , if Q(t1 )(>) ≥ Q(0). A billiard trajectory γ (t) is called (strictly) monotone between its two points q0 = γ (0) and q1 = γ (t1 ) (or between time 0 and time t1 ), if it is (strictly) monotone on any nonzero Jacobi field J (t) between q0 and q1 . A nonzero Jacobi field is called parabolic between time 0 and time t1 if J (0) = 0 and J (t1 ) = 0. A billiard trajectory γ (t) is called parabolic between its two points q0 = γ (0) and q1 = γ (t1 ), if it has a Jacobi field J (t) which is parabolic between time 0 and time t1 (i.e., J (0) = J (t1 ) = 0). It is called completely parabolic if for every Jacobi field if J (0) = 0 then also J (t1 ) = 0. Clearly any trajectory which is strictly monotone between q0 and q1 cannot be parabolic between q0 and q1 . Due to the reversibility of the billiard motion monotonicity is the property of a trajectory in the configuration space with chosen points q0 and q1 . Indeed, if it holds for the trajectory traversed from q0 to q1 then it also holds for the reversed trajectory from q1 to q0 . This is the subject of the following Lemma 2. If a trajectory is (strictly) monotone between its two points q0 and q1 then the reversed trajectory is also (strictly) monotone between q1 and q0 . Proof. Let us consider a nonzero Jacobi field J along the orbit γ (t), 0 ≤ t ≤ T . The orbit γ˜ (t) = γ (T − t) is the reversed orbit and J˜(t) = J (T − t) is a Jacobi field along the orbit γ˜ (t). We get further J˜ (t) = −J (T − t) and the change of Q on J along the orbit γ and on J˜ along the orbit γ˜ are the same. It follows from (1) that if there are no collisions between two points on a trajectory then the trajectory is monotone between the points. Indeed Q(t) − Q(0) = t|J (0)|2 . We have further Lemma 3. If a trajectory is monotone between two noncollision points q0 and q1 then it is also monotone between a point q˜0 earlier than q0 (q˜0 < q0 ) and a point q˜1 later than q1 (q1 < q˜1 ), provided that there are no collisions between q˜0 and q0 , and q1 and q˜1 , respectively. Moreover for two such points q˜0 < q0 and q˜1 > q1 either the orbit is strictly monotone between q˜0 and q˜1 or it is parabolic between the points.
286
M. P. Wojtkowski
Proof. The first part of the lemma is obvious. To prove the second part let us consider a nonzero Jacobi field J . If J does not vanish at q0 then the form Q increases strictly on J between q˜0 and q0 . Similarly if J does not vanish at q1 then the form Q increases strictly between q1 and q˜1 . It turns out that a sufficiently long “free flight” ensures monotonicity for most trajectories. Proposition 4. If a trajectory is not parabolic between two noncollision points q0 and q1 then it is strictly monotone between a point q˜0 sufficiently earlier than q0 and a point q˜1 sufficiently later than q1 , provided that there are no collisions between q˜0 and q0 , and q1 and q˜1 , respectively. If a trajectory is completely parabolic between two noncollision points q0 and q1 then it is monotone between a point q˜0 sufficiently earlier than q0 and a point q˜1 sufficiently later than q1 , provided that there are no collisions between q˜0 and q0 , and q1 and q˜1 , respectively. More precisely we extend the segments of the trajectory containing q0 and q1 into rays, which allows us to go arbitrarily far in the past and/or arbitrary far in the future without any new collisions. Note also that in dimension 2 we can apply this result to any trajectory since then any parabolic trajectory is obviously completely parabolic. However for trajectories close to parabolic the necessary interval of free flight may be unbounded. Proof. Let q0 = γ (0) and q1 = γ (t1 ). We are seeking T > 0 so large that Q(t1 + T ) > Q(−T ) for any nonzero Jacobi field J . In other words we want the quadratic form Q(t1 + T ) − Q(−T ) on perpendicular Jacobi fields to be positive definite. We have Q(t1 + T ) − Q(−T ) = Q(t1 + T ) − Q(t1 ) + Q(t1 ) − Q(0) + Q(0) − Q(−T ) = T |J (0)|2 + |J (t1 )|2 + Q(t1 ) − Q(0).
(3)
If a trajectory is not parabolic between q0 and q1 then |J (0)|2 + |J (t1 )|2 is a positive definite quadratic form on perpendicular Jacobi fields. It follows that for sufficiently large T the quadratic form (3) is also positive definite. This proves the first part of the proposition. To prove the second part let us recall that due to the symplectic nature of the billiard dynamics for trajectories completely parabolic between time 0 and time t1 we have J (t1 ) = A J (0), J (t1 ) = A∗−1 J (0) + B J (0), for some linear operators A and B such that A∗ B is symmetric, see for example [W1]. Using (3) we get Q(t1 + T ) − Q(−T ) = T |J (0)|2 + |A J (0)|2 + A∗ B J (0), J (0) . Clearly the quadratic form is positive semidefinite for sufficiently large T > 0, and hence the trajectory is monotone, but not strictly monotone. By (2) the monotonicity of a trajectory at a collision, i.e., Q(tc+ ) ≥ Q(tc− ), is equivalent to the positive semidefiniteness of the shape operator K ≥ 0, it holds for concave pieces of the boundary. Billiards with only concave and flat pieces of the boundary are called semidispersing. If K > 0 at a point of collision then by Lemma 3 we have
Design of Hyperbolic Billiards
287
strict monotonicity between a point before the collision and a point after the collision. In semidispersing billiards, where K ≥ 0, K = 0, strict monotonicity may still occur after sufficiently many reflections. Definition 5. We say that a billiard system is eventually strictly monotone (ESM) on a subset X of positive Lebesgue measure in the phase space, if for almost every trajectory beginning in X there is a return time t1 to X such that the trajectory is strictly monotone between time 0 and time t1 . The role of monotonicity is revealed in the following Theorem 6. [W1] If a billiard system is ESM on a subset X of the phase space then for almost every orbit passing through X all Lyapunov exponents are different from zero. Theorem 6 is formulated here for billiard systems. However it can be generalized and applied to other systems, not even hamiltonian (see [W2] for precise formulations, references and the history of this idea). 3. Wave Fronts and Monotonicity There is a geometric formulation of monotonicity (which historically preceded the one given above). Let us consider a local wavefront, i.e., a local hypersurface W (0) perpendicular to a trajectory γ (t) at t = 0. Let us consider further all billiard trajectories perpendicular to W (0). The points on these trajectories at time t form a local hypersurface W (t) perpendicular again to the trajectory (warning: in general at exceptional moments of time the wavefront W (t) is singular). Infinitesimally wavefronts are described by the shape operator U = ∇n, where n is the unit normal field. U is a symmetric operator on the hyperplane tangent to the wavefront (and perpendicular to the trajectory γ (t). The evolution of infinitesimal wavefronts is given by the formulas U (t) = (t I + U (0)−1 )−1 without collisions, U (tc+ ) = RU (tc− )R−1 + P ∗ KP at a collision.
(4)
It follows that between collisions a wavefront that is initially convex (i.e., diverging, or U > 0) will stay convex. Moreover any wavefront after a sufficiently long run without collisions will become convex (after which the normal curvatures of the wavefront will be decreasing). The second part of (4) shows that after a reflection in a strictly concave boundary a convex wavefront becomes strictly convex (and its normal curvatures increase). These properties are equivalent to (strict) monotonicity as formulated in Definition 1. Indeed in the language of Jacobi fields an infinitesimal wavefront represents a linear subspace in the space of perpendicular Jacobi fields (i.e., the tangent space). Moreover it is a Lagrangian subspace with respect to the standard symplectic form. We can follow individual Jacobi fields or whole subspaces of them. It explains the parallel of (1), (2) and (4). The form Q allows the introduction of positive and negative Jacobi fields and positive and negative Lagrangian subspaces. An infinitesimal convex wavefront represents a positive Lagrangian subspace. Monotonicity is equivalent to the property that for every positive Lagrangian subspace at time 0 its image under the derivative of the flow Dt is also positive. It may seem that there is loss of information in formulas (4) compared to (1) and (2). However the symplectic nature of the dynamics makes them actually equivalent, [W1].
288
M. P. Wojtkowski
4. Design of Hyperbolic Billiards In view of (4) it seems that a convex piece in the boundary (K < 0) excludes monotonicity. There are two ways around this apparent obstacle to hyperbolicity. First we could change the quadratic form Q at the convex boundary and consider monotonicity with respect to the modified form Q. We follow here another path. We treat convex pieces as “black boxes” and look only at incoming and outgoing trajectories. The first strategy is presented in [W1]. Although the second strategy seems more restrictive, the examples of hyperbolic billiards constructed to date fit the black box scenario with few exceptions, [W5, W6]. To introduce this approach let us consider a billiard table with flat pieces of the boundary and exactly one convex piece. A trajectory in such a billiard experiences visits to the convex piece separated by arbitrary long sequences of collisions in flat pieces, which do not affect the geometry of a wavefront at all. Hence whatever is the geometry of a wavefront emerging from the curved piece it will become convex and very flat by the time it comes back to the curved piece of the boundary again. Hence it follows, at least heuristically, that we must study the complete passage through the convex piece of the boundary, regarding its effect on convex, and especially flat, wavefronts. An important difference between convex and concave pieces is that a trajectory has usually several consecutive collisions in the same convex piece, moreover the number of such collisions is unbounded. A finite billiard trajectory is called complete if it contains reflections in one and the same piece of the boundary, and it is preceded and followed by reflections in other pieces. We can now formulate two principles for the design of hyperbolic billiards. 1. No parabolic trajectories. Convex pieces may have no complete trajectories which are parabolic. 2. Separation. There must be sufficient separation, in space or time, between complete trajectories. The relevance of these two principles can be seen in Proposition 4. For non-parabolic trajectories if there is enough separation we get strict monotonicity and Theorem 6 is applicable. Definition 7. A complete trajectory is (strictly) z-monotone (on a Jacobi field J ) for some z ≥ 0, if it is (strictly) monotone (on the Jacobi field J ) between the point at the distance z before the first reflection and the point at the distance z after the last reflection. A convex piece of the boundary is (strictly) monotone if almost every complete trajectory is (strictly) z-monotone for some z ≥ 0. Additionally the piece of the boundary is called finitely (strictly) monotone if the value of z is uniformly bounded for almost all complete trajectories in the piece. In the language of wavefronts a complete trajectory is z-monotone if every diverging wavefront at a distance at least z from the first reflection becomes diverging at the distance z after the last reflection, or earlier. Clearly a strictly concave piece is strictly monotone. Every complete trajectory has only one reflection and it is strictly -monotone for any small , the property we call 0-monotone. It follows from Theorem 6 that we get a completely hyperbolic billiard if we put together curved strictly monotone pieces of the boundary and some flat pieces, in such a way that for every two consecutive complete trajectories, which are z 1 -monotone and
Design of Hyperbolic Billiards
289
z 2 -monotone respectively the distance from the last reflection in the first trajectory to the first reflection in the second one is bigger than z 1 + z 2 . Indeed we can consider the subset X of the phase space containing appropriate midpoints of trajectories leaving one curved piece and hitting another one. We obtain immediately the property ESM on X . This construction seems unlikely to succeed if there is no uniform bound on the distances z at which complete orbits are monotone, so that no separation of the pieces will be sufficient. However in the case of spherical caps, studied by Bunimovich and Rehacek, [B-R], we find a geometric scenario that works without the uniform bound on the value of z. We will discuss it in Sect. 6. 5. Hyperbolic Billiards in Dimension 2 In all of the examples of hyperbolic billiards constructed so far the convex pieces of the boundary have no parabolic complete trajectories. Checking this property is nontrivial due to the unbounded number of reflections in complete trajectories close to tangency. It was accomplished so far only in integrable, or near integrable examples, with one exception described in the following. Billiards in dimension 2 are understood best. First of all there is yet another way of describing infinitesimal families of trajectories. Every infinitesimal family of lines in the plane has a point of focusing (in linear approximation), possibly at infinity. This point of focusing contains the same information as the curvature of the infinitesimal wavefront (it is the center of curvature, rather than curvature itself) and it has the advantage that it does not change in free flight. The change in the focusing point after a reflection is described by the familiar mirror equation of the geometric optics −
1 1 2 + = , f0 f1 d
(5)
where f 0 , f 1 are the signed distances of the points of focusing to the reflection point, d = r cos θ , r is the radius of curvature of the boundary piece (r > 0 for a strictly convex piece) and θ is the angle of incidence. The mirror equation is just the two dimensional version of (4). We can see that f 1 is a fractional linear function of f 0 as it should be, since the focusing point f is a projective coordinate in the projectivization of the two dimensional space of perpendicular Jacobi fields, [W3]. This fractional linear function gives us a mapping of the line extending the billiard segment, to the next one; the lines become topological circles with the addition of the points at infinity. These circles have natural orientations given by the direction of the billiard trajectory. Fractional linear maps given by the mirror equation preserve this orientation. Indeed the perpendicular Jacobi fields form a plane which has the canonical orientation defined by the symplectic form (the form does not vanish on the plane). This orientation, like the symplectic form, is invariant under the dynamics and it induces an orientation of the projectivization (the circle of focusing points with projective coordinate f ). By Proposition 4 in dimension 2 every convex piece is monotone. However in general it is not finitely monotone, i.e., the value of z may be unbounded. To examine this issue let us consider an incoming trajectory before the first collision and the outgoing trajectory after the last collision. The minimal value of z for which this trajectory is z-monotone can be obtained in terms of the linear fractional map that shows the dependence of the focusing point before the first reflection and the focusing point after the last reflection. Let the focusing points before the first reflection and after the last reflection be denoted by f 0 and f 1 respectively. f 0 and f 1 are signed distances to the respective
290
M. P. Wojtkowski
reflection points measured in the direction of the motion. There are two cases, parabolic and non-parabolic trajectory. If the complete trajectory is parabolic then for the parabolic Jacobi field J (t) the focusing after the last reflection occurs at infinity and we have f 1 = a f 0 + b, for some b a > 0 and b. We obtain that this parabolic trajectory is z-monotone for z = max{0, a+1 }. Indeed it is the minimal z ≥ 0 such that if f 0 ≤ −z then f 1 ≤ z. In the case of a nonparabolic complete trajectory let the Jacobi field that is initially focused at infinity (J (0) = 0) be focused after the last reflection at finite f 1 = c1 , and the Jacobi field focused at infinity after the last reflection be focused at some finite f 0 = c0 before the first reflection. We get then that f1 =
c1 f 0 + b , c0 c1 + b < 0. f 0 − c0
(6)
Let us note that the condition c0 c1 + b < 0 is equivalent to the orientation preservation c1 +b of the fractional linear map since dd ff01 = − ( cf 0−c > 0. )2 0
0
Lemma 8. The trajectory is z-monotone for 1 c1 − c0 + (c0 + c1 )2 − 4(c0 c1 + b) }. z = max{0, 2 Proof. By direct calculations we obtain two values f ±= 21 (c0−c1 ± (c0 +c1 )2−4(c0 c1+b)) such that if f 0 = f ± then f 1 = − f ± . We have also that f − < c0 < f + and − f + < c1 < − f − . If f − ≤ 0 then the trajectory is z-monotone for z = − f − . If f − > 0 then the trajectory is 0-monotone (in the sense that it is -monotone for arbitrarily small > 0). The problem with the application of these formulas is that in general we cannot get the explicit dependence of f 1 on f 0 for complete trajectories with a large number of reflections. The exception is provided by integrable billiard tables, the disk and the ellipse. Billiard in a disk is integrable due to its rotational symmetry. Let J be a Jacobi field obtained by the rotation of a trajectory. This family of trajectories (“the rotational family”) is focused exactly in the middle between two consecutive reflections (that is where J vanishes). It follows further from the mirror equation (5) that a parallel family of orbits is focused at a distance d2 after the reflection, and any family focusing somewhere between the parallel family and the rotational family will focus at a distance somewhere between d2 and d, not only after the first reflection, but also after an arbitrary long sequence of reflections. Hence any complete trajectory in an arc of a circle is z-monotone, where 2z is the length of a segment of the trajectory, and by Lemma 3 it is strictly z’-monotone, for any z > z. Two arcs of a circle separated by parallel segments form the stadium of Bunimovich, [B1]. Lazutkin showed that billiards in smooth strictly convex domains are near integrable close to the boundary of the table, [L]. Donnay applied Lazutkin’s coordinates to establish that for an arbitrary strictly convex arc the situation near the boundary is similar to that in a circle, i.e., in our language complete near tangent trajectories are z-monotone, where z is of the order of the length of a single segment. This crucial calculation shows that if a strictly convex arc is strictly monotone then any sufficiently small perturbation of this arc is strictly monotone. In view of Proposition 4 the only obstacle to using a convex arc in the boundary of a hyperbolic billiard is the presence of parabolic orbits. Such orbits are not a problem in
Design of Hyperbolic Billiards
291
themselves since they are still z-monotone, but nearby orbits may have c0 << −1 and c1 >> 1 which by Lemma 8 will result in a large z value. The case when c0 >> 1 and c1 << −1 is safe, resulting in z = 0. However no examples of such “safe” parabolic orbits are known to us. Bunimovich [B3] introduced the concept of absolutely focusing arcs which in our language means that parabolic trajectories are excluded and additionally for every complete trajectory c0 < 0 and c1 > 0. We do not need the last assumptions for our argument, but it is conceivable that our generalization is superficial, i.e., if there are no parabolic orbits then by necessity c0 < 0 and c1 > 0 for all complete trajectories. The following proposition reformulates in our language the result of Donnay [D]: Proposition 9. Any convex arc without complete parabolic orbits is finitely strictly monotone (and hence can be used in designing completely hyperbolic billiards). Proof. In view of Donnay’s analysis of near tangent orbits we have to consider only compact families of complete trajectories with the number of reflections not exceeding a certain fixed number. Under the assumption of the absence of parabolic complete trajectories, the values of |c0 | and |c1 | are uniformly bounded on these compact families. It follows from Lemma 8 that the z-value is uniformly bounded for these trajectories. It was also observed by Donnay [D] and Markarian [M] that for any strictly convex arc, its sufficiently short piece does not have parabolic orbits. Indeed, if the arc is very short, then any complete trajectory is either near tangent or it has only one reflection. No near tangent trajectory is parabolic and the mirror equation (5) does not allow a trajectory with one reflection to be parabolic. In general checking that parabolic orbits are absent cannot be done by a direct calculation. 2 An arc satisfying dds r2 < 0, where r is the radius of curvature as a function of the arc length s, is strictly monotone, [W3]. Such an arc is called convex scattering. More precisely we have that any complete trajectory in a convex scattering arc is strictly z-monotone with z = max d, maximum of the values of d from the mirror equations (5) for the first and the last segment of the trajectory. This property leads to examples of hyperbolic billiards with one convex piece of the boundary, like the domain bounded by the cardioid. Let us note that the convex scattering property stands out in not being associated with integrability or near integrability. Integrability of the elliptic billiard allows one to establish finite strict monotonicity of the semi-ellipse with endpoints on the longer axis, [W3]. Donnay, [D], √ showed that the other semi-ellipse is also finitely strictly monotone provided that a ≤ 2b, where a ≥ b are the semiaxes. 6. Systems with Local Spherical Symmetry Let us consider two segments of the same orbit lying in one plane R2 ⊂ Rn , with mirror symmetry in the plane R2 reversing the direction of time, Fig. 1. We further assume that there is the center of symmetry which we use as the origin of the coordinate system and that small rotations around it (of Rn ) take the two segments into two segments of another orbit, with the preservation of time. In such a case we say that our system has local spherical symmetry, on the orbit in question. Examples of systems with local spherical symmetry are furnished by billiards with spherical caps, and by soft billiards with spherical scatterers, [B2, W4, B-R , D-L, B-T].
292
M. P. Wojtkowski
e2 e1
m0 q0
m1 q1
Fig. 1.
Let γ (t) be the time parameterization of the segments, where q0 = γ (0) belongs to the first segment and q1 = γ (t1 ) belongs to the second segment. It follows from the local spherical symmetry that one parameter groups of rotations produce families of orbits. Let Z ∈ o(n) be an infinitesimal rotation (i.e. Z is an anti-symmetric matrix). We get the family of orbits γ (t, u) = eu Z γ (t) and the respective Jacobi field J Z (t) = Z γ (t), J Z (t) = Z γ (t). This Jacobi field is known to us only on the two segments, but it is enough to check monotonicity between q0 and q1 . We will call such Jacobi fields spherical. We choose an orthonormal basis e1 , e2 in the plane of the orbit so that its axis of symmetry has the direction of e2 , Fig. 1. We will be checking monotonicity of our orbit between γ (0) = q0 = ae1 + be2 and γ (t1 ) = q1 = −ae1 + be2 . We have γ (0) = sin αe1 + cos αe2 with − π2 ≤ α ≤ π2 , and γ (t1 ) = sin αe1 − cos αe2 . It follows that for the spherical Jacobi field J Z , J Z (0) = a Z e1 + bZ e2 , J Z (0) = sin α Z e1 + cos α Z e2 , J Z (t1 ) = −a Z e1 + bZ e2 , J Z (t1 ) = sin α Z e1 − cos α Z e2 .
(7)
We will say that the orbit segments are in general position if their extensions do not contain the origin. Equivalently the segments are in general position if a cos α −b sin α = 0. For orbit segments in general position there are many spherical Jacobi fields. More precisely we have the following Lemma 10. If the orbit segments are in general position then spherical Jacobi fields form a linear subspace of dimension 2n − 3. Nonzero spherical Jacobi fields are not necessarily perpendicular but none of them is parallel. Proof. Let us consider the linear map Z → J Z . It follows from the condition a cos α − b sin α = 0 that the kernel of the map coincides with antisymmetric matrices such that Z e1 = Z e2 = 0. Hence the kernel of the map has the dimension 21 (n − 2)(n − 3) while the space of all matrices Z has the dimension 21 n(n − 1). This gives us the dimension of the space of spherical Jacobi fields. To prove the second part of the lemma let us consider a spherical Jacobi field such that J Z (0) and J Z (0) are parallel to γ (0), and hence linearly dependent. It follows that Z e1 and Z e2 must be perpendicular to e1 and e2 , and further that J Z (0) and J Z (0) are both parallel and perpendicular to γ (0). Only the zero Jacobi field can satisfy it.
Design of Hyperbolic Billiards
293
We will consider only orbit segments in general position. It is enough for the study of hyperbolicity because the orbits for which this condition fails form a subset of the phase space of dimension n and can be safely ignored. It follows from (7) and the condition a cos α − b sin α = 0 that by an appropriate choice of a skewsymmetric matrix Z we can get arbitrary vectors perpendicular to the plane spanned by e1 and e2 as the values of J (0) and J (0). Let us call such spherical Jacobi fields transverse. Transverse Jacobi fields form a linear subspace of the space of spherical Jacobi fields of dimension 2n − 4. At this stage we need to invoke the symplectic nature of the dynamics. Jacobi fields form the tangent space to the phase space at any point on the orbit. Hence we get an identification of all of these tangent spaces. This identification amounts to the action of the derivative of the flow. The tangent spaces are equipped with the canonical symplectic form and hence the space of Jacobi symplectic space with fields is a linear the canonical symplectic form ω(J1 , J2 ) = J1 , J2 − J1 , J2 (‘Wronskian’), where the scalar products are evaluated at any point on the orbit segments (in particular we get the same value independent of the point). It follows from this formula that for any Jacobi field J skeworthogonal to the space of transverse Jacobi fields the values of J and J at any point are in the plane spanned by e1 and e2 . We will call such Jacobi fields planar. The space of perpendicular Jacobi fields contains the 2 dimensional subspace of planar Jacobi fields. Further we have the unique splitting of any Jacobi field J = J p + Jt into a planar, J p , and a transverse, Jt , Jacobi fields. Moreover it follows from the definition of the form Q that Q(J ) = Q(J p + Jt ) = Q(J p ) + Q(Jt ). Hence, if we establish monotonicity separately for planar and for transverse Jacobi fields then we get monotonicity for all Jacobi fields. Note that we have used the symplectic formalism to obtain the splitting. In the two classes of examples with spherical symmetry we get it directly from an additional symmetry. Monotonicity between the symmetric points q0 and q1 for spherical Jacobi fields can be analyzed by direct calculation: Q(t1 ) − Q(0) = J Z (t1 ), J Z (t1 ) − J Z (0), J Z (0) = −2a sin α Z e1 , Z e1 (8) −2b cos α Z e2 , Z e2 . Hence we get monotonicity if and only if b cos α ≤ 0 and a sin α ≤ 0. Let us assume that there is monotonicity on all spherical Jacobi fields between the points q0 = γ (0) and q1 = γ (t1 ) and that monotonicity fails on some of these fields between γ () and γ (t1 − ) for arbitrarily small > 0. In the same way as in formula (3) we obtain from (8), Q(t1 − ) − Q() = (−2a sin α − 2 sin2 α) Z e1 , Z e1 +(−2b cos α − 2 cos2 α) Z e2 , Z e2 . Our assumptions lead to the conditions b = 0, a sin α < 0 or a = 0, b cos α < 0. In this way we arrive at two generic configurations, the configuration A where b = 0, a sin α < 0, and the configuration B where a = 0, b cos α < 0. We also have two singular configurations, the configuration As with α = ± π2 , a = 0, and the configuration Bs with α = 0, b = 0, Fig. 2 and Fig. 3. In all cases we get monotonicity between q0 and q1 on all spherical Jacobi fields, but the points q0 and q1 are positioned differently in different configurations. In configuration A and Bs the points q0 and q1 lie on the e1 axis and in configuration B and As they
294
M. P. Wojtkowski
q1 q0 e2
q0
e2
q1
e1
e1
Fig. 2. Configurations A (left) and As (right)
e2
q0
e1 q0
e2
q
1
e1
q1
Fig. 3. Configurations B (left) and Bs (right)
coincide geometrically with one point on the axis of symmetry, i.e., the e2 axis. In the singular configuration Bs both segments are vertical and in the singular configuration As both segments lie on the same horizontal line. In any of the configurations the points are optimal in the sense that monotonicity fails for points past q0 and before q1 . The difference between a configuration A and a configuration B is in the location of the point of intersection of the line extensions of the segments. This point lies on the axis of symmetry, above the origin for a configuration A, and below the origin for a configuration B. (The terms ‘below’ and ‘above’ seem arbitrary. To remove this ambiguity we note that the two orbit segments are ordered by time. The first segment in a generic configuration allows us to orient canonically the line of symmetry, which was hidden earlier in the condition that − π2 ≤ α ≤ π2 .) We have thus established that monotonicity on spherical Jacobi fields depends only on the geometry of the incoming and outgoing segment and it is not affected by the dynamics. The generic configurations have no parabolic spherical Jacobi fields but the singular configurations do. However even for the singular configurations we get monotonicity between appropriate points, i.e., the conclusions of Proposition 4 hold, even though we do not have a completely parabolic orbit. It follows from the splitting of an arbitrary perpendicular Jacobi field into a transverse field and a planar one. All transverse fields are spherical and hence are covered by the above analysis. The planar fields form a two
Design of Hyperbolic Billiards
295
dimensional invariant subspace and hence the proof of the completely parabolic part of Proposition 4 applies to them. Having monotonicity separately for transverse and planar fields is equivalent to monotonicity. Let us analyze monotonicity on planar Jacobi fields in more detail. We will compare the Jacobi fields at the points γ (0) = m 0 and γ (t1 ) = m 1 which are closest to the origin in the line extensions of the respective orbit segments, Fig. 1. (If the initial segments are too short to contain the points their status is somewhat abstract; they may or may not be actual orbit points. However it will not effect our analysis.) Lemma 11. The fractional linear representation of the dynamics (6) on the planar perpendicular Jacobi fields between m 0 and m 1 has the form f1 =
−c f 0 , f0 − c
where c is the unique value for which there is a planar perpendicular Jacobi field J with J (0) = 0 and J (t1 − c) = 0 (or a field with J (c) = 0 and J (t1 ) = 0). Monotonicity holds for the planar Jacobi fields between γ (−z) and γ (t1 + z) for z = |c| − c. In the limit case of c → ∞ we get f 1 = f 0 and then there is monotonicity for z = 0. Proof. The space of planar perpendicular Jacobi fields is 2 dimensional. The spherical planar Jacobi field J Z generated by the infinitesimal rotation Z with Z e1 = e2 , Z e2 = −e1 is not perpendicular but it has the nonzero perpendicular component which we will denote by Jr . By the choice of the points m 0 and m 1 we have that Jr (0) = 0 and Jr (t1 ) = 0. Moreover introducing compatible orthonormal frames v0 = γ (0),v0⊥ and v1 = γ (0), v1⊥ at m 0 and m 1 respectively, we can calculate that Jr (0), v0⊥ = J Z (0), v0⊥ = 1 and Jr (t1 ), v1⊥ = J Z (t1 ), v1⊥ = 1. Now we use J (0), v0⊥ , J (0), v0⊥ and J (t1 ), v1⊥ , J (t1 ), v1⊥ as coordinates in the 2 dimensional space of planar perpendicular Jacobi fields, at m 0 and m 1 respectively. In these coordinates Jr is the second basic vector both at m 0 and at m 1 . Hence dynamics between m 0 and m 1 the 10 is described in these coordinates by the matrix . Since the focusing distance ∗1
J,v ⊥
f i = − J ,vi⊥ , i = 0, 1, we obtain the lemma from Lemma 8 by direct calculation. i
Note that the effort in the above proof is to show that c0 = −c1 = c in (6). We get it from local spherical symmetry alone. It is quite obvious in the two classes of examples due to the additional reversible symmetry. We have thus established that monotonicity of a complete trajectory in a system with spherical symmetry depends only on the geometry of the incoming and outgoing segments and the value of c from Lemma 11, which is the only information we need to extract from the dynamics. In the case of spherical caps of radius R any complete trajectory lies in a plane passing through the center. Moreover the planar Jacobi fields are just the Jacobi fields of the trajectory in the billiard in the disk of radius R. It was observed in Sect. 5 that in such a case c is always positive and hence z = 0. The analysis of monotonicity in the case of spherical caps is thus complete and can be summarized in the following proposition which was essentially stated in [W4].
296
M. P. Wojtkowski
Proposition 12. Any complete trajectory, in general position, in a spherical cap is monotone between min(q0 , m 0 ) and max(q1 , m 1 ). (The min and max are understood in the sense of the temporal ordering of the trajectory). Moreover only the trajectories in singular configurations are parabolic. Proof. It remains to analyze parabolic trajectories, i.e., we are looking for a perpendicular Jacobi field such that J (0) = 0, J (t1 ) = 0. We know that there are no such nonzero planar Jacobi fields. It remains to check the transverse (and hence spherical) Jacobi fields. It follows from (7) that if J Z (0) = sin α Z e1 + cos α Z e2 = 0 and J Z (t1 ) = sin α Z e1 − cos α Z e2 = 0 then either α = 0 and the trajectory is in the configuration Bs , or α = ± π2 and the trajectory is in configuration As . We can now apply this analysis to specific examples of billiards with spherical caps, and to soft billiards with spherical scatterers. The first construction of a three dimensional hyperbolic billiard with spherical caps was obtained in [B-R]. We are in a position to recover easily this construction, to see what the obstacles are and how to overcome them. We want to attach spherical caps to a box. We need separation of the caps so that it takes a long time from when an orbit leaves a cup until it reaches another one (or the same one after reflecting in flat pieces). By similarity considerations instead of separating the cups we may fix the rectangular box, and decrease the radius of the sphere. The first observation is that configurations Bs are disastrous, because if they are present then in the same plane we will also have configurations B with the point q0 arbitrarily far away. By elementary geometry we get Proposition 13. If the angle at which a piece S of a sphere is seen from the center is less than π2 then all complete trajectories in S are in configuration A, in particular there are no trajectories in configuration Bs We will call such pieces small spherical caps. One may get the impression that configurations As may pose a similar difficulty because the points q0 and q1 may go to infinity. Indeed if we consider a small spherical cap and a plane of our complete orbit that cuts the edge of the cap then our complete orbit may have the points q0 and q1 far away. What saves the construction is that the points must stay on the line through the center, and hence they have a bounded displacement in one direction. So now the prescription for the design of the billiard with spherical caps is to place the small caps only at the bottom and the top of the box. Such a billiard is equivalent to the billiard between two parallel hyperplanes (the top and the bottom) with small spherical caps attached. It is clear that if the hyperplanes are sufficiently far apart then the configurations A do not pose any difficulty. The exact separation is such that the horizontal hyperplanes through the centers of the spheres of the top spherical caps should be above those for the bottom spherical caps. Clearly more complicated designs can also be produced. One finds several of them in [B-R]. 7. Soft Billiards The analysis in Sect. 6 can be readily applied to soft billiards. These are systems with a point particle moving in a rectangular box, or a torus, with spherical scatterers. However the point particle does not collide elastically with the scatterer, but enters into it and is subjected to a field of force with a spherically symmetric potential. In the 2 dimensional case, after the work of Knauf, [K1, K2] Donnay and Liverani, [D-L], gave general
Design of Hyperbolic Billiards
297
conditions on the potential that guarantee, in our present language, that all complete trajectories are z-monotone with uniformly bounded z. A complete trajectory through a scatterer is the piece of a trajectory from entering a scatterer to leaving it. This allowed them to construct a variety of completely hyperbolic soft billiards,. The case of higher dimensions remained open for 15 years. Recently Balint and Toth, [B-T], obtained additional conditions on the potential that guarantee complete hyperbolicity in arbitrary dimension. Our condition that no complete trajectory is parabolic is fully equivalent to those of [B-T]. Moreover our approach results in fairly explicit conditions on the required separation, while such conditions are absent both from [D-L] and [B-T]. The orbit of the point particle inside a scatterer is not in general a straight segment. However we restrict our attention to the incoming and outgoing segments of our trajectory which we denote by γ (t), with γ (0) being a point before the entrance into the scatterer and γ (t1 ) a point after the exit. For a family of trajectories γ (t, u) we consider the Jacobi field J (t) = ∂γ ∂u |u=0 . For Jacobi fields which are not spherical we cannot claim that if J (0) is perpendicular to the trajectory then J (t1 ) is also perpendicular. However since J (t) is perpendicular to the trajectory outside of the scatterer (because the point particle has unit velocity there), then the values of Q(0) and Q(t1 ) depend only on the perpendicular component of J (t). We can then consider these perpendicular components in place of perpendicular fields and the analysis of Sect. 6 is perfectly valid. (What happens here is that an invariant codimension one subspace in the tangent bundle of the phase space is not a priori available and we have to work with a quotient space rather than a subspace. Perpendicular components of Jacobi fields form the quotient space, see [W1, W2].) Definition 14. The halo of a scatterer in a soft billiard is a closed concentric ball of minimal radius, containing the scatterer and such that almost any complete trajectory through the scatterer is monotone between two points outside of the ball. Our goal is to establish the existence of the halo for a given scatterer, and to determine its radius. By Theorem 6 if each scatterer in a soft billiard has a halo and the halos are mutually disjoint then the soft billiard is completely hyperbolic. In dimension 3 and above if the spherical potential V = V (r ) is continuous (i.e. in particular it vanishes at the boundary) and attractive V (r ) > 0 then there is no halo. Indeed let us consider a straight line tangent to the scatterer. Perturbing it we obtain a complete trajectory “grazing” the scatterer. By necessity it is in configuration A with the points q0 and q1 at large distance from the center. Since this distance goes to infinity as the trajectory approaches the tangent line and the points must belong to the halo we conclude that there is no halo for our scatterer. Hence scatterers with continuous attractive potentials are not allowed in the design of a hyperbolic soft billiard in dimension ≥ 3. This was already observed by Balint and Toth, [B-T]. The passage through a scatterer is completely described by the rotation angle = (ϕ), 0 ≤ ϕ ≤ π2 , [B-T]. For a given angle of incidence ϕ the angle is the angular difference between the entrance and the exit points on the complete trajectory that enters the scatterer with the incidence angle ϕ. With a fixed orientation of the circle the value of differs by the sign when we switch the incoming and outgoing lines. For simplicity we consider only the case depicted in Fig. 4, with the counterclockwise orientation of the boundary of the scatterer. It is convenient to introduce the angle − π2 ≤ η ≤ π2 between the perpendicular axis of symmetry of the line of the incoming segment and the axis of symmetry of the configuration, Fig. 4.
298
M. P. Wojtkowski
∆ 2
η
m
0
φ Fig. 4.
In configuration A we have 0 < η < π2 and in configuration B we have − π2 < η < 0. Moreover = 2η − 2ϕ + π . To find the radius of the halo of a scatterer we need to find the distance of the points q0 , q1 to the center of symmetry for any complete trajectory passing through the scatterer. By simple geometric considerations we obtain that this distance is equal to R
sin ϕ sin ϕ in configuration A and R , in configuration B, sin η cos η
(9)
where R denotes the radius of the scatterer, and ϕ is the angle of incidence for our complete trajectory. Hence the point q0 is outside of the scatterer when η < ϕ in configuration A, and when η < − π2 + ϕ in configuration B. These formulas allow the direct calculation of the halo of a scatterer in examples. It will be large if for some configurations η is small and positive or close to − π2 . It is guaranteed to be finite if there are no singular configurations (η = 0 or η = ± π2 ). There is also contribution into the halo from the planar Jacobi fields. More specifically we need to calculate the constant c in Lemma 11. It is sufficient to obtain one additional planar Jacobi field, not focused at m 0 (as in the rotational field). For that purpose let us consider the family of trajectories γ (t, ϕ) entering a scatterer at one point γ (0, ϕ) = (R, 0), γ (0, ϕ) = − cos ϕe1 + sin ϕe2 . At the exit time t1 = t1 (ϕ) we get γ (t1 (ϕ), ϕ) = (R cos , R sin ) , γ (t1 (ϕ), ϕ) = cos( + ϕ)e1 + sin( + ϕ)e2 . Hence we get the Jacobi field J , J (0)
J (0) = 0, J (t1 ) = R (− sin e1 + cos e2 ) , = sin ϕe1 + cos ϕe2 , J (t1 ) = ( + 1) (− sin( + ϕ)e1 + cos( + ϕ)e2 ) .
Design of Hyperbolic Billiards
299
ϕ By direct calculation we obtain now c = − R cos +2 . Hence by Lemma 11 if + 2 < 0 we get z = 0 and there is no contribution from planar Jacobi fields to the scatterer’s halo. If however + 2 > 0 then z = −2c which translates to the halo of radius −1
+1 . (10) h = R p 2 cos2 ϕ + sin2 ϕ, where p = 2
The application of these formulas, in obtaining explicit separation of scatterers for hyperbolicity, hinges on the representation of the rotation function in terms of the potential, which is somewhat cumbersome. We have nothing new to add on this subject compared to the papers [D-L] and [B-T], where the reader can find a detailed discussion. We will consider here only the simple case of a constant potential V = V0 < E = 21 . In the two dimensional case it was studied by Baldwin, [Ba], and Knauf, [K2], who arrived at sharp conditions for complete hyperbolicity. It was shown in [B-T] that there is always sufficient separation of scatterers that will guarantee complete hyperbolicity also in the multidimensional case, without providing specific bounds. It turns out that the two dimensional conditions are also sufficient in higher dimension. Soft billiards with constant potential are systems where the crossing of scatterers is governed by a version of the law of refraction. One needs to distinguish the case of the positive potential 0 < 2V0 < 1 and the negative potential V0 < 0. We have sin ϕ = arccos , ν = 1 − 2V0 , 2 ν where in the case of a positive potential (ν < 1) the formula is valid for 0 ≤ ϕ ≤ ϕ0 and ϕ0 is such that sin ϕ0 = ν, and in the case of a negative potential (ν > 1) the formula is valid for all 0 ≤ ϕ ≤ π2 . It follows immediately that 1 (ϕ) = −
. 2 2 1 + ν −1 cos2 ϕ
Hence the derivative is a decreasing function in the case of a positive potential (ν < 1) and an increasing function in the case of a negative potential. It follows that in the case of positive potential 2(ϕ) ≤ − ν1 < −1, and there is no contribution into the halo from
the planar Jacobi fields. In the case of a negative potential (ν > 1) 2(ϕ) ≥ − ν1 > −1 νR and we get, using (10), that the halo radius h = ν−1 (the minimal value of h is assumed at ϕ = 0 because p is a decreasing function of ϕ). Further in the positive case we have only configurations B with η < ϕ so that there is no contribution from them into the halo. In the negative case we have only configurations A and using (9) we arrive readily νR at the same halo radius h = ν−1 (it is again assumed at ϕ = 0). The explanation for this coincidence is that the halo is determined by trajectories in the limit of the incidence angle ϕ → 0. In that limit the distinction between planar and transverse Jacobi fields is lost. To summarize, no separation of scatterers is necessary for hyperbolicity in the case of a positive potential and in the case of a negative potential the scatterers should have non√ νR intersecting halos with the radius h = ν−1 , ν = 1 − 2V0 . Baldwin, [Ba], showed that with the violation of these conditions one can construct systems with elliptic periodic orbits. The same can be claimed in higher dimensional systems. Hence our conditions are sharp.
300
M. P. Wojtkowski
8. Twisted Cartesian Products We will describe here a construction of higher dimensional hyperbolic systems which generalizes the Papenbrock stadium, [P], and can be understood in the language of monotonicity as developed in this paper. Let us consider two billiard systems, system 1 and system 2, and their cartesian product. Given monotone trajectories γ1 and γ2 in systems 1 and 2 respectively, we address monotonicity of the trajectory (γ1 (t), γ2 (t)) in the cartesian product. We are faced with the basic difficulty that the moments of time between which there is monotonicity may be different for γ1 and γ2 . This difficulty disappears if one of the systems has all trajectories monotone between any two points. The examples of such systems are semidispersing billiards and closely related geodesic flows on manifolds of nonpositive sectional curvature. The simplest example is the motion of a point particle in a segment. We will call such systems universally monotone. Another new element in our construction is monotonicity in the full phase space; so far we have discussed montonicity of systems on one energy level. This restriction was somewhat hidden in the fact that all our Jacobi fields J satisfied J , γ = 0. When we allow all energy levels we have more Jacobi fields and a trajectory could fail to be monotone on some of the additional fields. Monotonicity on all Jacobi fields will be called ambient. In the cartesian product the kinetic energy is split arbitrarily between system 1 and system 2. In other words γ1 may be traversed fast while γ2 is traversed slowly. In a pure cartesian product it is not an issue because both kinetic energies are first integrals of motion. However we are going to modify the cartesian product to obtain a hyperbolic system and such modifications are bound to destroy the first integrals; only the total kinetic energy remains constant. We need to consider each of the systems in all of the phase space and check for the ambient monotonicity. More precisely we need to allow more Jacobi fields by considering families of trajectories γ (t, u) (cf., Sect. 2) in which || ∂t∂ γ (t, u)|| depends on u. It turns out that for billiard systems (and geodesic flows) ambient monotonicity follows automatically from monotonicity. Indeed in such systems the same trajectory can be traversed at different speeds. Hence in particular for any trajectory γ (t) there is a constant a > 0 such that γ (as) is its arc length parameterization. In other words a trajectory on an arbitrary energy level is a reparametrization of a trajectory with unit velocity. We have γ (t, u) = γ (a(u)s, u), J (s) = Y (t) + a sw,
∂ ∂u γ (t, u)|u=0 , J (s) 1 d a a ds J − a w, where w = ∂t∂ γ (t, 0).
Y (t) =
=
=
a
d dt Y
Adopting the convention that is get
d d dt , ds
or
∂ ∂u γ (a(u)s, u)|u=0 , = a(0), a = a (0),
d du , as appropriate, and using
J , w = 0, we
1 a (a )2 J (s), J − J (s), w + 3 s. Q(Y ) = Y, Y = a a a Once we remember that J (s), w does not change along a trajectory (because of the invariant split of Jacobi fields into perpendicular and parallel Jacobi fields) we conclude that Q(Y )(t1 ) − Q(Y )(t0 ) =
1 (a )2 (Q(J )(s1 ) − Q(J )(s0 )) + 3 (s1 − s0 ), where a a
Design of Hyperbolic Billiards
301
ti = asi , i = 0, 1.
(11)
We summarize the consequences of (11) in the following Proposition 15. If a trajectory traversed with speed one is monotone between two points then the same trajectory traversed at an arbitrary speed satisfies ambient monotonicity between the points. If a trajectory is strictly monotone in the restricted phase space between two points then in the full phase space the only Jacobi fields Y on which Q is not increased are parallel to the velocity and satisfy Y = 0. Proof. The first part follows immediately from (11). To prove the second part let us observe that by (11) if γ (t) is monotone between γ (t0 ) and γ (t1 ) and there is no increase of Q on a Jacobi field Y then a = 0, which means that the Jacobi field may be obtained from a family of trajectories in one energy level. Now since the trajectory is assumed to be strictly monotone, then the Jacobi field Y must be parallel, i.e., Y = constγ , Y = 0. We are ready to proceed with the construction. We consider a euclidean space with coordinates (x0 , x1 , . . . , xk ) = (x0 , x). Our system 1 is a billiard (or a geodesic flow) in a domain D in the upper halfspace {x0 > 0} with some boundary D0 at {x0 = 0}. We will remove the boundary D0 and allow trajectories to enter freely into the lower halfspace {x0 < 0}. We assume that in the system 1 any trajectory is strictly monotone between two consecutive visits to D0 . Our system 2 is a universally monotone system in the configuration space E which is isometric to D0 . It is clear that the product system has all trajectories monotone between consecutive visits of the first component to D0 , regardless of where the second component is in E. We will now examine the Jacobi fields on which our trajectory is parabolic. Let the euclidean coordinates in E be denoted by y = (y1 , . . . , yk ) and let the coordinate identity map y = x furnish the isometry from D0 to E. Let γ (t) = (γ1 (t), γ2 (t)) ∈ D × E be a trajectory of the cartesian product of our systems and let γ (ti ) ∈ D0 × E, i = 0, 1, be two consecutive visits to D0 × E. We consider a family of trajectories γ (t, u) = (γ1 (t, u), γ2 (t, u)) = (x0 (t, u), x(t, u), y(t, u)) including γ (t) = γ (t, 0) ∂ ∂ and generating the Jacobi field Y = ∂u γ1 (t, 0), ∂u γ2 (t, 0) . Proposition 15 leads immediately to the following Lemma 16. If there is no increase in Q on a Jacobi field Y between t0 and t1 then Y = 0 ∂ and ∂u γ1 (t, 0) = const ∂t∂ γ1 (t, 0). To finish the construction of our system we consider another system in the domain D˜ in the lower halfspace {x0 < 0} such that the part of the boundary of D˜ at {x0 = 0} is the same D0 , and with strict monotonicity between consecutive visits to D0 . The simplest example would be the reflection of our system 1 into the lower halfspace {x0 < 0}. We now glue the domains D × E and D˜ × E along the common boundary D0 × E by the isometric map G(x0 , x, y) = (x0 , −y, x). When a trajectory in D × E reaches {x0 = 0} then it is continued into D˜ × E after the change of position and velocity by the map G. We will call such a system a twisted cartesian product. Theorem 17. A trajectory in the twisted cartesian product is strictly monotone between every second visit to {x0 = 0}. The twisted cartesian product is completely hyperbolic.
302
M. P. Wojtkowski
Proof. We can see that the glueing map G preserves the form Q. Hence in the twisted cartesian product trajectories are monotone between visits to {x0 = 0}. We need to examine the Jacobi fields on which there is no increase in the value of the form Q between a first visit to {x0 = 0} and the third. Let Y (t) = (Y0 (t), Y1 (t), Y2 (t)) be such a field with the second visit to {x0 = 0} at t = 0. By Lemma 16 if there is no increase of Q on Y between the first and the second visit then (Y0 (−0), Y1 (−0)) = c1 γ1 (−0), where the values at −0(+0) denote the limits at 0 over negative (positive) t. Further we get that if there is no increase of Q on the Jacobi field Y between the second and third visit then (Y0 (+0), Y1 (+0)) = c2 γ1 (+0). Taking into account the gluing map G applied at t = 0 we have (Y0 (+0), Y1 (+0), Y2 (+0)) = (Y0 (−0), −Y2 (−0), Y1 (−0)) and we can conclude that c1 = c2 = c and Y (t) = cγ (t) both for t < 0 and t > 0, i.e., Y is a parallel Jacobi field in the product system. That means that our trajectory is actually strictly monotone between the first and the third visit. It follows that our twisted cartesian product on one energy level is eventually strictly monotone and hence completely hyperbolic. Indeed if we consider the Poincare section of our flow {x0 = 0, ddtx0 > 0} we see that consecutive visits to the section are separated by exactly one more visit to {x0 = 0}. Let us consider specific examples of systems with the required properties and their twisted cartesian products. Example 1. Let the system 1 be the Sinai billiard with one convex scatterer in a square D with D0 being one of the sides of the square and the system 2 be the motion of a point particle in the segment E = D0 . The resulting twisted cartesian product is a billiard system in a rectangular box in three dimensions with two cylindrical scatterers having perpendicular directions. Such systems were introduced by Simanyi and Szasz, [S-S], in a more general case of cylinders with arbitrary directions. Strictly speaking our analysis of strict monotonicity is incomplete for such a system since in the Sinai billiard the trajectories which do not collide with the scatterer between consecutive visits to D0 are parabolic. However it follows easily from our analysis that every trajectory is strictly monotone if it encounters both scatterers. To establish complete hyperbolicity of such a system it remains to show that the trajectories that encounter at most one scatterer form a set of zero Lebesgue measure. This was established in the paper [S-S], in a more general case. Example 2. Let the system 1 be the billiard in a convex domain D without corners with 2 the curved part of the boundary satisfying the property of convex scattering ( dds r2 < 0), and D0 being the flat part of the boundary (Fig. 5a). We have that any trajectory in D is strictly monotone between consecutive visits to D0 , [W3]. The system 2 is again the motion of a point particle in the segment E = D0 . By Theorem 17 the twisted cartesian product is completely hyperbolic. If instead the system 1 is “half of a Bunimovich stadium” as in Fig. 5b, then the twisted cartesian product is the Papenbrock stadium. The first proof that the Papenbrock stadium is completely hyperbolic was obtained by Bunimovich and del Magno, [B-M]. Example 3. We can take as system 1 a rectangular box in 3 dimensions with a spherical cup on one side and the square D0 on the other, that is essentially the BunimovichRehachek system discussed in Sect. 6, [B-R]. The system 2 is the uniform motion of a point particle in the square E = D0 . Theorem 17 is again not immediately applicable
Design of Hyperbolic Billiards
303
E
D0
a
D0
E
b Fig. 5.
because there are orbits in D that visit D0 twice without entering into the spherical cap. However the proof of Theorem 17 gives us strict monotonicity on trajectories that enter both caps, in {x0 > 0} and in {x0 < 0}. The proof that almost all trajectories have this property is straightforward but cumbersome and we omit the details. Let us finally observe that while the billiard domain in Example 1 can be modified with the preservation of strict monotonicity, as shown in [S-S], the billiard systems of Examples 2 and 3 are rigid in the sense that a typical perturbation of the billiard domain destroys the arguments used in Theorem 17. These arguments are based on partial integrability of cartesian products. Thus again we are confronted with the fragility of complete monotonicity in billiards in higher dimensions (≥ 3) with convex pieces of the boundary. It is an open problem to produce a more robust construction, or to explain why it cannot be done. Acknowledgements. Much of the work on this paper was done while the author visited the Institute of Mathematics of the Polish Academy of Sciences in Warsaw and the Institute Henri Poincaré in Paris. I am grateful for the warm hospitality and excellent working conditions that I enjoyed in Warsaw and Paris in the Spring of 2005. Supported in part by the NSF grant DMS-0406074. I am also grateful to Paul Wright and the anonymous referees for their valuable remarks.
References [Ba] [B1] [B2] [B3] [B-D] [B-R] [B-T] [Ch-S] [Ch-M]
Baldwin, P.R.: Soft billiard systems. Pyhysica D. 29, 321–342 (1988) Bunimovich, L.A.: On the ergodic properties of nowhere dispersing billiards. Commun. Math. Phys. 65, 295–312 (1979) Bunimovich, L.A.: Many-dimensional nowhere dispersing billiards with chaotic behavior. Physica D 33, 58–64 (1988) Bunimovich, L.A.: On absolutely focusing mirrors. In: Ergodic Theory and related topics, III, Gustrow 1990, U. Krengel (ed), Lecture Notes in Math. 1514, Berlin-Heidelberg-NewYork: Springer, 1992 pp. 62–82 Bunimovich, L.A., Del Magno, G.: Semi-focusing billiards: hyperbolicity. Commun. Math. Phys. 262, 17–32 (2006) Bunimovich, L.A., Rehacek, J.: How high-dimensional stadia look like. Commun. Math. Phys. 197, 277–301 (1998) Balint, P., Toth, I.P.: Hyperbolicity in multi-dimensional hamiltonian systems with applications to soft billiards. Disc. Cont. Dyn. Syst. 15, 37–59 (2006) Chernov, N.I., Sinai, Ya.G.: Ergodic properties of some systems of 2-dimensional discs and 3dimensional spheres. Russ. Math. Surv. 42, 181–207 (1987) Chernov, N.I., Markarian, R.: Billiards. Providence, RI: Amer. Math. Soc. 2005
304
M. P. Wojtkowski
[D]
Donnay, V.: Using integrability to produce chaos: billiards with positive entropy. Commun. Math. Phys. 141, 225–257 (1991) Hard ball systems and the Lorentz gas, ed. D. Szasz, Berlin-Heidelberg-New York: SpringerVerlag, 2000 Knauf, A.: Ergodic and topological properties of coulombic periodic potentials. Commun. Math. Phys. 110, 89–112 (1987) Knauf, A.: On soft billiard system. Pyhysica D. 36, 259–262 (1989) Katok, A., Strelcyn, J.-M.: with the collaboration of F. Ledrappier, F. Przytycki: Invariant manifolds, entropy and billiards; smooth maps with singularities. Lecture Notes in Math. 1222, BerlinHeidelberg-New York: Springer-Verlag 1986 Kozlov, V.V., Treschev, D.V.: Billiards. A genetic introduction to the dynamics of systems with impacts. Providence, RI: Amer. Math. Soc. 1990 Lazutkin, V.F.: On the existence of caustics for the billiard ball problem in a convex domain. Math. USSR Izv. 7, 185–215 (1973) Markarian, R.: Billiards with pesin region of measure one. Commun. Math. Phys. 118, 87–97 (1988) Papenbrock, T.: Numerical study of a three dimensional generalized stadium billiard. Phys. Rev. E 61, 4626–4628 (2000) Simanyi, N., Szasz, D.: Nonintegrability of cylindric billiards and transitive lie group actions. Erg. Th. Dyn. Sys 20, 593–610 (2000) Tabachnikov, S.: Billiards. Soc. Math. France 1995 Wojtkowski, M.P.: Systems of classical interacting particles with nonvanishing Lyapunov exponents. In: Lyapunov Exponents, Proceedings, Oberwolfach 1990, L. Arnold, H. Crauel, J.-P. Eckmann (eds.), Lecture Notes in Math. 1486, Berlin-Heidelberg-New York: Springer, 1991, pp. 243–262 Wojtkowski, M.P.: Monotonicity, J-algebra of Potapov and Lyapunov exponents. Smooth Ergodic Theory and Its Applications, Proc. Symp. Pure Math. 69, Providence, RI: Amer. Math. Soc. (2001) pp. 499–521 Wojtkowski, M.P.: Principles for the design of billiards with nonvanishing lyapunov exponents. Commun. Math. Phys. 105, 391–414 (1986) Wojtkowski, M.P.: Linearly stable orbits in 3-dimensional billiards. Commun. Math. Phys. 129, 319–327 (1990) Wojtkowski, M.P.: Hamiltonian systems with linear potential and elastic constraints. Fundamenta Matematicae 157, 305–341 (1998) Wojtkowski, M.P.: Complete hyperbolicity in hamiltonian systems with linear potential and elastic collisions. Rep. Math. Phys. 44, 301–312 (1999)
[H] [K1] [K2] [K-S] [K-T] [L] [M] [P] [S-S] [T] [W1]
[W2] [W3] [W4] [W5] [W6]
Communicated by G. Gallavotti
Commun. Math. Phys. 273, 305–315 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0231-5
Communications in
Mathematical Physics
Counting Regions with Bounded Surface Area P. N. Balister, B. Bollobás Department of Mathematical Science, University of Memphis, Memphis, TN 38152-3240, USA. E-mail: [email protected] Received: 2 March 2006 / Accepted: 28 July 2006 Published online: 12 May 2007 – © Springer-Verlag 2007
Abstract: Define a cubical complex to be a collection of integer-aligned unit cubes in d dimensions. Lebowitz and Mazel (1998) proved that there are between (C1 d)n/2d and (C2 d)64n/d complexes containing a fixed cube with connected boundary of (d − 1)-volume n. In this paper we narrow these bounds to between (C3 d)n/d and (C4 d)2n/d . We also show that there are n n/(2d(d−1))+o(1) connected complexes containing a fixed cube with (not necessarily connected) boundary of volume n. 1. Introduction Define an r -cube C to be an r -dimensional unit cube in Rd with vertices in Zd . In other words, a set of the following form: / I, ai ≤ xi ≤ ai + 1 for i ∈ I }, C = C(a, I ) = {x ∈ Rd : xi = ai for i ∈ where a = (a1 , . . . , ad ) ∈ Zd and I is a subset of {1, . . . , d} of size r . Define an r -dimensional cubical complex (or r -complex) B to be a finite union of r -cubes in Rd . We shall call a complex rooted if it contains the cube Cr = C(0, {1, . . . , r }). Define the volume |B| of B to be the number of r -cubes in B. We shall define the boundary ∂C of a cube C to be the (r − 1)-complex which is the union of the r pairs of faces C((a1 , . . . , ai , . . . , ad ), I \ {i}) and C((a1 , . . . , ai + 1, . . . , ad ), I \ {i}) for i ∈ I . We shall also define the boundary ∂ B of the complex n B = i=1 Ci to be the (r − 1)-complex which contains each (r − 1)-cube that occurs in an odd number of boundaries ∂Ci . (We shall avoid issues of orientation in this paper.) We say B is closed if ∂ B = ∅. Define the surface area of B to be the volume of the boundary |∂ B|. We say that an r -complex B is connected if it is connected via its (r − 1)-dimensional faces. More formally, let G be the graph with vertices equal to the component r -cubes of B and two vertices joined by an edge when these cubes share a common (r − 1)-dimensional face. Then B is connected precisely when G is connected.
306
P. N. Balister, B. Bollobás
Fig. 1. Examples of contours in 2 dimensions
The number of d-dimensional cubical complexes with a given volume or surface area is interesting in its own right; however it also has applications to the Ising model in d dimensions, where the convergence of the low temperature expansion is dependent on the number of Peierls contours, i.e., the number of connected boundaries of rooted cubical complexes (see Lebowitz and Mazel [3]). Following the notation of [3], we define a contour to be the boundary of some rooted d-complex, provided that this boundary is itself a connected (d −1)-complex. A contour is primitive if it is minimal, i.e., it is not a disjoint union of two non-empty contours. Note that, in general, if ∂ B is a contour, then the cubes of B need only be connected via (d − 2)-dimensional cubes. On the other hand, if we insist that B is itself connected, it does not follow that ∂ B is a contour, since ∂ B may not be connected, and even if it is, it is not necessarily primitive (see Fig. 1 for some examples with d = 2). However, if ∂ B is primitive then B must be connected (since ∂ B is the disjoint union of the boundaries of the components of B). Let Bd be the set of rooted d-complexes in Rd with primitive boundaries, Bd the rooted d-complexes (possibly disconnected) with connected boundary, and Bd the connected rooted d-complexes (possibly with disconnected boundary, see Fig. 1). Write Sd (n) (respectively Sd (n), Sd (n)) for the number of elements of Bd (respectively Bd , Bd ) with surface area n. Write Vd (n) for the number of connected rooted d-complexes with volume n. Note that all these quantities are finite. In this note we shall give upper and lower bounds for all of these quantities. 2. Preliminary Results For two r -complexes, B1 and B2 , define B1 ⊕ B2 to be the complex formed from all r -cubes that are in either B1 or B2 but not both. Note that ∂(B1 ⊕ B2 ) = ∂ B1 ⊕ ∂ B2 . Also, for each complex B and for 1 ≤ i ≤ d, define Bi= to be the subcomplex of all r -cubes of B that have zero extent in dimension i, i.e., that are contained in some hyperplane xi = c. Define Bi⊥ to be the subcomplex consisting of all the r -cubes of B which have positive extent in dimension i, so that B = Bi⊥ ⊕ Bi= . The cubes in B ⊥ will be called vertical cubes, and the cubes in B = will be called horizontal cubes. The following slightly technical lemma will be useful. Lemma 1. Assume B = Bd⊥ and ∂ B ⊆ Rd−1 , where Rd−1 is identified with the hyperplane xd = 0 in Rd . Then B = ∅. Proof. Assume B = ∅ and let a ∈ Z be the maximum integer such that B meets the hyperplane xd = a. Then B contains some r -cube C × [a − 1, a] with C ⊆ Rd−1 . The face C × {a} of this r -cube is a face of precisely two r -cubes in Rd with positive extent
Counting Regions with Bounded Surface Area
307
in dimension d. One of these is C × [a − 1, a], the other is C × [a, a + 1]. Only the first of these is in B, so C × {a} ⊆ ∂ B ⊆ Rd−1 . Hence a = 0 and B ⊆ Rd−1 × (−∞, 0]. A similar argument holds for the minimal a and shows that B ⊆ Rd−1 × [0, ∞). Thus B ⊆ Rd−1 × {0}, contradicting the assumption that every cube of B has positive extent in dimension d.
Lemma 2. An r -complex B is closed if and only if B = ∂ B for some B . Proof. Since each (r − 2)-dimensional subcube of an r -cube C is contained in precisely two faces of C, ∂∂C = ∅. Hence ∂ B is closed for all B. We now prove the converse. For each cube C × {a} in Bd= with a = 0, construct the stack C × [0, a] (or C × [a, 0] if a < 0). The ⊕-sum of all these stacks is a complex E with (∂ E)= d agreeing with Bd= outside Rd−1 . Let F = B ⊕ ∂ E. Then Fd= ⊆ Rd−1 . Now F is closed so ∂(Fd⊥ ) = ∂(Fd= ) ⊆ Rd−1 . Hence, by Lemma 1, Fd⊥ = ∅ and F = Fd= ⊆ Rd−1 is a closed complex in Rd−1 . By induction on d it is equal to ∂ F for some F . Now B = ∂(E ⊕ F ) as required.
= Lemma 3. If B and B are two d-complexes and (∂ B)= d = (∂ B )d , then B = B . ⊥ d−1 , Proof. Let E = B ⊕ B . Then (∂ E)= d = ∅, so ∂ E = (∂ E)d . Also ∂∂ E = ∅ ⊆ R ⊥ so by Lemma 1, ∂ E = ∅. But E d = E since every d-cube has positive extent in dimension d. Hence by Lemma 1 again, E = ∅, and so B = B .
Lemma 3 implies that the boundary ∂ B of a d-complex determines the complex B. Hence counting contours is equivalent to counting elements of B , while counting primitive contours is equivalent to counting elements of B. Following [3], we construct a floor-stack (multi-)graph G of the boundary B of a d-complex as follows. Decompose Bd= as a union of connected components or floors Fi . Decompose Bd⊥ into a union of complexes of the form E j = C × [a, b], where C is an (d − 2)-cube in Rd−1 and a, b ∈ Z with b − a maximal. In other words, we group together the component cubes of Bd⊥ as maximal stacks of cubes in the d th dimension. Since ∂ B = ∅, C × {a} and C × {b} must lie in ∂(Bd= ) = ∂(Bd⊥ ), and hence in some ∂ Fi and ∂ Fi . In other words, E j joins Fi and Fi . Let the vertices of G be the floors Fi and join Fi and Fi whenever there is a stack E j joining Fi and Fi . Lemma 4. If B ∈ Bd then the floor-stack graph of ∂ B is a connected graph. Proof. Let E be the union of the floors Fi and stacks E j in one component of the graph G. Let C be a horizontal (d − 2)-cube of ∂ E. Now C lies in the boundary of four (d − 1)-cubes, two horizontal, and two vertical. If C lies in ∂ E j , then precisely one of these vertical (d −1)-cubes lies in ∂ B. But then one of the horizontal (d −1)-cubes must also lie in ∂ B, since otherwise C would lie in ∂∂ B = ∅. Thus C lies in the boundary of some Fi . But this Fi is then an endvertex of E j , so also lies in the chosen component of G. But then C ∈ / ∂ E, a contradiction. Similarly, if C lies in the boundary of an Fi , then it lies at the end of a stack E j in the same component of G, once again leading to ⊥ d−1 , a contradiction. Hence (∂ E)= d = ∅, and thus ∂ E = (∂ E)d . Since ∂∂ E = ∅ ⊆ R Lemma 1 implies ∂ E = ∅. Thus E is a contour that is contained in ∂ B. Since ∂ B is primitive, E = ∂ B and so G is connected.
Note that the floor-stack graph may be disconnected for non-primitive contours. See Fig. 2 for an example in 3 dimensions.
308
P. N. Balister, B. Bollobás
Fig. 2. Example of contour with disconnected floor-stack graph
3. Bounds for Vd (n) We start with the easiest quantity to estimate, namely Vd (n), since this illustrates some of the techniques that we shall use for the other quantities. Theorem 5. For all n ≥ 1, d n−1 ≤ Vd (n) ≤
n 1 2d−1 (2ed) .
Proof. The cube Cd is connected to 2d other d-cubes. For each of these choose an affine transformation that maps Cd onto this d-cube. We can construct any connected rooted d-complex by gluing smaller complexes onto Cd at some or all of the adjacent d-cubes via their root cubes using the affine transformations defined above. If we define the polynomial f L (X ) inductively by f 0 (X ) = X and f L+1 (X ) = X (1 + f L (X ))2d , then the coefficient an,L of X n in f L is an upper bound on the number of complexes of volume n that can be constructed by the above process in at most L steps (i.e., complexes for which every cube is within graph-distance L of the root cube). As L increases an,L increases, and for L ≥n, an,L is constant, say an,L = an . Thus f L (X ) increases monotonically to f (X ) = an X n provided X is within the radius of convergence of this limiting series. Hence the number of these of volume nn is bounded above by the coefficient an in the generating function f (X ) = ∞ n=1 an X , where f (X ) satisfies the equation f (X ) = X (1 + f (X ))2d . (1) Rewrite Eq. (1) as X = f (1 + f )−2d and maximize X . If the maximal X = X c occurs at f = f c then one sees inductively that f L (X c ) ≤ f c for all L. Hence the generating function f (X ) converges for all X ≤ X c . By logarithmic differentiation, 1 2d 1 dX = − X df f 1+ f so at f = f c ,
1 fc
=
2d 1+ f c .
Thus f c =
1 2d−1
and
X c = (2d − 1)2d−1 (2d)−2d = (1 +
−(2d−1) 1 (2d)−1 2d−1 )
Therefore Vd (n) ≤
n i=1
ai ≤ f (X c )X c−n ≤
(2ed)n . 2d − 1
≥ (2ed)−1 .
Counting Regions with Bounded Surface Area
309
For the lower bound, note that for each sequence (d2 , . . . , dn ) with di ∈ {1, 2, . . . , d} we can construct a complex by taking a sequence of d-cubes with the i th cube located one step in the positive di th direction from the (i − 1)st cube. This gives d n−1 distinct connected complexes.
4. Bounds for Sd (n) We start with an upper bound for the number of primitive contours with given (d − 1)dimensional volume. Note that this volume is always even since the surface area of each cube is even. Theorem 6. For all d ≥ 2 and even n ≥ 2d, Sd (n) ≤
n (8e2 d 2 )n/d 8d 3
≤ (8d)2n/d .
Proof. Let B = ∂ B be a primitive contour. Then, by Lemma 4, the floor-stack graph G of B is connected. Fix a spanning tree of G. Then we can reconstruct the floors by specifying each floor as a rooted (d − 1)-complex together with connecting stacks. We can obtain an upper bound for the number of primitive contours containing the cube Cd−1 by alternately growing floors and stacks. We shall define a generating function g(X, Y ) = ar,s X r Y s , where ar,s will bound the number of possible spanning trees with total stack size s and total floor volume r . We define g by g(X, Y ) = X (1 + κg(X, Y ))2(d−1) ,
(2)
where κ=
4Y 1−Y
+ 1.
(3)
To see that this gives an upper bound, consider growing a complex starting with Cd−1 . For each of the 2(d − 1) faces of Cd−1 we can either attach nothing, attach the neighboring horizontal (d − 1)-cube (extending the current floor), or attach a stack, together with a horizontal (d − 1)-cube at the other end of the stack. Note that we can never attach two stacks (since together they would form a single stack) or a stack and a horizontal cube (since then the stack would not lie in the boundary of the floor). In the cases when we attach cubes, we continue building the complex from the new horizontal (d − 1)cube. If we attach a stack, then the stack can go in one of two directions (up or down) and the horizonal (d − 1)-cube at the other end of the stack can be attached in one of two positions. The stack can be any positive integral length. Hence we get a factor of 4(Y + Y 2 + Y 3 + . . . ) = 4Y/(1 − Y ). Adding one to include the possibility of extending the floor, we get a factor of κg for each face of Cd−1 that we add something to. If we define g0 (X, Y ) = 0 and g L+1 = X (1 + κg L )2(d−1) then g L is a polynomial in X with each coefficient a polynomial in Y divided by some power of 1 − Y . If 0 < Y < 1 then the coefficients increase and stabilize at the corresponding coefficients of g. Hence, as before, g converges provided X ≤ X c , where X c is the maximum value of g/(1 + κg)2(d−1) . This maximum occurs at g = gc = 1/((2d − 3)κ) with X c = gc (1 + κgc )−2(d−1) = κ −1 (2d − 3)2d−3 (2d − 2)−(2d−2) ≥ (2(d − 1)κe)−1 . (4) Next, we bound the number of contours containing a fixed vertical (d − 1)-cube as the root. In this case, we grow the spanning tree of G starting with a stack. Since the root may
310
P. N. Balister, B. Bollobás
lie in the middle of a stack, and there are floors at each end of the stack, the generating function for these is bounded by g(X, ˜ Y ) = (Y + 2Y 2 + 3Y 3 + . . . )(2g(X, Y ))2 =
4Y g(X, Y )2 . (1 − Y )2
The term kY k (2g(X, Y ))2 comes from choosing the stack of length k (with k possible choices for the root). We then grow the contour starting with two floors, each of which starts in one of two directions. Let h(X ) be the generating function for the number of primitive contours containing the cube Cd−1 . Fix such a contour B. Then B contributes a term of the form X r Y s to g(X, Y ), where r = |Bd= | and s is the sum of the stack lengths, in particular s ≤ n = |B|. Indeed, B contributes many such terms, one for each spanning tree of the floor-stack graph. Now consider the above construction, except that instead of taking dimension d as vertical, dimension i as vertical for each i = 1, . . . , d. Then B contributes at d take least i=1 X ri Y si to the generating function g(X, Y ) + (d − 1)g(X, ˜ Y ). (The root Cd−1 is vertical in (d − 1) dimensions and horizontal in only one.) Since ri = |Bi= |, ri = n and si ≤ n. Thus the AM-GM inequality and the fact that Y < 1 gives d
X ri Y si ≥ d X n/d Y
si /d
≥ d X n/d Y n .
i=1
Hence, for any 0 < Y < 1 and 0 < X < X c = X c (Y ) we have g(X, Y ) + (d − 1)g(X, ˜ Y ) ≥ d h(X 1/d Y ). We wish to maximize X 1/d Y subject to remaining inside the domain of convergence of d 1 g and g. ˜ A reasonably good choice is Y = Y0 = d+1 and X 0 = 8ed 2 . Then κ = 4d + 1, −1 X c ≥ (2e(d − 1)(4d + 1)) ≥ X 0 , and X 0 Y0d ≥
1 8ed 2 (1 + 1/d)d
≥ X1 =
1 . 8e2 d 2
Now (1 +
κ 2(d−1) ) 8d 2
2
≤ e2(4d+1)(d−1)/8d ≤ e,
so 1 (1 + 8dκ 2 )−2(d−1) 8d 2
≥
1 8ed 2
= X0.
Hence g0 = g(X 0 , Y0 ) ≤ 8d1 2 . Also g˜ 0 = g(X ˜ 0 , Y0 ) = 4d(d + 1)g02 ≤ number of primitive contours of size n containing Cd−1 is at most 1/d
−n/d
h(X 1 )X 1
−n/d
≤ d1 (g0 + (d − 1)g˜ 0 )X 1
≤
1 . 8d 2
Thus the
1 (8e2 d 2 )n/d . 8d 2
Each contour surrounding Cd must contain a vertical translate of Cd−1 at a vertical distance less than n/(2(d − 1)) ≤ n/d below the hyperplane xd = 0. Thus Sd (n) ≤
n (8e2 d 2 )n/d . 8d 3
Counting Regions with Bounded Surface Area
311
Finally for d ≥ 2, n 8d 3
3
≤ en/8d ≤ e0.04n/d ,
so n (8e2 d 2 )n/d 8d 3
≤ (8e2.04 d 2 )n/d ≤ (8d)2n/d .
Now, we turn to a lower bound on Sd (n). Theorem 7. For all d ≥ 2 and all even n ≥ 4d 2 we have n−4d 2
Sd (n) ≥ d 2(d−1) ≥ (Cd)n/2d . Proof. Let us use the procedure defined in Theorem 5 that for each sequence (d2 , . . . , dk+1 ) with di ∈ {1, 2, . . . , d} builds a complex by taking a sequence of d-cubes with the i th cube located one step in the positive dith direction from the (i − 1)st cube. This gives d k distinct connected complexes with surface area 2(k + 1)(d − 1) + 2. To get an arbitrary even surface area, add a 2 × j × 1 × · · · × 1 block in one of the negative directions. This increases the surface area by j (4d − 6) + 4 − 2 (the −2 is due to the loss of the joining face). Thus we obtain a complex with surface area 2(k + 1 + 2 j)(d − 1) + 4 − 2 j. If we choose j so that 4 − 2 j ≡ n mod 2(d − 1) then one can solve n = 2(k + 1 + 2 j)(d − 1) + 4 − 2 j for k. We can choose j so that 1 ≤ j ≤ d − 1, so n ≤ 2k(d − 1) + 4(d − 1)2 + 4 ≤ 2k(d − 1) + 4d 2 . The result follows.
Combining the upper and lower bounds we see that (C1 d)n/2d ≤ Sd (n) ≤ (C2 d)2n/d for sufficiently large even n. 5. Bounds for Sd (n) Now we extend the results of the previous section to count all contours, rather than just primitive ones. Theorem 8. For all d ≥ 2 and large even n, Sd (n) ≤
n (8e17/8 d 2 )n/d 8d 3
≤ (9d)2n/d .
Proof. As in Theorem 6, let h(X ) be the generating function for the number of primitive contours of volume n containing Cd−1 . Fix a primitive contour containing n (d − 1)cubes. Then there are a total of (at most) (d − 1)n common (d − 2)-cubes, since each (d − 2) cube occurs as the face of (at least) two (d − 1)-cubes, and each (d − 1)-cube has 2(d − 1) faces. An arbitrary contour can be obtained by attaching a contour to some of the (d −2)-cube boundaries of the component cubes of some primitive contour. The way in which this attachment is done is essentially unique, since there are only two possible other (d −1)-cubes that meet this (d −2)-cube, and both must be in the attached contour. By a suitable ordering of all (d − 1)-cubes in Rd , we can fix one of these as the root of
312
P. N. Balister, B. Bollobás
the added contour. Thus, the number of contours with volume n is bounded above by the coefficient of X n in f (X ) = h(X (1 + f (X ))d−1 ). From the previous section we know that h((8e2 d 2 )−1/d ) ≤ (8e2 d 2 )−1/d /(1 +
1 d−1 ) 8d 2
= (8e2 d 2 (1 +
1 . 8d 2
1 d(d−1) −1/d ) ) 8d 2
Since
≥ (8e17/8 d 2 )−1/d ,
if we set X 2 = (8e17/8 d 2 )−1/d then f (X 2 ) converges and f (X 2 ) ≤ 8d1 2 . Thus the number of contours of surface area (at most) n containing Cd−1 is bounded by f (X 2 )X 2−n ≤
1 (8e17/8 d 2 )n/d . 8d 2
As before, any contour surrounding Cd must contain one of n/d translates of Cd−1 , so Sd (n) ≤ Finally for d ≥ 2,
n 8d 3
≤ e0.04n/d so
n (8e17/8 d 2 )n/d . 8d 3 n (8e17/8 d 2 )n/d 8d 3
≤ (8e2.165 d 2 )n/d < (9d)2n/d .
Our lower bound on Sd (n) is even easier to prove. Theorem 9. For all d ≥ 2 and even n ≥ 2d 2 we have Sd (n) ≥
d (n−d 2 )/2d 2
≥ (Cd)n/d .
Proof. We can write n = 2dk + 2(d − j) for some k > j, 0 ≤ j ≤ d − 1. Fix a sequence (d1 , . . . , d j ) with di ∈ {1, . . . , d} and a sequence ( p j+1 , . . . , pk ), where each pi is an unordered pair (di,1 , di,2 ), di,s ∈ {1, . . . , d}. Construct a complex by starting with the root Cd and adding cubes so that the (i + 1)st cube is located one step in the positive dith direction from the i th cube when i ≤ j and is one step in both the positive di,1 and di,2 directions when i > j. The boundary of this complex is a contour surrounding Cd with k− j surface area 2d(k + 1) − 2 j = n. There are d j d2 such contours. Thus Sd (n) ≥ d j
d k− j 2
≥
d k− j/2 2
≥
d (n−d 2 )/2d 2
.
Combining the upper and lower bounds we find that (C1 d)n/d ≤ Sd (n) ≤ (C2 d)2n/d for sufficiently large even n. Note that these bounds are ‘closer’ than for Sd (n) since the lower bound is much larger.
Counting Regions with Bounded Surface Area
313
6. Bounds for Sd (n) It turns out that this quantity is much larger than Sd (n) or Sd (n), i.e., there are many more connected rooted complexes with a given surface area than there are contours. Theorem 10. For fixed d ≥ 2 and all sufficiently large even n, n
Sd (n) = n 2d(d−1) (1+o(1)) . Proof. For a lower bound, consider the large cube [0, N + 2]d consisting of (N + 2)d d-cubes. Remove k of the N d central cubes which are of the form C((a1 , . . . , ad )) with d d i=1 ai ≡ s mod 3. For some choice of s ∈ {0, 1, 2} there will be at least N /3 choices N d /3 for these cubes, and hence at least k ≥ (N d /3ek)k possible resulting d-complexes. The restrictions on the cubes that are removed ensures that the resulting complex will always be connected, and will have surface area exactly 2d(N + 2)d−1 + 2dk. Assuming N > 2d, we can add cubes of the form C((−1, i, 0, 0, . . . , 0), i = 2, 4, . . . 2 j, increasing the surface area by (2d − 2) j. Assume n is large and even. In particular, assume n ≥ 2d(N + 2)d−1 + 2d 2 . One can choose j and k so that 2dk + (2d − 2) j = n − 2d(N + 2)d−1 . To do this, first choose j so that 2 j ≡ −n mod 2d, 0 ≤ j < d. Then solve for k = k(N ). Finally, choose N so that n ≈ 2d(log N )N d−1 . Then k ≈ (log N − 1)N d−1 n and (N d /3ek)k = n 2d(d−1) (1+o(1)) . This shows that Sd (n) is at least as large as claimed. It is somewhat harder to prove a good upper bound. Note that Rd \ B has precisely one infinite component, and the boundary of this component is a contour ∂ B with B ⊆ B . Thus ∂ B is obtained by fixing a contour ∂ B and then adding some contours inside B . Given n 0 = |∂ B | ≤ n, there are at most (9d)2n 0 /d choices for B , and each such B has volume at most v = (n 0 /2d)d/(d−1) with equality if B is a large cube. We shall add choices k contours, each of which must surround some cube of B . There are v+k−1 k for these root cubes (we allow a root cube to be chosen more than once). The added contours will be of sizesn 1 , . . . , n k , where n 1 + · · · + n k = n − n 0 . Since each n i ≥ 2d, there are n−n 0 −2dk+k−1 choices for the n i , i > 0 (we need to partition the ‘excess’ k−1 r = n − n 0 − 2dk as the sum of k numbers). Now each contour can be chosen in at most (9d)2n i /d ways. Thus n−n 0 −2dk+k−1 Sd (n) ≤ . (9d)2(n 0 +n 1 +···+n k )/d v+k−1 k k−1 k,n 0 :2dk+n 0 ≤n
Now (9d)2(n 0 +···+n k )/d = (9d)2n/d = n o(n) , n 2 = n o(n) choices for (k, n 0 ). Hence Sd (n) ≤
k,n 0 :2dk+n 0 ≤n
m
v+k n−n 0 −2dk+k o(n) n ≤ k k
r
≤ (m/r )r , and there are at most max
2dk+n 0 ≤n
(v+k)(n−n 0 −2dk+k) k2
k
n o(n) .
We bound v = (n 0 /2d)d/(d−1) ≤ n 1/(d−1) n 0 , and n 0 ≤ n − 2dk. Thus v + k ≤ n 1/(d−1) (n − 2dk) + k ≤ n 1/(d−1) (n − 2dk + k) and so (v + k)(n − n 0 − 2dk + k) ≤ n 1/(d−1) (n − 2dk + k)2 .
314
P. N. Balister, B. Bollobás
This shows that Sd (n) ≤ max
2dk≤n
n 1/(d−1) (n−2dk+k)2 k2
k
n o(n) .
Now we maximize over k. Taking logarithms, we need to maximize 2dk + k) − 2k log k. Differentiating with respect to k gives log n d−1
+ 2 log n−2dk+k − k
2(2d−1)k n−2dk+k
k log n d−1
+ 2k log(n −
− 2.
But k ≤ n − 2dk + k, so for sufficiently large n this is always positive. Hence the maximum is attained for the maximum possible k = n/2d. Substituting this we get Sd (n) ≤ n k/(d−1)+o(n) = n n/(2d(d−1))+o(n) .
7. Polymer Expansion for the Ising Model One application of our bounds is to the convergence of the low temperature expansion of the d-dimensional Ising model in terms of Peierls contours (see [3]). The general result of Kotecký and Preiss [2] (see also Dobrushin [1] and Scott and Sokal [4]) about the convergence of cluster expansion implies the following assertion (see also Lemma 2.1 of [3]). Lemma 11. The polymer expansion constructed for the Ising model in terms of Peierls contours is convergent at inverse temperature β if there exists a positive function a(γ ) such that, for any contour γ , e−β|γ |+a(γ ) ≤ a(γ ), γ
where the sum is taken over all contours γ that intersect γ . Using our results on Sd (n) we can improve considerably the Lebowitz-Mazel bound on β implying the convergence of the polymer expansion. Theorem 12. The polymer expansion constructed for the Ising model in terms of Peierls contours converges at inverse temperature β for all β ≥ d2 log(11d). Proof. Each γ that intersects γ must have some common (d − 2)-cube with γ . Fixing this (d − 2)-cube C, γ is forced to contain at least one of the four (d − 1)-cubes meeting C. Since there are (at most) (d − 1)|γ | (d − 2)-cubes in γ , it is enough to show e(α−β)|γ | ≤ α, 4(d − 1) γ
where we have chosen a(γ ) = α|γ | and the sum is over all contours containing a fixed (d − 1)-cube. In other words, we need to show that 4(d − 1)
∞ n=1
cn e(α−β)n ≤ α,
Counting Regions with Bounded Surface Area
315
where cn is the number ofrooted contours with surface area n. If eα−β ≤ X 2 , then from (α−β)n ≤ 1 . Thus we can take α = 1 , provided β the proof of Theorem 8, ∞ n=1 cn e 2d 8d 2 is at least α − log X 2 =
1 2d
+
1 d
log(8e17/8 d 2 ) =
1 d
log(8e21/8 d 2 ) ≤
2 d
log(11d).
Acknowledgements. We are grateful to Roman Kotecký for drawing our attention to the problems discussed in this note.
References 1. Dobrushin, R.L.: Estimates of semi-invariants for the Ising model at low temperatures. In: edited by Topics in Theoretical and Statistical Physics, R.L. Dobrushin, R.A. Minlos, M.A. Shubin and A.M. Vershik, Providence, RI: Amer. Math. Soc., 1996, pp. 59–81 2. Kotecký, R., Preiss, D.: Cluster expansion for abstract polymer models. Commun. Math. Phys. 103, 491– 498 (1986) 3. Lebowitz, J.L., Mazel, A.E.: Improved Peierls argument for high-dimensional Ising models. J. Stat. Phys. 90, 1051–1059 (1998) 4. Scott, A.D., Sokal, A.D.: The repulsive lattice gas, the independent-set polynomial, and the Lovász local lemma. J. Stat. Phys. 118, 1151–1261 (2005) Communicated by J.L. Lebowitz
Commun. Math. Phys. 273, 317–355 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0232-4
Communications in
Mathematical Physics
Large N Expansion of q -Deformed Two-Dimensional Yang-Mills Theory and Hecke Algebras Sebastian de Haro1 , Sanjaye Ramgoolam2 , Alessandro Torrielli3 1 Max-Planck-Institut für Gravitationsphysik, Albert-Einstein-Institut, 14476 Golm, Germany.
E-mail: [email protected]; [email protected]
2 Department of Physics, Queen Mary, University of London, Mile End Road, London E1 4NS, UK.
E-mail: [email protected]
3 Institut für Physik, Humboldt Universität zu Berlin, Newtonstraße 15, D-12489 Berlin, Germany.
E-mail: [email protected] Received: 21 March 2006 / Accepted: 13 December 2006 Published online: 17 May 2007 – © Springer-Verlag 2007
Abstract: We derive the q-deformation of the chiral Gross-Taylor holomorphic string large N expansion of two dimensional SU (N ) Yang-Mills theory. Delta functions on symmetric group algebras are replaced by the corresponding objects (canonical trace functions) for Hecke algebras. The role of the Schur-Weyl duality between unitary groups and symmetric groups is now played by q-deformed Schur-Weyl duality of quantum groups. The appearance of Euler characters of configuration spaces of Riemann surfaces in the expansion persists. We discuss the geometrical meaning of these formulae.
Contents 1. Introduction and Summary of the Results . . . . . . . . . . . . . 2. Hecke Algebras and the Chiral Expansion of q-Deformed 2dYM 2.1 Review of the Gross-Taylor expansion . . . . . . . . . . . . 2.2 Hecke algebras and Schur-Weyl duality . . . . . . . . . . . 2.3 A Hecke formula for the q-dimension . . . . . . . . . . . . 2.4 Hecke q-generalization of sums over symmetric groups of 2d Yang Mills . . . . . . . . . . . . . . . . . . . . . . . . 3. Manifolds with Boundary . . . . . . . . . . . . . . . . . . . . . 4. Chiral Large N Expansion for Wilson Loops . . . . . . . . . . . 5. On the Role of Quantum Characters in q-Deformed 2d YM . . . 5.1 Consistency of Wilson loops . . . . . . . . . . . . . . . . . 5.2 Gauge invariance of Wilson loops . . . . . . . . . . . . . . 6. Discussion and Outlook . . . . . . . . . . . . . . . . . . . . . . A. Central Elements . . . . . . . . . . . . . . . . . . . . . . . . . A.1 Centrality of q-deformed conjugation sum . . . . . . . . . . A.2 Centrality of q-deformed commutator sum . . . . . . . . . A.3 The elements D and E of Hn (q) . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
318 320 320 321 324
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
326 329 331 334 335 336 337 339 339 341 342
318
S. de Haro, S. Ramgoolam, A. Torrielli
B. Quantum Dimensions . . . . . . . . . . . . . . . C. Projectors . . . . . . . . . . . . . . . . . . . . . C.1 H3 . . . . . . . . . . . . . . . . . . . . . . . C.2 H4 . . . . . . . . . . . . . . . . . . . . . . . C.3 The construction for Hn . . . . . . . . . . . . D. q-Schur-Weyl Duality and q-Characters . . . . . D.1 Uq (su(2)) conventions . . . . . . . . . . . . D.2 Schur-Weyl duality in spin-one . . . . . . . . D.3 Quantum characters in spin-one representation References . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
343 345 345 346 347 349 349 350 352 353
1. Introduction and Summary of the Results Two-dimensional Yang-Mills theory, on a Riemann surface of genus G and of area A, can be solved exactly. The partition function is 2 Z YM (G, A) = (dim(R))2−2G e−gYM A C2 (R) . (1.1) R
This result was first obtained using the lattice formulation, followed by a continuum limit [1]. The sum is over all irreducible representations of the gauge group, the cases U (N ) or SU (N ) will be of interest here. Gross and Taylor [2–4] studied the large N expansion of two-dimensional Yang-Mills theory with gauge group U (N ) and SU (N ) and showed that it is equivalent to a string theory. They showed that the large N expansion is given by a non-chiral expansion, which is a sum involving chiral and anti-chiral factors. The chiral expansion of (1.1)1 is given by G ∞ 1 1 + δ 2−2G Z YM (G) = si ti si−1 ti−1 . (1.2) n n! N (2G−2)n n=0
si ,ti ∈Sn
i=1
It is a sum consisting of delta functions over symmetric groups, which count homomorphisms from the fundamental group of punctured Riemann surfaces to the symmetric groups. These homomorphisms are known to count branched covers of G . It was shown in [5, 6] that the chiral sum actually computes an Euler character of moduli spaces of holomorphic maps with fixed target space. This was done by expanding the factors, and recognising that the coefficients in the expansion are Euler characters of configuration spaces of (branch) points on G . Topological string theory constructions were then used to derive a path integral which localizes to an integral of the Euler class on the moduli space of holomorphic maps. For simplicity we are discussing only the chiral part of the partition function here, but there is an analogous expansion for the full partition function. A different string action involving harmonic maps was proposed in [7]. Two-dimensional Yang-Mills has recently found a surprising new application in connection with topological strings on a non-compact Calabi-Yau and black hole entropy [8]. The q-deformation of two-dimensional Yang-Mills has also found an application in this context [9, 10]. The partition function of q-deformed Yang-Mills has been obtained by replacing the scalar dual field of the Yang-Mills field strength by a compact scalar. 1 In this paper we work at zero area. The computations can be generalized to the case of finite area along the lines of [3, 4].
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras
319
Such a compact scalar is natural from the point of view of the worldvolume of D4-branes wrapping a 4-cycle of the non-compact Calabi-Yau. New connections with Turaev invariants have also been suggested [11]. The q-deformation of two-dimensional Yang-Mills theory was studied earlier [12, 13] (see also [14]). The q-deformation of the zero area partition function of two-dimensional Yang-Mills is Z q YM (G) = (dimq R)2−2G . (1.3) R
In the context of [10] this is the limit where the degree p of one of the line bundles is zero. In the q-deformed Yang-Mills, the universal enveloping algebra of U (N ) is replaced by Uq (u(N )). The exact partition function for a closed Riemann surface, which is expressed in terms of dimensions of irreducible representations of U (N ), is now expressed in terms of q-dimensions of Uq (u(N )) representations. The same remarks apply to Uq (su(N )). The underlying algebraic relation which leads to the relation between the sum over U (N ) representations in (1.1) and the delta functions over symmetric groups in (1.2) is Schur-Weyl duality, which we describe further in Sect. 2. The q-deformation of the Schur-Weyl duality between U (N ) and Sn is known [15]. In this q-deformation, the role of the group algebra of Sn ( denoted by CSn ) is played by the Hecke algebra Hn (q). In this paper, we show that the large N , chiral Gross-Taylor expansion, in terms of symmetric group data can be q-deformed to give an expansion in terms of Hecke algebra data. In this case we find the following result: Z q YM (G) =
∞ 1 [N ](2−2G)n δ g n=0 si ti ∈Sn G −1 −1 2−2G −l(si )−l(ti ) q h(si )h(ti )h(si )h(ti ) . × D n
(1.4)
i=1
Here, h(s) ∈ Hn is the Hecke algebra element associated to s ∈ Sn . That such an expansion is possible at all in the quantum case is highly non-trivial and very much suggestive of a geometric interpretation in terms of deformations of maps, on which we comment in Sect. 6. The possibility of the expansion (1.4) depends crucially on the existence of suitable central elements of the Hecke algebra (like D and n , to be defined later). These central elements play an important role in that they also determine the data on manifolds with closed boundary: Z (G ; C1 , . . . , C B ) =
⎛ × δ ⎝ D 1−B 2−2G−B n
[N ](2−2G−B)n
R G i=1
1 g st i i
q −l(si )−l(ti ) h(si )h(ti )h(si−1 )h(ti−1 )
B
⎞ Cj⎠ .
(1.5)
j=1
In this formula, the central elements of the Hecke algebra take over the role of the holonomies of the gauge field around the B boundaries of G . We also work out the case of non-intersecting Wilson loops. We develop an analog of the Verlinde formula for the tensor product multiplicity coefficients of SU (N ) in terms of characters of the Hecke algebra. To our knowledge, this formula has not appeared in
320
S. de Haro, S. Ramgoolam, A. Torrielli
the literature. Expectation values of Wilson loops can now again be written as Hecke delta functions which are natural deformations of the symmetric group delta functions. In four appendices we give some of the facts and proofs about Hecke algebras that we use in the main text. To our knowledge, some of the formulas proven in these appendices are not available in the mathematical literature before. 2. Hecke Algebras and the Chiral Expansion of q-Deformed 2dYM 2.1. Review of the Gross-Taylor expansion. Before we do the q-deformed case, we will review the main tools used in the derivation of the partition function of 2d Yang-Mills as a topological theory counting branched covers of the Riemann surface. For full details we refer to [5]. For simplicity, we discuss the case of zero-area and no Wilson loops in this section. We start writing out the partition function as a sum over Young tableaux: Z 2dYM (G ; A) =
(dim(R))2−2G =
∞
(dim(R(Y )))2−2G ,
(2.1)
n=0 Y ∈YnN
R
where we sum over the set YnN of SU (N ) Young diagrams with n boxes and number of rows less than N . Of course, we also sum over diagrams with arbitrary number of boxes. The chiral expansion is derived by dropping the constraint on the number of rows. Next we use Schur-Weyl duality to derive the following fomula: dim(R) =
Nn χ R (n ) . n!
(2.2)
We are using a notation where R = R(Y ) denotes both the SU (N ) and the Sn representation corresponding to a Young tableau with n boxes, Y . χ R is a character of the symmetric group, and n is a particular central element in CSn given in [3, 4]. The chiral Gross-Taylor expansion is obtained as Z 2dYM (G ; A) = =
∞
N
(2−2G)n
n=0 R ∞ (2−2G)n
N
n=0
dR n!
2−2G
1 χ R (2−2G ) n dR
G 1 −1 −1 2−2G δ n si ti si ti . n! si ,ti ∈Sn
(2.3)
i=1
The fact that n is a central element in the group algebra CSn is important. This is explained in more detail and generalized to the q-deformed case in Sect. (2.3). Another important identity which enters (2.3) is 2 1 n! −1 −1 χ R (sts t ) = , dR dR
(2.4)
s,t∈Sn
where it is easy to see that s,t sts −1 t −1 is a central element of CSn . We find (2.50), which gives the q-deformation of this equation, and we prove related centrality properties for Hn (q) in Appendix A.
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras
321
2.2. Hecke algebras and Schur-Weyl duality. There is a natural generalization of the previous formulas using Hecke algebras. In this subsection we review basic facts about Hecke algebras and derive some formulas that we will use in what follows. The symmetric group Sn can be defined in terms of generators si (i = 1, . . . , n − 1), which obey relations si2 = 1 for i = 1, . . . , n − 1, si si+1 si = si+1 si si+1 for i = 1, . . . , n − 2, si s j = s j si for |i − j| ≥ 2.
(2.5)
The minimal length of a word in the si which is equal to a permutation σ is called the length of the permutation and is denoted as l(σ ). The Hecke algebra Hn (q) is defined in terms of generators gi which obey [16] gi2 = (q − 1) gi + q for i = 1, . . . , n − 1, gi gi+1 gi = gi+1 gi gi+1 for i = 1, . . . , n − 2, gi g j = g j gi for |i − j| ≥ 2.
(2.6)
The Hecke algebra has, as a vector space, a basis h(σ ) labelled by the elements σ of Sn . This is often called the “standard basis” in the literature. These elements h(σ ) are obtained by expressing the σ as a minimal length word in the si and then replacing the si by gi . These Hecke algebras arise as the algebra of operators on V ⊗n , the n-fold tensor product of the fundamental representation of U (N ) or SU (N ), which commute with the action of Uq (u(N )) or Uq (su(N )), the q-deformation of the universal enveloping algebra of u(N ) or su(N ), respectively. The action of the q-deformed enveloping algebras on V ⊗ V is given by the co-product . This obeys the following relations with respect to the R-matrix: R = R, (P R) = (P R) .
(2.7)
For h ∈ Uq , if we write (h) = h 1 ⊗ h 2 , then (h) = h 2 ⊗ h 1 . P is the permutation ˇ For the R-matrix we will use the convenoperator. P R is also commonly denoted by R. tions of [17]. To make that explicit, we write RFRT . The Hecke algebra is related to the algebra of the Rˇ FRT as: √ √ (2.8) g = q Rˇ FRT qFRT = q . gi corresponds to Rˇ FRT acting in the tensor product Vi ⊗ Vi+1 and is sometimes called a braid operator. Since the centralizer of Uq is the Hecke algebra, we can construct the projectors for irreducible representations of Uq in terms of words in the gi . g1 acts on the product space V1 ⊗ V2 , therefore there are two possible projectors that we can construct [17]: q −1 ˇ = 1 (1 + g), (1 + q R) −1 q +q 1+q q q −1 ˇ P = (1 − q R) = (1 − q −1 g) , −1 q +q 1+q
P
=
(2.9)
322
S. de Haro, S. Ramgoolam, A. Torrielli
which project onto the totally symmetric and antisymmetric tensor products of the fundamental representation, respectively. Using (2.6), one easily checks that they satisfy PR2 = PR .
(2.10)
The symmetric projector is illustrated in Appendix D in terms of properties of ClebschGordan coefficients of Uq (su(2)). Projectors are useful to compute characters in a particular representation in terms of lower-dimensional representations. For example, taking the trace of the above, q −1 q + q −1 q Tr U = q + q −1
Tr
U =
(trU )2 + q tr ⊗ tr Rˇ (U ⊗ 1)(1 ⊗ U ) ,
(trU )2 − q −1 tr ⊗ tr Rˇ (U ⊗ 1)(1 ⊗ U ) ,
(2.11)
where the traces on the right-hand side are taken in the fundamental representation, Tr = trV = tr. From now on we will indicate such traces by trn = trV ⊗n = tr⊗. . . ⊗tr. The U ’s in (2.11), which are matrix elements of representations of Uq , generate the dual algebra to Uq denoted by Funq (SU (N )) or Funq (U (N )) (see for example [18, 19, 13] ). Using known facts about Hecke algebras and the q-deformation of the Schur-Weyl duality between U (N ) and Sn , we will now derive the generalization for arbitrary irreducible representations: PR =
d R (q) −l(σ ) q χ R (h(σ −1 )) h(σ ) , g σ
(2.12)
where l(σ ) is the length of the permutation, i.e. the number of elements in the minimal presentation of the permutation as a product of simple transpositions. The character is taken in the Hecke algebra Hn . Without danger of confusion, we will denote Hn and Funq (SU (N )) characters with the same symbol. The characters for low values of n can be read off from the tables in [16, 20]. d R (q) is the q-deformation of the dimension of a representation of the symmetric group, and g reduces to n! in the classical limit: li lj (q − 1)(q 2 − 1) . . . (q n − 1) i j (q − q ) d R (q) = m , m(m−1)(m−2) 2 l i 6 i=1 (q − 1)(q − 1) . . . (q − 1) q g=
(1 − q)(1 − q 2 ) . . . (1 − q n ) , (1 − q)n
(2.13)
where li = λi + m − i and λ1 ≥ λ2 ≥ ..λm ≥ 0 are the row lengths of the Young diagram, and m is the number of non-zero λ’s. In order to derive (2.12), recall the familiar relation in the q = 1 case: χ R (U ) =
1 χ R (σ ) trn (σ U ) . n! σ
(2.14)
Here R is both the U (N ) reprsentation corresponding to a Young diagram and the Sn rep corresponding to the same diagram. The trace on the right-hand side is taken in V ⊗n , that is U acts as U ⊗ U ⊗ ... ⊗ U and σ acts by permuting the vectors of the tensor product.
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras
323
The above is obtained from the fact that, if V is the fundamental representation of U (N ) or the universal enveloping algebra U (u(N )), then V ⊗n can be decomposed on the terms of the product group U (N ) × Sn as U (N )
V ⊗n = ⊕ R V R
⊗ V RSn .
(2.15)
The sum is over Young diagrams of Sn , V RSn is the irrep of Sn corresponding to the Young U (N ) diagram R, while V R is the irrep of U (N ) corresponding to the same Young diagram. Similar relations hold when U (N ) is replaced by SU (N ). An immediate consequence of the above expansion is tr (σ U ) = χ R (σ ) χ R (U ). (2.16) R
Then we can use orthogonality of characters of Sn , χ R (σ ) χ S (σ −1 ) = n! δ R S ,
(2.17)
σ
to obtain (2.14). From (2.15) it also follows that d R χ R (U ) = trn (PR U ) ,
(2.18)
hence we can read off PR =
dR χ R (σ −1 ) σ . n! σ
(2.19)
The decomposition analogous to (2.15) holds for Uq (u(N )), when CSn is replaced by the Hecke algebra Hn (q) [15]: V ⊗n = ⊕ R V R q ⊗ V RHn . U
(2.20)
U
Here V R q is the irrep of Uq (u(N )) corresponding to the Young diagram R and V RHn is the representation of Hn corresponding to the same Young diagram. It follows from (2.20) that χ R (h(σ )) χ R (U ). (2.21) trn (h(σ ) U ) = R
U lives in the deformed algebra of functions on U (N ) denoted as Funq (U (N )). This can be defined as the dual to Uq (U (N )). For further discussion on the duality see for example [18, 19, 13]. In (2.21) U acts as (U ⊗ 1⊗ 1⊗ · · · )(1⊗ U ⊗ 1 ⊗ · · · )(1⊗ 1⊗ U ⊗ 1⊗ · · · ) · · · (1 ⊗ 1 ⊗ · · · ⊗ 1 ⊗ U ). (2.22) This product of n U ’s is dual to the co-product which defines the action of Uq on V ⊗n . As will be explained in Sect. 5 (see also Appendix D), quantum traces contain the u-element associated to the Hopf algebra Uq (su(N )). We get the quantum trace if we take a trace of the action of u U on the left-hand side of (2.20) to get trn (h(σ ) ρn (u U )) = χ R (h(σ )) χ R (u U ). (2.23) R
324
S. de Haro, S. Ramgoolam, A. Torrielli
Here ρn (u) = u ⊗n and U acts as above. For the case of diagonal U , the formula (2.21) is used in [20]. Multiplying the left- and right-hand side of (2.21) with q −l(σ ) χ S (h(σ −1 )), and using the orthogonality relation [21] for Hecke characters d R (1) δRS , q −l(σ ) χ R (h(σ )) χ S (h(σ −1 )) = g (2.24) d R (q) σ we get
q −l(σ ) χ R (h(σ −1 )) trn (h(σ )U ) = g
σ
d R (1) χ R (U ) . d R (q)
This means that the character can be expressed as 1 d R (q) −l(σ ) χ R (U ) = q χ R (h(σ −1 )) trn (h(σ )U ) . g d R (1) σ
(2.25)
(2.26)
This equation can be interpreted as giving us the projection on a fixed Young diagram from the sum in (2.15). Indeed, note that (2.20) implies, by projecting on a fixed Young diagram: d R (1) χ R (U ) = trn (PR U ) . Comparing with (2.26) we see that the projector is 1 PR = d R (q) q −l(σ ) χ R (h(σ −1 )) h(σ ) , g σ
(2.27)
(2.28)
as claimed above. In the appendix we check that it satisfies (2.10). If we use orthogonality starting from (2.23) rather than (2.21), then we get 1 d R (q) −l(σ ) (q) χ R (U ) ≡ χ R (u U ) = q χ R (h(σ −1 )) trn (h(σ )(u U )) . (2.29) g d R (1) σ Note that u ⊗n commutes with h(σ ). We will specialize to U = 1 in order to get a new formula for the q-dimension in Sect. (2.3). 2.3. A Hecke formula for the q-dimension . Recall that in the case q = 1 there is a very useful formula for the dimension of SU (N ) reps which follows from Schur-Weyl duality [5]. This formula can be obtained by specializing (2.14) to U = 1. To that end we need to compute the trace of a permutation acting on V ⊗n . If σ = 1, we just get N n . If σ = (12)(3)(4)..(n), we get N n−1 . In general we get one factor of N for each cycle in the permutation. If the permutation has cycles of length i occuring with multiplicity ki the power of N is N ki . In the 2d Yang-Mills literature this is also denoted as N K σ . So the useful formula for the dimension in 2d Yang-Mills [3, 4] is 1 dim(R) = χ R (σ ) N K σ n! σ Nn = χ R (σ ) N −n+ i ki (σ ) . (2.30) n! σ =
Nn χ R (n ). n!
(2.31)
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras
325
The last line defines the element n . It is convenient to write this as a sum over conjugacy classes. Let T be a conjugacy class, which is given by specification of the cycle decomposition of the permutations involved. We will write C T = σ ∈T σ . Note this is a central element of the group algebra CSn , i.e. it commutes with all the elements of Sn . So the above can be rewritten as 1 dim(R) = χ R (C T ) N i ki (T ) . (2.32) n! T
We can now find the q-generalization of this formula by setting U = 1 in (2.29), to obtain
1 d R (q) −l(σ ) dimq (R) = q χ R h(σ −1 ) trn (h(σ )u) . (2.33) g d R (1) σ ∈Sn
We can manipulate the above sum, using cyclicity of the trace and the Hecke relations, to reduce it to a sum over conjugacy classes T in Sn , with the only terms appearing inside trn being the trn (u h(m T )). m T are permutations in the conjugacy class T which have minimal length when expressed in terms of generators. They are the minimal words in [16]. For n = 3, m T are 1, g1 , g1 g2 for the 3 conjugacy classes. We prove in Appendix B (B.7), trn (u h(m T )) = q
N +1 2
l(T )
[N ]
i ki
,
(2.34)
where the q-number is [N ] =
q N /2 − q −N /2 , q 1/2 − q −1/2
(2.35)
and l(T ) is the length of the permutation m T . We will explain below that the Hecke algebra elements C T appearing as the coefficients of q −l(T ) tr(u h(m T )) are central. Hence the formula for the q-dimension becomes dimq (R) =
N −1 1 d R (q) χ R (C T (q)) [N ] i ki (T ) q 2 l(T ) . g d R (1)
(2.36)
T
Examples of this formula are described in Appendix B, along with checks against the standard formula in terms of a product of q-numbers over the cells of the Young diagram. We now explain the centrality property of C T . Starting from the formula for the projector (2.12) we can express it in a reduced form using cyclicity and Hecke relations, where we only have the characters of the minimal words in each conjugacy class: PR =
1 d R (q) χ R (h(m T )) C T . g
(2.37)
T
Here T runs over conjugacy classes, and m T are the minimal words. For the formulas up to n = 4, see Appendix C. We can get the projector to the form (2.37) because, by using cyclicity of χ R and the Hecke relations, the Hecke characters can be expressed in terms of these basic characters [16]. Now for every R, PR is a central element of the Hecke algebra since it is a projector for the irreducible representation R. There are as many conjugacy classes T as irreducible representations R. Hence C T must be central
326
S. de Haro, S. Ramgoolam, A. Torrielli
elements. When we calculate the q-dimension we get (2.33). When we manipulate the expression to express it in terms of q −l(T ) trn (u h(m T )), we are using the same Hecke relations and cyclicity (of trn this time): dimq (R) =
1 d R (q) −l(T ) q χ R (C T ) trn (h(m T ) u) . g d R (1)
(2.38)
T
This immediately leads to (2.36). Incidentally, (2.37) seems to give a relatively efficient way of calculating the central class elements compared to the ones we are aware of in the mathematical literature. Some interesting papers with explicit formulae for Hecke central elements, which we found useful, are [22, 23].
2.4. Hecke q-generalization of sums over symmetric groups of 2d Yang Mills. The string theory interpretation of 2d Yang Mills at q = 1 is centred on formulae derived from Schur-Weyl duality. The character relations following from Schur-Weyl give rise to a formula for dimensions of SU (N ) reps in terms of Sn reps. Then some group theory manipulations lead to an expression of the chiral partition function in terms of delta functions over the symmetric group. The delta function is defined over the symmetric group or, more generally, over the group algebra of the symmetric group: δ(σ ) = 1 if σ = 1, δ(σ ) = 0 otherwise .
(2.39)
A useful property of this delta function is that it can be expressed in terms of characters, n! δ(σ ) = d R χ R (σ ) . (2.40) R
The expressions arising in the 2d Yang-Mills string take the form δ (σ1 σ2 · · · σk ) ,
(2.41)
and the weights depend on the genus G and on N in precisely such a way that the chiral partition function can be expressed in terms of a sum of Euler characters of moduli spaces of holomorphic maps (see Sect. 7 of [5]). Now we will describe a q-generalization of this story, where the Hecke algebra will replace the group algebra of the symmetric group. A q-analog of the delta function on the symmetric group is known in the theory of Hecke algebras [21]. It is defined as: δ(h(σ )) = 1 if σ = 1, δ(h(σ )) = 0 otherwise .
(2.42)
Our δ(h(σ )) is g1 tr(h(σ )) in the notation of [21] for the canonical trace function tr(h(σ )). This q-deformed delta function reduces exactly to the delta function on the symmetric group defined above when q → 1. It can be expressed as g δ(h(σ )) = d R (q) χ R (h(σ )) , (2.43) R
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras
327
where R runs over partitions of n or Young diagrams with n boxes, and g is given in (2.13). An important fact we will use in what follows is that for C a central element of the Hecke algebra, and for arbitrary σ ∈ Sn , χ R (C) χ R (h(σ )) = d R (1) χ R (C h(σ )) ,
(2.44)
which follows simply from Schur’s lemma applied to the Hecke algebra. We now have all the elements we need in order to rewrite the quantum dimensions in terms of central elements of the Hecke algebra. Using (2.36), we can write [N ]n d R (q) χ R (n ) . g d R (1)
dimq (R) =
(2.45)
In the quantum case the ’s are expressed as N −1 [N ] K T −n q 2 l(T ) C T n = T
= 1+ ≡
T 1 + n
[N ] K T −n q
N −1 2
l(T )
CT
,
(2.46)
where the unprimed sum runs over the central elements of Hn . The restricted sum (denoted by the prime) runs over all central elements associated with conjugacy classes of Sn which are not the identity. The last line is a definition of n . Making repeated use of (2.44), we find that for a central element we have:
χ R (C) m χ R (C m ) . (2.47) = d R (1) d R (1) It now follows from (2.45) that (dimq (R))m = = =
[N ]n d R (q) g
m
χ R (m ) d R (1)
m ∞ n [N ] d R (q) d(m, ) ⎛
×⎝
g
=0
⎞
C Ti ⎠ [N ]
d R (1) i
(2.48) χR
K Ti −n
q
N −1 2
i
l(Ti )
,
i=1 Ti
(m+1) , and we wrote out the definition of . where d(m, ) = ( +1) (m− +1) Let us develop the q-deformed chiral Gross-Taylor expansion
Z =
∞ 2−2G dimq (R) n=0 R∈Yn
=
∞ n=0 R∈Yn
[N ]
(2−2G)n
d R (q) g
2−2G
1 χ R 2−2G . d R (1)
(2.49)
328
S. de Haro, S. Ramgoolam, A. Torrielli
Now we can show (see Appendix A) that
2
1 g χ R h(s)h(t)h(s −1 )h(t −1 ) . = q −l(s)−l(t) d R (q) d R (1)
(2.50)
s,t∈Sn
We also show in the appendix that the element q −l(s)−l(t) h(s)h(t)h(s −1 )h(t −1 )
(2.51)
s,t∈Sn
is central in Hn . Hence we have G
2G 1 g −1 −1 − i (l(si )+l(ti )) χR = q h(si )h(ti )h(si )h(ti ) . d R (q) d R (1) s ,t ···s ,t 1 1
i=1
G G
(2.52) Now we employ this equation in (2.49) to get Z =
∞
q−
i (l(si )+l(ti ))
[N ](2−2G)n
n=0 R∈Yn si ti
× χR
G i=1
=
d R (q) g d R (1)
2
h(si )h(ti )h(si−1 )h(ti−1 )
∞
[N ]
(2−2G)n
n=0 R∈Yn si ti
× χ R 2−2G
G
d R (q) g
2
χ R (2−2G ) q−
i (l(si )+l(ti ))
d R (1)
h(si )h(ti )h(si−1 )h(ti−1 ) ,
(2.53)
i=1
where we sum over Sn permutations s1 , t1 , . . . , sG , tG . At this point the manipulations performed in the classical case do not generalize straightforwardly to the quantum case because of the different powers of d R (q) and d R (1). We need to introduce an element D of the Hecke algebra with the property χ R (D) = d R (q) .
(2.54)
The existence of this element is proven in Appendix A, where an explicit expression is given for it in terms of an infinite sum. Let us find it explicitly for low values of n. For n = 2, 3, we can solve the above equation explicitly. We find for n = 2, D=
1 + q2 1 − q + g1 , 1+q 1+q
and for n = 3, D=
1 + q 2 + 2q 3 + q 4 + q 6 (1 − q)(2 + 2q + q 2 + 2q 3 + 2q 4 ) + g1 (1 + q)(1 + q + q 2 ) (1 + q)(1 + q + q 2 ) (1 + q)(1 − q)2 + g1 g2 . 1 + q + q2
(2.55)
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras
329
We note that D → 1 in the classical limit. Using the form of D in the appendix we can 2 χ R (D) write ddRR(q) (1) = d R (q) d R (1) , which allows us to rewrite (2.53), ∞ G 1 [N ](2−2G)n δ D 2−2G q −l(si )−l(ti ) h(si )h(ti )h(si−1 )h(ti−1 ) . Z = g st n=0
i=1
i i
(2.56) In the last step we used (2.43). This is the q-analog of the Gross-Taylor expansion. We can expand the -factors as follows: Z =
∞
∞ N −1 1 i=1 l(Ti ) [N ](2−2G)n+ i (K (Ti )−n) q 2 g n=0 si ti =0 T1 ...T G −1 −1 ×d (2 − 2G, ) δ D C T1 . . . C T h(si )h(ti )h(si )h(ti ) . (2.57)
q−
i (l(si )+l(ti ))
i=1
As explained in [5], the factor of d(2 − 2G, ) is the Euler character of the configuration space of points on G , denoted as χ (G, ). Hence we can write Z =
∞ n=0 si ti
q−
i (l(si )+l(ti ))
∞ N −1 1 [N ](2−2G)n+ j=1 (K (T j )−n) q 2 g =0 T1 ...T
×χ (G, ) δ D C T1 . . . C T
G
q
j=1 l(T j )
−(l(si )+l(ti ))
h(si )h(ti )h(si−1 )h(ti−1 )
. (2.58)
i=1
3. Manifolds with Boundary We now describe the chiral large [N ] expansion of q-deformed 2d Yang-Mills theory on manifolds with boundary, in terms of Hecke algebras. We recall the classical case first. For a Riemann surface of genus G with B boundaries and boundary holonomies U1 , . . . , U B in SU (N ), the parition function is Z YM (G, B; U1 , . . . , U B ) =
(dim R)2−2G−B χ R (U1 ) χ R (U2 ) . . . χ R (U B ) . (3.1)
R 1 B It is useful in that case to multiply by ( n! ) trn (T1 U1† ) trn (T2 U2† ) . . . tr(TB U B† ) and integrate over the holonomies, where T1 , . . . , Tn are sums of permutations in fixed conjugacy classes in Sn . Then the chiral Gross-Taylor expansion becomes
G 1 N n(2−2G−B) δ T1 . . . TB 2−2G−B si ti si−1 ti−1 . Z YM (G, B; T1 , . . . , TB ) = n n! s ,t i i
i=1
(3.2) This is basically a Fourier transformation, and the derivation is explained in [24].
330
S. de Haro, S. Ramgoolam, A. Torrielli
For q-deformed 2d Yang-Mills, the holonomies along the boundaries are specified by the quantum characters [13, 12] of Uq (SU (N )): Z q YM (G, B; U1 , . . . , U B ) =
(dimq R)2−2G−B χ R (U1 ) χ R (U2 ) . . . χ R (U B ). (3.3)
R
Now we can insert ( g1 ) B trn (C T1 U1† ) trn (C T2 U2† ) . . . tr(C TB U B† ). In this case, C T1 , . . . , C TB are central elements in Hn (q) which approach the class sums T1 , T2 , . . . , TB in the limit q → 1. They have appeared in the formulae for the q-dimension earlier. We use the expansion tr(C T U † ) =
χ S (C T ) χ S (U † ) ,
(3.4)
S
where χ S (C T ) is the Hecke algebra character in the representation S. Then we integrate the quantum group elements U1 , . . . , U B , and use the orthogonality [13, 12] dU χ R (U ) χ S (U † ) = δ R S .
(3.5)
The result is Z q YM (G, B; C T1 , . . . , C TB ) =
B χ R (C T j ) 2−2G−B dimq R g
R∈Yn
=
R
[N ]
j=1
(2−2G−B)n
d R (q) χ R () g d R (1)
2−2G−B
B χ R (C T j ) j=1
(3.6)
g
1 [N ](2−2G−B)n δ g si ti ⎞ ⎛ B−1 G B E 2−2G−B q −l(si )−l(ti ) h(si )h(ti )h(si−1 )h(ti−1 ) CT j ⎠ . ×⎝ g
=
i=1
j=1
In the second line we used (2.48), and in the last line we employed (2.52). The element E is defined in (A.14). As in manipulations of the partition function we repeatedly used (2.44) to combine products of characters. Finally to obtain the delta function from the Hecke characters, we used (2.43). In the q = 1 limit (3.6) reduces to a delta function over the group algebra of Sn , counting maps with specified conjugacy classes of permutations at the boundaries. There is now some deformation of this geometry, involving central elements of the Hecke algebra Hn (q) associated with the boundaries. It is very intersting that for B = 1 we do not have the Eg factors. Recall also that Eg = 1 in the q = 1 limit. In the q-deformed theory there is a notion of a delta-function over the quantum group -valued holonomies [13]. It is the partition function on the disk, therefore the case G = 0,
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras
331
B = 1 of the above. We compute directly: δ(U, 1) =
n;σ ∈Sn
=
n;σ ∈Sn
1 [N ]n q −l(σ ) δ(D h(σ −1 )) trn (h(σ ) u U ) g 1 [N ]n Q σ trn (h(σ ) u U ) , g
(3.7)
where we defined D = σ Q σ h(σ ). Using (3.5), we can integrate this expression against any test function to obtain a form that depends purely on the Hecke algebra. In particular, the above gives another expression for the quantum dimensions. Thus, in the q-deformed theory the partition function on a disk of zero area continues to be associated to a flat connection, in the quantum group sense [13].
4. Chiral Large N Expansion for Wilson Loops After having computed the partition function on closed Riemann surfaces and Riemann surfaces with boundaries, we should now discuss the chiral expansion of Wilson loops. For simplicity, we will consider non-intersecting Wilson loops in this section. The basic object we need to take into account are the SU (N ) tensor multiplicity coefficients [13, 12]. Indeed, consider a surface of genus G = G 1 + G 2 with a Wilson loop in representation S, where G 1 and G 2 are the genera of the inner and outer faces of the Wilson loop. The expectation value of this Wilson loop is W S (G) =
1−2G 1 1−2G 2 dimq R2 χ R1 (U ) χ S (U ) χ R2 (U † ) , dU dimq R1
R1 R2
(4.1) where R1 and R2 are the representations of the inner and outer faces, respectively. Since we are discussing the case of q non-root of unity, the result of the above quantum integral is the usual SU (N ) tensor multiplicity coefficients (Littlewood-Richardson coefficients). Thus we are set to compute W S (G) =
(dimq R1 )1−2G 1 (dimq R2 )1−2G 2 N RR12S .
(4.2)
R1 R2
Our next task is to look for an expression for the Littlewood-Richardson coefficients that we can interpret as a deformation of the Riemann surface. Thus, we want to write them as delta functions on the Hecke algebra. We start from the definition: N RR13R2 =
dU χ R1 (U ) χ R2 (U ) χ R3 (U † ),
(4.3)
and observe that the above is a trace of the following operator acting in R1 ⊗ R2 : dU χ R3 (U † ) ρ R1 ⊗R2 (U ) .
(4.4)
332
S. de Haro, S. Ramgoolam, A. Torrielli
Now R1 can be realized in V ⊗n 1 with multiplicity d R1 (1) when we project on the given Young diagram, and likewise for R2 . It is also useful to note that the above operator is proportional to a projector for the representation R3 , 1 ρ R ⊗R (PR3 ) . (4.5) dU χ R3 (U † ) ρ R1 ⊗R2 (U ) = dimq R3 1 2 Using the expression for the projectors for R1 and R2 in terms of the Hecke algebra, we obtain d R1 (q) d R2 (q) −l(σ1 )−l(σ2 ) 1 N RR13R2 = q χ R1 (h(σ1−1 )) χ R2 (h(σ2−1 )) g1 g2 g3 d R1 (1) d R2 (1) σ σ 1 2
1 × (4.6) trV ⊗n1 ⊗V ⊗n2 (h(σ1 ) · h(σ2 )) PR3 . dimq R3 Here and in what follows we take σi ∈ Sn i for i = 1, 2, 3. Writing out the projector (2.28), we get N RR13R2 =
d R1 (q) d R2 (q) 1 g1 g2 g3 d R1 (1) d R2 (1) × q −l(σ1 )−l(σ2 )−l(σ3 ) χ R1 (h(σ1−1 )) χ R2 (h(σ2−1 )) χ R3 (h(σ3−1 )) σ1 σ2 σ3
×
d R3 (q) trV ⊗n1 ⊗V ⊗n2 (h(σ1 ) · h(σ2 )) h(σ3 ) , dimq R3
(4.7)
and expanding the trace in a basis of Young tableaux with n 1 + n 2 boxes, we get N RR13R2 =
d R1 (q) d R2 (q) 1 g1 g2 g3 d R1 (1) d R2 (1) × q −l(σ1 )−l(σ2 )−l(σ3 ) χ R1 (h(σ1−1 )) χ R2 (h(σ2−1 )) χ R3 (h(σ3−1 )) σ1 σ2 σ3
×
d R3 (q) dimq R3
χ S (h(σ1 ) · h(σ2 )) h(σ3 ) dimq S .
(4.8)
S∈Yn 1 +n 2
If we now use the projector property χ S (PR3 h(σ )) = δ R3 S χ R3 (h(σ ))
(4.9)
and the explicit form of the projector in (2.12) then we have the useful orthogonality relation
d R (q) q −l(σ3 ) 3 χ R3 (h(σ3−1 )) χ S h(σ3 )(h(σ1 ) · h(σ2 )) g3 σ 3
= χ R3 (h(σ1 ) · h(σ2 )) δ R3 S .
(4.10)
This can be used to simplify the expression (4.8) further to 1 d R1 (q) d R2 (q) −l(σ1 )−l(σ2 ) q N RR13,R2 = g1 g2 d R1 (1) d R2 (1) σ σ 1 2
× χ R1 (h(σ1−1 )) χ R2 (h(σ2−1 )) χ R3 (h(σ1 ) · h(σ2 )).
(4.11)
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras
333
This formula is reminiscent of the Verlinde formula for the fusion coefficients of orbifold conformal field theories [25], or alternatively of Chern-Simons theory with finite groups [26, 27]. It would be interesting to understand the connection. If we go from the character basis to the basis in terms of central elements of the Hecke algebra, and using the above, we get 1 N RR13R2 χ R1 (C1 ) χ R2 (C2 ) χ R3 (C3 ) g1 g2 g3 R1 R2 R3
1 δ(h(σ1−1 )C1 ) δ(h(σ2−1 )C2 ) q −l(σ1 )−l(σ2 ) g1 g2 σ σ 1 2 1 × d R (1) χ R3 (h(σ1 ) · h(σ2 )C3 ) g3 3
=
R3
1 1 δ(h(σ1−1 )C1 ) δ(h(σ2−1 )C2 ) q −l(σ1 )−l(σ2 ) δ E C3 (h(σ1 ) · h(σ2 )) = g1 g2 σ σ g3 1 2
σ σ 1 C1 1 C2 2 δ E C3 (h(σ1 ) · h(σ2 )) . (4.12) = g1 g2 g3 σ σ 1
2
E is the element defined in (A.14) of Appendix A. We have denoted by C σ the coefficients which appear in the expansion of the central element C, C=
C σ h(σ ) ,
(4.13)
σ
and we used the following property of the trace [21]: δ(h(σ )h(σ )) = q l(σ ) if σ σ = 1, δ(h(σ )h(σ )) = 0 otherwise.
(4.14)
Consider now the computation of a simple Wilson loop, in the representation S, separating a region with G 1 handles from another region with G 2 handles, WS =
n 1 ,n 2 R1 R2
(dimq R1 )1−2G 1 (dimq R2 )1−2G 2 N RR12S
d R1 (q) 1−2G 1 = [N ] (χ R1 ())1−2G 1 g1 d R1 (1) R1
d R2 (q) 1−2G 2 n 2 (1−2G 2 ) × [N ] (χ R2 ())1−2G 2 N RR12S . g2 d R2 (1)
n 1 (1−2G 1 )
(4.15)
R2
We now use (4.11) with the fusion coefficient, multiply by the character of some central element C in Hn S (q) and sum over S W (C, G 1 , G 2 ) =
χ S (C) S
gS
WS .
(4.16)
334
S. de Haro, S. Ramgoolam, A. Torrielli
Collecting all S dependences we have 1 d S (q) χ S (C) χ S (h(σ2−1 )) = δ(C h(σ2−1 )). g S d S (1)
(4.17)
S
Hence we obtain W (C; G 1 , G 2 ) =
1 δn 1 +n S ,n 2 [N ]n 1 (1−2G 1 )+n 2 (1−2G 2 ) q −l(σ1 )−l(σ2 ) g g 1 2 n 1 ,n 2 σ1 σ2
G × δ(C h(σ2−1 )) δ D 1 1 1−2G 1 h(σ1−1 ) (4.18)
× δ 1G 2 1−2G 2 (h(σ1 ) · h(σ2 )) .
The factors of [N ] are as above. We have defined 1G 1 = 1G 2 =
q−
i
l(si )−l(ti )
s1 ,t1 ..sG 1 ,tG 1
G1
h(si )h(ti )h(si−1 )h(ti−1 ),
i=1
q−
i
l(si )−l(ti )
s1 ,t1 ..sG 2 ,tG 2
G2
h(si )h(ti )h(si−1 )h(ti−1 ).
(4.19)
i=1
Expanding C=
C σ h(σ )
σ
P = D 1−2G 1 1G 1 =
P σ h(σ ) ,
(4.20)
σ
we finally get W (C; G 1 , G 2 ) =
∞ n 1 =0
1 [N ]γ P σ C σ δ 1−2G 2 1G 2 (h(σ ) · h(σ )) . g1 g2
(4.21)
σσ
We defined γ = n 1 + n 2 − 2(n 1 G 1 + n 2 G 2 ) = (2 − 2G)n 1 + n S (1 − 2G 2 ), where we used n 2 = n 1 + n S . 5. On the Role of Quantum Characters in q-Deformed 2d YM In this paper we have used quantum Uq (SU (N ), characters rather than classical SU (N ) characters. For the computations in [10] it seemed enough to consider classical SU (N ) characters. So one can ask: does one need to compute with quantum characters, or do the classical ones suffice? In this section we argue that quantum characters are needed in the generic situation; in fact, they are extremely natural and they provide the simplest solution to the problem of crossings and gluing along open lines. Our arguments are consistent with [10], where the dimensions appearing in the partition function (1.3) were quantum dimensions but the characters associated with boundaries and Wilson loops were classical SU (N ) characters. In particular, this paper did not consider crossings on the surface, and gluing constructions involved closed curves only. In the absence
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras
335
of crossing points, both the classical and the quantum characters lead to a topological invariant theory. It is a well-known fact from Chern-Simons theory that one can do without R-matrices or other quantum group structure as long as one considers simple Wilson loops – for example, toric ones, whose expectation value follows from surgery. In the 2d Yang-Mills case, the basic gluing formula along circles is (3.5), which is valid both for classical and quantum characters, and ensures topological invariance of the gluing construction along circles. More precisely, the need for quantum characters in qYM can be seen: 1) in the presence of Wilson loops with non-trivial crossings; 2) when gluing along open lines. The original definition of qYM is well-known [13, 12] and it involves quantum characters. In the following subsections we collect several arguments that show the need for quantum characters. 5.1. Consistency of Wilson loops. One of the basic consistency conditions to be imposed on a Wilson loop is that, if the charge of the particle is zero, the expectation value of the Wilson loop should be that of the unit operator; in other words, it should give back the partition function of the theory. In our case, if W R (G; C) is the Wilson loop operator in representation R around the curve C on the Riemann surface of genus G, consistency requires W R=ρ (G; C) = 1 = Z q YM (G ),
(5.1)
where ρ is the Weyl vector labeling the trivial representation. Thus, we should reproduce: 1 Wρ (G; C) = (dimq S)2−2g q − 2 A C2 (S) . (5.2) S
We will check whether quantum dimensions and classical characters are consistent with this for a Wilson loop with crossings. Consider the expectation value of the Wilson loop W R (G; C) in Fig. 1. In this case we have A = A1 + A2 + A3 , where A1 is the area of the outer face, which has genus G. We get: W R (G; C) 1 = (dimq (R1 ))1−2g dimq (R2 ) dimq (R3 ) q − 2 (A1 C2 (R1 )+A2 C2 (R2 )+A3 C2 (R3 )) R1 R2 R3
×
dU dV χ R1 (U −1 V −1 ) χ R2 (U ) χ R3 (V )χ R (U V −1 ),
(5.3)
where, since we are dealing with classical characters, dU is the Haar measure. Let us compute this in the trivial case: R = ρ. We can compute the integrals using the character formula χ R2 (V ) . (5.4) dU χ R2 (U )χ R3 (U −1 V ) = δ R2 R3 dim(R2 ) We get (dimq (S))3−2g S
dim(S)
1
q − 2 AC2 (S) ,
(5.5)
336
S. de Haro, S. Ramgoolam, A. Torrielli
R V U
A3 A2 R3 R2 A1 R1 Fig. 1. A Wilson loop with a crossing.
which disagrees with (5.2). The reason that the dimensions do not come out right is that we were forced to use formula (5.4). We conclude that this procedure is not consistent. On the other hand, the same computation can be carried out with quantum characters, and in that case we do get the quantum dimension in (5.4).
5.2. Gauge invariance of Wilson loops. There is a short proof of gauge invariance for the Wilson loops and boundary elements we have discussed in previous sections. Let U ∈ Funq (SU (N )) (for more details on this see Appendices B and D), and consider the ad-action of Funq (SU (N )) on itself: ad : U → h U S(h) ,
(5.6)
where we are considering Funq (SU (N )) as a Hopf algebra with antipode S [17]. It is easy to see that the quantum trace Tr (u U )
(5.7)
is left invariant under this action (for the definition of the u-element, see Appendix B). We get: Tr (u hU S(h)) = Tr (S 2 (h)uU S(h)) = (S 2 (h))i j (uU ) jk (S(h))ki = (S 2 (h))i j (S(h))ki (uU ) jk = S(h ki (S(h))i j )(uU ) jk = Tr (u U ) ,
(5.8)
where we used (h) = 1, and the fact that u satisfies u x = S 2 (x) u
(5.9)
for any x ∈ Funq (SU (N )). Thus, gauge invariance in Funq (SU (N )) is ensured provided we include the u-element.
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras
We have proven that the triple
Migdal gluing , quantum dimensions , classical characters
337
(5.10)
is inconsistent in the generic case. To get a consistent theory, we need to modify one of the above. If instead of quantum dimensions we use classical dimensions, we of course get back the usual 2d Yang-Mills. If we want the dimensions to be quantum, we either need quantum characters, or a modification of the gluing rules. The possibility to have quantum characters has been discussed at length in this paper, and it has been shown to be consistent in [13]. In particular, the theory is gauge invariant and independent of the triangulation. We do not exclude that there might be a complicated modification of the gluing rules that would allow to keep quantum dimensions and classical characters even in the presence of crossings. Additional features of the quantum characters are the following. The natural expansion of the quantum dimensions is in terms of quantum characters, which are most easily expressed in terms of a Hecke algebra, as we have shown. This gives a natural deformation of the symmetric group description of covering maps of the Riemann surface. Also in the case with boundaries, the use of quantum characters was essential for this. Finally, q-deformed 2d Yang-Mills computes invariants of knots in Seifert manifolds [28, 11]. This is also expected from open-closed string duality in the A-model with branes. This relation will however only work if on the qYM side we deform the gauge symmetry as well so as to get quantum characters, since only that will give the quantum 6j-symbols that appear in the Reshetikhin-Turaev invariant relevant for knots in Chern-Simons [11]. 6. Discussion and Outlook We have shown that the chiral large N expansion ( note that the q-number [N ] appears as the natural expansion parameter ) for q-deformed Yang-Mills can be described by Hecke algebras. The full large N expansion is expected to be given by a coupled product of chiral and anti-chiral contributions. We expect that techniques of this paper can be extended to give a precise description of this non-chiral expansion in terms of Hecke algebras. The string interpretation of q-deformed 2d Yang-Mills on G has been developed in [9, 10]. The leading order terms in the expansion, obtained by setting the factors to 1, were shown to compute Gromov-Witten invariants of a Calabi-Yau space X which is a direct sum of line bundles L p ⊕ L 2g−2− p fibered over G . The sub-leading terms, due to the factors were intepreted in terms of D-brane insertions at 2G − 2 points. This picture develops the Gross-Taylor interpretation (at q = 1) of the factors in terms of fixed points on the Riemann surface [3, 4]. An alternative interpretation of the factors underlies the topological string theory developed in [5, 6] for q = 1. The latter topological string is different from the standard one. It has been labelled a balanced topological string and has been observed to be an example of a general class of balanced topological field theories naturally related to Euler characters of moduli spaces [29]. It integrates over the moduli space of holomorphic maps the Euler class of the tangent bundle to that moduli space. The concrete connection between Euler characters and the large N expansion of two dimensional Yang-Mills is manifest when one expands the factors and recognizes the binomial coefficients as Euler characters of configuration spaces of points on the Riemann surface G [5]. Our treatment of the factors in the q-deformed case, which
338
S. de Haro, S. Ramgoolam, A. Torrielli
has expressed it in terms of central elements of the Hecke algebra, naturally lends itself to this interpretation. Euler characters of configuration spaces continue to appear in the expansion for the same reasons as at q = 1. This suggests that a closed topological string interpretation exists for the large N expansion of q-deformed two-dimensional Yang-Mills in terms of a balanced topological string. The simplest proposal along these lines is that the balanced topological string with target space X would give a closed string interpretation for the all orders expansion of q-deformed two-dimensional YangMills. The relation of such a picture to the D-brane insertions of [10] would involve an interesting incarnation of open-string/closed-string duality. Developing these relations requires a clearer understanding of the coupling between holomorphic and anti-holomorphic sectors in the context of the balanced topological string. The connection between the Gross-Taylor expansion and the Gromov-Witten invariants appearing in [9, 10] has also been discussed in [30, 31]. Given the rather simple Hecke q-deformation we have uncovered, of the sums over symmetric group delta functions related to the classical Hurwitz counting of branched covers, it is also natural to speculate that there is an intrinsically two-dimensional picture which would account for the Hecke delta functions, without appealing to the Calabi-Yau X . One possiblity is that we have q-deformed Riemann surfaces and maps between such Riemann surfaces. In fact q-deformed planes, known as Manin planes, have been studied and holomorphy has been discussed ( see for example [32]). One could construct Riemann surfaces which, in some sense, locally look like Manin planes, and consider holomorphic maps between them. As far as we are aware, such a theory of Hurwitz spaces for q-deformed Riemann surfaces has not yet been developed. While Hecke algebras are more familiar to mathematical physicists as centralizers of quantum groups acting in tensor spaces, they have another pure mathematical origin (see for example [33]). Hn (q) is an algebra of double cosets Bn (Fq ) \ G L n (Fq )/Bn (Fq ). Here Fq is the finite field with q elements, where q is a power of a prime p. (If q = p then Fq is just the field of residue classes modulo p.) G L n (Fq ) is the group of n × n matrices with entries in Fq . Bn (Fq ) is the subgroup of the upper triangular matrices. This generalises the fact that Sn appears from double cosets Bn (C) \ G L n (C)/Bn (C). Hence the deformation of CSn to the Hecke algebra Hn (q) corresponds to going from C to Fq . This suggests that, at least for q equal to a power of a prime, our Hecke-q-deformed Hurwitz counting problem might be related to Riemann surfaces over Fq . It is interesting that, in this context, fundamental groups can be defined and they still take the form −1 −1 bG u 1 · · · u B = 1 . a1 b1 a1−1 b1−1 a2 b2 a2−1 b2−1 · · · aG bG aG
(6.1)
There are also results on the moduli spaces of branched covers in this set-up, generalizing properties of classical Hurwitz space [34]. An interesting direction for the future is to determine if there is a relation between Hecke algebras Hn (q) and these moduli spaces, and if such a relation provides the geometrical meaning for the q-deformed Hecke counting problems in (2.56), (2.57). Classical and q-deformed 2d Yang-Mills are closely connected to Chern-Simons theory on Seifert manifolds [35, 11, 10, 36, 37]. On the other hand, some of the formulas in this paper, such as (4.11) , are suggestive of some connection of the chiral large N expansion of q-deformed 2d Yang-Mills and orbifold conformal field theories [25] or Chern-Simons theory for finite gauge groups [26, 27]. It is known that the ChungFukuma-Shapere three-dimensional topological field theory [38] is the absolute value squared of the partition function of the Dijkgraaf-Witten theory. It seems very likely that the chiral expansion in terms of Hecke characters worked out in this paper can
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras
339
be formulated in the two-dimensional topological field theory framework of [39, 38] with additional insertions coming from the branch points. It would be interesting to see in detail to what extent the chiral q-deformed 2d Yang-Mills theory is related to the Dijkgraaf-Witten theory. In view of the connection to Chern-Simons theory, it will be interesting to explore the q-deformed chiral as well when q approaches roots of unity. q-Schur Weyl duality at roots of unity has been discussed in [40, 41]. Acknowledgements. We thank Mina Aganagic, Luca Griguolo, Costis Papageorgakis, Gabriele Travaglini, and Ivan Todorov for useful discussions and correspondence. SR is supported by a PPARC Advanced Fellowship. The research of SR and SdH is in part supported by the EC Marie Curie Research Training Network MRTN-CT-2004-512194. The work of AT is supported by DFG (Deutsche Forschungsgemeinschaft) within the “Schwerpunktprogramm Stringtheorie 1096”. We have used the package NCAlgebra for some of our computations with the Hecke algebra.
A. Central Elements A.1. Centrality of q-deformed conjugation sum . We want to show that q −l(s) h(s)h(t)h(s −1 )
(A.1)
s
is central in Hq (n). Since Hq (n) is generated by g1 , ...gn−1 , it suffices to show that the above element commutes with these generators. We will first show it for g1 , and it will be clear the same proof can be repeated for g2 , etc. First recall how this works in the case q = 1. We write s1 sts −1 = (˜s )t s˜ −1 s1 s
s
=
s˜ t s˜ −1 s1 ,
s˜
where we defined s˜ = s1 s. The cancellation only uses a pair of terms at a time. For a fixed s, s1 sts −1 = s˜ t s˜ −1 s1 , s1 s˜ t s˜ −1 = sts −1 s1 , which means that [s1 , sts −1 ] + [s1 , s˜ t s˜ −1 ] = 0 .
(A.2)
It turns out that the same pairwise cancellation works for q = 1. It is instructive to check it explicitly for n = 3, 4. Below we give the general argument. Suppose s is of the form s1 u, where u is a word in the generators. Now recall that before applying the map h to s we must express it in reduced form. This means that if s = s1 u, the leftmost term in u is not s1 . The following can be derived easily h(s) = g1 h(u), l(s) = l(u) + 1, h(s −1 ) = h(u −1 )g1 .
340
S. de Haro, S. Ramgoolam, A. Torrielli
Then s˜ = s1 s = u. Now we write the pair of elements from (A.1) for the fixed s, s˜ . q −l(s) h(s)h(t)h(s −1 ) = q −l(u)−1 g1 h(u)h(t)h(u −1 )g1 , q −l(˜s ) h(˜s )h(t)h(˜s −1 ) = q −l(u) h(u)h(t)h(u −1 ) .
(A.3)
The commutator with the first term is [g1 , q −l(s) h(s)h(t)h(s −1 )] = q −l(u) h(u)h(t)h(u −1 )g1 + q −l(u)−1 (q − 1)g1 h(u)h(t)h(u −1 )g1 − q −l(u) g1 h(u)h(t)h(u −1 ) − q −l(u)−1 (q − 1)g1 h(u)h(t)h(u −1 )g1 . (A.4) The commutator with the second term in (A.3) is [g1 , q −l(u) h(u)h(t)h(u −1 )] = q −l(u) g1 h(u)h(t)h(u −1 ) − q −l(u) h(u)h(t)h(u −1 )g1 . (A.5) Combining the terms in (A.4) and (A.5) we see that the terms proportional to a power of q cancel between the two equations (as they must for this to work at q = 1). The terms containing a factor q − 1 cancel within (A.4). This proves that the sum (A.1) commutes with g1 . It has been done by decomposing the sum over Sn into a sum over left coset elements by the subgroup S2 generated by s1 , and a sum over representatives in each coset. The vanishing of the commutator with g1 works within the sum over representatives in each coset. To prove that it commutes with g2 · · · gn−1 we similarly decompose with respect to left cosets of s2 , · · · sn−1 . Hence (A.1) is central in Hq (n). It follows that its matrix representation in any irreducible representation must be diagonal. Using the matrices given in [16], we have checked this explicitly up to n = 4. A special case of (A.1) is given by the choice t = 1. Based on evidence described below, we conjecture that its character in an irreducible representation is
q −l(s)
s
χR g d R (1) (h(s −1 )h(s)) = , d R (1) d R (q)
(A.6)
with d R (q) and g as given in (2.13). Since the Hecke element in the character is central ( after summation over s ), it suffices to calculate it on one state in the irrep. We have checked this for general completely symmetric reps and completely antisymmetric reps, as well as for all representations up to n = 4, using the explicit matrices given in [16]. Another check of this formula is to multiply by d R (q)d R (1) and sum over young diagrams R with n boxes. Using (2.43), the LHS becomes gδ
q
−l(s)
h(s
−1
)h(s) .
(A.7)
s −1 l(s) But from [21] δ(h(s )h(s)) = q . Hence the LHS is equal to (g n!). On the RHS we have g R (d R (1))2 = (g n!) . This gives a consistency check of (A.6) for any n.
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras
341
Using (A.6) and (2.44), χR χR (h(s)h(t)h(s −1 )) = (h(s)h(s −1 )h(t)) q −l(s) q −l(s) d (1) d (1) R R s s g d R (1) χ R (h(t)) d R (q) d R (1) g = χ R (h(t)) . d R (q) =
Hence
s,t
(A.8)
χR (h(s)h(t)h(s −1 )h(t −1 )) q −l(t)−l(s) d R (1) χR χR (h(s)h(t)h(s −1 )) (h(t −1 )) q −l(t) q −l(s) = d (1) d (1) R R s,t g −l(t) χR (h(t −1 )) = q χ R (h(t)) d R (q) t d R (1) g q −l(t) χ R (h(t))χ R (h(t −1 )) = d R (q)d R (1) t
2 gd R (1) g g = . (A.9) = d R (q)d R (1) d R (q) d R (q)
The last sum over characters was done by using orthogonality (2.24). This shows the desired identity (2.50) A.2. Centrality of q-deformed commutator sum . We prove that the element C≡ q −l(s)−l(t) h(s)h(t)h(s −1 )h(t −1 )
(A.10)
s,t
of the Hecke algebra Hn (q) is central. In the q = 1 limit, this is s,t sts −1 t −1 , a sum of commutators of all group elements. Hence C is a q-deformed sum of commutators. Since Hn (q) is generated by g1 . . . gn−1 it suffices to prove that gi C = Cgi for any gi . We will start with g1 and it will be clear how to generalize to the other generators. Given the centrality of the q-deformed conjugation sum (A.1) we can write 1 ≡ g1 C − Cg1 = q −l(s)−l(t) h(s)h(t)h(s −1 )g1 h(t −1 ) s,t
−
q −l(s)−l(t) h(s)g1 h(t)h(s −1 )h(t −1 ) .
(A.11)
s,t
We want to prove 1 = 0. For q = 1 this can be proved as follows. If we define t = tˆs1 , s = sˆ s1 , we can write ss1 ts −1 t −1 = sˆ tˆsˆ −1 s1 tˆ−1 . (A.12) s,t
sˆ ,tˆ
342
S. de Haro, S. Ramgoolam, A. Torrielli
This shows that it is useful to think about the sums over Sn in terms of the cosets Sn /S2 , where the S2 is generated by s1 . Let us choose expressions for the elements of Sn in terms of words of minimal length in s1 ..sn . Let S+ be the set of words not ending with s1 on the right, and S− the set of elements of the form sˆ s1 . Clearly sˆ does not end with s1 : if it did s would not be in reduced form. Hence sˆ ∈ S+ . For such s = sˆ s1 , it is easy to see that h(s) = h(ˆs )g1 , l(s) = l(ˆs ) + 1, h(s −1 ) = g1 h(ˆs −1 ).
(A.13)
We can write 1 as
1 = (
+
s∈S+
−(
s∈S+
=
+
)(
s=ˆs s1 ∈S− ; sˆ ∈S+
s=ˆs s1 ∈S− ; sˆ ∈S+
)(
t∈S+
+
q −l(ˆs )−l(t)−1 h(ˆs )g1 h(t)g1 h(ˆs −1 )g1 h(t −1 ) +
q −l(ˆs )−l(tˆ)−2 h(ˆs )g1 h(tˆ)g12 h(ˆs −1 )g12 h(tˆ−1 )
sˆ ,tˆ∈S+
q −l(s)−l(t) h(s)g1 h(t)h(s −1 )h(t −1 ) −
q −l(s)−l(tˆ)−1 h(s)g1 h(t)g1 h(s −1 )g1 h(t −1 )
s,tˆ∈S+
s,t∈S+
−
q −l(s)−l(tˆ)−1 h(s)h(tˆ)g1 h(s −1 )g12 h(tˆ−1 )
s,tˆ∈S+
sˆ ,t∈S+
−
) h(s)g1 h(t)h(s −1 )h(t −1 )q −l(s)−l(t)
t=tˆs1 ∈S− ; tˆ∈S+
q −l(s)−l(t) h(s)h(t)h(s −1 )g1 h(t −1 ) +
s,t∈S+
+
) h(s)h(t)h(s −1 )g1 h(t −1 )q −l(s)−l(t)
t=tˆs1 ∈S− ; tˆ∈S+
t∈S+
+
q −l(ˆs )−l(t)−1 h(ˆs )g12 h(t)g1 h(ˆs −1 )h(t −1 ) −
q −l(ˆs )−l(tˆ)−2 h(ˆs )g12 h(tˆ)g12 h(ˆs −1 )g1 h(t −1 ) .
sˆ ,tˆ∈S+
sˆ ,t∈S+
This can be simplified by using g12 = (q −1)g1 +q. We get terms with powers q −l(s)−l(t) in the summand but without powers of (q − 1), terms proportional to (q − 1) and terms proportional to (q − 1)2 . The terms without powers of (q − 1) cancel pairwise among the 8 terms. The other terms can be written out explicitly, and seen to cancel. This proves that [g1 , C] = 0. When checking for commutation with gi , we organise the sums over Sn according to cosets of the S2 subgroup generated by si . Then the same argument as above applies to show that any of the generating gi commute with C. Hence C is central. A.3. The elements D and E of Hn (q) . Equation (A.6) also allows us to give an expression for D defined in (2.54). Let us write E= q −l(s) h(s −1 )h(s) s
= 1+
q −l(s) h(s −1 )h(s)
s
≡ 1 + E . The primed sum extends over elements in Sn excluding the identity. Then we can write χ R (E) gd R (1) = . d R (1) d R (q)
(A.14)
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras
Using that E is central
χ R (E) d R (1)
m
χ R (E m ) = = d R (1)
g d R (1) d R (q)
343
m .
(A.15)
Now let m = −1 to get χ R (E −1 ) =
d R (q) . g
(A.16)
Hence D = g E −1 ∞ =g (−1)k (E )k =g
k=0 ∞
k=0
u 1 ,u 2 ...u k
(−1)k
−1 q −l(u 1 )−l(u 2 )−...−l(u k ) h(u −1 1 )h(u 1 ) · · · h(u k )h(u k )
(A.17)
B. Quantum Dimensions The irreducible representations R of Uq (U (N )) can be realized as subspaces of V ⊗n , where V is the fundamental representation. The matrix elements of the fundamental j representation are denoted by U , with entries Ui (see Appendix D for explicit expressions), and the algebra generated by the U ’s is dual to Uq (U (N )) and is denoted by j Funq (U (N )). The commutation relations of the Ui are given in terms of the R-matrix in the reference we denote as FRT [17] (we are using U for T of this reference). We first derive formula (2.34), used in Sect. 2 to obtain a Hecke formula for the quantum dimensions. Thus we need to compute the trace trn (u h(m T )) that comes from the quantum character expression. The element u is: u=q
2
N N +1 i=1
2
−i E ii
.
(B.1)
The E i j act on the fundamental representation in the usual way E i j vk = δ jk vi .
(B.2)
Now we can use the FRT formula for the R-matrix to show that (tr ⊗ 1)(u ⊗ 1) P R = q N 1 , and tr(u) =
q N −q −N q−q −1
(B.3)
. This means that (tr ⊗ tr)(u ⊗ u) P R = q N
q N − q −N . q − q −1
Going back to the Hecke algebra conventions using (2.8) (q → (tr ⊗ 1)(u ⊗ 1) g1 = q
N +1 2
(tr ⊗ tr)(u ⊗ u) g1 = q
N +1 2
(B.4) √ q ), we get
1, [N ] .
(B.5)
344
S. de Haro, S. Ramgoolam, A. Torrielli
More generally, tensor products of traces act on uh(m T ) as (tr ⊗ tr ⊗ . . . ⊗ tri )(u ⊗ u ⊗ · · · ⊗ u)(g1 g2 · · · gi−1 ) = q (i−1)
N +1 2
[N ] .
(B.6)
We now need to find out how to built h(m T ) out of the gi ’s. Consider a conjugacy class in Sn , denoted by T , made of permutations which have K i cycles of length i. When expressed in terms of the generators si , the minimal length permutation in this conjugacy class, denoted by m T , has length i (i − 1)K i . The minimal permutations are given in terms of words of the form gi gi+1 ...gi+ j , such as the one appearing in (B.6). For such minimal words, we can use (B.6) to obtain N +1 N +1 trn (u h(m T )) ≡ tr⊗n u ⊗n h(m T ) = q 2 i (i−1)K i [N ] i K i = q 2 l(m T ) [N ] i K i . (B.7) This is the formula (2.34) used in the derivation of the q-dimension formula in Sect. 2. We now show explicitly how formula (2.29) works in some examples, and that it leads to a q-dimension formula in terms of central elements (2.36). For q-traces in V ⊗3 , i.e traces with u ⊗3 inserted, we have trq (1) = [N ]3 , trq (g1 ) = [N ]2 q trq (g1 g2 ) = [N ] q
N +1 2
N +1
, ,
(B.8)
therefore trq (g2 g1 g2 ) = (q − 1) trq (g2 g1 ) + q trq (g1 ) = (q − 1) q N +1 [N ] + q q
N +1 2
[N ]2 .
(B.9)
Now (2.29) gives for the q-dimension N +1 1 d R (q) [N ]3 χ R (1) + 2q −1 q 2 [N ]2 χ R (g1 ) dimq (R) = g d R (1) + q −3 χ R (g1 g2 g1 ) trq (g2 g1 g2 ) +q
−2
χ R (g1 g2 ) q
N +1
[N ] + q
−2
χ R (g2 g1 ) q
N +1
[N ] .
(B.10)
Filling in the above, we finally find N −1 1 d R (q) [N ]3 χ R (1) + q 2 [N ]2 χ R (g1 + g2 + q −1 g1 g2 g1 ) dimq (R) = g d R (1)
+ q N −1 [N ] χ R (g1 g2 + g2 g1 + q −1 (q − 1)g1 g2 g1 )
N −1 1 d R (q) [N ]3 χ R (1) + q 2 [N ]2 χ R (C T (2,1) ) + q N −1 [N ] χ R (C T (3) ) . = g d R (1) (B.11) The final expression contains central elements C T associated to conjugacy classes of Sn . There is the trivial conjugacy class containing the identity element, for which C T (q) = 1. There is C(2,1) (q) = g1 + g2 + q −1 g1 g2 g1 , for the conjugacy class corresponding to a
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras
345
single transposition. Finally there is C(3) (q) = g1 g2 + g2 g1 + (q−1) q g1 g2 g1 . It is easy to check that these elements commute with g1 , g2 . The above central elements and their generalizations are described in [43, 44]. They approach the correct classical limit of a sum of permutations in the appropriate conjugacy class. Using the Hecke characters given in [16, 20], we have checked that the above is consistent with the standard formula for the q-dimension as a product over the cells of the Young diagram: dimq R =
1≤i j≤N
q (λi −λ j + j−i)/2 − q −(λi −λ j + j−i)/2 , q ( j−i)/2 − q −( j−i)/2
(B.12)
where λ1 , . . . , λ N are the lengths of the rows of the Young tableau, and, for SU (N ), λ N = 0. For n = 2, an easy check gives for the symmetric representation : [N ][N + 1] , [2] and for the antisymmetric representation
(B.13)
:
[N ][N − 1] . [2]
(B.14)
We have also obtained by the above manipulations, explicit formulae for central elements for n = 4 which agree with those given in [22]. We have also checked that our formula for the quantum dimensions (2.36) agrees with the standard formula (B.12) for all representations up to n = 4. C. Projectors Below we give explicit checks that (2.28) indeed defines projectors. We do this for n = 3 and n = 4, that is for the Hecke algebras of H3 and H4 , and outline the method of [21] for the general case. C.1. H3 . We work out the projector for a general representation of H3 . It contains 3! = 6 independent terms, corresponding to the six elements of H3 . Using (B.9), we get
1 1 q −1 1 χ R (1) + χ R (g1 ) (g1 + g2 ) + PR = χ R (g1 g2 ) + 2 χ R (g1 ) g1 g2 g1 cR q q3 q
1 + 2 χ R (g1 g2 ) (g1 g2 + g2 g1 ) , (C.1) q where the term g1 g2 g1 corresponds to the (13) permutation. Notice that in Sn , s1 s2 s1 is in the same conjugacy class as s1 . Indeed, in the classical case where q = 1 the first term in (B.9) is absent and tr(g1 g2 g1 ) = tr(g1 ). In the quantum case, g1 g2 g1 has contributions from both χ (g1 g2 ) and χ (g1 ). This implies that it contributes to two different class elements.
346
S. de Haro, S. Ramgoolam, A. Torrielli
Using the expressions for the characters in [16], we get for the three H3 representations: =
P P
=
P =
1 c 1 c 1 c
(1 + g1 + g2 + g1 g2 + g2 g1 + g1 g2 g1 ) ,
q −1 1 2+ (g1 + g2 ) − (g1 g2 + g2 g1 ) , q q
1 1 1 1 − (g1 + g2 ) + 2 (g1 g2 + g2 g1 ) − 3 g1 g2 g1 . q q q
(C.2)
We have checked by explicit computation that they satisfy the projection equation (2.10) provided c c
=
q2 + q + 1 , q
= (q + 1)(q 2 + q + 1), c =
(q + 1)(q 2 + q + 1) . q3
(C.3)
This agrees exactly with the values given in [21], Eq. (C.7) below. C.2. H4 . For H4 , the projector contains 4! = 24 independent terms. The projector is: c R PR = a + b(g1 + g2 + g3 ) + c(g1 g2 + g2 g3 + g2 g1 + g3 g2 ) + dg1 g3 + f (g1 g2 g3 + g1 g3 g2 + g2 g1 g3 + g3 g2 g1 ) + h(g1 g2 g1 + g2 g3 g2 ) + k(g1 g2 g1 g3 + g1 g2 g3 g2 + g1 g3 g2 g1 + g2 g3 g2 g1 ) + lg2 g1 g3 g2 + m(g1 g2 g1 g3 g2 + g2 g1 g3 g2 g1 ) + ng1 g2 g3 g2 g1 + pg2 g1 g3 g2 g1 g3 . (C.4) The coefficients a, b, c, d, f, h, k, l, m, n, p depend on the representation. They are characters of q −l(σ ) χ (h(σ −1 )) which can be simplified, using cyclicity and the Hecke relations, to a d h k l
= χ (1) , b = q −1 χ (g1 ) , c = q −2 χ (g1 g2 ), = q −2 χ (g1 g3 ) , f = q −3 χ (g1 g2 g3 ), = q −3 [(q − 1)χ (g1 g2 ) + qχ (g1 )], = q −4 [(q − 1)χ (g1 g2 g3 ) + qχ (g1 g2 )], = q −4 [(q − 1)χ (g1 g2 g3 ) + qχ (g1 g3 )],
m = q −5 [(q 2 − q + 1)χ (g1 g2 g3 ) + q(q − 1)χ (g2 g3 )], n = q −5 [(q − 1)2 χ (g1 g2 g3 ) + 2q(q − 1)χ (g1 g2 ) + q 2 χ (g1 )], p = q −6 [(q − 1)(q 2 + 1)χ (g1 g2 g3 ) + q(q − 1)2 χ (g1 g2 ) + q 2 χ (g1 g3 )] . (C.5) Again, the mixing between different terms comes from using formulas like (B.9) and is related to the contribution of a single term to different central elements. In the limit q = 1, each of the a, d, . . . , p depend on a single character, the one corresponding to the conjugacy class of the element that a, d, . . . , p multiply in the projector.
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras
347
Using computer algebra, we have checked that the above are projectors for the five n = 4 representations, provided = (q + 1)(q 2 + q + 1)(q 3 + q 2 + q + 1),
c
(q + 1)(q 3 + q 2 + q + 1) , q (q + 1)2 (q 2 + q + 1) = , q2 (q + 1)(q 3 + q 2 + q + 1) = , q3 =
c c c
c =
(q + 1)(q 2 + q + 1)(q 3 + q 2 + q + 1) . q6
(C.6)
C.3. The construction for Hn . In the general case, the projector contains n! elements. Gyoja has given a formula for the coefficients2 c R : m (q − 1)(q 2 − 1) . . . (q λi +m−i − 1) 1 m(m−1)(m−2) c R = i=1 (q − 1)−n . (C.7) q6 λi +m−i − q λ j +m− j ) 1≤i j≤m (q From (C.3) and (C.6) we easily see that the coefficients satisfy c R (q −1 ) = c R T (q) ,
(C.8)
where R T is the representation with transposed Young tableau. This is indeed a general property of the projectors (C.7) [45]. It is easy to see that in the classical limit, m i=1 li ! c R (q = 1) = , (C.9) 1≤i j≤m (li − l j ) where li = λi + m − i. This is the coefficient of the Young symmetrizer, and is given by the hook formula. Also the quantum coefficients (C.7) can be expressed in terms of a hook formula. For high n, it is tedious to check idempotency of the projector. Also, it relies on having explicit formulas for the characters of the Hecke algebra. Gyoja [21] has given a construction to compute projectors in general without recourse to characters. In this construction, to associate a projector to a particular representation R, we first associate a projector to every state of the representation. Every state is represented by a standard tableau T . A standard tableau is a tableau where the entries (numbered with elements from {1, . . . , n}) are increasing across each row and down each column. The number of states in a given representation is d R (1). Thus, PR will be a sum of d R (1) primitive projectors, which we call E T , where T is the standard tableau they correspond to. The construction proceeds by defining two special tableaux, T+ and T− . These are the tableaux where the entries of the tableaux are numbered from 1 to n successively across the first row (column), then the second, third, etc. I+ and I− are the subgroups of Sn that 2 We corrected a typo in the formula in [21].
348
S. de Haro, S. Ramgoolam, A. Torrielli
preserve the rows (columns) of T+ (T− ). We associate to them parabolic subgroups W± of Sn and define h(w), e+ = w∈W+
e− =
(−q)−l(w) h(w) .
(C.10)
w∈W−
The primitive projector (up to normalization) associated to T is then −1 E(T ) = h − e− h −1 − h + e+ h + ,
(C.11)
where h + = h + (T ) and h − = h − (T ) are the elements of the Hecke algebra corresponding to the permutation that transforms T+ (resp. T− ) to the standard tableau T . Gyoja showed that the E’s are idempotents. The projector is then the sum of the orthogonal primitive idempotents: PR =
d R (1) 1 E(Ti ) , cR
(C.12)
i=1
where c R was given before3 . We checked the previously constructed projectors for n up to 4 using this construction. The first non-trivial case for n = 3 is the representation . There are two standard tableaux: T+ = [{1, 2}{3}] and T− = [{1, 3}{2}]. The permutation relating both is (23), which is h((23)) = g2 . In this case the parabolic subgroups are W+ = W− = {1, s1 }, and e+ = 1 + g1 , 1 e− = 1 − g1 . q
(C.13)
We further have h + (T+ ) = 1, h − (T+ ) = g2 , therefore E(T+ ) = 1 + g1 +
q −1 1 q −1 1 g2 − g1 g2 + g2 g1 − g1 g2 g1 . q q q q
(C.14)
For E(T− ), h + = g2 and h − = 1, so E(T− ) = 1 −
1 1 g1 − g2 g1 + g1 g2 g1 . q q
(C.15)
The primitive idempotents are automatically orthogonal. We get P
q (E(T+ ) + E(T− )) +q +1
q −1 q 1 2+ (g1 + g2 ) − (g1 g2 + g2 g1 ) , = 2 q +q +1 q q =
q2
(C.16)
3 Gyoja showed that the primitive idempotents are orthogonal using a certain ordering. In order for (C.12) to be a projector, they must be orthogonal independently of the ordering. This can be done defining new primitive idempotents in terms of the old ones, see Theorem 4.5 in [21]. For n up to 4, however, we found that the primitive projectors are automatically orthogonal.
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras
349
in agreement with the formula obtained earlier. As another example, we do n = 4 for the representation . There are three standard tableaux: T+ = [{1, 2, 3}, {4}], T− = [{1, 3, 4}, {2}], and T3 = [{1, 2, 4}, {3}]. We have I+ = {1, s1 , s2 } and I− = {1, s1 }, so e+ = 1 + g1 + g2 + g1 g2 + g2 g1 + g1 g2 g1 , 1 e− = 1 − g1 . q
(C.17)
In this case h + (T+ ) = 1, h − (T+ ) = g3 g2 . Thus: E(T+ ) = g3 g2 e− (T ) g2−1 g3−1 e+ (T ) ,
(C.18)
which we worked out with the help of computer algebra. In the same way we have h − (T− ) = 1, h + (T− ) = g2 g3 , so E(T− ) = e− (T ) g2 g3 e+ (T ) g3−1 g2−1 .
(C.19)
For T3 , h − (T3 ) = g2 , h + (T3 ) = g3 , hence E(T3 ) = g2 e− (T ) g2−1 g3 e+ (T ) g3−1 .
(C.20)
The projector is the sum of the three, with the appropriate coefficient, and it agrees with the one computed directly. Notice that the primitive idempotents were automatically orthogonal in this case as well. D. q-Schur-Weyl Duality and q-Characters In this appendix, we explain concretely the relation between quantum characters of the q-deformed SU (N ) and the symmetric group, in the special case of SU (2). We will use the quantum group conventions of [46] and [47]. We will use the formulae for matrix elements of spin-one representations from [46] in terms of spin-half representations and show that they are consistent with expressing the characters in spin-one in terms of the characters of spin half, using the Hecke algebra generators, or R-matrices. For the R-matrix we will use the notation of [47]. D.1. Uq (su(2)) conventions. We first summarize some of the formulas of [46, 47] that we will use later. The Uq (su(2)) algebra and coproduct are [46]: H e − eH = 2e, H f − f H = −2 f, q H/2 − q −H/2 e f − f e = 1/2 , q − q −1/2 (e) = e ⊗ q H/4 + q −H/4 ⊗ e .
(D.1)
For later convenience, we note that the map to the notation of [47] is q e f H
→ q, → X +, → X −, → H.
(D.2)
350
S. de Haro, S. Ramgoolam, A. Torrielli
The universal R-matrix in this basis is [47] R=q
H ⊗H 4
∞ (1 − q −1 )n
[n]!
n=0
(q H/4 X + )n ⊗ (q −H/4 X − )n ,
(D.3)
where [n] is as in (2.35). Together with the action of the generators on spin-half states, 1 1 1 1 e | , − = | , , 2 2 2 2 1 1 1 1 f | , = | , − , 2 2 2 2 1 1 1 1 H | , ± = ±| , , 2 2 2 2
(D.4)
this determines the R-matrix as follows: 1
R 12 R R
, 21
1 2 ,2 1 1 2 ,− 2 1 1 2 ,− 2 1 1 2 ,− 2 − 21 , 12
=R =R
− 12 ,− 12 − 21 ,− 21 − 12 , 21 − 21 , 21
= q 1/4 ,
= q −1/4 ,
= q −1/4 (q 1/2 − q −1/2 ) .
(D.5)
D.2. Schur-Weyl duality in spin-one. As in the classical case, the q-characters in higher representations can be written in terms of q-characters of lower representations. Consider for concreteness the case of spin-one, which is contained in the tensor product of two spin-half representations V . There is a projector (2.9) acting on V ⊗ V that leads to the symmetric representation. In the classical case it is just 21 (1 + P), where P is the permutation of the two tensor factors. In the quantum case P does not commute with the coproduct, but P R ≡ Rˇ does: ˇ Rˇ = R.
(D.6)
When Rˇ acts on the tensor product of two spin half irreps, it satisfies a relation of the form Rˇ 2 = q −1/4 (q 1/2 − q −1/2 ) Rˇ + q −1/2 .
(D.7)
A rescaling g = q 3/4 Rˇ can be done to map to the standard form of the Hecke algebra used in the main text. A matrix element of some element h of Uq (su(2)) in the spin 1 representation can be written in terms of a product of spin half reps by using the Clebsch-Gordan coefficients. Consider now the following matrix element in the spin-one representation: j = 1, n|h| j = 1, m = d nj=1;m (h) = h, d nj=1;m .
(D.8)
d nj;m is the representation matrix in representation j with indices n, m, and d mj;m its trace. In the last equation we have expressed the fact that the matrix elements can be viewed as living in the dual space Uq (su(2)), denoted by Funq (SU (2)). For more details on this duality see for example [18, 19, 13].
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras
351
We now express this in terms of matrix elements of the fundamental representation. They generate Funq (SU (2)), the deformed algebra of functions on SU (2). Using the Clebsch-Gordan coefficients, we can rewrite the above as follows: Cnn1 n 2 Cmm 1 m 2 j = 1, n|h| j = 1, m = m 1 ,m 2 ;n 1 ,n 2
1 1 1 1 × j = , n 1 | ⊗ j = , n 2 |(h 1 ⊗h 2 )| j = , m 1 ⊗| j = , m 2 2 2 2 2 Cnn1 n 2 Cmm 1 m 2 d n 1 1 (h 1 ) d n 2 1 (h 2 ) = j= 2 ;m 1
m 1 ,m 2 ;n 1 ,n 2
=
Cnn1 n 2 Cmm 1 m 2 h, d n 1
j= 12 ;m 1
m 1 ,m 2 ;n 1 ,n 2
j= 2 ;m 2
d n2
j= 21 ;m 2
.
(D.9)
In the first equality, the co-product (h) = h 1 ⊗ h 2 gives the action of h on the tensor product V ⊗ V . In the last equality, we used the fact that the dual pairing of a product of two elements in Funq (SU (2)) is given by the co-product. Now we can sum over m and use the identity between Clebsch-Gordan coefficients and projectors (see for example [42])
1 1 , ;1 . (D.10) Cnm1 n 2 Cmm 1 m 2 = Pnm11nm2 2 2 2 m ˇ For j = 1, the projector The projector is a linear combination of the identity and the R. is in the tensor product of two spin-half representations. It has to be a linear combination of 1 and Rˇ since the Hecke algebra generates the centralizer of the quantum group action in the tensor product:
1 1 ˇ P , ; 1 = a + b R, (D.11) 2 2 and for the matrix elements we have d nj=1;m = d n1 m 1 ,m 2 ;n 1 ,n 2
j= 21 ;m 1
d n2
j= 21 ;m 2
Cmm 1 m 2 Cnn1 n 2 .
(D.12)
To compute the character, we want the trace of this equation. Using (D.10), and expanding the projector in terms of the R-matrix as in (D.11), we get: ˇ 1 ⊗ 1)(1 ⊗ d 1 )) , tr1 d = a (tr 1 d) (tr 1 d) + b tr1 ( R(d 2
2
2
which, written out in indices, reads: m m d mj=1;m = a δn 11 δn 22 + b Rnm12nm2 1 d n1 1 m 1 ,m 2 ;n 1 ,n 2
m
(D.13)
2
2
;m 1
d n1 2 2
;m 2
.
(D.14)
We will show that the above equation can indeed be solved for constants a, b. The left-hand side can be calculated to give: √ d mj=1;m = x 2 + (x y + quv) + y 2 m
= x 2 + y 2 + x y(1 + q) − q,
(D.15)
352
S. de Haro, S. Ramgoolam, A. Torrielli
where we have used Eqs. (36-40) of [46] ( recalling that x, y, u, v are the matrix entries of d in the fundamental representation). For the right-hand side of (D.14) we get √ √ (a + bq 1/4 )(x 2 + y 2 ) + ax y + 2buvq −1/4 + (a + bq −1/4 ( q − 1/ q))yx. (D.16) Using the relations yx = (1 − q) + q x y, uv = q 1/2 (x y − 1),
(D.17)
x 2 , y 2 , x y, 1. Comparing with (D.15) and considering
we can rewrite (D.16) in terms of the coefficient of x 2 + y 2 we immediately see that
a + b q 1/4 = 1 .
(D.18)
With this condition the coefficient of x y becomes (q + 1) as desired. Comparing coefficients of the constant term then determines a=
1 q 3/4 , b= . 1+q 1+q
(D.19)
Putting everything together, and going back to the notation used in the main text, we get:
q 3/4 1 tr U tr U + tr ⊗ tr Rˇ (U ⊗ 1)(1 ⊗ U ) (D.20) tr1 U = 1+q 1+q which is q-Schur-Weyl duality (2.20) for n = 1. By comparing (D.7) and with the first ˇ Then the projector can be read from of (2.6) we can see that we can define g = q 3/4 R. above, P
=
1 (1 + g), 1+q
(D.21)
and agrees with (2.9) and the general form (2.28). D.3. Quantum characters in spin-one representation. The quantum characters can be obtained from the above by including the u-element (B.1) in the trace, which is basically q −H . In fact, we will do a slightly more general computation of the trace with an insertion of q AH . Thus, we consider the matrix element in the spin one representation of hq AH where A is an arbitrary number and h is an arbitrary element o Uq (su(2)) : j = 1, n|h q AH | j = 1, m = q Am d nj=1;m (h).
(D.22)
As before, we now rewrite this in spin-half matrix coefficients using the Clebsch-Gordan coefficients: j = 1, n|hq AH | j = 1, m =
m 1 ,m 2 ;n 1 ,n 2
m m2
Cnn1 n 2 Cm 1
1 1 1 1 × j = , n 1 |⊗ j = , n 2 |(h 1 ⊗h 2 )(q AH ⊗q AH )| j= , m 1 ⊗| j = , m 2 2 2 2 2 m m n1 n Cnn1 n 2 Cm 1 2 d j=1/2;m (h 1 ) d 2 1 (h 2 ) q Am 1 + Am 2 = 1
m 1 ,m 2 ;n 1 ,n 2
=
m 1 ,m 2 ;n 1 ,n 2
j= 2 ;m 2
m m 2 Am 1 + Am 2 n n q h, d 1 1 d 2 . j= 2 ;m 1 j= 21 ;m 2
Cnn1 n 2 Cm 1
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras
353
Again, we can sum over m = n to take the trace (notice the presence of q AH so this gives the quantum trace) and use (D.10) to get q Am d mj=1;m = q A(m 1 +m 2 ) a δnm11 δnm22 + b Rnm12nm2 1 d n1 1 d n1 2 . m
2
m 1 ,m 2 ;n 1 ,n 2
;m 1
2
;m 2
(D.23) For comparison to the classical formulae, it is useful to rewrite it as
tr1 (q AH U ) = a tr (q AH U ) tr (q AH U ) + b tr ⊗ tr q AH Rˇ (U ⊗ 1)(1 ⊗ U ) . (D.24) In the q → 1 limit, Rˇ goes to the permutation P and the second term becomes 21 tr(U 2 ). We still need to compute the constants a, b in this case. Writing out the traces using [46], we get: tr1 (q AH U ) = q A x 2 + q −A y 2 + 1 + (q 1/2 + q −1/2 )uv,
tr ⊗ tr q A H ⊗H (U ⊗ 1)(1 ⊗ U ) = (tr (q AH U ))2 = q A x 2 + q −A y 2 + 2 + (q 1/2 + q −1/2 )uv,
tr ⊗ tr q A H ⊗H Rˇ (U ⊗ 1)(1 ⊗ U ) = q 1/4 [q A x 2 + q −A y 2 + 1 − q −1 + (q 1/2 + q −1/2 )uv], (D.25)
where A denotes an arbitrary power. It is now easy to see that the values of a, b (D.19) are still the same, independently of the value of A. From the explicit computation (D.25) we also get the special N = 2 relations, Tr U = 1, Tr
U = (tr U )2 − 1 .
(D.26)
References 1. Migdal, A.A.: Recursion Equations In Gauge Field Theories. Sov. Phys. JETP 42, 413 (1975) [Zh. Eksp. Teor. Fiz. 69, 810 (1975)] 2. Gross, D.J.: Two-dimensional QCD as a string theory. Nucl. Phys. B 400, 161 (1993) 3. Gross, D.J., Taylor, W.I.: Two-dimensional QCD is a string theory. Nucl. Phys. B 400, 181 (1993) 4. Gross, D.J., Taylor, W.I.: Twists and Wilson loops in the string theory of two-dimensional QCD. Nucl. Phys. B 403, 395 (1993) 5. Cordes, S., Moore, G.W., Ramgoolam, S.: Lectures on 2-d Yang-Mills theory, equivariant cohomology and topological field theories. Nucl. Phys. Proc. Suppl. 41, 184 (1995) 6. Cordes, S., Moore, G.W., Ramgoolam, S.: Large N 2-D Yang-Mills theory and topological string theory. Commun. Math. Phys 185, 543 (1997) 7. Horava P.: Topological strings and QCD in two-dimensions. http://arxiv.org/list/hep-th/9311156, 1993 8. Vafa C.: Two dimensional Yang-Mills, black holes and topological strings. http://arxiv.org/list/hepth/0406058, 2004 9. Bryan J., Pandharipande R.: The local Gromov-Witten theory of curves. http://arxiv.org/list/math.ag/ 0411037, 2004 10. Aganagic, M., Ooguri, H., Saulina, N., Vafa, C.: Black holes, q-deformed 2d Yang-Mills, and non-perturbative topological strings. Nucl. Phys. B 715, 304 (2005) 11. Haro, S. de : A note on knot invariants and q-deformed 2d Yang-Mills. Phys. Lett. B 634, 78 (2006) 12. Boulatov, D.V.: q deformed lattice gauge theory and three manifold invariants. Int. J. Mod. Phys. A 8, 3139 (1993) 13. Buffenoir, E., Roche, P.: Two-dimensional lattice gauge theory based on a quantum group. Commun. Math. Phys. 170, 669 (1995)
354
S. de Haro, S. Ramgoolam, A. Torrielli
14. Klimcik, C.: The formulae of Kontsevich and Verlinde from the perspective of the Drinfeld double. Commun. Math. Phys. 217, 203 (2001) 15. Jimbo, M.: A q-analog of U (gl(N + 1)), Hecke algebras and the Yang-Baxter equation. Lett. Math. Phys. 11, 247–252 (1986) 16. King, R.C., Wybourne, B.G.: Representations and traces of the Hecke algebrasHn (q) of type An−1 . J. Math. Phys. 33(1), 4 (1992) 17. Faddeev, L.D., Reshetikhin, N.Y., Takhtajan, L.A.: Quantization Of Lie Groups And Lie Algebras. Lengingrad Math. J. 1, 193 (1990) 18. Majid, S.: Foundations of quantum group theory. Cambridge: Cambridge Univ. Press, (1995) 19. Coquereaux, R., Schieber, G.E.: Action of a finite quantum group on the algebra of complex N × N matrices. AIP Conf. Proc. 453, Melville, NY: Amer. Inst. of Physics, 1998 pp. 9–23 20. Ram, A.: A Frobenius formula for characters of the Hecke algebra. Invent. Math. 106, 461–488 (1991) 21. Gyoja, A.: A q-analogue of Young Symmetrizer. Osaka J. Math. 23, 841–852 (1986) 22. Francis, A.: The Minimal Basis for the Centre of an Iwahori-Hecke Algebra. J. Algebra 221, 1–28 (1999) 23. Francis, A., Jones, L.: On bases of centres of Iwahori-Hecke algebras of the symmetric group. J. Algebra 289(1), 42–69 (2005) 24. Ramgoolam, S.: Wilson loops in 2-D Yang-Mills: Euler characters and loop equations. Int. J. Mod. Phys. A 11, 3885 (1996) 25. Dijkgraaf, R., Vafa, C., Verlinde, E.P., Verlinde, H.L.: The Operator Algebra Of Orbifold Models. Commun. Math. Phys. 123, 485 (1989) 26. Dijkgraaf, R., Witten, E.: Topological Gauge Theories And Group Cohomology. Commun. Math. Phys. 129, 393 (1990) 27. Freed, D.S., Quinn, F.: Chern-Simons theory with finite gauge group. Commun. Math. Phys. 156, 435 (1993) 28. Haro, S. de: Chern-Simons theory, 2d Yang-Mills, and lie algebra wanderers. Nucl. Phys. B 730, 312 (2005) 29. Dijkgraaf, R., Moore, G.W.: Balanced topological field theories. Commun. Math. Phys. 185, 411 (1997) 30. Caporaso, N., Cirafici, M., Griguolo, L., Pasquetti, S., Seminara, D., Szabo, R.J.: Topological strings and large N phase transitions. I: Nonchiral expansion of q-deformed Yang-Mills theory. JHEP 0601, 035 (2006) 31. Caporaso, N., Cirafici, M., Griguolo, L., Pasquetti, S., Seminara, D., Szabo, R.J.: Topological strings and large N phase transitions. II: Chiral expansion of q-deformed Yang-Mills theory. JHEP 0601, 036 (2006) 32. Brzezinski, T., Dabrowski, H., Rembielinski, J.: On the quantum differential calculus and the quantum holomorphicity. J. Math. Phys. 33(1), 19–24 (1992) 33. Krieg A. Hecke Algebras. Memoirs of the American Mathematical Society, 87, 435, Providence, RI: Amer. Math. Soc, 1990 34. Fulton, W.: Hurwitz schemes and irreducibility of the moduli spaces of algebraic curves. Ann. of Math. (2) 90, 542–575 (1969) 35. Haro, S. de : Chern-Simons theory in lens spaces from 2d Yang-Mills on the cylinder. JHEP 0408, 041 (2004) 36. Beasley, C., Witten, E.: Non-abelian localization for Chern-Simons theory. J. Diff. Geom. 70, 183– 323 (2005) 37. Blau, M., Thompson, G.: Chern-Simons theory on S**1-bundles: Abelianisation and q-deformed YangMills theory. JHEP 0605003 , (2006) 38. Chung, S.W., Fukuma, M., Shapere, A.D.: Structure of topological lattice field theories in three-dimensions. Int. J. Mod. Phys. A 9, 1305 (1994) 39. Fukuma, M., Hosono, S., Kawai, H.: Lattice topological field theory in two-dimensions. Commun. Math. Phys. 161, 157 (1994) 40. Martin P.M.: On Schur-Weyl duality, An Hecke algebras and quantum sl(N ) on n+1 C N . Infinite analysis, Part A, B (Kyoto, 1991) Adv. Ser. Math. Phys. Vol. 16 River Edge. NJ: World Sci. Publ., 1992, pp. 645–673 41. Be˘ılinson, A.A., Lusztig, G., MacPherson, R.D.: Duke Math. J. 61(2), 655–677 (1990) 42. LeClair, A., Ludwig, A., Mussardo, G.: Integrability of coupled conformal field theories. Nucl. Phys. B 512, 523 (1998) 43. Katriel, J., Abdelassam, B., Chakrabarti, A.: The fundamental invariant of the Hecke algebra Hn (q) characterizes the representations of Hn (q), Sn , SUq (N ) and SU (N ). J. Math. Phys. 36, 5139–5158 (1995) 44. Dipper, R., James, G.D.: Blocks and idempotents of Hecke algebras of general linear groups. Proc. Lon. Math. Soc. 3(54), 57 (1987) 45. Ogievetsky O., Pyatov P.: Lecture on Hecke algebra. Based on lectures at the International School “Symmetries and Integrable systems”. (Dubna, 8–11 June, 1999). Dubna: JINR Publ. Dept., 2000
Large N Expansion of q-Deformed 2-D Yang-Mills Theory and Hecke Algebras j
355
46. Nomura, M.: Representation functions dmk of U [slq (2)] as wavefunctions of “Quantum symmetric tops” and Relationship to Braiding matrices. J. Phys. Soc. Japan 59(12), 4260–4271 (1990) 47. Kirillov, A.N., Reshetikhin, N.Yu.: Representations of the algebra Uq (sl(2)), q orthogonal polynomials and invariants of links. Adv. Series in Math. Phys. 7, 285–339 (1989) Communicated by M.R. Douglas
Commun. Math. Phys. 273, 357–378 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0237-z
Communications in
Mathematical Physics
Looped Cotangent Virasoro Algebra and Non-Linear Integrable Systems in Dimension 2 + 1 V. Ovsienko, C. Roger Institut Camille Jordan, Université Claude Bernard Lyon 1, 21 Avenue Claude Bernard, 69622 Villeurbanne Cedex, France. E-mail: [email protected]; [email protected] Received: 26 April 2006 / Accepted: 31 October 2006 Published online: 11 May 2007 – © Springer-Verlag 2007
Abstract: We consider a Lie algebra generalizing the Virasoro algebra to the case of two space variables. We study its coadjoint representation and calculate the corresponding Euler equations. In particular, we obtain a bi-Hamiltonian system that leads to an integrable non-linear partial differential equation. This equation is an analogue of the Kadomtsev–Petviashvili (of type B) equation. 1. Introduction and Motivations This work was initiated by the question asked to one of the authors by T. Ratiu: can the Kadomtsev–Petviashvili equation be realized as an Euler equation on some infinitedimensional Lie group? In order to make the above question understandable, let us start with some basic definitions. 1.1. The KP and BKP equations as two generalizations of KdV. The famous Korteweg–de Vries (in short KdV) equation ut = 3 u x u + c u x x x ,
(1.1)
where u(t, x) is a smooth (complex valued) function, is the most classic example of an integrable infinite-dimensional Hamiltonian system. The Kadomtsev–Petviashvili (KP) is a “two space variable” generalization of KdV. A function u(t, x, y) on R3 of time t and two space variables x, y satisfies KP if one has u t = 3 u x u + c1 u x x x + c2 ∂x−1 u yy ,
(1.2)
where, as usual, the partial derivatives are denoted by the corresponding variables as lower indices, c1 , c2 ∈ C are arbitrary constants and ∂x−1 denotes the indefinite integral.
358
V. Ovsienko, C. Roger
Of course, there is some ambiguity in this definition, so that one can prefer to use the following form: u t x = 3 u x x u + 3 u 2x + c1 u x x x x + c2 u yy . The constants c1 , c2 are called central charges for reasons that will be clarified below. Let us notice that, if u does not depend explicitly on y then (1.2) reduces to the KdV with c = c1 . Another example is the more recent version of the KP equation, namely the so-called KP of type B (BKP) (see [18, 5]). The dispersionless BKP is of the form u t = α u x u 2 + β u y u + u x ∂x−1 u y + c ∂x−1 u yy ,
(1.3)
where α, β and c are some constants. Equations (1.1)-(1.3) are infinite-dimensional integrable (in a weak algebraic sense, cf. [30]) systems. They correspond to infinite hierarchies of conservation laws and an infinite series of commuting evolution equations on the space of functions. They are interesting both for mathematics and theoretical physics. Remark 1.1. The classic KP equation (1.2) should not be confused with the KP hierarchy that became more popular than the original KP equation itself. The KP hierarchy is a family of integrable P.D.E. obtained by an inductive algebraic construction; the KdV equation appears as the first term in this family while the “classic” KP corresponds to the third term (see [30] and [1]). For more details on the KP and BKP hierarchies see, e.g., [13].
1.2. Euler equations. The notion of Euler equation means in our context some grouptheoretic generalizations of the classic Euler equation for the rigid solid motion. Let G be a Lie group, g the corresponding Lie algebra and g∗ the dual of g equipped with the canonical linear Poisson structure (or the Kirillov-Kostant-Souriau bracket). Consider a linear map I : g → g∗ called the inertia operator. Definition 1.2. The Euler equation is the Hamiltonian vector field on g∗ with the quadratic Hamiltonian function H (m) =
1 −1 I (m), m 2
(1.4)
for m ∈ g∗ and , being the natural pairing between g and g∗ . The well-known formula for this vector field is m t = {H, m} = −ad∗dm H m = −ad∗I −1 (m) m,
(1.5)
where { , } is the canonical Poisson bracket on g∗ and ad∗ is the coadjoint action of g on g∗ and dm H = I −1 (m) is the differential of the function (1.4) (see, e.g., [2, 3]).
Virasoro Algebra and Non-Linear Integrable Systems in Dimension 2 + 1
359
1.3. The role of central extensions. In many cases, one is naturally lead to consider central extensions g of g. A central extension is given by a set of non-trivial 2-cocycles µi ∈ Z 2 (g; C) with i = 1, . . . , k. As a vector space, g∼ = g ⊕ Ck , where the second summand belongs to the center. Consider the dual space g∗ ∼ = g∗ ⊕ Ck and fix arbitrary values (c1 , . . . , ck ) ∈ Ck ; ∗ the affine subspace g(c1 ,...,ck ) ⊂ g∗ is stable with respect to the coadjoint action of g. ∗ Identifying the affine subspace g(c1 ,...,ck ) with g∗ and restricting the coadjoint action of g to g (since the center acts trivially), one obtains a k-parameter family of actions of g on g∗ : ∗x = ad∗x + ad
k
ci Si (x)
for
x ∈ g,
(1.6)
i=1
where Si : g → g∗ are 1-cocycles on g with values in the coadjoint representation. More precisely, the above 1-cocyles Si are related with the 2-cocycles µi via µi (x, y) = Si (x), y − Si (y), x
for
(x, y) ∈ g2 .
Formula (1.6) is due to Kirillov (see [14]) for the details); the 1-cocycles Si are sometimes called the Souriau cocycles associated with µi . With the above modifications of the coadjoint action and the corresponding Poisson structure, the Euler equation becomes xt = −ad ∗I −1 (x) x + ci Si I −1 (x) , (1.7) so that one adds to the original equation, which is usually quadratic, some extra linear terms. 1.4. The main results. Our work consists in two parts: • we introduce an infinite-dimensional Lie algebra which seems to be a nice generalization of the Virasoro algebra with two space variables (the kinematics); • we find the relevant inertia operators and Euler equations (the dynamics), in particular, we are interested in the bi-Hamiltonian Euler equations. We consider the loop algebra over the semidirect product of the Virasoro algebra and its dual and classify its non-trivial central extensions. It turns out that one of these central extensions is related simultaneously to the Virasoro and Kac-Moody algebras. We then study the coadjoint representation of this Lie algebra. We also introduce a Lie superalgebra extending the constructed Lie algebra. This superalgebra is a generalization of the Neveu-Schwarz algebra. We compute several Euler equations corresponding to our Lie algebra. The first example is very similar to the KP equation (1.2), yet different from it. In fact, this equation is nothing but KP with supplementary terms and coupled with another equation. This equation cannot be reduced to KP, however, it reduces to KdV. We do not know if this Hamiltonian system is integrable in any sense. The second Euler equation we obtained in this paper leads to the following differential equation gt = gx ∂x−1 g y − g y g + c ∂x−1 g yy ,
(1.8)
360
V. Ovsienko, C. Roger
where g = g(x, y, t) and c ∈ C. To avoid non-local expressions, one can rewrite this equation as a system: gt + gx h − h x g + c h y = 0,
g y + h x = 0.
We prove integrability of (1.8) in the following sense. There exists an infinite hierarchy of vector fields commuting with (1.8) and with commuting flows. The first commuting fields are: gt = gx and gt = g y ; one more higher field is provided by Example 5.9. We use the bi-Hamiltonian technique. More precisely, we obtain Eq. (1.8) coupled together with another differential equation (see formula (5.7)) so that the system of two equations is a bi-Hamiltonian vector field. Equation (1.8) was studied in [8] (see formula (33)) and [9]. It has also been considered in differential geometry [7]. In a more general setting, this equation is the second term of the so-called universal hierarchy [23]. Although Eq. (1.8) resembles KP and especially BKP (1.3), it is different: there are no cubic terms and, foremost, the sign in the quadratic term is different. This may be important, especially if one works over R (rather than over C). 1.5. Historical overview. The most classic case is related to the Lie group G = SO(3) and the inertia operator I given by a symmetric tensor in S 2 g (the usual inertia tensor on the rigid solid). One gets the genuine Euler equation. The first generalization was obtained by V.I. Arnold (1966) for the hydrodynamical Euler equation of an incompressible fluid, hence the name of Euler-Arnold is sometimes granted to these equations. The relevant group in this case is the group SDiff(D) of volume-preserving diffeomorphisms of a domain D ⊂ R2 or R3 (see the books [2, 3]). Some interesting examples, such as the Landau-Lifchitz equation, correspond to the Euler equations on the Kac-Moody groups (see [15]). Another example has already been mentioned: the KdV equation is an Euler equation on the Virasoro-Bott group (see [17]). This group is defined as the unique (up to isomorphism) non-trivial central extension of the group Diff(S 1 ) of all diffeomorphisms of S 1 . The inertia operator is given by the standard L 2 -metric on S 1 . In [24] and [16] different choices of the metric on Diff(S 1 ) (and thus of the inertia operator) were considered in order to get some other equations than KdV, such as the Hunter-Saxon and the Camassa-Holm equations. Let us also mention that there is a huge literature containing different generalizations of KdV (in the super case, matrix versions, etc.) in the dimension 1 + 1. All these generalizations are related to some extensions of the Virasoro algebra. Let us finally stress that the property of an evolution equation to be an Euler equation associated with some Lie group (or Lie algebra) is important for the following reason: it allows one to deal with the equation as with the geodesic flow and to apply a wide spectrum of methods specific for differential geometry (see, e.g., [25]). 2. The Virasoro Algebra and its Loop Algebra In this section we introduce the preliminary examples of infinite-dimensional Lie algebras that we will consider. We define the Virasoro algebra and show how to obtain the KdV equation as an Euler equation on it. We then consider the loop group L Diff(S 1 ) . We classify non-trivial central extensions of the corresponding Lie algebra.
Virasoro Algebra and Non-Linear Integrable Systems in Dimension 2 + 1
361
2.1. Reminder: the Virasoro algebra and KdV equation. The Virasoro algebra is defined as the central extension of Vect(S 1 ) given by the 2-cocycle ∂ ∂ , g(x) = f gx x x d x. (2.1) µ f (x) ∂x ∂x S1 This cocycle was found in [12] and is known as the Gelfand-Fuchs cocycle. The cohomology group H 2 (Vect(S 1 ); C) is one-dimensional so that the cocycle (2.1) defines the unique (up to isomorphism) non-trivial central extension of Vect(S 1 ). Consider a natural family of modules over Vect(S 1 ) (and therefore over the Virasoro algebra with the trivial action of the center). Let Fλ be the space of λ-densities (or the space of tensor densities of degree λ) on S 1 ,
Fλ = ϕ(x) d x λ ϕ(x) ∈ C ∞ (S 1 ) , where λ ∈ C. The action of Vect(S 1 ) on the space Fλ is given by the first-order differential operator L λf ϕ(x) d x λ = ( f ϕx + λ f x ϕ) d x λ (2.2) which is nothing but the Lie derivative along the vector field f = f (x) ∂∂x . Let us calculate the coadjoint action of the Virasoro algebra. The dual space, Vect(S 1 )∗ , corresponds to the space of all distributions on S 1 . Following [15], we will consider only the regular part of this dual space that consists of differentiable 2-densities, that is Vect(S 1 )∗reg = F2 with the natural pairing ∂ , u(x) d x 2 = f (x) f (x) u(x) d x. ∂x S1 The coadjoint action of Vect(S 1 ) coincides with the Vect(S 1 )-action on F2 . The Souriau cocycle on Vect(S 1 ) corresponding to the Gelfand-Fuchs cocycle is ∂ = fx x x d x 2 S f (x) ∂x and one finally obtains the coadjoint action of the Virasoro algebra: 2 ∗ u(x) d x = ( f u x + 2 fx u + c fx x x ) d x 2. (2.3) ad ∂ f (x) ∂x
Consider the simplest quadratic Hamiltonian function on Vect(S 1 )∗ 1 2 H u(x) d x = u(x)2 d x 2 S1 corresponding to the inertia operator I ( f (x) ∂∂x ) = f (x) d x 2 . The following result was obtained in [17]. Proposition 2.1. The Euler equation on the Virasoro algebra corresponding to the Hamiltonian H is precisely the KdV equation (1.1). Proof. Immediately follows from formulæ (1.5) and (2.3).
For more details about the Virasoro algebra, its modules and its cohomology see [10] and [11].
362
V. Ovsienko, C. Roger
2.2. The loop group on Diff(S 1 ) and the loop algebra on Vect(S 1 ). We wish to extend the Virasoro algebra to the case of two space variables. A natural way to do this is to consider the loops on it. One defines the loop group on Diff(S 1 ) as follows:
L Diff(S 1 ) = ϕ : S 1 → Diff(S 1 ) | ϕ is differentiable , the group law being given by (ϕ ◦ ψ) (y) = ϕ(y) ◦ ψ(y),
y ∈ S1.
Remark 2.2. Let us stress that there are no difficulties in defining differentiable maps with values in Diff(S 1 ). Indeed, L Diff(S 1 ) is naturally embedded into the space of C ∞ -maps on S 1 × S 1 with values in S 1 . In a similar way, we construct the Lie algebra L Vect(S 1 ) consisting of vector fields on S 1 depending on one more independent variable y ∈ S 1 . The loop variable is thus of S 1 by x. The elements denoted by y and the variable on∂ the “target” copy 1 ∞ of L Vect(S ) are of the form: f (x, y) ∂ x , where f ∈ C (S 1 × S 1 ) and the Lie bracket reads as follows:
∂ ∂ ∂ f (x, y) , g(x, y) = ( f (x, y) gx (x, y) − f x (x, y) g(x, y)) . ∂x ∂x ∂x It is easy to convince oneself that L Vect(S 1 ) is the Lie algebra of L Diff(S 1 ) in the usual weak sense for the infinite-dimensional case; a one-parameter group argumentation gives an identification between the tangent space to L Diff(S 1 ) at the identity and L Vect(S 1 ) , equipped with its Lie bracket. We will now classify non-trivial central extensions of the Lie algebra L Vect(S 1 ) and therefore calculate H 2 (L Vect(S 1 ) ; C). This result can be deduced from a more general one that we will need later. 2.3. Central extensions of tensor products. Following the work of Zusmanovich [32], one can calculate the cohomology group H 2 (g ⊗ A) for a Lie algebra g and a commutative algebra A over a field k, the Lie bracket on g ⊗ A being defined by [x1 ⊗ a1 , x2 ⊗ a2 ] = [x1 , x2 ] ⊗ a1 a2 ,
xi ∈ g, ai ∈ A, i = 1, 2.
From the results of [32], one can easily deduce the following Proposition 2.3. If g = [g, g], then H 2 (g ⊗ A; k) = H 2 (g; k) ⊗ A ⊕ Invg S 2 (g∗ ) ⊗ H C 1 (A) , where H C 1 (A) is the first group of cyclic cohomology of the k-algebra A and Invg S 2 (g∗ ) is the space of g-invariant symmetric bilinear maps from g into k, while A = Homk (A, k) represents the dual of A. One can give explicit formulæ for the cohomology classes.
Virasoro Algebra and Non-Linear Integrable Systems in Dimension 2 + 1
363
• Given µ ∈ Z 2 (g; k) and λ ∈ A , one gets µλ ∈ Z 2 (g ⊗ A; k) defined by µλ (x1 ⊗ a1 , x2 ⊗ a2 ) = µ(x1 , x2 ) λ(a1 a2 ).
(2.4)
• Given K ∈ Invg S 2 (g∗ ) and ∈ H C 1 (A), one gets K ∈ Z 2 (g ⊗ A; k) defined by K (x1 ⊗ a1 , x2 ⊗ a2 ) = K (x1 , x2 ) (a1 da2 ),
(2.5)
where d is the Kähler derivative. For the general results on cyclic homology and cohomology see [19]. Example 2.4. If g is a finite-dimensional semisimple Lie algebra and A = C ∞ (S 1 ), formula (2.5) defines the Kac-Moody cocycle K M(x1 ⊗ a1 , x2 ⊗ a2 ) = K (x1 , x2 )
S1
a1 da2 ,
where K is the Killing form. Remark 2.5. In the general situation, one can call the cohomology classes of the cocycles (2.5) the classes “of Kac-Moody type”. For instance, such a class on the loop algebra over the algebra of pseudodifferential symbols has already been used in [30] in order to obtain the KP equation as a Hamiltonian system. 2.4. Central extensions of L Vect(S 1 ) . In our case, g = Vect(S 1 ) and A = C ∞ (S 1 ), one has the following well-known statement: Invg S 2 (g∗ ) = 0, that is, there is no invariant bilinear symmetric form (“Killing form”) on Vect(S 1 ) (see, e.g., [10]). Proposition 2.3 then implies the following result. Proposition 2.6. One has H 2 (L V ect (S 1 ) ; C) = C ∞ (S 1 ) . To a distribution λ ∈ C ∞ (S 1 ) one associates a 2-cocycle µλ ∈ Z 2 (L Vect(S 1 ) ; C) given by formula (2.4) with µ being the Gelfand-Fuchs cocycle (2.1). Similar results were obtained in [29] in a slightly different context. Proposition 2.6 provides a classification of non-trivial central extensions of the loop algebra L Vect(S 1 ) . This is rather a “negative result” for us since it implies that all these central extensions are of Virasoro type. The KP equation (1.2) contains two different central charges, c1 and c2 , and the second one does not belong to the Virasoro type but to the Kac-Moody one. It is clear then that KP-type (and BKP-type) equations cannot be obtained as Euler equations associated with the group L Diff(S 1 ) . One therefore needs to introduce another group with richer second cohomology. We will discuss further generalizations of L Vect(S 1 ) in the Appendix. However, this Lie algebra is not the one we are interested in.
364
V. Ovsienko, C. Roger
3. The Cotangent Virasoro Algebra and its Loop Algebra In this section we introduce the main object of our study. We consider the Lie algebra 1 ) and calculate its cenof loops associated with the cotangent Lie algebra T ∗ Vect(S 1 tral extensions. Unlike the loop Lie algebra L Vect(S ) , the constructed Lie algebra simultaneously has non-trivial central extensions of the Virasoro and the Kac-Moody types. 3.1. General setting: the cotangent group and its Lie algebra. The cotangent space T ∗ G of a Lie group G is naturally identified with the semi-direct product G g∗ , where G acts on g∗ by the coadjoint action Ad∗ . The space T ∗ G is then a Lie group with the product (g1 , u 1 ) · (g2 , u 2 ) = (g1 g2 , u 1 + Ad∗g1 u 2 ). The corresponding Lie algebra T ∗ g is the semi-direct product g g∗ equipped with the commutator (3.1) [(x1 , u 1 ), (x2 , u 2 )] = [x1 , x2 ], ad∗x1 u 2 − ad∗x2 u 1 . The evaluation map gives a natural symmetric bilinear form on T ∗ g, namely K ((x1 , u 1 ), (x2 , u 2 )) = u 1 , x2 + u 2 , x1 .
(3.2)
Furthermore, this form is non-degenerate so that one has a Killing type form on this Lie algebra. Remark 3.1. Let us mention that the semi-direct product g g∗ is the simplest case of so-called Drinfeld’s double which corresponds to the trivial Lie-Poisson structure. 3.2. The cotangent loop group and algebra. We will consider the cotangent group G = T ∗ Diff(S 1 ) and we willbe particularly interested in the associated loop group. We = L T ∗ Diff(S 1 ) for short. One has will use the notation G = L Diff(S 1 ) F2 = L Diff(S 1 ) L(F2 ). G Consider the semidirect product g = Vect(S 1 ) F2 . is The Lie algebra corresponding to the group G g = L(Vect(S 1 ) F2 ) = L Vect(S 1 ) L(F2 ).
(3.3)
An element of g is a couple ( f, u), where f and u are C ∞ -tensor fields on S 1 × S 1 of the following form: ∂ + u(x, y) d x 2 . ∂x The commutator (Lie bracket) is defined accordingly to (3.1). ( f, u) = f (x, y)
Remark 3.2.It is easy to check that, with ∗ the right convention on the duality, one has L(F2 ) = L Vect(S 1 )∗ = L Vect(S 1 ) ; the chosen form L(F2 ) will be more suitable for our computations.
Virasoro Algebra and Non-Linear Integrable Systems in Dimension 2 + 1
365
3.3. Central extensions of g. Let us calculate the cohomology space H 2 ( g, C) using Proposition 2.3. The following result is a classification of central extensions of the Lie algebra g. Theorem 3.3. One has g; C) = C ∞ (S 1 ) ⊕ C. H 2 ( Proof. From the classical results on the cohomology of the Virasoro algebra and its representations, one has H 2 (g, C) = C, where, as above, g = Vect(S 1 ) F2 , cf. [10] and [12]. The non-trivial cohomology class is again generated by the Gelfand-Fuchs cocycle. More precisely, µ (( f, u), (g, v)) = µ( f, g), where µ is as in (2.1). Furthermore, one has Invg S 2 (g∗ ) = C, where the generator is provided by the 2-form (3.2) and it is easy to check that there are no other generators. Finally, H C 1 (A) = C and is generated by the volume form (see, e.g., [19]).
Let us give the explicit formulæ of non-trivial 2-cocycles on g. A distribution λ ∈ C ∞ (S 1 ) corresponds to a 2-cocycle of the first class (2.4) given by f gx x x d x , µλ (( f, u), (g, v)) = λ S1
these are the Virasoro type extensions. For the particular case where λ(a(y)) = S 1 a(y)dy, such a 2-cocycle will be denoted by µ1 so that one has f gx x x d xd y. (3.4) µ1 (( f, u), (g, v)) = S 1 ×S 1
Another non-trivial cohomology class is provided by the 2-cocycle µ2 (( f, u), (g, v)) = f v y − g u y d xd y S 1 ×S 1
(3.5)
of Kac-Moody type (2.5). 3.4. The Lie algebra g. We define the Lie algebra g as the two-dimensional central extension of g given by the cocycles µ1 and µ2 . As a vector space, g = g ⊕ C2 , g. The commutator in g is given by the following where the summand C2 is the center of explicit expression which readily follows from the above formulæ:
∂ ∂ f + u d x 2, g + v dx2 ∂x ∂x ∂ + ( f vx + 2 f x v − g u x − 2 gx u) d x 2 = ( f gx − f x g) (3.6) ∂ x + f v y − g u y d xd y , f gx x x d xd y, × S 1 ×S 1
S 1 ×S 1
366
V. Ovsienko, C. Roger
where the last term is an element of the center of g. (Note that we did not write the central elements in the left hand-side since they do not enter into the commutator.) The Lie algebra g and its coadjoint representation is the main object of our study. This algebra is a natural two-dimensional generalization of the Virasoro algebra. Further generalizations will be described in the Appendix. 4. The Coadjoint Representation of g According to the general viewpoint of symplectic geometry and mechanics, the coadjoint representation of a Lie algebra plays a very special role for all sorts of applications. It was observed by Kirillov [15] and Segal [31] that the coadjoint action of the Virasoro group and algebra coincides with the natural action of Diff(S 1 ) and Vect(S 1 ), respectively, on the space of Sturm-Liouville operators. The Casimir functions (i.e., the invariants of the coadjoint action) are then expressed in terms of the monodromy operator. In this section we recall the Kirillov-Segal result and generalize it to the case of the Lie algebra g. 4.1. Computing the coadjoint action. Let us start with the explicit expression for the coadjoint action of g. As usual, in the case of semidirect product of a Lie algebra and its dual, the (regular) dual space g∗ is identified with g; the natural pairing being given by ∂ ∂ 2 2 f (4.1) + u dx , g + v dx = ( f v + g u) d xd y, ∂x ∂x S 1 ×S 1 which is nothing but a specialization of the form (3.2). Furthermore, one immediately obtains the Souriau 1-cocycles on g: ∂ + u d x 2 = fx x x d x 2, S1 f ∂x (4.2) ∂ ∂ S2 f + u dx2 = fy + u y d x 2, ∂x ∂x corresponding to µ1 and µ2 , respectively. We are ready to give the expression of the coadjoint action of the “extended” Lie algebra g on the regular dual space g∗ ∼ g. = Proposition 4.1. The coadjoint action of the Lie algebra g is given by ∂ ∂ ∗( f,u) g + v d x 2 = f gx − f x g + c2 f y ad (4.3) ∂x ∂x + f vx + 2 f x v−u x g−2 u gx + c1 f x x x + c2 u y d x 2 , while the center acts trivially. Proof. According to formula (1.6), one has to calculate the coadjoint action of g, which coincides with the adjoint one (cf. formula (3.6) with no central terms) and then to add the Souriau cocycles.
Our next task is to investigate a “geometric meaning” of the coadjoint action (4.3). However, this result will not be relevant for our main task: computing the Euler equations on g. Sections 4.2–4.6 can be omitted at the first reading.
Virasoro Algebra and Non-Linear Integrable Systems in Dimension 2 + 1
367
4.2. The Virasoro coadjoint action and Sturm-Liouville operators. Let us recall the Kirillov-Segal result on the coadjoint action of the Virasoro algebra (cf. [15, 31, 28]). Consider the space of second-order linear differential operators 2 d A = 2c + u(x), (4.4) dx where u(x) ∈ C ∞ (S 1 ). This is a Sturm-Liouville operator with periodic potential (also called a Hill operator). Define an action of Vect(S 1 ) on this space by the formula 3
−1
L f (A) := L 2f ◦ A − A ◦ L f 2 ,
(4.5)
where f = f (x) ddx and where L λf is the first-order differential operator L λf = f
∂ + λ fx ∂x
corresponding to the Lie derivative on the space Fλ of λ-densities (2.2). An elementary computation shows that the result of the action (4.5) is a differential operator of order 0 (that is, an operator of multiplication by a function). More precisely, one has L f (A) = f vx + 2 f x v + c f x x x . This formula coincides with formula (2.3) of the coadjoint action of the Virasoro algebra. We obtained the following result: the space of operators (4.4) equipped with the Vect(S 1 )-action is isomorphic to the coadjoint representation of the Virasoro algebra. Remark 4.2. The operator A is understood as a differential operator on tensor densities on S 1 , namely A : F− 1 → F 3 . The expression (2.2) is the Lie derivative on the space 2 2 Fλ and thus the action (4.5) is natural. This geometric meaning of the Sturm-Liouville operator has already been known by classics. This remarkable coincidence is just a simple observation, but it has important consequences. It relates the Virasoro algebra with the projective differential geometry (we refer to [28] and references therein for more details). For instance, one can now interpret the monodromy operator associated with A as an invariant of the coadjoint action and eventually obtain a classification of the coadjoint orbits of the Virasoro algebra. Sturm-Liouville operators are also closely related to the KdV hierarchy. Let us mention that another geometric approach to the study of the Virasoro coadjoint orbits can be found in [4]. 4.3. Relations with the Neveu-Schwarz superalgebra. It order to understand the origin of the Sturm-Liouville operator in the context of Virasoro algebra, we will apply Kirillov’s sophisticated method [14] that uses Lie superalgebras. Let us note that this is quite an unsusual situation when superalgebra helps to better understand the usual (“non-super”) situation. Let us recall the definition of the Neveu-Schwarz Lie superalgebra. Consider the direct sum k = Vect(S 1 ) ⊕ F− 1 2
368
V. Ovsienko, C. Roger
and define the structure of a Lie superalgebra on the space k by
∂ ∂ ∂ − 12 − 12 f = ( f gx − f x g + ϕψ) + ϕ dx , g + ψ dx ∂x ∂x ∂x 1 1 1 + f ψx − f x ψ − gϕx + gx ϕ d x − 2 , 2 2 which is symmetric on the odd part k1 = F− 1 . The Jacobi (super)identity can be easily 2 checked. Furthermore, the Gelfand-Fuchs cocycle (2.1) can be extended to the superalgebra k: ∂ ∂ − 12 − 12 = µ f (4.6) + ϕ dx , g + ψ dx ( f gx x x + 2 ϕψx x ) d x. ∂x ∂x S1 This 2-cocycle defines a central extension k of k called the Neveu-Schwarz algebra. The dual space of this superalgebra is k ∗ = F2 ⊕ F 3 , 2
since F ∗ 1 ∼ k can be easily calculated: = F 3 . The coadjoint action of − 2
∗ ad f
− 21 ∂ ∂ x +ϕ d x
2
3 1 3 u d x 2 + α d x 2 = f vx + 2 f x v + c f x x x + ϕ αx + ϕx α d x 2 2 2 (4.7) 3 3 f αx + f x α + u ϕ + 2 c ϕx x d x 2 . 2
The operator (4.4) is already present in this formula: it gives the action of the odd part of k on the even part of k∗ , namely ∗ − 1 u d x 2 = A(ϕ), ad ϕ dx
2
where A is as in (4.4). This way, the Sturm-Liouville operator naturally appears in the Virasoro context. The Kirillov-Segal result can now be deduced simply from the Jacobi identity for the superalgebra k. 4.4. Coadjoint action of g and matrix differential operators. It turns out that the coadjoint action of g given by the formula (4.3) can also be realized as an action of the non-extended algebra g on some space of differential operators. We introduce the space of 2 × 2-matrix differential operators ⎛ ⎞ ∂ 1 ∂ −c + g − g 0 2 x ⎜ ⎟ ∂ y ∂x 2 ⎟, A=⎜ (4.8) ⎝ ∂ 2 ∂ 3 ⎠ ∂ +g + gx 2c1 +v −c2 ∂x ∂y ∂x 2 where g and v are functions in (x, y), that is g, v ∈ C ∞ (S 1 × S 1 ). Let us define an action of the Lie algebra g on the space of operators (4.8).
Virasoro Algebra and Non-Linear Integrable Systems in Dimension 2 + 1
369
λ the space of loops in Fλ , i.e., of tensor fields on S 1 × S 1 of Let us now denote by F 1 ⊕F 3 is equipped with a g-action the form ϕ = ϕ(x, y) d x λ . The direct sum F Lf
∂ 2 ∂ x +u d x
1
ϕ dx− 2 3 ψ dx 2
−2
2
1 f ϕx − 21 f x ϕ d x − 2 3 . f ψx + 23 f x ψ + u ϕ d x 2
=
(4.9)
Assume the operator A acting on the above space of tensor densities: 3 → F 1 ⊕F 3 . 1 ⊕F A:F − − 2
2
2
2
Then the g-action is naturally defined by the commutator with the action (4.9): L f ∂ +u d x 2 (A) := L f ∂ +u d x 2 , A . (4.10) ∂x
∂x
The following statement is a generalization of the Kirillov-Segal result. Theorem 4.3. The action (4.10) of g on the space of differential operators (4.8) coincides with the coadjoint action (4.3) of the Lie algebra g. Proof. This statement can be checked by a straightforward computation.
In Sect. 4.6 we will give another, conceptual proof of this theorem. Let us also mention that similar results were obtained in [22, 21] for different examples of Lie algebras generalizing Virasoro. Theorem 4.3 implies that the invariants of the operators (4.8) are now invariants of the coadjoint orbits. It would be interesting to investigate these invariants, for instance, if there is an analogue of the monodromy operator of A. 4.5. Poisson bracket of tensor densities. Many of the explicit formulæ that we calculated in this paper (see for instance, (3.6), (4.9) and (4.7)) use the bilinear operation on tensor densities λ ⊗ F µ → F λ+µ+1 F given by
ϕ d x λ , ψ d x µ = (λ ϕ ψx − µ ϕx ψ) d x λ+µ+1 .
(4.11)
It is easy to check that the bilinear maps (4.11) are invariant. We will call the operation (4.11) the Poisson bracket of tensor densities. Let us rewrite some of the main formulæ using the Poisson bracket. The commutator −1 ⊕ F 2 is simply in g=F [( f, u), (g, v)] = ({ f, g}, { f, v} − {g, u}) ,
(4.12)
3 reads in invariant terms as follows: 1 ⊕F while the formula (4.9) of g-action on F − L ( f, u)
ϕ ψ
2
=
2
{ f ϕ} { f ψ} + u ϕ
.
(4.13)
370
V. Ovsienko, C. Roger
4.6. A Lie superalgebra extending g. Although the proof of Theorem 4.3 is, indeed, straightforward, it does not clarify the origin of the operators (4.8). The Kirillov method using the Lie superalgebras proved to be universal (see [27] for the details) and will be useful in our case. The Lie superalgebra we define in this section generalizes the Neveu-Schwarz algebra in the same sense as g generalizes the Virasoro algebra. It can be called the looped cotangent Neveu-Schwarz algebra. 1 ⊕F 3 and denote it by Consider the g-module F g1 . We will define the Lie super−2 2 algebra structure on = G g ⊕ g1 . Since we already know the g-action (4.9) on g1 , it remains to define the symmetric operation g1 → g, g1 ⊗ i.e., the “anticommutator”. Similarly to (4.12) and (4.13), let us set [(ϕ, α), (ψ, β)] = (ϕ ψ, {ϕ, β} + {ψ, α}) .
(4.14)
One immediately obtains the following Proposition 4.4. Formula (4.14) defines a structure of a Lie superalgebra on G. as even cocycles such that The cocycles (3.4) and (3.5) can be extended from g to G µ1 ((ϕ, α), (ψ, β)) = 2 ϕ ψx x d xd y (4.15) S 1 ×S 1
(cf. (4.6)) and µ2 ((ϕ, α), (ψ, β)) =
S 1 ×S 1
ϕ β y + ψ α y d xd y.
(4.16)
This defines a two-dimensional central extension G. First, observe that the (regular) dual space Let us calculate the coadjoint action of G. The corresponding Souriau-type cocycles extending (4.2) to G ∗ is isomorphic to G. G can also be easily calculated: 3
S1 (ϕ, α) = 2 ϕx x d x 2 , 1 3 S2 (ϕ, α) = −ϕ y d x − 2 − α y d x 2 ,
(4.17)
Finally, ∗ of G. so that one can write down the explicit formula of the coadjoint action ad ∗ to the odd part let us consider only the restriction of ad g1 and apply it to the even part ∗ which is of course isomorphic to of G g. More precisely, one obtains the map ∗ : ad g1 → End( g). One obtains the following
Virasoro Algebra and Non-Linear Integrable Systems in Dimension 2 + 1
371
Proposition 4.5. One has ∗ (g, v) = A ad ϕ α
ϕ , α
where A is the operator (4.8). This statement clarifies the origin of the linear differential operators (4.8). Proposition 4.5 also implies Theorem 4.3. The “kinematic” part of our work is complete; we now turn to the “dynamics”. 5. Euler Equations on g∗ In this section we calculate Euler equations associated with the Lie algebra g. We are particularly interested in the Euler equations which are bi-Hamiltonian. 5.1. Hamiltonian formalism on g∗ . Let us recall very briefly the explicit expression of the Hamiltonian vector fields on the dual space of a Lie algebra in the infinite-dimensional functional case (see, e.g., [6, 11] for more details). Given a functional H on g∗ which is a (pseudo)differential polynomial: H (g, v) = h g, v, gx , vx , g y , v y , ∂x−1 g, ∂x−1 v, ∂ y−1 g, ∂ y−1 v, gx y , vx y , . . . S 1 ×S 1
×d xd y,
where h is a polynomial in an infinite set of variables. The Hamiltonian vector field (1.5) with the Hamiltonian H reads ∗ δ H δ H (g, v), (g, v)t = −ad , δv
(5.1)
δg
where δδvH and δδgH are the standard variational derivatives given by the (generalized) Euler-Lagrange equations. For instance, δH ∂ ∂ = hv − h vx − δv ∂x ∂y ∂2 ∂2 + 2 h vx x + ∂x ∂ x∂ y
h v y − ∂x−1 h ∂x−1 v − ∂ y−1 h ∂ y−1 v
∂2 h vx y + 2 h v yy ± · · · , ∂y
where, as usual, h v means the partial derivative
∂h ∂v ,
similarly h vx =
∂h ∂vx .
5.2. An Euler equation on g∗ generalizing KP. Our first example is very close to the classic KP equation (1.2). Consider the following quadratic Hamiltonian on g∗ : H (g, v) =
v2 S 1 ×S 1
2
+ v ∂x−1 g y d xd y.
(5.2)
372
V. Ovsienko, C. Roger
The variational derivatives of H can be easily computed ∂ δH (g, v) = v + ∂x−1 g y , δv ∂x δH (g, v) = ∂x−1 v y d x 2 . δg We then use formula (5.1) and apply formula (4.3) of the coadjoint action to obtain the following result. Proposition 5.1. The Euler equation associated with the Lie algebra g and the Hamiltonian H is the following system: gt = vgx − vx g − g y g + gx ∂x−1 g y + c2 v y + c2 ∂x−1 g yy , (5.3) vt = 3vx v + c1 vx x x + c2 ∂x−1 v yy + 2vg y − v y g + vx ∂x−1 g y − 2gx ∂x−1 v y + c1 gx x y , with two indeterminates, g(t, x, y) and v(t, x, y). Note that the second equation in (5.3) can be written as vt = 3vx v + c1 vx x x + c2 ∂x−1 v yy + (linear terms in g), and this is nothing but the KP equation with some extra terms in g. In this sense, one can speak of the system (5.3) as a “generalized KP equation”. Remark 5.2. If one sets g ≡ 0 in (5.3), then the first equation gives v y = 0. The function v is thus independent of y and the second equation coincides with KdV (trivially “looped” by an extra variable y which does not intervene in the derivatives). Unfortunately, we do not know whether Eq. (5.3) is bi-Hamiltonian and have no information regarding its integrability. 5.3. “Bi-Hamiltonian formalism” on the dual of a Lie algebra. The notion of integrability of Hamiltonian systems on infinite-dimensional (functional) spaces can be understood in a number of different ways. A quite popular way to define integrability is related to the notion of bi-Hamiltonian systems that goes back to F. Magri [20]. The best known infinite-dimensional example is the KdV equation. Let us recall the standard way to obtain bi-Hamiltonian vector fields on the dual of a Lie algebra. Given a Lie algebra a, the canonical linear Poisson structure on a∗ is given by {F, G} (m) = [dm F, dm G] , m, where the differentials dm F and dm G at a point m ∈ a∗ are understood as elements of the Lie algebra a ∼ = (a∗ )∗ . Consider a constant Poisson structure: fix a point m 0 of the dual space and set {F, G}0 (m) = [dm F, dm G] , m 0 . It is easy to check that the above Poisson structures are compatible (or form a Poisson pair), i.e., their linear combination { , }λ = { , }0 − λ { , }
(5.4)
Virasoro Algebra and Non-Linear Integrable Systems in Dimension 2 + 1
373
is again a Poisson structure for all λ ∈ C. A function F on a∗ defines now two Hamiltonian vector fields associated with F: m t = ad∗dm F m
and
m t = ad∗dm F m 0
corresponding to the first and the second Poisson structure, respectively. Definition 5.3. A vector field X on a∗ is called bi-Hamiltonian if there are two functions, H and H such that X is a Hamiltonian vector field of H with respect to the Poisson structure { , } and is a Hamiltonian vector field of H with respect to { , }0 . Bi-Hamiltonian vector fields provide a rich source of integrable systems. Let H be a function on a∗ which is a Casimir function of the Poisson structure (5.4). This means, for every function F, one has {H, F}λ = 0.
(5.5)
Assume that H is written in a form of a series H = H0 + λ H1 + λ2 H2 + · · · .
(5.6)
One immediately obtains the following Proposition 5.4. The condition (5.5) is equivalent to the following two conditions: (i) H0 is a Casimir function of { , }0 . (ii) For all k, the Hamiltonian vector field of Hk+1 with respect to { , }0 coincides with the Hamiltonian vector field of Hk with respect to { , }. Furthermore, all the Hamiltonians Hk are in involution with respect to both Poisson structures: {Hk , H } = {Hk , H }0 = 0, and the corresponding Hamiltonian vector fields commute with each other. Indeed, if k ≥ , then one has {Hk , H }0 = {Hk−1 , H } = {Hk−1 , H+1 }0 until one obtains an expression of the form {Hs , Hs } or {Hs , Hs }0 which is identically zero. In practice, to construct an integrable hierarchy, one chooses a Casimir function H0 of the Poisson structure { , }0 and then considers its Hamiltonian vector field with respect to { , }. Thanks to the compatibility condition (5.4), this field is Hamiltonian also with respect to the Poisson structure { , }0 with some Hamiltonian H1 . Then one iterates the procedure. The above method has been successfully applied to the KdV equation viewed as a Hamiltonian field on the dual of the Virasoro algebra. 5.4. Bi-Hamiltonian Euler equation on g∗ . There exists an Euler equation on g∗ which is bi-Hamiltonian. This Euler equation is closely related to the BKP equation (1.3) in the special case α = 0. Theorem 5.5. The following Hamiltonian 1 1 c1 H (g, v) = g ∂x−1 v y − ∂x−1 g y + gx x d xd y, 2 2 c2 S 1 ×S 1 defines a bi-Hamiltonian system on g∗ .
(5.7)
374
V. Ovsienko, C. Roger
Proof. Let us fix the following point of g∗ : ∂ , d x 2 , 0, 0 (g, u, c1 , c2 )0 = − ∂x and show that the Euler vector field (5.7) is also Hamiltonian with respect to the constant Poisson structure { , }0 . Consider the following Casimir function of the constant Poisson structure { , }0 : H0 = (v − g) d xd y. S 1 ×S 1
Its Hamiltonian vector field with respect to the linear structure { , } is ! gt = gx vt = 2 gx + vx . The compatibility condition (5.4) guarantees that this vector field is Hamiltonian with respect to the constant structure { , }0 . Its Hamiltonian function can be easily computed: v g d xdy. H1 = S 1 ×S 1
Iterating this procedure, consider its Hamiltonian field with respect to the linear structure { , }: ! gt = c2 g y vt = c1 gx x x + c2 v y . Its Hamiltonian with respect to the constant structure { , }0 is proportional to the function (5.7), namely H2 = c2 H. We proved that the Hamiltonian H belongs to the hierarchy (5.6).
Remark 5.6. Note that the Hamiltonian H1 is nothing but the quadratic form (4.1). In the case of the Lie algebra g this is the Casimir function with identically zero Hamiltonian vector field. This is why, in the case of g, the corresponding Hamiltonian vector field linearly depends on the central charges c1 , c2 . Let us now calculate the explicit formula of the Euler equation. Proposition 5.7. The Euler equation with the Hamiltonian (5.7) is of the form: gt = gx ∂x−1 g y − g y g + c2 ∂x−1 g yy , vt = 2v g y − v y g + vx ∂x−1 g y − 2gx ∂x−1 v y + g y g + 2gx ∂x−1 g y − cc21 (gx x x g + 2gx x gx ) +2c1 gx x y + c2 ∂x−1 v yy + ∂x−1 g yy .
(5.8)
Proof. We compute the variational derivatives of H : ∂ δH (g, v) = ∂x−1 g y , δv ∂x δH c1 (g, v) = ∂x−1 v y − ∂x−1 g y + gx x d x 2 , δg c2 and then use formula (5.1) for the Euler equation together with formula (4.3) for the coadjoint action.
Virasoro Algebra and Non-Linear Integrable Systems in Dimension 2 + 1
375
The first equation in (5.8) is precisely Eq. (1.8) already defined in the Introduction. In the complex case, it is equivalent to BKP (1.3) with α = 0. It is an amazing fact that the KP equation (in the preceding section) and the BKP equation naturally appear in our context on mutually ”dual functions”, namely on v and g. Let us now show how the bi-Hamiltonian technique implies the existence of an infinite hierarchy of commuting flows. 5.5. Integrability of Equation (1.8). A corollary of Theorem 5.5 is the existence of an infinite series of first integrals in involution for the field (5.8). It turns out that the corresponding Hamiltonians are of a particular form. Proposition 5.8. Each Hamiltonian Hk of the constructed hierarchy is an affine functional in v, that is Hk (g, v) = Hk (g, v) + Hk (g) and Hk is linear in v. Hk Proof. Let us show that the variational derivative δδv does not depend on v. By construction, the expression for the Hamiltonian vector field of Hk with respect to the Poisson structure { , } gives δH δH δ Hk k k gt = g + c2 . (5.9) gx − δv δv x δv y
On the other hand, the same vector field is the Hamiltonian field of Hk+1 with respect to the Poisson structure { , }0 . One has δ Hk+1 gt = δv x that expresses the variational derivative proceeds by induction.
δ Hk+1 δv
in terms of the function
δ Hk δv .
One then
It follows that Eqs. (5.9) never depend on v. The flows of these vector fields commute with each other since the corresponding Hamiltonian fields on g commute. Example 5.9. The next vector field of our hierarchy is already quite complicated: gt = g g g y − gx ∂x−1 g y − gx ∂x−1 g g y − gx ∂x−1 g y −c2 g ∂x−1 g yy − gx ∂x−2 g yy + ∂x−1 g g y − gx ∂x−1 g y y +c22 ∂x−2 g yyy .
This is the first higher order equation in the hierarchy of (1.8). We can now give a partial answer to the question of T. Ratiu. The dispersionless BKP equation can be realized as an Euler equation on the dual of the looped cotangent Virasoro algebra. However, the problem remains open in the case of the classic KP equation. Appendix All the Lie algebras considered in this paper are generalizations of the Virasoro algebra with two space variables. However, these algebras themselves have interesting generalizations. These structures seem to be quite rich and deserve further study.
376
V. Ovsienko, C. Roger
Natural generalizations of L Vect(S 1 ) . Consider the 2-torus T2 = S 1 × S 1 parameterized with variables x and y. Let Vect(T2 ) be the Lie algebra of tangent vector fields on T2 . The Lie algebra L Vect(S 1 ) is naturally embedded to Vect(T2 ) as the Lie subalgebra of vector fields tangent to the constant field X = ∂∂x . This fact suggests the following generalization. Let V be a compact orientable manifold with a fixed volume form ω and X ∈ Vect(V ) be a non-vanishing vector field on V such that divX = 0. We will denote by A X the Lie algebra of vector fields collinear to X ; the Lie bracket of A X can be written as follows: [ f X, g X ] = ( f L X (g) − g L X ( f )) X. This is clearly a Lie subalgebra of Vect(V ) generalizing L Vect(S 1 ) . Some particular cases, such as the iterated loop were considered in [29]. One can easily construct a generalization of the Gelfand-Fuchs cocycle (2.1) on A X : c( f X, g X ) = f (L X )3 g ω. V
It is easy to check that this cocycle is non-trivial so that H 2 (A X ; C) is not trivial. However, we have no further information about this cohomology group. The geometry of coadjoint orbits of A, as well as possible applications to dynamical systems, also remains an interesting open problem. Remark 5.10. The condition divX = 0 is assumed here mainly for technical reasons (it makes the formulæ nicer); however, one may think of dropping this condition as well as the condition on X to be non-vanishing. A 2-parameter deformation of g. Let us describe a 2-parameter family of Lie algebras which can be obtained as a deformation of g. In [26] we classified the non-central extensions of Vect(S 1 ) by the space of quadratic differentials. The result is as follows. There are exactly two (up to isomorphism) non-trivial extensions of Vect(S 1 ) by F2 defined by the following 2-cocycles: ∂ ∂ ρ1 f ,g = ( f x x x g − f gx x x ) d x 2 , ∂x ∂x ∂ ∂ ρ2 f ,g = ( f x x x gx − f x gx x x ) d x 2 ∂x ∂x from Vect(S 1 ) to F2 . The 2-cocycles ρ1 , ρ2 give rise to the following modification of the Lie algebra law (3.6). We set
∂ ∂ ∂ ∂ f + u d x 2, g + v dx2 + u d x 2, g + v dx2 = f ∂x ∂x ∂x ∂x (κ1 ,κ2 ) ∂ ∂ ∂ ∂ + κ2 ρ2 f , + κ1 ρ1 f ,g ,g ∂x ∂x ∂x ∂x where κ1 κ2 ∈ C are parameters. This deformed commutator satisfies the Jacobi identity and provides an interesting Lie algebra structure.
Virasoro Algebra and Non-Linear Integrable Systems in Dimension 2 + 1
377
Acknowledgements. We are grateful to T. Ratiu for his interest in this work. We also wish to thank B. Khesin and A. Reiman for enlightening discussions.
References 1. Ablowitz, M.J., Clarkson, P.A.: Solitons, nonlinear evolution equations and inverse scattering. London Mathematical Society Lecture Note Series 149, Cambridge: Cambridge University Press, 1991 2. Arnold, V.I.: Mathematical methods of classical mechanics. Third edition, Moscow: Nauka, 1989 3. Arnold, V.I., Khesin, B.A.: Topological methods in hydrodynamics. Applied Mathematical Sciences 125. New York: Springer-Verlag, 1998 4. Balog, J., Fehér, L., Palla, L.: Coadjoint orbits of the Virasoro algebra and the global Liouville equation. Internat. J. Mod. Phys. A 13(2), 315–362 (1998) 5. Bogdanov, L.V., Konopelchenko, B.G.: On dispersionless BKP hierarchy and its reductions. J. Nonlinear Math. Phys. 12, Suppl. 1, 64–73 (2005) 6. Dickey L.A.: Soliton equations and Hamiltonian systems. Second edition, Adv. Ser. in Math. Phys. 26, RivereEdge, NJ World Scientific, 2003 7. Dunajski, M.: A class of Einstein-Weyl spaces associated to an integrable system of hydrodynamic type. J. Geom. Phys. 51(1), 126–137 (2004) 8. Ferapontov, E.V., Khusnutdinova, K.R.: On the integrability of (2 + 1)-dimensional quasilinear systems. Commun. Math. Phys. 248(1), 187–206 (2004) 9. Ferapontov, E.V., Khusnutdinova, K.R.: Hydrodynamic reductions of multi-dimensional dispersionless PDEs: the test for integrability. J. Math. Phys. 45(6), 2365–2377 (2004) 10. Fuks D.B.: Cohomology of infinite-dimensional Lie algebras. New York: Consultants Bureau, 1986 11. Guieu L., Roger C.: L’Algèbre et le Groupe de Virasoro: aspects géométriques et algébriques, généralisations. To appear 12. Gelfand, I.M., Fuks, D.B.: Cohomologies of the Lie algebra of vector fields on the circle. Func. Anal. Appl. 2(4), 92–93 (1968) 13. Hirota R.: The direct method in soliton theory. Cambridge Tracts in Mathematics 155, Cambridge: Cambridge University Press, 2004 14. Kirillov, A.: The orbits of the group of diffeomorphisms of the circle, and local Lie superalgebras. Func. Anal. Appl. 15(2), 75–76 (1981) 15. Kirillov A.: Infinite-dimensional Lie groups: their orbits, invariants and representations. The geometry of moments. Lecture Notes in Math. 970, Berlin: Springer, 1982, pp. 101–123 16. Khesin, B., Misiolek, G.: Euler equations on homogeneous spaces and Virasoro orbits. Adv. Math. 176(1), 116–144 (2003) 17. Khesin, B., Ovsienko, V.: The super Korteweg-de Vries equation as an Euler equation. Func. Anal. Appl. 21(4), 81–82 (1987) 18. Konopelchenko, B., Martinez Alonso, L.: Dispersionless scalar integrable hierarchies, Whitham hierarchy, and the quasiclassical ∂-dressing method. J. Math. Phys. 43(7), 3807–3823 (2002) 19. Loday, J.-L.: Cyclic homology. Second edition. Berlin: Springer-Verlag, 1998 20. Magri, F.: A simple model of the integrable Hamiltonian equation. J. Math. Phys. 19(5), 1156–1162 (1978) 21. Marcel, P.: Generalizations of the Virasoro algebra and matrix Sturm-Liouville operators. J. Geom. Phys. 36(3–4), 211–222 (2000) 22. Marcel, P., Ovsienko, V., Roger, C.: Extension of the Virasoro and Neveu-Schwarz algebras and generalized Sturm-Liouville operators. Lett. Math. Phys. 40(1), 31–39 (1997) 23. Martinez Alonso, L., Shabat, A.B.: Towards a theory of differential constraints of a hydrodynamic hierarchy. J. Nonlinear Math. Phys. 10(2), 229–242 (2003) 24. Misiolek, G.: A shallow water equation as a geodesic flow on the Bott-Virasoro group. J. Geom. Phys. 24(3), 203–208 (1998) 25. Misiolek, G.: Conjugate points in the Bott-Virasoro group and the KdV equation. Proc. Amer. Math. Soc. 125(3), 935–940 (1997) 26. Ovsienko, V., Roger, C.: Generalizations of Virasoro group and Virasoro algebra through extensions by modules of tensor-densities on S 1. Indag. Math. (N.S.) 9(2), 277–288 (1998) 27. Ovsienko V.: Coadjoint representation of Virasoro-type Lie algebras and differential operators on tensor-densities. DMV Sem. 31, Basel: Birkhäuser, 2001, pp. 231–255 28. Ovsienko V., Tabachnikov S.: Projective differential geometry old and new. From the Schwarzian derivative to the cohomology of diffeomorphism groups. Cambridge Tracts in Math. 165, Cambridge: Cambridge University Press, 2005 29. Ramos, E., Sah, C.-H.: R Shrock, Algebras of diffeomorphisms of the N -torus. J. Math. Phys. 31(8), 1805– 1816 (1990)
378
V. Ovsienko, C. Roger
30. Reiman, A., Semenov-Tyan-Shanskii M.: Hamiltonian structure of equations of Kadomtsev-Petviashvili type. Differential geometry, Lie groups and mechanics, VI. Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 133, 212–227 (1984) 31. Segal, G.: Unitary representations of some infinite-dimensional groups. Commun Math. Phys. 80(3), 301– 342 (1981) 32. Zusmanovich P.: The second homology group of current Lie algebras. In: K -theory (Strasbourg, 1992), Astérisque No. 226(11), 435–452 (1994) Communicated by L. Takhtajan
Commun. Math. Phys. 273, 379–394 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0242-2
Communications in
Mathematical Physics
t 1/3 Superdiffusivity of Finite-Range Asymmetric Exclusion Processes on Z Jeremy Quastel , Benedek Valkó , Departments of Mathematics and Statistics, University of Toronto, Toronto, Ontario MSS 1L2, Canada. E-mail: [email protected]; [email protected] Received: 2 June 2006 / Accepted: 31 October 2006 Published online: 15 May 2007 – © Springer-Verlag 2007
Abstract: We consider finite-range asymmetric exclusion processes on Z with non-zero drift. The diffusivity D(t) is expected be of O(t 1/3 ). We prove that D(t) ≥ Ct 1/3 ∞to−λt in the weak (Tauberian) sense that 0 e t D(t)dt ≥ Cλ−7/3 as λ → 0. The proof employs the resolvent method to make a direct comparison with the totally asymmetric simple exclusion process, for which the result is a consequence of the scaling limit for the two-point function recently obtained by Ferrari and Spohn. In the nearest neighbor case, we show further that t D(t) is monotone, and hence we can conclude that D(t) ≥ Ct 1/3 (log t)−7/3 in the usual sense. 1. Introduction A finite-range exclusion process on the integer lattice Z is a system of continuous time, rate one random walks with finite-range jump law p(·), i.e. p(z) ≥ 0, and p(z) = 0 for z > R for some R < ∞, z p(z) = 1, interacting via exclusion: Attempted jumps to occupied sites are suppressed. We will always assume in this article that p(·) has a non-zero drift, zp(z) = b = 0. (1.1) z
In particular, p(·) is asymmetric and we will refer to the process as the asymmetric exclusion process (AEP). The state space of the process is {0, 1}Z and it is traditional to call configurations η, with ηx ∈ {0, 1} indicating the absence, or presence, of a particle at x ∈ Z. The infinitesimal generator of the process is given by L f (η) = p(z)ηx (1 − ηx+z )( f (η x,x+z ) − f (η)), (1.2) x,z∈Z Supported by the Natural Sciences and Engineering Research Council of Canada.
Partially supported by the Hungarian Scientific Research Fund grants T37685 and K60708.
380
J. Quastel, B. Valkó
where η x,y denotes the configuration obtained from η by interchanging the occupation variables at x and y. The Bernoulli product measures πρ , ρ ∈ [0, 1], with πρ (ηx = 1) = ρ form a one-parameter family of invariant measures for the process. In fact, there exist other invariant measures [BM], but they will not be relevant for our discussion. The process starting from π0 and π1 are trivial and so we consider the stationary process obtained by starting with πρ for some ρ ∈ (0, 1). Let ηx − ρ ηˆ x = √ ηˆ x (1.3) , ηˆ A = ρ(1 − ρ) x∈A for any finite nonempty set A ⊂ Z. The collection {ηˆ A }, where A ranges over a finite subset of Z is an orthonormal basis of L 2 (πρ ) with its natural inner product f, g = f gdπρ . (1.4) {0,1}Z
Then L 2 (πρ ) can naturally be thought of as the direct sum of subspaces H1 , H2 , . . . , where Hn is the linear span of {ηˆ A }, |A| = n. It is natural to think of H1 as being linear functions, H2 as quadratic functions, etc. From a physical point of view, the most basic quantity is the two-point function, S(x, t) = E[(ηx (t) − ρ)(η0 (0) − ρ)].
(1.5)
The expectation is with respect to the stationary process obtained by starting from one of the invariant measures πρ . It is easy to show (see [PS]) that S(x, t) satisfies the sum rules 1 S(x, t) = ρ(1 − ρ) = χ , x S(x, t) = (1 − 2ρ)bt. (1.6) χ x x Note that one should not expect to be able to actually compute S(x, t) but one does hope to find its large scale structure. The next most basic quantity, the diffusivity D(t), is already unknown. It is defined as D(t) = (χ t)−1
(x − (1 − 2ρ)bt)2 S(x, t).
(1.7)
x∈Z
Using coupling (see [L]), the diffusivity can be rewritten in terms of the variance of a second class particle. Suppose one starts with two configurations η and η which are ordered in the sense that η x ≥ ηx for each x ∈ Z. One can couple the two exclusions by having them jump together whenever possible and one observes that at later times the ordering is preserved. If we write η = η + η
, then the “particles” of η
move according to the second class particle dynamics. Among themselves they move with the standard exclusion rule, the other (first class) particles move without noticing them, and if a first class particle attempts to jump to a site occupied by a second class particle, the two exchange positions. Note that χ −1 S(x, t) = P(ηx (t) = 1 | η0 (0) = 1) − P(ηx (t) = 1 | η0 (0) = 0) = P(η
x (t) = 1 | η
(0) = δ0 ) = P(X (t) = x | X (0) = 0)
(1.8)
t 1/3 Superdiffusivity of Finite-Range Asymmetric Exclusion Processes on Z
381
are the transition probabilities of a single second class particle X (t) starting at the origin. Here δ0 is the configuration with only one particle at 0 and P is the coupled measure. The diffusivity is then given by D(t) = t −1 V ar (X (t)).
(1.9)
We can alternately write the dynamics as a stochastic differential equation dηˆ x = (∇wx + ηˆ x )dt + dMx ,
(1.10)
where d is a microscopic convective derivative, dηˆ x = d ηˆ x +
(1 − 2ρ) p(z)(ηˆ x+z − ηˆ x−z ), 2 z
∇ and are microscopic analogues of first and second spatial derivatives, ∇wx = χ 1/2 p(z)(ηˆ x,x+z − ηˆ x−z,x ),
(1.11)
(1.12)
z
ηˆ x =
1 p(z)(ηˆ x+z + ηˆ x−z − 2ηˆ x ), 2 z
and Mx (t) are martingales with t t 2 E[( φx dMx ) ] = p(z)(φx+z − φx )2 ds. 0
0
x
(1.13)
(1.14)
x,z
The current wx = τx w, where the specific quadratic function w is given by w= zp(z)ηˆ {0,z} .
(1.15)
z
In this sense, AEP is a natural discretisation of the stochastic Burgers equation, ∂t u = ∂x u 2 + ∂x2 u + ∂x W˙
(1.16)
for a function u(t, x) of x ∈ R and t > 0 where W˙ is a space-time white noise. White noise is supposed to be an invariant measure. Letting ∂x U = u one obtains the Kardar-Parisi-Zhang equation for surface growth, ∂t U = (∂x U )2 + ∂x2 U + W˙ .
(1.17)
We are interested in the large scale behaviour and the only rescalings of u which preserve the initial white noise are u (t, x) = −1/2 u( −z t, −1 x).
(1.18)
The stochastic Burgers equation (1.16) transforms to ∂t u = 2 −z ∂x u 2 + 2−z ∂x2 u + 1− 2 ∂x W˙ , 3
z
(1.19)
which suggests that the dynamical exponent z = 3/2 and that the diffusion and random forcing terms become irrelevant in the limit.
382
J. Quastel, B. Valkó
The exponent z = 3/2 was first predicted for (1.16) by Forster, Nelson and Stephen [FNS], then for AEP by van Beijeren, Kutner and Spohn [BKS] and then for (1.17) by Kardar, Parisi and Zhang [KPZ]. Note that at a rigorous level we are very far from understanding this for either (1.16) or (1.17). At the present time the mathematical problem there is just to make sense of the equation (see [BG]). So it makes sense to consider exclusion processes, which are clearly well defined, yet are supposed to have the same large scale behaviour. The scaling prediction for u suggests that on large scales S(x, t) t −2/3 (t −2/3 (x − (1 − 2ρ)bt))
(1.20)
for some scaling function , and in particular one conjectures that, D(t) Ct 1/3 .
(1.21)
Note that the case of asymmetric exclusion with mean-zero jump law is different and there one has as usual that D(t) → D as t → ∞ (see [V]). The diffusivity can be related to the time integral of current-current correlation functions by the Green-Kubo formula, t s 2 −1 D(t) = z p(z) + 2χ t w, eu L w duds. (1.22) 0
z
0
It uses a special inner product defined for local functions by φ, ψ = φ, τx ψ .
(1.23)
x
Equation (1.22) is proved in [LOY] (in the special case p(1) = 1, but the proof for general AEP is the same.) A useful variant is obtained by taking the Laplace transform, ∞ −λt −2 2 2 e t D(t)dt = λ z p(z) + 2χ |||w|||−1,λ , (1.24) 0
z
where the H−1 norm corresponding to L is defined for local functions by |||φ|||−1,λ = φ, (λ − L)−1 φ 1/2 .
∞
(1.25)
We say that D(t) t ρ , ρ > 0 in the weak (Tauberian) sense if 0 e−λt t D(t)dt λ−(2+ρ) . Hence the weak (Tauberian) version of the conjecture (1.21) is |||w|||2−1,λ λ−1/3 .
(1.26)
One of the key advantages of this resolvent approach is that there is a variational formula (see [LQSY]), |||w|||2−1,λ = sup 2w, f − f, (λ − S) f − A f, (λ − S)−1 A f , (1.27) f
where S = 21 (L + L ∗ ) and A = 21 (L − L ∗ ) are the symmetric and antisymmetric parts of the generator L. S is nothing but the generator of the symmetric exclusion process with p(z) ¯ = 21 ( p(z) + p(−z)). It has the special property that it maps the subspaces Hn
t 1/3 Superdiffusivity of Finite-Range Asymmetric Exclusion Processes on Z
383
into themselves, and on each is nothing but the generator of a symmetric random walk. Hence one can hope to obtain non-trivial information from (1.27) by choosing carefully test functions f . This idea was used in [LQSY] to obtain D(t) ≥ Ct 1/4 in d = 1 and D(t) ≥ C(log t)1/2 in d = 2, which was improved to D(t) C(log t)2/3 in [Y]. All of these are in the weak (Tauberian) sense. The special case of jump law p(1) = 1, p(z) = 0, z = 1 is called the totally asymmetric simple exclusion process (TASEP). Simple refers to the nearest-neighbour jumps of the underlying random walk. It is very remarkable that after about 20 years of intense study, TASEP has succumbed to a combination of sophisticated techniques from analysis, combinatorics and random matrix theory (see [FS] and references therein). We now state the main result of Ferrari and Spohn [FS]. Define the height function h t (x) = 2Nt − Mt (x)
(1.28)
t ≥ 0, where Nt counts the number of jumps from site 0 to site 1 up to time t and ⎧ x if x > 0, ⎨ i=1 (2ηi (t) − 1) 0 if x = 0, (1.29) Mt (x) = ⎩ 0 − i=x+1 (2ηi (t) − 1) if x < 0. Note that E[h t (x)] = 2χ t + (1 − 2ρ)x. Let v(x, t) = V ar (h t (x)).
(1.30)
Since h t (x + 1) − h t (x) = 1 − 2ηx+1 (t), it is not hard to check that 8S(x, t) = v(x + 1, t) − 2v(x, t) + v(x − 1, t). See [PS] for a detailed proof. We have V ar (h t (x)) − 4χ |x − (1 − 2ρ)t| D(t) = (4χ t)−1
(1.31)
(1.32)
x∈Z
(see Sect. 4). Now consider a normalised version of h t : hˆ t (x) = χ −2/3 t −1/3 (h t (x) − E[h t (x)]),
(1.33)
and for each fixed t > 0 and ω ∈ R let Fω,t be the cumulative distribution function of −hˆ t ((1 − 2ρ)t + 2ωχ 1/3 t 2/3 ); Fω,t (s) = P(−hˆ t ((1 − 2ρ)t + 2ωχ 1/3 t 2/3 ) ≤ s).
(1.34)
The main result of Ferrari and Spohn concerning TASEP is that d Fω,t converge weakly as probability measures, as t tends to infinity, to d Fω where ∂ FGU E (s + ω2 )g(s + ω2 , ω) , (1.35) Fω (s) = ∂s where FGU E is the Tracy-Widom distribution and g is a scaling function defined through the Airy kernel (see [FS] for details). Note that the convergence stated in [FS] is that for any c1 < c2 , c2 c2 lim Fω (s, t)ds = Fω (s)ds. (1.36) t→∞ c 1
c1
384
J. Quastel, B. Valkó
In fact, this is the same as weak convergence. For by monotonicity, if > 0, s s+ −1 −1 Fω,t (u)du ≤ Fω,t (s) ≤ Fω,t (u)du. s−
(1.37)
s
Taking the limit in t and using (1.36) we see that limt→∞ Fω,t (s) = Fω (s) at any continuity point s of the limit function (in this case all s ∈ R), and this is equivalent to weak convergence. The proof of Ferrari and Spohn is through a direct mapping between TASEP and a particular last passage percolation problem. Such a mapping is not available except for the case of TASEP. So although one expects analogous results for general AEP in one dimension, different techniques will be required. Our main motivation here is to confirm, at least in part, the predicted universality (see Sect. 6 of [PS] for a nice description) by showing that these results for TASEP imply some bounds for general AEP. From (1.32) and (1.36) one expects D T AS E P (t) c T AS E P χ 2/3 t 1/3 , where c
T AS E P
=
dω
s d Fω (s) = 2 2
(1.38)
dω
ds FGU E (s + ω2 )g(s + ω2 , ω).
(1.39)
Here, and throughout this article, we will use the superscript T AS E P to denote the values taken by TASEP of quantities defined for general AEP. Unfortunately, the necessary estimates for the upper bound appear to be missing at this time. However from the weak convergence we have immediately that Corollary 1.
lim inf t −1/3 D T AS E P (t) ≥ c T AS E P χ 2/3 . t→∞
(1.40)
Remark. Another way to see the strict positivity of the left-hand side without computing c T AS E P is that by Schwartz’s inequality and (1.7), 2 −1 −1 |x − (1 − 2ρ)t|S(x, t) . (1.41) χ D(t) ≥ t x∈Z
We have
|x − (1 − 2ρ)t|S(x, t) = 2V ar (h t ((1 − 2ρ)t))
(1.42)
x∈Z
(see Sect. 4) and from the weak convergence we have, lim inf t −2/3 V ar (h t ((1 − 2ρ)t)) ≥ χ 4/3 t→∞
Since by (1.6)
s 2 d Fω (s).
(|x − (1 − 2ρ)t| − |x − (1 − 2ρ)t|)S(x, t) ≤ χ x∈Z
the positive lower bound on lim inf t→∞ t −1/3 D T AS E P (t) follows.
(1.43)
t 1/3 Superdiffusivity of Finite-Range Asymmetric Exclusion Processes on Z
385
The main result of the present article is a comparison between the diffusivity of AEP and that of TASEP: Theorem 1. Let D(t) be the diffusivity of a finite range exclusion process in d = 1 with non-zero drift. Let D T AS E P (t) be the diffusivity of the totally asymmetric simple exclusion process. There exist 0 < K , C < ∞ such that ∞ ∞ −1 −λK −1 t T AS E P C e tD (t)dt ≤ e−λt t D(t)dt (1.44) 0 0 ∞ e−λK t t D T AS E P (t)dt. ≤C 0
Combined with (1.40) this gives Theorem 2. For any finite range exclusion process in d = 1 with non-zero drift, D(t) ≥ Ct 1/3 in the weak (Tauberian) sense: There exists C > 0 such that ∞ e−λt t D(t)dt ≥ Cλ−7/3 . (1.45) 0
We now make some comments on obtaining strict versions of the estimates, as opposed to weak (Tauberian) versions. In [LY] it is shown that t t −1 E[ w(s)ds τx w(s)ds] ≤ |||w|||2−1,t −1 , (1.46) t x
0
0
and hence an upper weak (Tauberian) bound implies a strict upper bound in time on the diffusivity. There is no analogous fact for lower bounds. However, it is easy to show the following: Proposition 1. Suppose that v(t) ≥ 0 is a nondecreasing function and β > 0. 1. Suppose there exist c1 < ∞ and λ0 > 0 such that for 0 < λ < λ0 , ∞ e−λt v(t)dt ≤ c1 λ−(1+β) ,
(1.47)
0
then there exist c2 < ∞ and t0 such that for all t > t0 , v(t) ≤ c2 t β .
(1.48)
2. Suppose v(t) ≤ c2 t α for some α ≥ β and t > t0 and for some c3 > 0, for 0 < λ < λ0 , ∞ e−λt v(t)dt ≥ c3 λ−(1+β) . (1.49) 0
Then there exists c4 > 0 and t1 < ∞ such that for t > t1 , c4 t β if α = β; v(t) ≥ β −(1+β) c4 t (log t) if α > β.
(1.50)
386
J. Quastel, B. Valkó
Proof. 1. Since v is monotone nondecreasing we have for t > λ−1 0 , ∞ ∞ e−1 v(t) = e−s v(t)ds ≤ e−s v(ts)ds ≤ c1 t β . 1
(1.51)
0
t 2. Because v(t) is non-decreasing, 0 e−λs v(s)ds ≤ tv(t) and if v(t) ≤ c2 t α we have ∞ −λs v(s)ds ≤ c2 λ−1 e−λt t α for t > t1 . Hence t e c3 λ−(1+β) ≤ tv(t) + c2 λ−1 e−λt t α . Choosing λ = t −1 (1 + (α − β)(log t + c log log t)) gives the result. Note that the bound
∞
e−λt t D(t)dt ≤ Cλ−5/2
(1.52) (1.53)
0
can be derived easily from the variational formula (1.27) (see the proof of Proposition 3 for a similar computation). Certainly one expects t D(t) to be nondecreasing in general. We will show in Lemma 2 that z 2 p(z) − 2ρ z( p(z) − p(−z))E[ X (t)|ηz (0) = 1], (1.54) ∂t (t D(t)) = z
z>0
where
X (t) = X (t) − (1 − 2ρ)bt. (1.55) What one expects is that b E[ X (t) | ηz (0) = 1] ≤ 0. If p(z) ≥ p(−z) for all z > 0, (or for all z < 0) this would imply that t D(t) is increasing. We have only been able to prove this in the special case of the simple (nearest neighbor) exclusion (see Proposition 4). Hence for this class of AEP we can make the following statement: Theorem 3. Let D(t) be the diffusivity of a nearest neighbor ( p(z) = 0, |z| = 1) asymmetric exclusion. 1. There exists c0 > 0 such that D(t) ≥ c0 t 1/3 (log t)−7/3 .
(1.56)
2. Suppose that there exists c1 < ∞ such that D T AS E P (t) ≤ c1 t 1/3 .
(1.57)
Then there exists c2 < ∞ such that c2−1 t 1/3 ≤ D(t) ≤ c2 t 1/3 .
(1.58)
Remarks. 1. Note that in Theorems 1 and 2 we have not made any assumptions about the irreducibility of p(·). Let κ = gcd(y ∈ Z : p(y) > 0).
(1.59)
If κ > 1 then our AEP is the same as κ independent copies of the AEP with jump law p(y) ˜ = p(κ y) on the sublattices κZ + i (i = 0, 1, . . . , κ − 1). Using this simple observation it is easy to extend all our proofs from κ = 1 to κ > 1, so we can assume without loss of generality in the proofs that p(·) is irreducible.
t 1/3 Superdiffusivity of Finite-Range Asymmetric Exclusion Processes on Z
387
2. Analogous methods to the ones presented here could in principle be applied to other functionals of AEP. For example, the variance of the occupation time of the origin, t ηs (0)ds, (1.60) 0
O(t 4/3 ).
In [B] a lower bound of the form Ct 5/4 is obtained. This is also expected to be variance is again given by the H−1 norm of a certain function and direct comparisons between its value for TASEP and general AEP can be obtained in a straightforward way. Hence asymptotic order of growth bounds for this variance under TASEP would imply the same for AEP. Unfortunately, at the present time no such bounds are available, though it is plausible they could be derived from the machinery that has been developed for TASEP. 2. Comparison of H−1 Norms The first proposition adapts results of Sethuraman [S] to the present context. Proposition 2. There exist α, β ∈ (0, ∞) depending only on p(·) such that T AS E P T AS E P α −1 |||φ|||−1,β −1 λ ≤ |||φ|||−1,λ ≤ α|||φ|||−1,βλ .
(2.1)
Proof. We can also define H−1 norms based on the standard inner product ·, · : φ−1,λ = φ, (λ − L)−1 φ .
(2.2)
T AS E P T AS E P α −1 φ−1,β −1 λ ≤ φ−1,λ ≤ αφ−1,βλ .
(2.3)
From [S] we have that
From the translation invariance of the generators τx φ, (λ − L)−1 φ |||φ|||2−1,λ = x n n 1 τx φ, (λ − L)−1 τx φ n→∞ 2n x=−n x=−n
= lim
n 1 τx φ2−1,λ . n→∞ 2n x=−n
= lim The proposition follows.
Proposition 3. Let w be the current corresponding to a general AEP as in (1.15) and w T AS E P be the current for TASEP. Then there exists C < ∞ such that for 0 < λ < 1, |||w − bw T AS E P |||−1,λ ≤ C.
(2.4)
Remarks. 1. In the theorem one can use either L or L T AS E P to define ||| · |||−1,λ since the results are equivalent. 2. This is similar to, but not the same as, results in [SX], because of the special norm ||| · |||−1,λ .
388
J. Quastel, B. Valkó
Proof. Since
w − bw T AS E P =
x p(x) ηˆ {0,1} − ηˆ {0,x} ,
x
it is enough to show that
|||ηˆ {0,1} − ηˆ {0,k+1} |||−1,λ ≤ C
(2.5)
for each k > 0, where C is a constant depending on p(·) and k and ||| · |||−1,λ is defined using the generator L T AS E P . Call V = ηˆ {0,1} − ηˆ {0,k+1} . Dropping the third term in the variational formula (1.27) we have |||V |||2−1,λ ≤ V, (λ − S)−1 V .
(2.6)
We now show that the right-hand side is bounded independent of λ. The computation is maps H2 to itself. In particular, if f, g ∈ H2 with done using the fact that S f = x
f (x + z, y + z)g(x, y) =
z x
where
∞
f (x)g(x),
(2.7)
x=0
f (x) =
f (y, y + x + 1),
(2.8)
y
and S f =
x
Sˆ f (x, y)ηˆ {x,y} with
1 f (x, y + 1) + f (x − 1, y) − 2 f (x, y) Sˆ f (x, y) = 2 + 1{y−x>1} ( f (x, y − 1) + f (x + 1, y) − 2 f (x, y)) .
(2.9)
Moreover (2.10) Sˆ f (x) = (S f¯)(x) = f (x + 1) − f (x) + 1{x>0} ( f (x − 1) − f (x)). Our V = x
for some h. Then V, (λ − S)−1 V =
x
h(x, x + 1) −
h(x, x + k + 1)
x
= h(0) − h(k) = (λ − S)−1 V (0) − (λ − S)−1 V (k),
(2.12)
where V (x) = 1{x=0} − 1{x=k} . An explicit computation shows that q(x) :=
γx = (λ − S)−1 1{x=0} (x), λ+1−γ
(2.13)
t 1/3 Superdiffusivity of Finite-Range Asymmetric Exclusion Processes on Z
389
where γ = γ (λ) is the solution of the equation λ + 2 = γ −1 + γ
(2.14)
with 0 < γ < 1. This is easy to check: if x > 0 then
(λ − S)q (x) = λ − (γ − 1) − (γ −1 − 1)
γx =0 λ+1−γ
and (λ − S)q (0) = (λ − (γ − 1))
1 = 1. λ+1−γ
A similar calculation shows that if k > 0 then one can find constants c1 , c2 (depending on k and λ) such that 1
(c γ k−x + c2 γ x−k ) if 0 ≤ x < k −1 (λ − S) 1{x=k} (x) = 2 11 x−k if k ≤ x 2 (c1 + c2 ) γ and that there is a C < ∞ such that |ci − λ−1/2 | ≤ C, i = 1, 2.
(2.15)
So (λ − S)−1 V (0) − (λ − S)−1 V (k) =
1 1 − γk + (c1 (1 − γ k ) + c2 (1 − γ −k )). (2.16) λ+1−γ 2
Since γ 1 − λ1/2 as λ → 0, it is not hard to check that the right-hand side is bounded for 0 < λ < 1. 3. Monotonicity of t D(t) Let X (t) be the position of a second class particle at time t started at the origin and X (t) = X (t) − (1 − 2ρ)bt. Lemma 1. For any AEP,
and
(1 − ρ)E[ X (t)|η y (0) = 0] + ρ E[ X (t)|η y (0) = 1] = 0
(3.17)
E[ X (t)|η y (0) = 1] = E[ X (t)|η−y (0) = 1].
(3.18)
Proof. Equation (3.17) is straightforward from E[X (t)] = (1 − 2ρ)tb. To prove (3.18) we first write the difference as x(P(X (t) = x|η y (0) = 1) − P(X (t) = x|η−y (0) = 1)) (3.19) x
=
x
x P(X (t) = x|η y (0) = 1) −
(x + y)P(X (t) = x|η−y (0) = 1) + y. x
390
J. Quastel, B. Valkó
We can write P(X (t) = x|η y (0) = 1) (3.20) = E[ηx (t)|η0 (0) = 1, η y (0) = 1] − E[ηx (t)|η0 (0) = 0, η y (0) = 1], and by the translation invariance P(X (t) = x|η−y (0) = 1) (3.21) = E[ηx (t)|η0 (0) = 1, η−y (0) = 1] − E[ηx (t)|η0 (0) = 0, η−y (0) = 1] = E[ηx+y (t)|η0 (0) = 1, η y (0) = 1] − E[ηx+y (t)|η0 (0) = 1, η y (0) = 0]. Substituting these into the previous equation we get E[ X (t)|η y (0) = 1] − E[ X (t)|η−y (0) = 1] (3.22) = x{E[ηx (t)|η0 (0) = 1, η y (0) = 0] − E[ηx (t)|η0 (0) = 0, η y (0) = 1]} + y x
= χ −1
x E[ηx (t)(η0 (0) − η y (0)] + y
x
=0 by (1.6).
Lemma 2. For any AEP, ∂t (t D(t)) = z 2 p(z) − 2ρ z( p(z) − p(−z))E[ X (t)|ηz (0) = 1]. z
(3.23)
z>0
Proof. We compute ∂t x 2 S(x, t) = x 2 p(z)ηx−z (t)(1−ηx (t))−ηx (t)(1−ηx+z (t)), η0 (0) . (3.24) x
x,z
Summing by parts, using the translation invariance, reversing space and time, we can rewrite (3.24) as (−2x z + z 2 ) p(−z)η0 (0)(1 − ηz (0)), ηx (t) − ρ . (3.25) x,z
Again, by explicit computation η0 (1 − ηz )(0), ηx (t) − ρ is given by
χ ρ(1 − ρ) (e11 − e01 ) + (1 − ρ)2 (e10 − e00 ) − ρ (e11 − e10 ) ,
(3.26)
where ei j = E[ηx (t)|η0 (0) = i, ηz (0) = j].
(3.27)
Equation (3.26) can be rewritten in terms of the second class particle (see (3.21)) as χ ((1 − ρ)P(X (t) = x) − ρ P(X (t) = x − z|η−z (0) = 1)) .
(3.28)
t 1/3 Superdiffusivity of Finite-Range Asymmetric Exclusion Processes on Z
391
Substituting this into (3.25) and using (1.1), (1.6): x 2 S(x, t) = χ (1 − ρ) (−2E[X (t)]z + z 2 ) p(−z) ∂t x
z
(−2E[X (t)|η−z (0) = 1]z − z 2 ) p(−z) − χρ =χ
z
z 2 p(z) + 2b2 t (1 − 2ρ)2
z
− 2χρ
(3.29)
E[ X (t)|ηz (0) = 1]zp(z).
z
Using (3.18) and the definition of D(t) completes the proof.
Proposition 4. Suppose that p(z) = 0 for |z| = 1 (nearest neighbor). Then t D(t) is non-decreasing in t. Proof. We can assume p(1) ≥ p(−1). In this case we will show E[ X (t)|η1 (0) = 1] ≤ (1 − ρ).
(3.30)
By the previous lemma, ∂t (t D(t)) ≥ (1 − 2ρ(1 − ρ)) p(1) + (1 + 2ρ(1 − ρ)) p(−1) ≥ 0.
(3.31)
Consider a configuration where at site 1 we have a second class particle, at site 0 we have a third class particle and at all the other sites the distribution of particles is independent Bernoulli with probability ρ. The ordinary particles don’t see the second or third class particles (i.e. they see them as empty sites) and the second class particle doesn’t see the third class particle. Let the process evolve according to the AEP dynamics, and denote the position of the third and second class particle with A(t) and B(t), respectively. It is not hard to see that the law of A(t) is the same as the law of X (t) conditioned on the event {η1 (0) = 1} and we have to prove E[A(t)] ≤ (1 − ρ) + (1 − 2ρ)bt.
(3.32)
Also, the law of B(t) is the same as the law of X (t) + 1 conditioned on the event {η−1 (0) = 0}. By Lemma 1 we have E[ρ A(t) + (1 − ρ)B(t)] = (1 − ρ) + (1 − 2ρ)bt, thus it is enough to prove that E[A(t)] ≤ E[B(t)].
(3.33)
Define the variable Z (t) the following way: Z (t) = 0 if A(t) < B(t) and Z (t) = 1 otherwise. Consider a possible joint trajectory (x1 (t), x2 (t)) for min(A(t), B(t)), max(A(t), B(t)) . Conditioned on {(x1 (s), x2 (s)), 0 ≤ s ≤ t}, Z (t) is a continuous time Markov process on {0, 1} with rate p(−1)1{x2 (t)−x1 (t)=1} for the transition 0 → 1 and p(1)1{x2 (t)−x1 (t)=1} for the transition 1 → 0. This uses the fact that our process is nearest neighbor, and thus Z (t) can change only if the second and third class particles switch places. We
392
J. Quastel, B. Valkó
can now calculate P(Z (t) = 0(x1 (s), x2 (s)), 0 ≤ s ≤ t) explicitly. Let T (t) = |{s : x2 (s) − x1 (s) = 1, 0 ≤ s ≤ t}| be the time spent by the two particles up to time t with distance 1 between them, then (using P(Z (0) = 0) = 1) p(−1)e−T (t)( p(−1)+ p(1)) + p(1) P(Z (t) = 0(x1 (s), x2 (s)), 0 ≤ s ≤ t) = . p(−1) + p(1)
(3.34)
Since p(1) ≥ p(−1), this is always at least 1/2. This means E A(t){(x1 (s), x2 (s)), 0 ≤ s ≤ t} ≤ E B(t){(x1 (s), x2 (s)), 0 ≤ s ≤ t} , from which (3.33) and the proposition follows.
4. Summation by Parts In this section we will prove identities (1.32) and (1.42). They hold for general finite range exclusions, but we only need them in case of the TASEP so we will only give the proofs in that special case. Note b = 1 here. The identities are a simple consequence of (1.31) and summation by parts, once one knows the precise behaviour of v(t, x) as |x| → ∞. They are not new; see, for example [FF] for a proof of (1.42). But we could not find a reference for (1.32), so we include the proof here. For x ∈ Z, t ≥ 0 denote by Nt (x) the number of jumps from site x to site x + 1 up to time t. Lemma 3.
v(x, t) = 4χ |x| + 4Cov(Nt (0), Nt (x)) − 4 sgn(x) Cov Nt (0),
|x| y=−|x|+1
η y (t) .
Proof. We will assume x ≥ 0; the case x < 0 is analogous. Recalling the definition (1.29) of Mt (x) and Nt (x) we have Nt (0) − Nt (x) =
1 (Mt (x) − M0 (x)). 2
(4.35)
It is easy to compute V ar (Mt (x)) = 4χ |x|, and by the definition of v(x, t) we have v(x, t) = V ar (2Nt (0) − Mt (x)) = 4V ar (Nt (0)) + 4χ x − 4Cov(Nt (0), Mt (x)).
(4.36)
Using the identity (4.35) and the translation invariance we get Cov(Nt (0), Mt (x)) = E[Nt (0)Mt (x)] − E[Nt (0)]E[Mt (x)] (4.37) 1 = E[(Nt (x) + (Mt (x) − M0 (x)))Mt (x)] − E[Nt (0)]E[Mt (x)] 2 1 = E[( (Mt (x) − M0 (x)))2 ] + E[Nt (x)Mt (x)] 2 − E[Nt (x)]E[Mt (x)] = E[(Nt (x) − Nt (0))2 ] + E[Nt (x)Mt (x)] − E[Nt (x)]E[Mt (x)] = 2V ar (Nt (0)) − 2Cov(Nt (x), Nt (0)) + Cov(Nt (x), Mt (x)).
t 1/3 Superdiffusivity of Finite-Range Asymmetric Exclusion Processes on Z
393
We will substitute this into (4.36) to get v(x, t) = 4χ x + 4Cov(Nt (x), Nt (0)) − 2Cov(Nt (0), Mt (x)) − 2Cov(Nt (x), Mt (x)).
(4.38)
By translation invariance, and because of the sign convention in the definition (1.29) of Mt (x), Cov(Nt (x), Mt (x)) = −Cov(Nt (0), Mt (−x)), and the lemma follows.
(4.39)
Lemma 4. For each t > 0, there exist C1 < ∞ and C2 > 0 such that Cov (Nt (0), ηx (t)) ≤ C1 exp{−C2 |x|}, Cov(Nt (0), Nt (x)) ≤ C1 exp{−C2 |x|}. (4.40) Proof. The lemma is standard, but we could not find an exact reference, so for completeness, we give a sketch of the proof. Consider two copies (η(t), η(t)) ˜ of TASEP, coupled as in the preamble to (1.8), starting with initial data (η y , η˜ y = η y 1{y∈[−x/3,x/3]∪[2x/3,4x/3]} ), y ∈ Z, where η is distributed according to πρ . Discrepancies perform nearest neighbor random walks, and the rate of jumping left or right is always at most 1. Let A = {η0 (s) = η˜ 0 (s) and ηx (s) = η˜ x (s) for all s ∈ [0, t]} .
(4.41)
AC is contained in the event that an initial discrepancy reaches 0 or x during the time interval [0, t]. Because of the preservation of order, there are just 4 candidates and hence P(AC ) ≤ 4P(Poisson(t) > x/3), which is exponentially small in x. On A, Nt (0) = ˜ Both Nt (x) and N˜ t (x) are N˜ t (0) and Nt (x) = N˜ t (x), where N˜ t (·) are the currents in η(t). stochastically dominated by Poisson(t) random variables and hence, for any fixed t, their moments are bounded. Breaking up the respective expectations onto A and AC and applying Schwartz’s inequality we see that both |Cov(Nt (0), ηx (t))−Cov( N˜ t (0), η˜ x (t))| and |Cov(Nt (0), Nt (x)) − Cov( N˜ t (0), N˜ t (x))| are exponentially small in x. Hence it suffices to prove the lemma for the second process. Consider a third process η¯ with the same initial conditions as η, ˜ but disallowing jumps between [2x/3] and [2x/3] − 1. Using the same argument as above, it is enough to prove the lemma for η. ¯ Now N¯ t (0) and η¯ x (t) (and N¯ t (0) and N¯ t (x)) are independent, so the covariances vanish. Once one has Lemma 4, it follows from Lemma 3 that for each fixed t ≥ 0, |v(x, t) − 4χ |x|| ≤ C3 exp{−C4 |x|}
(4.42)
for some C3 < ∞ and C4 > 0. Now (1.32) and (1.42) follow by taking partial summations, applying (1.31) summing by parts, and noting that the boundary terms are exponentially small from (4.42). Acknowledgement. The authors would like to thank the referee for pointing out an error in an earlier version of the manuscript.
394
J. Quastel, B. Valkó
References [B]
Bernardin, C.: Fluctuations in the occupation time of a site in the asymmetric simple exclusion process. Ann. Probab. 32(1B), 855–879 (2004) [BG] Bertini, L., Giacomin, G.: Stochastic Burgers and KPZ equations from particle systems. Commun. Math. Phys. 183(3), 571–607 (1997) [BKS] Beijeren, H., van Kutner, R., Spohn, H.: Excess noise for driven diffusive systems. Phys. Rev. Lett. 54, 2026–2029 (1985) [BM] Bramson, M., Mountford, T.: Stationary blocking measures for one-dimensional nonzero mean exclusion processes. Ann. Probab. 30(3), 1082–1130 (2002) [FF] Ferrari, P.A., Fontes, L.R.G.: Current fluctuations for the asymmetric simple exclusion process. Ann. Probab. 22(2), 820–832 (1994) [FS] Ferrari, P.L., Spohn, H.: Scaling limit for the space-time covariance of the stationary totally asymmetric simple exclusion process. Commun. Math. Phys. 265(1), 1–44 (2006) [FNS] Forster, D., Nelson, D., Stephen, M.J.: Large-distance and long time properties of a randomly stirred fluid. Phys. Rev. A 16, 732–749 (1977) [KPZ] Kardar, K., Parisi, G., Zhang, Y.Z.: Dynamic scaling of growing interfaces. Phys. Rev. Lett. 56, 889–892 (1986) [L] Liggett, T.M.: Interacting particle systems. Grundlehren der Mathematischen Wissenschaften 276. New York: Springer-Verlag, 1985 [LOY] Landim, C., Olla, S., Yau, H.T.: Some properties of the diffusion coefficient for asymmetric simple exclusion processes. Ann. Probab. 24(4), 1779–1808 (1996) [LQSY] Landim, C., Quastel, J., Salmhofer, M., Yau, H.-T.: Superdiffusivity of asymmetric exclusion process in dimensions one and two. Commun. Math. Phys. 244(3), 455–481 (2004) [LY] Landim, C., Yau, H.-T.: Fluctuation-dissipation equation of asymmetric simple exclusion processes, Probab. Theory Related Fields 108(3), 321–356 (1997) [PS] Prähofer, M., Spohn, H.: Current fluctuations for the totally asymmetric simple exclusion process. In: In and out of equilibrium (Mambucaba, 2000), Progr. Probab. 51, Boston, MA: Birkhäuser Boston, 2002, pp. 185–204 [S] Sethuraman, S.: An equivalence of H−1 norms for the simple exclusion process. Ann. Prob. 31(1), 35–62 (2003) [SX] Sethuraman, S., Xu, L.: A central limit theorem for reversible exclusion and zero-range particle systems. Ann. Probab. 24(4), 1842–1870 (1996) [V] Varadhan, S.R.S.: Lectures on hydrodynamic scaling. In: Hydrodynamic limits and related topics (Toronto, ON, 1998), Fields Inst. Commun. 27, Providence, RI: Amer. Math. Soc., 2000, pp. 3–40 [Y] Yau, H.-T.: (log t)2/3 law of the two dimensional asymmetric simple exclusion process. Ann. of Math. (2) 159(1), 377–405 (2004) Communicated by H.-T. Yau
Commun. Math. Phys. 273, 395–414 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0250-2
Communications in
Mathematical Physics
An Algebra of Deformation Quantization for Star-Exponentials on Complex Symplectic Manifolds Giuseppe Dito1 , Pierre Schapira2 1 Institut de Mathématiques de Bourgogne, Université de Bourgogne, B.P. 47870, 21078 Dijon Cedex, France.
E-mail: [email protected]
2 Institut de Mathématiques, Université Pierre et Marie Curie, 175, rue du Chevaleret, 75013 Paris, France.
E-mail: [email protected] Received: 10 July 2006 / Accepted: 4 January 2007 Published online: 4 May 2007 – © Springer-Verlag 2007
Abstract: The cotangent bundle T ∗ X to a complex manifold X is classically endowed with the sheaf of k-algebras WT ∗ X of deformation quantization, where k := W{pt} is a subfield of C[[, −1 ]. Here, we construct a new sheaf of k-algebras WTt ∗ X which contains WT ∗ X as a subalgebra and an extra central parameter t. We give the symbol calculus for this algebra and prove that quantized symplectic transformations operate on it. If P is any section of order zero of WT ∗ X , we show that exp(t−1 P) is well defined in WTt ∗ X . Introduction The cotangent bundle T ∗ X to a complex manifold X is endowed with the sheaf of filtered C-algebras ET ∗ X constructed functorially by Sato-Kashiwara-Kawai in [9] and called the sheaf of microdifferential operators. This sheaf is conic and is associated with the homogeneous symplectic structure of T ∗ X . Another no more conic sheaf of filtered algeT ∗ X and defined over C[[, −1 ], has been constructed bras on T ∗ X , denoted here by W in the framework of formal associative deformations by many authors after [1]. (This construction has been extended to Poisson manifolds in [7].) Its analytic counterpart WT ∗ X is constructed in [8]. The sheaf WT ∗ X is similar to the sheaf ET ∗ X of microdifferential operators of [9], but with an extra central parameter , a substitute to the lack of homogeneity 1 . Here belongs to the field k := W{pt} , a subfield of C[[, −1 ]. (Note that the notation τ = −1 is used in [8].) When X is affine and one denotes by (x; u) a ∗ point of T ∗ X , a section P of this sheaf on an open subset U ⊂ T X is represented by its total symbol σtot (P) = −∞< j≤m p j (x; u)− j , with m ∈ Z, p j ∈ OT ∗ X (U ), the p j ’s satisfying suitable inequalities and the product, denoted here by , being given by the Leibniz formula. 1 In this paper, we write E ∗ and W ∗ instead of the classical notations E and W . X X T X T X
396
G. Dito, P. Schapira
A fundamental tool for spectral analysis in deformation quantization is the star-exponential of the Hamiltonian H (see [1]): (t−1 H )n exp (t−1 H ) = . n! n≥0
However, at the formal level, the star-exponential does not make sense as a formal series in and −1 . The goal of this article is to construct a new sheaf of algebras on the cotangent bundle T ∗ X to a complex manifold X in which the star-exponential has a meaning and such that quantized symplectic transformations operate on such algebras. More precisely, we construct a new sheaf of k-algebras WTt ∗ X , with an extra central holomorphic parameter t defined in a neighborhood of t = 0, with the property that complex symplectic transformations may be locally quantized as isomorphisms of algeres ι bras and there are natural morphisms of k-algebras WT ∗ X − → WTt ∗ X −→ WT ∗ X whose composition is the identity on WT ∗ X . We give the symbol calculus on WTt ∗ X , which extends naturally that of WT ∗ X (however, now we get series in j with −∞ < j < ∞), and finally we show that, if P is a section of WT ∗ X of order 0, then exp(t−1 P) is well defined in WTt ∗ X . We also briefly discuss the case where T ∗ X is replaced with a general symplectic manifold. Our construction is as follows. First, we add a central holomorphic parameter s ∈ C and consider the sheaf WC×T ∗ X , the subsheaf of WT ∗ (C×X ) consisting of sections not depending on ∂s . Denoting by a : C × T ∗ X − → T ∗ X the projection, we first define an s 1 algebra WT ∗ X := R a! WC×T ∗ X . The algebra structure with respect to the s-variable is given by convolution, as in the case of the space Hc1 (C; OC ). In order to replace this convolution product by an usual product, we define the sheaf WTt ∗ X as the “formal” Laplace transform with respect to the variables s−1 of the algebra WTs ∗ X . In a deformation quantization context, the existence of exp(t−1 P) in WTt ∗ X gives a precise meaning to the star-exponential [1] of P which is heuristically related to the Feynman Path Integral of P. 1. Symbols The fields k and k. We set k := C[[, −1 ]. Hence, an element a ∈ k is a series a= a j − j , a j ∈ C, m ∈ Z. −∞< j≤m
Consider the following condition on a: there exist positive constants C, ε such that |a j | ≤ Cε− j (− j)! for all j < 0.
(1.1)
We denote by k the subfield of k consisting of series satisfying (1.1). Convention. We endow k, hence k, with the filtration associated to ord() = −1.
(1.2)
k(0) and k(0), respectively. The fields k and k are Z-filtered2 and contain the subrings
Note that k(0) = C[[]] and k(0) = k ∩ k(0). Remark 1.1. k is flat over k(0) and k is flat over k(0). 2 In the sequel, we shall say “filtered” instead of “Z-filtered”.
Algebra of Deformation Quantization for Star-Exponentials
397
and O . Let (X, O X ) be a complex manifold. The sheaves O X X is the the sheaf O X [[, −1 ]. In other words, O Definition 1.2.(i) We denote by O X X filtered k-algebra defined as follows: A section f (x, ) of OX of order ≤ m (m ∈ Z) on an open set U of X is a series f j (x)− j , (1.3) f (x, ) = −∞< j≤m
with f j ∈ O X (U ). consisting of sections f (x, ) as (ii) We denote by OX the filtered k-subalgebra of O X above satisfying: for any compact subset K of U there exist positive constants C, ε such that sup | f j | ≤ Cε− j (− j)! for all j < 0. K
(1.4) Note that O (0) ⊗ O X X k(0) k, O X O X (0) ⊗k(0) k.
(1.5)
(To be correct, we should have written k X , the constant sheaf with values in k, instead of k in these formulas, and similarly for k(0), k(0) and k.) Also note that there exist isomorphisms of sheaves (not of algebras) (0) O X ×Cˆ| X ×{0} , O X
(1.6)
OX (0)
(1.7)
O X ×C | X ×{0} ,
where O X ×Cˆ| X ×{0} is the formal completion of O X ×C along the hypersurface X × {0} of X × C and O X ×C | X ×{0} is the restriction of O X ×C to X × {0}. Denoting by t the coordinate on C, the isomorphism (1.7) is given by the map OX (0)
j≤0
f j − j →
j≥0
f− j
tj ∈ O X ×C | X ×{0} . j!
The convolution algebra Hc1 (C; OC ). The results of this subsection are well known and elementary. We recall them for the reader’s convenience. We consider the complex line C endowed with a holomorphic coordinate s. Using this coordinate, we identify the sheaf OC of holomorphic functions on C and the sheaf C of holomorphic forms on C. The space Hc1 (C; OC ) is endowed with a structure of an algebra by → Hc2 (C2 ; OC2 ) Hc1 (C; OC ) × Hc1 (C; OC ) − − → Hc1 (C; OC ), where the first arrow is the cup product and the second arrow is the integration along the fibers of the map C2 − → C, (s, s ) → s + s . When representing the cohomology classes by holomorphic functions, the convolution product is described as follows.
398
G. Dito, P. Schapira
For a compact subset K of C, we identify the vector space HK1 (C; OC ) with the quotient space (C \ K ; OC )/ (C; OC ) and, if f ∈ (C \ K ; OC ), we still denote by f its image in HK1 (C; OC ) or in Hc1 (C; OC ). Let K and L be compact subsets of C, let f ∈ (C \ K ; OC ) and g ∈ (C \ L; OC ). The convolution product f ∗ g is given by 1 f ∗ g(z) = f (z − w)g(w)dw, (1.8) 2iπ γ where γ is a counter-clockwise oriented circle which contains L and |z| is chosen big enough so that z + K is outside of the disc bounded by γ . It is an easy exercise to show that this definition does not depend on the representatives f and g, and that to interchange the role of f and g in the formula (1.8) modifies the result by a function defined all over C, hence gives the same result in Hc1 (C; OC ). Therefore, we obtain a commutative algebra structure on Hc1 (C; OC ). Example 1.3. 1 z n+1
∗
1 z m+1
=
(n + m)! 1 . n!m! z n+m+1
The sheaf Os, X . From now on, we shall concentrate our study on O X .
Notataion 1.4. We shall often denote by Cs the complex line C endowed with the coordinate s. Lemma 1.5. Let Y be a complex manifold and Z a Stein submanifold of Y . Then H j (Z ; OY (0)| Z ) vanishes for j = 0. Proof. Using the isomorphism (1.7), we may replace the sheaf OY (0) with the sheaf OY ×Ct |t=0 . By a theorem of Siu [11], Z × {0} admits a fundamental system of open Stein neighborhoods in Y × Ct and the result follows. Let X be a complex manifold. The manifold Cs × X is thus endowed with the k-fil tered sheaf OC . Let a : Cs × X − → X denote the projection. s ×X Lemma 1.6. (i) One has the isomorphism R j a! OC R j a! OC (0) ⊗k(0) k. s ×X s ×X (ii) R j a! OC (0) 0 for j = 1. s ×X (iii) Let U ⊂⊂ V ⊂⊂ W be three open subsets of X and assume that W is Stein. Then )− → (U ; R 1 a! OC ) factorizes through the natural morphism (W ; R 1 a! OC s ×X s ×X lim ((Cs \ K ) × V ; OC )/ (Cs × V ; OC ), s ×X s ×X − →
K ⊂Cs
where K ranges over the family of compact subsets of C. L
Proof. (i) follows from the projection formula for sheaves (i.e., Ra ! (F ⊗a −1 G) L
Ra ! F ⊗G) and (1.5), since k is flat over k(0).
Algebra of Deformation Quantization for Star-Exponentials
399
(ii) For x ∈ X , we have H j (Ra ! OC (0))x lim HK (Cs × {x}; OC (0)|Cs ×{x} ). s ×X s ×X − → j
K
Applying the distinguished triangle of functors +1
R K (Cs × {x}; • ) − → R(Cs × {x}; • ) − → R((Cs \ K ) × {x}; • ) −→ to the sheaf OC (0)|Cs ×{x} we get the result by Lemma 1.5 for j > 1 and the case s ×X j = 0 follows from the principle of analytic continuation. (iii) Recall first that if W is a Stein manifold and if W1 ⊂⊂ W is open, there exists a Stein open subset W2 of W with W1 ⊂⊂ W2 ⊂⊂ W . For a compact subset L of X , (L; R 1 a! OC ) (L; R 1 a! OC (0)) ⊗k(0) k. s ×X s ×X Hence, it is enough to prove the result for OC (0). s ×X
By Lemma 1.5, H j (D × U ; OC (0)) vanishes for D open in Cs , U Stein open s ×X
(0)) vanishes for j = 1 and we get in X and j = 0. Therefore, HK ×U (Cs × U ; OC s ×X the exact sequence: j
0− → (Cs × U ; OC (0)) − → ((Cs \ K ) × U ; OC (0)) s ×X s ×X (0)) − → 0. − → HK1 ×U (Cs × U ; OC s ×X
1 Definition 1.7. We set Os, X := R a! OCs ×X . Clearly, Os, X is a sheaf of filtered k-modules. By Lemma 1.6, a section f (s, x, ) of order m of the sheaf Os, X on a Stein open subset W of X may be written on any relatively compact open subset U of W as a series
f (s, x, ) =
f j (s, x)− j ,
−∞< j≤m
where f j (s, x) is a holomorphic function on (Cs \ K 0 ) × U for a compact set K 0 not depending on j and the f j ’s satisfy an estimate (1.4) on each compact subset K of (Cs \ K 0 ) × U . We shall extend the product (1.8) to Os, X as follows. For two sections f (s, x, ) = − j and g(s, x, ) = −∞< j≤m g j (s, x)− j of Os, −∞< j≤m f j (s, x) X , we set:
f (s, x, ) ∗ g(s, x, ) = −∞< j≤m+m h j (s, x)− j , 1 h k (s, x) = i+ j=k 2iπ γ f i (s − w, x)g j (w, x)dw.
(1.9)
Proposition 1.8. The sheaf Os, X has a structure of a filtered commutative k- algebra.
400
G. Dito, P. Schapira
Proof. It is easily checked that multiplication by −1 induces an isomorphism of sheaves s, ∼ of k-modules Os, that the product of X (m) −→ O X (m + 1). Hence we just need to check two sections of order 0 is a section of order 0. Let f (s, x, ) = −∞ R big enough so that s + K 0 does not meet γ . Then for w ∈ γ and x ∈ K ∩ (Cs \ K 0 ) × U , we have: | f i (s − w, x)g j (w, x)| ≤ C 2 (−k)! i+ j=k,i, j≤0
×
ε−i− j
i+ j=k,i, j≤0
Hence h(s, x, ) =
(−i)!(− j)! ≤ 3C 2 ε−k (−k)!. (−k)!
−∞< j≤0 h k (s, x)
−k
defined by (1.9) is in Os, X (0).
The Laplace transform and the algebra Ot,X. In order to replace the convolution product in the s-variable with the ordinary product, we shall apply a kind of Laplace transform to Os, X . Definition 1.9. On a complex manifold X , we denote by Ot,X the filtered sheaf of k-modules defined as follows. A section f (t, x, ) of Ot,X(m) (i.e., a section of order m) on an open set U of X is a series f j (t, x)− j , f j ∈ (U ; OC×X |t=0 ), (1.10) f (t, x, ) = −∞< j<∞
with the condition that for any compact subset K of U there exists η > 0 such that f j (t, x) is holomorphic in a neighborhood of {|t| ≤ η} × K and satisfies there exist positive constants C, ε such that (1.11) sup | f j (t, x)| ≤ C · ε− j (− j)! for all j < 0, x∈K ,|t|≤η
⎧ ⎨there exist positive constants M and R such that R j−m (1.12) |t| j−m for |t| ≤ η and all j ≥ m. ⎩ sup | f j (t, x)| ≤ M ( j − m)! x∈K Let f (t, x, ) = −∞< j<∞ f j (t, x)− j and g(t, x, ) = −∞< j<∞ g j (t, x)− j be
two sections of Ot,X of order m and m respectively. Define formally h(t, x, ) = h j (t, x)− j , h k (t, x) = f i (t, x)g j (t, x). (1.13) −∞< j<∞
i+ j=k
Lemma 1.10. (i) Multiplication by −1 induces an isomorphism of sheaves of k(0)→ Ot,X(m + 1). modules Ot,X(m) −∼ (ii) The product (1.13) of a section f (t, x, ) ∈ Ot,X(m) and a section g(t, x, ) ∈ Ot,X(m ) is well defined and belongs to Ot,X(m + m ).
Algebra of Deformation Quantization for Star-Exponentials
401
Proof. (i) (a) Let f (t, x, ) = −∞< j<∞ f j (t, x)− j ∈ Ot,X(m). Then −1 f (t, x, ) = −j ˜ ˜ −∞< j<∞ f j (t, x) , with f j = f j−1 . For any integer j < 0, we have: sup
x∈K ,|t|≤η
| f˜j (t, x)| =
sup
x∈K ,|t|≤η
| f j−1 (t, x)| ≤ Cε− j+1 (− j + 1)! ≤ (Cε)(εe)− j (− j)!.
Hence Condition (1.11) is satisfied. For j ≥ m + 1, we have: sup | f˜j (t, x)| = sup | f j−1 (t, x)| ≤ M x∈K
x∈K
R j−m−1 |t| j−m−1 , ( j − m − 1)!
which is simply Condition (1.12) for m + 1 and −1 f (t, x, ) ∈ Ot,X(m + 1). (b) Let f (t, x, ) = −∞< j<∞ f˜j (t, x)− j , with f˜j = f j+1 . For any integer j < −1, we have: sup
x∈K ,|t|≤η
| f˜j (t, x)| =
sup
x∈K ,|t|≤η
| f j+1 (t, x)| ≤ Cε− j−1 (− j − 1)! ≤
C −j ε (− j)!. ε
For j = −1, we have: sup
x∈K ,|t|≤η
| f˜−1 (t, x)| =
sup
x∈K ,|t|≤η
| f 0 (t, x)| = A ≥ 0,
since f 0 (t, x) is holomorphic in a neighborhood of |t| ≤ η × K . Set C = max{ Aε , Cε }, then for all integers j < 0, we have: sup
x∈K ,|t|≤η
| f˜j (t, x)| ≤ C ε− j (− j)!,
and Condition (1.11) is satisfied. For j ≥ m − 1, we have: sup | f˜j (t, x)| = sup | f j+1 (t, x)| ≤ M x∈K
x∈K
R j−m+1 |t| j−m+1 , ( j − m + 1)!
which is Condition (1.12) for m − 1 and f (t, x, ) ∈ Ot,X(m − 1). Therefore, multiplication by −1 induces an isomorphism Ot,X(m) −∼ → Ot,X(m + 1).
(ii) By (i), we may assume m = m = 0. Let f = −∞ 0 −∞< j<∞ g j (t, x) X such that f i (t, x) and g j (t, x) are holomorphic in a neighborhood of {|t| ≤ η} × K . Conditions (1.11) and (1.12) guarantee the existence of the positive constants C1 , ε1 , M1 and R1 for the f i ’s, and C2 , ε2 , M2 and R2 for the g j ’s. We set C = max{C1 , C2 }, ε = max{ε1 , ε2 }, M = max{M1 , M2 } and R = max{R1 , R2 }. We shall show that the product (1.13) is well defined. Let h k (t, x) = i+ j=k f i (t, x) g j (t, x). (a) Consider the case k < 0. The sum defining h k can be divided into three parts: hk = f i gk−i + f i gk−i + f k− j g j . (1.14) k
i≥0
j≥0
402
G. Dito, P. Schapira
The first sum is finite and defines a holomorphic function in a neighborhood of {|t| ≤ η} × K . In the second sum, k − i is strictly negative and, for each term in this sum, Conditions (1.11) and (1.12) give the following estimates when x ∈ K and |t| ≤ η: (Rη)i i−k Cε (i − k)! i! −k i i −k . ≤ C Mε (−k)!(Rηε) i
| f i (t, x)gk−i (t, x)| ≤ M
Recall that
i≥0 α
i n+i i
=
1 (1−α)n+1
for |α| < 1. When Rηε < 1, (Rηε)i
i−k i
is the
1 2Rε }.
Then the second general term of an absolutely convergent series. Let η˜ = min{η, sum in (1.14) converges uniformly on {|t| ≤ η} ˜ × K. The third sum is handled in a similar way and one gets the estimate: −k j j −k . | f k− j (t, x)g j (t, x)| ≤ C Mε (−k)!(Rηε) j It follows that, for k < 0, h k is a holomorphic function in a neighborhood of {|t| ≤ η}×K ˜ . Let us show that h k satisfies Condition (1.11). For x ∈ K and |t| ≤ η, ˜ the first sum in (1.14) is bounded by: | f i (t, x)gk−i (t, x)| ≤ C 2 ε−i εi−k (−i)!(i − k)! ≤ C 2 ε−k (−k)!. k
k
For the second and third sums we have: f i (t, x)gk−i (t, x)| ≤ C Mε−k (−k)! | i≥0
|
f k− j (t, x)g j (t, x)| ≤ C Mε−k (−k)!
j≥0
1 , (1 − R ηε) ˜ −k+1 1 . (1 − R ηε) ˜ −k+1
ε Let ε˜ = max{ε, (1−R ˜ we find that: ηε) ˜ }. For x ∈ K and |t| ≤ η,
|h k (t, x)| ≤ (C 2 +
2C M )˜ε−k (−k)!. (1 − R ηε) ˜
Hence h k satisfies Condition (1.11). (b) The case k ≥ 0. We again split the sum defining h k into three parts: hk = f i gk−i + f i gk−i + f k− j g j . 0≤i≤k
i<0
(1.15)
j<0
The first sum is a holomorphic function in a neighborhood of {|t| ≤ η} × K . For each term in the second sum, we have the following estimates when x ∈ K and |t| ≤ η: | f i (t, x)gk−i (t, x)| ≤ Cε−i (−i)!M
Rk k R k−i |t|k−i ≤ C M(ε Rη)−i |t| . (k − i)! k!
Algebra of Deformation Quantization for Star-Exponentials
403
Since (ε Rη)−i is the general term of the geometric series, the second sum in (1.15) defines a holomorphic function in a neighborhood of {|t| ≤ η} ˜ × K , where η˜ = 1 min{η, 2Rε }. Similarly, for the third sum we have: | f k− j (t, x)g j (t, x)| ≤ C M(ε Rη)− j
Rk k |t| . k!
Therefore, for k ≥ 0, h k is a holomorphic function in a neighborhood of {|t| ≤ η} ˜ × K. Let us show that h k satisfies Condition (1.12) with m = 0. For x ∈ K and |t| ≤ η, ˜ the first sum in (1.15) is bounded by: k Rk (2R)k k ≤ M2 |t| . | f i (t, x)gk−i (t, x)| ≤ M 2 |t|k i k! k! 0≤i≤k
0≤i≤k
For the second and third sums we find: |
f i (t, x)gk−i (t, x)| ≤ C M
Rk k R ηε ˜ |t| , 1 − R ηε ˜ k!
f k− j (t, x)g j (t, x)| ≤ C M
Rk k R ηε ˜ |t| . 1 − R ηε ˜ k!
i<0
|
j<0
For x ∈ K and |t| ≤ η, ˜ we have: |h k (t, x)| ≤ (M 2 + 2C M
(2R)k k R ηε ˜ ) |t| . 1 − R ηε ˜ k!
Hence h k satisfies Condition (1.12) with m = 0. The product of f ∈ Ot,X(0) and g ∈ Ot,X(0) is well defined and f g ∈ Ot,X(0). Therefore: Proposition 1.11. The sheaf Ot,X is naturally endowed with a structure of a commutative filtered k-algebra. ). We formally Let U be an open subset of X and let f (s, x, ) ∈ ((Cs \ K )×U ; OC s ×X define the Laplace transform L( f ) of f by 1 L( f )(t, x, ) = f (s, x, ) exp(st−1 ) ds, 2iπ γ
where γ is a counter-clockwise oriented circle centered at 0 with radius R 0. Example 1.12. L(s −n−1 ) = −n t n /n!, L(
1 ) = exp(t−1 ). s−1
Lemma 1.13. The Laplace transform induces a k-linear monomorphism s −1 · O X [[s −1 ]][[, −1 ] → O X [[t]][[, −1 ]].
404
G. Dito, P. Schapira
Proof. One notices that the Laplace transform is given by: an, j t n −n− j , an, j s −n−1 − j → n! −∞< j≤m n≥0
j≤m n≥0
and the result follows. Theorem 1.14. The Laplace transform induces a k-linear isomorphism of filtered k-algebras ∼ t, L : Os, X −→ O X .
(1.16)
Proof. (i) By Lemma 1.10, it is enough to check that L induces an isomorphism t, ∼ Os, X (0) −→ O X (0). (ii) Let W be a Stein open subset of X and let U be a relatively compact open subset of W . Let us develop a section f (s, x, ) of (W ; R 1 a! OC (0)) with respect to s −1 for s ×X s > R. We get f j (s, x)− j f˜(s, x, ) = −∞< j≤0
=
f j,n (x)s −n−1 − j
−∞< j≤0 n≥0
with the following Cauchy estimates: for any compact subset K of U there exist positive constants C, ε, R such that sup | f j,n (x)| ≤ Cε− j (− j)!R n . x∈K
tn Applying the Laplace transform to f˜(s, x, ) means to replace s −n−1 with −n . Hence, n! we find tn f j (t, x)− j = f j,n (x) − j−n , L( f˜)(t, x, ) = n! −∞< j<∞
−∞< j≤0 n≥0
where f j (t, x) =
f j−n,n (x)
j≤n,0≤n
tn n!
satisfies | f j (t, x)| ≤ C
j≤n,0≤n
εn− j
(n − j)! (|t|R)n . n!
Let η < (ε R)−1 . It follows that f j (t, x) is holomorphic in a neighborhood of {|t| ≤ η} × K . Assume j < 0, |t| ≤ η and x ∈ K . We get (n − j)! ε C | f j (t, x)| ≤ Cε− j (− j)! (ηε R)n ≤ ( )− j (− j)!. (− j)!n! 1 − ηε R 1 − ηε R 0≤n
Hence Condition (1.11) is satisfied.
Algebra of Deformation Quantization for Star-Exponentials
405
Assume j ≥ 0. We get for |t| ≤ η and x ∈ K , | f j (t, x)| ≤ C
Rj j ε− j j!(n − j)! C (|t|Rε)n ≤ |t| . j! n! 1 − ηε R j! j≤n
Hence Condition (1.12) for m = 0 is satisfied and L( f )(t, x, ) is in Ot,X(0). (iii) Conversely, let f (t, x, ) be a section of Ot,X(0). We develop f as f (t, x, ) =
f j (t, x)− j =
−∞< j<∞
n! f j,n (x)
−∞< j<∞ n≥0
t n −n − j+n . n!
(1.17)
For any compact set K , there exists η > 0 such that f j (t, x) is holomorphic in a neighborhood of {|t| ≤ η} × K . Conditions (1.11) and (1.12) give the Cauchy estimates | f j,n (x)| ≤ Cε− j (− j)!η−n for j < 0, | f j,n (x)| ≤ M
R j j−n η for j ≥ 0. j!
Notice that Condition (1.12) for j > 0 implies that f j (0, x) =
∂ j−1 f j ∂fj (0, x) = · · · = (0, x) = 0, ∂t ∂t j−1
(1.18)
or f j,n (x) = 0 for 0 ≤ n ≤ j − 1.
tn The inverse Laplace transform consists formally in replacing −n by s −n−1 in n! (1.17). We then get L−1 ( f )(s, x, ) = f˜(s, x, ) = n! f j+n,n (x)s −n−1 − j . −∞< j<∞ n≥0
−j ˜ ˜ Writing f˜(s, x, ) = −∞< j<∞ f j (s, x) , (1.18) implies that f j (s, x) = 0 for j ≥ 1. Let R1 > R be large enough so that (η R1 )−1 ≤ 1. We shall check that the sum f˜j (s, x) = n≥0 n! f j+n,n (x)s −n−1 defines a holomorphic function in a neighborhood of {|s| ≥ R1 } × K for any j ≤ 0. For j ≤ 0, let us split the sum f˜j (s, x) as n! f j+n,n (x)s −n−1 + n! f j+n,n (x)s −n−1 . (1.19) f˜j (s, x) = n≥− j
0≤n<− j
In the first sum we have n + j ≤ 0 and for |s| ≥ R1 and x ∈ K , we get from the Cauchy estimates R n+ j j −n−1 M n R −n−1 j η |s| |n! f j+n,n (x)s | ≤ n!M ≤ (η R) (− j)! ( )n . (n + j)! R1 − j R1 (1.20) The right-hand side is the general term of a convergent series since R1 > R and we get the result by noticing that the second sum in (1.19) is finite.
406
G. Dito, P. Schapira
Finally we shall show that f˜j (s, x) satisfies the required estimates. From (1.20), the first sum in (1.19) is bounded by n R M −n−1 j ( )n | n! f j+n,n (x)s |≤ (η R) (− j)! − j R1 R1 n≥− j
n≥− j
− j 1 M ≤ (− j)!. R1 − R η(R1 − R) Similarly, for the second sum we have |
n! f j+n,n (x)s −n−1 | ≤
0≤n<− j
C −j ε (− j)! R1
0≤n<− j
1 n 1 2C − j ) ≤ ε (− j)!,
− j ( η R1 R1
where the last inequality follows from (η R1 )−1 ≤ 1 and
n
0≤n<− j
Combining these estimates we get for j ≤ 0,
1
(−n j )
≤ 2.
| f˜j (s, x)| ≤ C˜ ε˜ − j (− j)!, 1 with C˜ = max{ R1M−R , 2C R1 } and ε˜ = max{ε, η(R1 −R) }. ˜ Therefore f˜(s, x, ) = j≤0 f˜j (s, x)− j is a section of Os, X (0) and L( f )(t, x, ) = f (t, x, ). (iv) The fact that L is a morphism of algebras follows easily from Example 1.3.
The ring gr Ot,X. If A is a filtered sheaf of rings, we denote as usual by gr A the associated graded ring. Let Cu be the complex line endowed with the coordinate u and denote by b : X × Cu − → X the projection. exp u
Definition 1.15. (i) One denotes by O X the subsheaf of C-algebras on X of the sheaf b∗ O X ×Cu whose sections on an open set U ⊂ X are the holomorphic functions f (x, u) on U × Cu satisfying: for any compact subset K of U there exist positive constants C, R such that sup | f (x, u)| ≤ C exp(R|u|). x∈K exp t −1
(ii) One sets O X
exp t −1
[, −1 ] = O X
⊗C C[, −1 ].
Proposition 1.16. There is a natural isomorphism of graded sheaves of rings exp t −1
gr Ot,X O X
[, −1 ].
Proof. First note the isomorphism s, 1 Os, X (0)/O X (−1) R a! OCs ×X ,
from which we deduce the isomorphism 1 −1 gr Os, X R a! OCs ×X ⊗C C[, ].
Algebra of Deformation Quantization for Star-Exponentials
407
The classical Paley-Wiener theorem says that the Laplace transform induces an isomorphism between Hc1 (C; OC ) and the space of entire functions of exponential type. An extension of this result with holomorphic parameters provides an isomorphism exp t −1 L : R 1 a! OCs ×X −∼ , → OX
and the result follows. in the preceding constructions The formal case. It is possible to replace OX with O X and to set s, := R 1 a! O O Cs ×X . X
(1.21)
s, does not seem to have an easy description. However the Laplace transform of O X Indeed, its sections are no longer germs of holomorphic functions with respect to t as shown in the next example. Example 1.17. Consider a sequence {c j } j≤0 of complex numbers and the section f of s, given by O X cj − j . f (s, ) = (s − 1) j≤0
Then, formally, the Laplace transform of f is given by L( f )(t, ) =
j≤0 n≥0
and the coefficient of 0 is
tn
n≥0 c−n n! ,
cj
t n −n− j , n!
which does not belong to OCt |t=0 in general.
2. The Algebra WT ∗ X Let (X, O X ) be a complex manifold. The cotangent bundle T ∗ X is a homogeneous symplectic manifold endowed with the C× -conic sheaf of rings ET ∗ X of finite-order microdifferential operators. This ring is filtered and contains in particular the subring ET ∗ X (0) of operators of order ≤ 0. This ring is constructed in [9] and we assume that the reader is familiar with this theory, referring to [5] or [10] for an exposition. On the symplectic manifold T ∗ X there exists another (no more conic) useful sheaf of rings constructed as follows (see [8]). Let C be the complex line endowed with the coordinate t and (t; τ ) the associated coordinates on T ∗ C. Set T{τ∗ =0} (X × C) = {(x, t; ξ, τ ); τ = 0} and consider the map ρ : T{τ∗ =0} (X × C) − → T ∗ X, (x, t; ξ, τ ) → (x; ξ/τ ).
(2.1)
ET ∗ (X ×C),t = {P ∈ ET ∗ (X ×C) ; [P, ∂/∂t ] = 0}.
(2.2)
Set
The ring WT ∗ X on T ∗ X is given by WT ∗ X := ρ∗ (ET ∗ (X ×C),t ).
408
G. Dito, P. Schapira
In the sequel we set := τ −1 .
(2.3)
The ring WT ∗ X is filtered and we denote by WT ∗ X ( j) the subsheaf of WT ∗ X consisting of sections of order less than or equal to j. The following result was obtained in [8]. Theorem 2.1. (i) The sheaf WT ∗ X is naturally endowed with a structure of a filtered k-algebra and gr WT ∗ X OT ∗ X [, −1 ]. (ii) Consider two complex manifolds X and Y , two open subsets U X ⊂ T ∗ X and UY ⊂ → UY . Then, locally, ψ may be T ∗ Y and a symplectic isomorphism ψ : U X −∼ → WT ∗ Y such that quantized as an isomorphism of filtered k-algebras : WT ∗ X −∼ the isomorphism induced on the graded algebras coincides with the isomorphism OT ∗ X [, −1 ] −∼ → OT ∗ Y [, −1 ] induced by ψ. Total symbols. Assume that X is affine of dimension n, that is, X is open in some C-vector space V of dimension n. Theorem 2.2. Assume X is affine. There is an isomorphism of filtered sheaves of k-modules (not of algebras), called the “total symbol” morphism: σtot : WT ∗ X −∼ → OT∗ X .
(2.4)
The total symbol of a product is given by the Leibniz formula. Denote by (x) a local coordinate system on X and denote by (x, u) the associated local symplectic coordinate system on T ∗ X . If Q is an operator of total symbol σtot (Q), then σtot (P ◦ Q) =
|α| ∂uα σtot (P) · ∂xα σtot (Q). α! n
(2.5)
α∈N
The total symbol of a section P ∈ WT ∗ X (U ) is thus written as a formal series: σtot (P) =
p j (x; u)− j , m ∈ Z,
p j ∈ OT ∗ X (U ),
(2.6)
−∞≤ j≤m
with the condition (1.4). Note that (2.5) does not depend of the choice of a local coordinate system on X but only on the affine structure of V . Indeed, (2.5) may be rewritten as σtot (P ◦ Q) = (exp(du , d y )σtot (P)(x, u)σtot (Q)(y, v))|x=y,u=v , where du , d y =
n
i=1 ∂u i ∂ yi
does not depend on the affine coordinate system.
Remark 2.3. Let us identify X with the zero section of T ∗ X . Then the sheaf OX (see Def. 1.2) is isomorphic to the left coherent WT ∗ X -module obtained as the quotient of WT ∗ X by the left ideal generated by the vector fields on X .
Algebra of Deformation Quantization for Star-Exponentials
409
3. The Algebra WTs ∗ X . Operations on W. Let S be a complex manifold of complex dimension d S . One defines the sheaf W S×T ∗ X on S × T ∗ X as the subsheaf of WT ∗ (S×X ) consisting of sections which commute with the holomorphic functions on S. Heuristically, W S×T ∗ X is the sheaf WT ∗ X with holomorphic parameters on S. For a morphism of complex manifolds f: S − → Z we shall still denote by f the map S × X − → Z × X , as well as the map S × T∗X − → Z × T ∗ X . One denotes as usual by S the sheaf of holomorphic forms of maximal degree and one sets for short: (d )
S W S×T ∗ X = W S×T ∗ X ⊗O S . S
(3.1)
Let us recall well-known operations of the theory of microdifferential operators. Although these results do not seem to be explicitly written in the literature, their proofs are straightforward and will not be given here. Let f : S − → Z be a morphism of complex manifolds. The usual operations of inverse image f ∗ : f −1 O Z − → O S and of direct image f : R f ! S [d S ] − → Z [d Z ] extend to W S×T ∗ X . More precisely, there exist morphisms of sheaves of k-modules (the second morphism holds in the derived category Db (k Z ×T ∗ X )): → W S×T ∗ X , f ∗ : f −1 W Z ×T ∗ X − (d S ) Z) : R f ! (W S×T → W Z(d×T ∗ X [d S ]) − ∗ X [d Z ],
(3.2) (3.3)
f
these morphisms having the following properties: • they are functorial with respect to f , that is, for a morphism of complex manifolds g: Z − → W , one has (g ◦ f )∗ f ∗ ◦ g ∗ and g◦ f = g ◦ f , and moreover the inverse (resp. direct) image of the identity morphism is the identity, • when X is affine, f ∗ and f commute with the total symbol morphism (2.4). As a convention, we choose the morphism in (3.3) so that the integral of Hc1 (Cs ; Cs ) is 1. In other words, a
1 1 = s 2iπ
γ
ds ∈ s
ds , s
where γ is a counter-clockwise oriented circle around the origin. The algebra WTs ∗ X . Denote by → T∗X a : Cs × T ∗ X −
(3.4)
the projection. Then, after identifying the sheaves OCs and Cs by f (s) → f (s)ds, the sheaf R 1 a! WCs ×T ∗ X is endowed with a structure of a filtered k-algebra by Hc1 (Cs × T ∗ X ; WCs ×T ∗ X ) × Hc1 (Cs × T ∗ X ; WCs ×T ∗ X ) − → Hc2 (C2s,s × T ∗ X ; WC2
s,s
×T ∗ X )
− → Hc1 (Cs ; WCs ×T ∗ X ), where the first arrow is the cup product and the second arrow is the integration along → C, (s, s ) → s + s . the fibers of the map C2 −
410
G. Dito, P. Schapira
Definition 3.1. The sheaf WTs ∗ X of k-modules on T ∗ X is given by WTs ∗ X = R 1 a! (WCs ×T ∗ X ).
(3.5)
1 with the cohomology class it defines s in Hc1 (Cs ; OCs ), we define the morphism of sheaves After identifying the holomorphic function
1 P. s Clearly, the morphism (3.6) is a monomorphism of sheaves of k-algebras. We define the morphism of sheaves ι : WT ∗ X − → WTs ∗ X ,
P →
res : WTs ∗ X − → WT ∗ X
(3.6)
(3.7)
by the integration morphism (3.3) associated to the map (3.4). Clearly, the morphism (3.7) is a morphism of sheaves of k-algebras. Hence: Theorem 3.2. (i) The sheaf WTs ∗ X is naturally endowed with a structure of a filtered k-algebra and gr WTs ∗ X R 1 a! OCs ×T ∗ X [, −1 ]. (ii) The monomorphism ι in (3.6) is a morphism of filtered k-algebras, the integration morphism res in (3.7) is a morphism of filtered k-algebras and the composition res ◦ ι : WT ∗ X − → WTs ∗ X − → WT ∗ X is the identity. (iii) Consider two complex manifolds X and Y , two open subsets U X ⊂ T ∗ X and UY ⊂ → UY . Then, locally, ψ may be T ∗ Y and a symplectic isomorphism ψ : U X −∼ → WTs ∗ Y such that quantized as an isomorphism of filtered k-algebras : WTs ∗ X −∼ the isomorphism induced on the graded algebras coincides with the isomorphism R 1 a! OCs ×T ∗ X [, −1 ] −∼ → R 1 a! OCs ×T ∗ Y [, −1 ] induced by ψ. (iv) Assume X is affine. There is an isomorphism of filtered sheaves of k-modules (not of algebras), called the “total symbol” morphism: σ : W s ∗ −∼ → Os,∗ . (3.8) tot
T X
T X
The total symbol of a product is given by the Leibniz formula with a convolution product in the s variable (see (3.10)). Proof. These results follow immediately from Theorem 2.1. Assume that X is affine. For each Stein open subset W of T ∗ X and each relatively compact open subset U ⊂⊂ W , a section P of WTs ∗ X on W admits a total symbol σtot (P)(s, x, u) = p j (s, x; u)− j , m ∈ Z, (3.9) −∞< j≤m
where p j belongs to ((Cs \ K 0 ) × U ; OCs ×T ∗ X ), for a compact subset K 0 of Cs which depends only on P and U , and the p j ’s satisfy an estimate as in (1.4) on each compact subset K of (Cs \ K 0 ) × U . Consider now two sections P and Q of WTs ∗ X on a Stein open set W with total symbols as in (3.9) (replacing p j with q j and m with m for Q). Then the total symbol of P ◦ Q is given by the Leibniz formula: |α| ∂ α σtot (P) ∗ ∂xα σtot (Q), σtot (P ◦ Q) = (3.10) α! u n α∈N
where, setting f (s, x, u) = ∂uα σtot (P)(s, x; u) and g(s, x, u) = ∂xα σtot (Q)(s, x; u), the product f ∗ g is given by (1.9).
Algebra of Deformation Quantization for Star-Exponentials
411
4. The Laplace Transform and the Algebra WTt ∗ X The filtered k-algebra WTt ∗ X on T ∗ X is the algebra WTs ∗ X , but with a different symbol calculus. Definition 4.1. We set WTt ∗ X := WTs ∗ X . For X affine, the total symbol morphism of k-modules (not of algebras) → OTt, σtot : WTt ∗ X −∼ ∗X
(4.1)
∼ is the composition WTs ∗ X −− → OTs,∗X −∼ → OTt, ∗X. σtot
L
For P a section of WTt ∗ X on a Stein open subset V of T ∗ X and an open subset U ⊂⊂ V , σtot (P) is written as a series p j (t, x, u)− j , p j ∈ OC×T ∗ X |t=0 (U ) σtot (P)(t, x, u, ) = −∞< j<∞
satisfying (1.11) and (1.12). Applying Theorem 3.2, we get: exp t −1
Theorem 4.2. (i) WTt ∗ X is a filtered k-algebra and gr WTt ∗ X OT ∗ X [, −1 ] (see Definition 1.15). (ii) The morphism ι in (3.6) induces a monomorphism of filtered k-algebras ι : WT ∗ X → WTt ∗ X , the morphism res in (3.7) induces a morphism of filtered k-algebras res : → WT ∗ X and the composition WT ∗ X − → WTt ∗ X − → WT ∗ X is the identity. WTt ∗ X − (iii) Consider two complex manifolds X and Y , two open subsets U X ⊂ T ∗ X and UY ⊂ → UY . Then, locally, ψ may be T ∗ Y and a symplectic isomorphism ψ : U X −∼ → WTt ∗ Y such that quantized as an isomorphism of filtered k-algebras : WTt ∗ X −∼ the isomorphism induced on the graded algebras coincides with the isomorphism exp t −1 exp t −1 OT ∗ X [, −1 ] −∼ → OT ∗ Y [, −1 ] induced by ψ. (iv) Assume X is affine. There is an isomorphism of filtered sheaves of k-modules (not of algebras), called the “total symbol” morphism: σtot : WTt ∗ X −∼ → OTt, ∗X.
(4.2)
The total symbol of a product is given by the Leibniz formula. For P and Q two sections of WTt ∗ X on an open subset U of T ∗ X , with X affine, the total symbol of P ◦ Q is thus given by the formula: σtot (P ◦ Q) =
|α| ∂uα σtot (P) · ∂xα σtot (Q), α! n
(4.3)
α∈N
where the product ∂uα σtot (P) · ∂xα σtot (Q) is given by the usual commutative algebra structure of OTt, ∗ X of Lemma 1.10. Remark 4.3. In Theorem 4.2, the monomorphism WT ∗ X − → WTt ∗ X is given on symbols t → WT ∗ X is given on symbols by by σtot (P) → σtot (P) and the morphism WT ∗ X − σtot (P)(t, x; u, ) → σtot (P)(0, x; u, ).
412
G. Dito, P. Schapira
The formal case. The above constructions also work when replacing the sheaf WT ∗ X T ∗ X . Let us briefly explain it. with its formal counterpart, the sheaf W Let X be a complex manifold, as above. Replacing the sheaf of rings ET ∗ X on T ∗ X with the sheaf of rings ET ∗ X of formal microdifferential operators and proceeding as for T ∗ X of finite-order formal WKB-operators on T ∗ X . WT ∗ X , we get the sheaf of rings W It is defined by T ∗ X := ρ∗ (ET ∗ (X ×C),t ). W When X is affine of dimension n, the total symbol morphism induces an isomorphism of k-modules T ∗ X −∼ ∗ , σtot : W →O T X and the symbol σtot (P ◦ Q) is given by the Leibniz formula (2.5). Then by a similar s ∗ . Namely, construction as for WTs ∗ X we construct the filtered sheaf of k-algebras W T X we set C×T ∗ X . s ∗ := R 1 a! W W T X If X is affine, the total symbol morphism induces an isomorphism of k-modules s ∗ −∼ s,∗ and the product is again given by the Leibniz formula (3.10). W → O T X T X However, as already noticed, the Laplace transform does not seem to behave as well for the formal case as for the analytic case, and we shall not construct the Laplace s,∗ . transform of O T X s on a Symplectic Manifold X 5. Remark: The Algebra WX
The complex case. Consider a complex symplectic manifold X. There exists an open covering X = i Ui and complex symplectic isomorphisms ϕi : Ui −∼ → Vi where the Vi ’s are open in some cotangent bundles T ∗ X i of complex manifolds X i . Set WUi := ϕi−1 WT ∗ X i |Vi . In general, the WUi ’s do not glue in order to give a globally defined sheaf of algebras WX on X. However the prestack S on X (roughly speaking, a prestack is a sheaf of categories) Ui → Mod(WUi ) is a stack and the category Mod(WX) := S(X) is well defined. Moreover, one can give a precise meaning to WX by replacing the notion of a sheaf of algebras with that of an algebroid. We refer to [4] for the construction of (an analogue of) this stack in the contact complex case and to [6] in the symplectic complex X and for the definition of an algebroid. See also [8] for a construction of case for W WX (by a different method). By adapting the construction of [8], one easily constructs s associated with the locally defined sheaves of algebras W s . Details the algebroid WX Ui are left to the reader. The real case. Let M be a real analytic manifold, X a complexification of M and denote by ω X the canonical 2-form on T ∗ X . The conormal bundle TM∗ X is Lagrangian for Re ω X and symplectic for Im ω X . In particular, the real manifold TM∗ X is symplectic. For an open subset U of TM∗ X , we set WU := WT ∗ X |U . Now, consider a real analytic symplectic manifold M. It is well known that it is possible to construct a globally defined sheaf of algebras WM on M such that:
Algebra of Deformation Quantization for Star-Exponentials
413
• there exists an open covering M = i∈I Ui and real symplectic isomorphisms ϕi : Ui −∼ → Vi where the Vi ’s are open in the conormal bundles TM∗ i X i for some real manifolds Mi with complexification X i , • WM|Vi ϕi−1 WVi for all i ∈ I . Replacing M with Cs × M, one easily constructs the sheaf of algebras WCs ×M of sections with holomorphic parameter s ∈ Cs . Setting s WM := R 1 a! WCs ×M
we get a filtered k-algebra similar to the algebra WTs ∗ X of Definition 3.1. Then, if P 1 s . is well defined in WM belongs to WM and has order 0, the section s−P 6. Applications As an application, let us construct the exponential of sections of order 0 of WT ∗ X . Consider a section P of WT ∗ X (0) on an open subset U of T ∗ X . For each compact subset K of U , there exists R > 0 such that the section s − P of WTs ∗ X defined on Cs ×U is invertible on (Cs \ D(0, R)) × K , where D(0, R) denotes the closed disc centered at 0 1 defines an element of Hc1 (Cs × U ; WCs ×T ∗ X ), hence, with radius R. Therefore s−P 1 . an element of (U ; WTs ∗ X ). We still denote this section of WTs ∗ X on U by s−P n 1 P By developing as n≥0 n+1 and applying the Laplace transform, we get s−P s 1 formally: L( s−P ) = exp(t−1 P). Notataion 6.1. We denote by exp(t−1 P) the image in WTt ∗ X of the section WTs ∗ X .
1 of s−P
Proposition 6.2. For P ∈ WT ∗ X (0), there is a section exp(t−1 P) ∈ WTt ∗ X such that, when X is affine: σtot (exp(t−1 P)) =
(t−1 σtot (P))n n≥0
n!
,
where the star-product f n means the product given by the Leibniz formula (2.5). Remark 6.3. The Leibniz formula (2.5) is nothing but the standard or normal or Wick star-product and Proposition 6.2 tells us that the star-exponential [1] of P makes sense in WTt ∗ X . In a holomorphic deformation quantization context, the star-exponential of P is heuristically related to the Feynman Path Integral FPI(P) of P. Indeed, the Feynman Path Integral of a Hamiltonian H is the symbol of the evolution operator associated to H , the precise relation being given (see [2]) by exp(−xu−1 )FPI(P) = σtot (exp(t−1 P)).
414
G. Dito, P. Schapira
Example 6.4. As a simple example, take X = C and P ∈ WT ∗ X (0) with σtot (P) = p0 (t, x; u) = θ xu, θ ∈ C. Up to a change of holomorphic symplectic coordinates, σtot (P) represents the Hamiltonian of the harmonic oscillator in the holomorphic representation. Clearly P is in WC (0), and the total symbol of exp(t−1 P) is easily computed: ∂ σtot (exp(t−1 P)) = σtot (−1 P ◦ exp(t−1 P)) ∂t
= −1 σtot (P)σtot (exp(t−1 P)) ∂ ∂ + σtot (P) σtot (exp(t−1 P)) ∂u ∂x ∂ −1 = θ uxσtot (exp(t−1 P)) + θ x σtot (exp(t−1 P)). ∂x Since σtot (exp(t−1 P))|t=0 = 1, the solution to the preceding equation is:
σtot (exp(t−1 P)) = exp (exp(θ t) − 1)xu−1 . The Feynman Path Integral for the harmonic oscillator is well known in the Physics literature and is given by exp exp(θ t)xu−1 [3]. Acknowledgement. We would like to thank Masaki Kashiwara for extremely useful conversations and helpful insights. The first named author thanks Yoshiaki Maeda for warm hospitality at Keio University where this work was finalized, and the JSPS for financial support.
References 1. Bayen, F., Flato, M., Fronsdal, C., Lichnerowicz, A., Sternheimer, D.: Deformation theory and quantization I,II, Ann. Phys. 111:61–110, 111–151 (1978) 2. Dito, J.: Star product approach to quantum field theory: The free scalar field. Lett. Math. Phys. 20, 125– 134 (1990) 3. Faddeev, L.D., Slanov, A.A.: Gauge Fields. Introduction to Quantum Theory. Reading MA: Benjamin Cummings Publishing, 1980 4. Kashiwara, M.: Quantization of contact manifolds. Publ. RIMS, Kyoto Univ. 32, 1–5 (1996) 5. Kashiwara, M.: D-modules and Microlocal Calculus, Translations of Mathematical Monographs 217, Providence, RI: American Math. Soc., 2003 6. Kontsevich, M.: Deformation quantization of algebraic varieties, In: EuroConférence Moshé Flato, Part III (Dijon, 2000). Lett. Math. Phys. 56, 271–294 (2001) 7. Kontsevich, M.: Deformation quantization of Poisson manifolds. Lett. Math. Phys. 66, 157–216 (2003) 8. Polesello, P., Schapira, P.: Stacks of quantization-deformation modules over complex symplectic manifolds. Int. Math. Res. Notices 49, 2637–2664 (2004) 9. Sato, M., Kawai, T., Kashiwara, M.: Microfunctions and pseudo-differential equations. In: Komatsu, H. (ed.), Hyperfunctions and pseudo-differential equations. Proceedings, Katata 1971. Lecture Notes in Math. 287. New-York: Springer-Verlag, 1973, pp. 265–529 10. Schapira, P.: Microdifferential Systems in the Complex Domain. Grundlehren der Math. Wiss. 269. Berlin: Springer-Verlag, 1985 11. Siu, Y.T.: Every Stein subvariety admits a Stein neighborhood. Invent. Math. 38, 89–100 (1976/77) Communicated by L. Takhtajan
Commun. Math. Phys. 273, 415–443 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0194-6
Communications in
Mathematical Physics
Heat Kernel and Number Theory on NC-Torus V. Gayral1 , B. Iochum2,3 , D. V. Vassilevich 4,5 1 Matematisk Afdeling, Københavns Universitet, 2100 Kobenhavn, Denmark.
E-mail: [email protected]
2 UMR 6207, Unité Mixte de Recherche du CNRS et des Universités Aix-Marseille I, Aix-Marseille II et de
l’Université du Sud Toulon-Var, Laboratoire affilié à la FRUMAM – FR 2291, Luming Case 907, F-13288 Marseille Cedex, France. E-mail: [email protected] 3 Université de Provence, Marseille, France 4 Institut für Theoretische Physik, Universität Leipzig, Posttach 100920, D-04099 Leipzig, Germany. E-mail: [email protected] 5 V. A. Fock Institute of Physics, St. Petersburg University, St. Petersburg 198504, Russia Received: 12 July 2006 / Accepted: 11 September 2006 Published online: 13 March 2007 – © Springer-Verlag 2007
Abstract: The heat trace asymptotics on the noncommutative torus, where generalized Laplacians are made out of left and right regular representations, is fully determined. It turns out that this question is very sensitive to the number-theoretical aspect of the deformation parameters. The central condition we use is of a Diophantine type. More generally, the importance of number theory is made explicit in a few examples. We apply the results to the spectral action computation and revisit the UV/IR mixing phenomenon for a scalar theory. Although we find non-local counterterms in the NC φ 4 theory on T4 , we show that this theory can be made renormalizable at least at one loop, and maybe even beyond. 1. Introduction The importance of heat kernel techniques in spectral analysis (see [26, 32]) or in quantum field theory has been known for a long time (see for instance the references in [40]). This type of expansion is particularly very useful for the control of anomalies and loop divergences. Naturally, its extension to noncommutative theories using for instance the Moyal product instead of the pointwise one, was also begun a long time ago (see reviews [18, 37, 39] and also [10]). The idea, originally due to Heisenberg, behind this generalization is that it could help to suppress some divergences. Unfortunately, a consequence of this idea is that the situation is as difficult as in the classical setting, or even worse since some UV/IR mixing can occur, except in some peculiar cases where the renormalisability of the model is proved [29]. Meanwhile, the noncommutative geometry (NCG) pioneered by Alain Connes [12] has shown its capacity to cover isospectral deformations like the deformation of a classical torus into the celebrated noncommutative torus (nc-torus). While many physical ideas coming from string theory have justified a systematic study of noncommutative quantum field theory, the interest of NCG stems also from its mathematical roots. In particular, the spectral action introduced by Chamseddine–Connes refers to a spectral triple (A, H, D) of an algebra A acting on a Hilbert space H and a
416
V. Gayral, B. Iochum, D. V. Vassilevich
given Dirac operator D which generates the inner fluctuations corresponding to gauge potentials. This spectral action is simply Tr (D A /) , where is a positive even function, is a mass scale parameter, D A = D + A and A is a one-form. This torus depends on parameters through a deformation matrix and it appears that the heat asymptotics are very sensitive to it. In particular, some Diophantine conditions, see (14), (16), are necessary to control the number-theoretic deviation from rational numbers. We also investigate a situation beyond this condition which yields precise aspects of number theory. To control the heat trace asymptotic we apply a two-step procedure. First, we define a trace (cf. (11)), which is indeed proportional to the Dixmier trace, and calculate it through the Fourier coefficients (cf. (18)). Then we prove that being expressed in terms of this trace the heat trace asymptotics for generalized Laplacians look precisely the same as in the commutative case. We first apply the control of the heat trace asymptotic to the spectral action computation. This action has been partially computed in [24], but here we pay attention to the natural existing real structure J of the triple ([13]): now D A = D + A + J A J −1 , so we are in the most difficult situation where simultaneously left and right regular representations exist (see also [27] for the further physical motivations). We show here that this full spectral action is the expected one for a 4-dimensional nc-torus. The amazing fact is that unlike for arbitrary generalized Laplacians, where non-standard terms appear in the heat kernel expansion (typically of the form ‘product of traces’), for the square of the covariant Dirac operator, such weird terms are absent. Thus the formula (57) we obtain is the expected one, up to some numerical coefficients. Then, we apply our results to the study of a scalar field on an nc-4-torus. We show that the divergent part of the effective action does not reproduce the structure of the classical one, that is, the divergences cannot be cancelled by a proper couplings re-definition. Nevertheless, the theory can be made renormalizable at one loop by adding to the classical action a non-local term, which perfectly fits in with the philosophy of [29]. We conjecture that the modified theory is renormalizable to all orders in perturbation theory. The paper is organized as follows: we recall in Sect. 2 some useful facts on nc-tori and study a trace, in fact a Dixmier trace, applied to operators like L(a)R(b), a, b ∈ A, where L (resp. R) is the left (resp. right) multiplication, giving a full asymptotic of Tr L(a)R(b)e−t P as t → 0 for a generalized Laplacian P. Section 3 touches on toric noncommutative manifolds, not necessarily compact. The spectral action is computed in Sect. 4 and the last section is devoted to the study of divergences of a scalar field theory. Since the proof of the asymptotics of the heat trace is technical, it is postponed to Appendix A while some consequences of number theory in this setting are developed in Appendix B. 2. Heat Trace Asymptotic on NC-Torus 2.1. Traces and number theory. Let C ∞ (Tn ) be the smooth noncommutative n-torus associated to a skewsymmetric deformation matrix ∈ Mn (R) (see [11, 35]). This means that C ∞ (Tn ) is the algebra generated by n unitaries u i , i = 1, . . . , n subject to the relations u i u j = eii j u j u i ,
(1)
we denote k.q := and with Schwartz coefficients. In the following, for k, q ∈ gµν k µ q ν and |k|2 := k.k, with gµν a constant metric on Tn . Naturally, the matrix satisfies k.q = −k.q. Zn ,
Heat Kernel and Number Theory on NC-Torus
417 i
Using the Weyl elements Uk := e− 2 k.χ k u k11 · · · u knn , k ∈ Zn , where χ is the matrix restriction of to its upper triangular part, the relation (1) reads i
Uk Uq = e− 2 k.q Uk+q ,
(2)
where χ is the matrix restriction of to its upper triangular part. Thus unitary operators Uk satisfy Uk∗ = U−k and a typical element a ∈ C ∞ (Tn ) can be written as a = (2π )−n/2 k∈Zn ak Uk , where {ak } ∈ S(Zn ). We use this non-standard normalization in order to simplify upcoming formulas. Let τ be the (unique) normalized faithful trace on C ∞ (Tn ) defined by τ a := (2π )−n/2 a0 and Hτ be the GNS Hilbert space obtained by completion of C ∞ (Tn ) with respect to the norm induced by the scalar product a, b := τ (a ∗ b). On Hτ , we consider the left and right regular representations of C ∞ (Tn ) by bounded operators, that we denote respectively by L(.) and R(.). An easy consequence of the associativity of the algebra is the commutativity of these two representations, namely L(a)R(b) = R(b)L(a), for all a, b ∈ C ∞ (Tn ). Let also δµ , µ = 1, . . . , n be the n (pairwise commuting) canonical derivations, defined by δµ (Uk ) := ikµ Uk .
(3)
They extend to unbounded operators on Hτ (with suitable domain), and let := −g µν δµ δν ≥ 0 be the associated Laplacian. (with constant metric). There is (at least) one analogous description of C ∞ (Tn ) given in terms of Rieffel n n ∞ n star-product ∞ n [36]. If α denotes the (periodic) action of R on T , then C (T ) C (T ), , where the star-product is defined by the following oscillatory integrals: f α−z h , f, g ∈ C ∞ (Tn ), (4) d n y d n z e−i yz α 1 f h := (2π )−n 2 y
R n ×R n
yielding relation (2) on the Fourier modes, i
eikx eiq x = e− 2 k.q ei(k+q)x . A counterpart of Eq. (3) reads ∂µ eikx = ikµ eikx . In this description, the above defined Laplacian is nothing else but the ordinary one (associated to the constant metric g) and the trace τ is the normalized integral, √ τ ( f ) := (2π )−n d n x f (x) = (vol Tn )−1 d n x g f (x), f ∈ C ∞ (Tn ). Tn
Tn
We can consider a non-flat metric, but we need (for later use) a severe restriction on it: the Rn -action must be isometric. Thus only constant metrics are allowed.1 1 There are very few attempts to deal with metrics which are not constant in the non-commutative directions. So far, one was able to obtain expressions for the heat trace asymptotics as formal power series in deviations of the metric from the flat one only [42]. Similar difficulties appear if the metric is matrix-valued [2].
418
V. Gayral, B. Iochum, D. V. Vassilevich
The purpose of this section is to establish the small-t asymptotics of the function t → Tr L(l)R(r )e−t P , where P is a generalized Laplacian, i.e. P has the form P := −(g µν ∇µ ∇ν + E),
(5)
∇µ := δµ + ωµ := δµ + L(λµ ) − R(ρµ ) , E := L(l1 ) − R(r1 ) + L(l2 )R(r2 ) ,
(6) (7)
where
and l, r, λµ , ρµ , li , ri ∈ C ∞ (Tn ). The arbitrary choice of the sign −R will be justified in (49). One can also take more general forms of E and ω. For example, ωµ can contain a term like R(ρµ )L(l ) with some smooth ρµ and l . Such modifications change very little in our considerations below. For that, the central asymptotic to compute is the one of the function t → Tr L(l) R(r ) e−t , l, r ∈ C ∞ (Tn ). (8) Indeed, after an expansion of the semi-group e−t P , viewed as an unbounded perturbation of the heat operator e−t , the only other asymptotics we need are t → Tr L(l) e−t , t → Tr R(r ) e−t , (9) but we have shown in [24, 41] that they have the same asymptotics as their commutative ( = 0) counterparts. Note that the heat semi-group e−t is trace-class, since it is diagonal in the ortho2 normal basis {Uk }k∈Zn , with eigenvalues e−t |k| . Of course, the same property holds −t P for e , as shown in the next lemma, based on a simple application of the Duhamel expansion. Lemma 2.1. For any r, l, λµ , ρµ , li , ri ∈ C ∞ (T ), the operator e−t P is trace-class for t > 0. Proof. We are going to use Duhamel’s expansion for the semi-group generated by P, viewed as an unbounded perturbation of . We write P = − B, where B = 2g µν ωµ δν + C with C = g µν (ωµ ων + ων,µ ) + E and ων,µ = L(δµ λν ) − R(δµ ρν ). From the Duhamel principle 1 e−st (A+B) B e−(1−s)t A ds, e−t (A+B) = e−t A − t 0
we first formally write e−t P =
∞ (−t) j E j (t), j=0
(10)
Heat Kernel and Number Theory on NC-Torus
where E 0 (t) := e
−t
419
and E j (t) :=
e−s1 t B e−(s2 −s1 )t · · · B e−(1−s j )t d j s.
j
Here j denotes the ordinary j-simplex: j := {s ∈ R ; 0 ≤ s1 ≤ · · · ≤ s j ≤ 1} {s ∈ R j
j+1
; si ≥ 0,
j
si = 1}.
i=0
We prove convergence of the expansion (10) in the trace-norm and for reasonably small t: from the Hölder inequality for Schatten classes, we have
E j (t) 1 ≤
e−s0 t s −1 B e−s1 t s −1 · · · B e−s j t s −1 d j s, j
0
1
j
where B = 2ωµ δµ + C and ωµ , C are bounded. By functional calculus,
δµ e−si t s −1 ≤ δµ e−si t/2 e−si t/2 s −1 i
i
≤ c(g) (esi t)−1/2 (Tr e−t/2 )si , using the inequality || f (δ1 , · · · , δn )||op ≤ || f ||∞ , where f (x) = xµ e−x.x which follows from f (δ1 , · · · , δn )Uk = f (ik)Uk . So
E j (t) 1 ≤ (Tr e−t )s0 C (Tr e−t )s1 +2c(g) ωµ (es1 t)−1/2 (Tr e−t/2 )s1 · · · j
µ
· · · C (Tr e−t )s j + 2c(g)
ωµ (es j t)−1/2 (Tr e−t/2 )s j d j s ≤ Tr e−t/2
j
· · · C + 2c(g)
µ
C + 2c(g)
ωµ (es1 t)−1/2 · · ·
µ
ω (es j t)−1/2 d j s. µ
µ
Using
j
j i=1
−1/2
si
d j s ≤ 2 j−1 ,
the last expression can be estimated for t ≤ e−1 (since s j ≤ 1) by j−1 t −( j−1)/2 e−( j−1)/2 2 j−1 C + 2c(g)
ωµ Tr e−t/2 . µ
Thus ∞ ∞ √ j−1 √2 ( C + 2c(g) (−t) j E j (t) ≤ t Tr e−t/2
ω
) t , µ e j=0
1
j=0
µ
420
V. Gayral, B. Iochum, D. V. Vassilevich
which is finite for 0
e := t0 . 4( C + 2c(g) µ ωµ )2
Finally, for t0 ≤ t ≤ 2t0 , note that
e−t P 1 ≤ e−(t−t0 )P 1 e−t0 P , and the result follows inductively. One can probably prove this lemma also by using some estimates involving Sobolev spaces, cf. [41]. However, the Duhamel principle is quite important in its own right for physical applications. We shall use (10) below in Sect. 5. The convergence of the Duhamel expansion is necessary to construct the covariant perturbation series in the approach of Barvinsky and Vilkovisky [4]. Note also that the Duhamel expansion has been used to compute one-loop divergences in a more general framework of NCQFT, namely on the Moyal plane with degenerate but non-constant [23]. Let us define the functional Sp on L(Hτ ), given for a bounded operator A by (11) Sp(A) := lim (4π t)n/2 Tr A e−t . t→0+
Note that this definition is not sensitive to the kernel of (which is CU0 ), so we may change in + 1 or assume that is invertible. We will show that Sp L(.)R(.) , as a functional on C ∞ (Tn ) × C ∞ (Tn ), is indeed a finite and faithful trace in each argument, namely it vanishes whenever one of its arguments is a commutator, see Lemma 19. This should not be surprising once one knows that Sp(A) is a multiple of the Dixmier trace of A(1 + )−n/2 . Indeed, from the knowledge of the eigenvalues of and theboundedness of A, one has N µk A(1 + )−n/2 = O(ln N ), where µk (X ) A(1 + )−n/2 ∈ L(1,∞) (Hτ ), i.e., k=1 are the ordered singular values of X . It follows then (see [14, p. 236] with immediate modifications) that Tr ω A(1 + )−n/2 =
1 1 lim t n/2 Tr(A e−t ) = Sp(A). n/2 (n/2 + 1) t→0 (4π ) (n/2 + 1)
However, we leave this feature now since our goal is to find an algorithm to obtain analytic expressions for the heat coefficients. For that, Dixmier-trace technology is not so helpful. From relations (1) and using the orthonormal basis {Uk }k∈Zn to compute the trace, we find for l = (2π )−n/2 q1 lq1 Uq1 , r = (2π )−n/2 q2 rq2 Uq2 in C ∞ (Tn ), 2 τ Uk∗ L(l) R(r ) e−t Uk = e−t|k| τ Uk∗ l Uk r Tr L(l) R(r ) e−t = k∈Zn
= (2π )−n
k∈Zn
e
−t|k|2
lq1 rq2 τ U−k Uq1 Uk Uq2
k, q1 , q2 ∈Zn
= (2π )−n
q, k∈Zn
lq r−q e−ik.q e−t|k| , 2
Heat Kernel and Number Theory on NC-Torus
421
which after Poisson resummation reads √ 2 lq r−q e−|q−2π k| /4t Tr L(l) R(r ) e−t = g (4π t)−n/2 =
√
g
q, k∈Zn
lq r−q K (t, q − 2π k),
(12)
q, k∈Zn
where K (t, x) := (4π t)−n/2 e−|x| /4t is the heat kernel of Rn with metric g µν . To proceed further, we need to impose some restrictions on the matrix . Whereas it does not for the asymptotics of (9), the number-theoretical aspect of has huge consequences for the asymptotics of (8). To explain what this is about,
let us review what happens in the (nondegenerate) two0 −1 , with θ an arbitrary real number. (Actually, dimensional case, where = θ 1 0 up to an isomorphism, the range of θ can be reduced to the interval [0, 21 ].) In this case, there are two distinct situations to consider [21]. When θ is rational (relative to 2π ), then in the sum (12), only terms with a q multiple of the θ -denominator, will contribute to the small-t asymptotics. When θ is irrational (again relative to 2π ),one can guess that only the zero-mode will contribute to the small-t asymptotics of Sp L(l)R(r ) . This is indeed true, provided one has a control on the sum 2
k∈Z2
0=q∈Z2
e−|θq−2π k| /4t , (4π t)n/2 2
lq r−q
(13)
i.e., provided one can measure how far from rationals θ is, since |θq − 2π k| could be in principle arbitrarily close to 0. This control is precisely given by a Diophantine condition. Definition 2.2. A number θ is said to satisfy a Diophantine condition (relative to 2π ) if there exist two constants C > 0, β ≥ 0 such that for all q ∈ Z∗ ,
θq T := inf |θq − 2π k| ≥ k∈Z
C 2C 2 ⇐⇒ |1 − cos(θq)| ≥ . 2 |q|1+β |q|1+2β+β
(14)
In fact, inf p∈Z |θq − p| ≤ | sin(π θq)| : actually, for a given q, their exists an integer p0 such that |θq − p0 | ≤ 21 and since we work under modulus, we may assume that 0 ≤ θq − p0 ≤ 21 . Since (sin x)/x is decreasing on [0, π2 ], we get sin(π(θq − p0 )) ≥ 2(θq − p0 ) and the above equivalence by taking the square of the inequality with 1 sin2 ( θq 2 ) = 2 (1 − cos(θq)). In other words, this condition states that the inverse torus norm of θq is a temperate distribution over Z∗ . This is exactly what we need since in Eq. (13), the complex coefficients lq , rq are of Schwartz-class by the smoothness assumption. Note that this condition is not too restrictive since the set of irrational numbers satisfying a Diophantine condition is of full Lebesgue measure. In the general Poisson case, one can always change the coordinates on Tn , y = Bx with a constant matrix B, so that the new coordinates y are 2π -periodic again, and the ¯ = B T B becomes Poisson matrix
0 1 0 1 ¯ = 0l , (15) θi θj −1 0 −1 0, {i : θi ∈2π Q∗ }
{ j : θ j ∈R\2π Q}
422
V. Gayral, B. Iochum, D. V. Vassilevich
where 0l is the zero matrix of size l and n = l +m 1 +m 2 . Here m 1 is the size of the rational part (with θi = 2π pi /qi , i = 1, · · · , m 1 ) and m 2 of the irrational one. We define then Z = Zl × q1 Z × · · · × qm 1 Z × {(0, · · · , 0) ∈ Zm 2 }. The rest of Zn is denoted by K and is split into two parts, Kper {(k1 , · · · , km 1 ) ∈ Nm 1 , 1 ≤ ki ≤ qi − 1, i = 1, · · · , m 1 } and Kinf Zm 2 \ {0}. Kper is a finite set which lays in an R-linear space generated by Z. We shall omit the bar over in what follows. One can avoid the use of a particular coordinate system. Then Z is defined as the set of q ∈ Zn such that (2π )−1 q ∈ Zn . This definition reveals the actual meaning of this set. One can also figure out how to give coordinate independent definitions of Kper and Kinf . Now we are ready to formulate our restriction on . We assume the following: The matrix satisfies a Diophantine condition with respect to Kinf , i.e., there are two positive constants C and β such that inf |q − 2π k| ≥
k∈Zn
C for all q ∈ Kinf . |q|1+β
(16)
In the standard definition of a Diophantine condition for an l-tuple of real numbers, one assumes that 1 + β ≥ l. Our consideration is valid in a more general case for any positive 1 + β, so we do not need this restriction as far as such exist; note, for instance, that there exists a set of full Lebesgue measure of l-tuples of Roth type, that is of α ∈ Rl such that for all > 0, there exists C with inf k∈Zl |αq − 2π k| ≥ |q|Cn+ , see [31]. µν We remark that it does not matter whether one uses the metric g or the normalized diagonal metric in norm-values of (16) since it can be absorbed in the constant C. With this restriction on we can prove the following formula which governs the asymptotic behavior of the trace at t → 0+ : Tr L(l)R(r )e−t =
√
g lq r−q + e.s.t., (4π t)n/2
(17)
q∈Z
where e.s.t. denotes some exponentially small terms in t, i.e., the terms which vanish faster than any power of t as t → 0+ . The proof, which is somewhat technical, is postponed to Appendix A. Equation (17) immediately yields the following theorem. Theorem 2.3. Assume satisfies condition (16). Then for any l, r ∈ C ∞ (Tn ), √ lq r−q . Sp L(l)R(r ) = g
(18)
q∈Z
The explicit expression (18) makes it possible to show rather directly that (11) indeed defines a trace in each variable: it vanishes whenever l or r is a commutator: Corollary 2.4. Let l, r, s ∈ C ∞ (Tn ). Then, Sp L(l)R([r, s]) = 0.
Heat Kernel and Number Theory on NC-Torus
423
Proof. From commutation relations (2), we find rk sq sin( 21 k.q) Uk+q , [r, s] = −2i(2π )−n k, q∈Zn
and thus [r, s]k = 2i(2π )−n/2
q∈Zn
rk−q sq sin( 21 q.k),
which is zero whenever k ∈ Z since it is equivalent to (2π )−1 k ∈ Zn . Remark 2.5. We have the following relations between the functional Sp and the trace τ : Sp L(a) = Sp R(a) = (vol Tn ) τ (a), for all a ∈ C ∞ (Tn ) and any , and when Z = {0} (i.e. pure Diophantine case), then Sp L(l) R(r ) = (vol Tn ) τ (l) τ (r ), which makes transparent the statement of the corollary. This completes our study of the trace (11). Below we present several relations similar to (17) which will be used in the next section. One can show (see Appendix A) that Tr [L(l)R(r )]µ δµ e−t = 0 + e.s.t., (19) where the notation [L(l)R(r )]µ1 ...µm means that the vector indices are distributed between l and r . For higher derivatives we have ∗ Tr [L(l)R(r )]µ1 ...µm δµ1 . . . δµm e−t = τ Uk [L(l)R(r )]µ1 ...µm δµ1 . . . δµm e−t Uk k∈Zn
= im =:
kµ1 . . . kµmτ Uk∗ [L(l)R(r )]µ1 ...µme−t Uk
k∈Zn (m) i m Gµ Tr [L(l)R(r )]µ1 ...µm 1 ...µm
e−t . (20)
One can calculate the tensors G (m) by varying (17) or (19) with respect to the metric g µν (this is a standard way to include derivatives in the heat trace expansion, cf. [6].) All G (2 j+1) are exponentially small and can be neglected in our analysis. For even m, corresponding tensors G (m) are obtained from the following recursion relation: 1 δ p+2) G (2 p) , (21) G (2 µνµ1 ...µ2 p = − t δg µν µ1 ...µ2 p √ with G (0) = g. One has to take into account that g µν is symmetric, so that not all of the components are indeed independent, and δ 1 gρσ = − (gµρ gνσ + gµσ gνρ ). (22) δg µν 2 For example, √ g (2) gµν , G µν = 2t (23) √ g (4) G µνρσ = 2 (gµν gρσ + gµρ gνσ + gµσ gνρ ). 4t
424
V. Gayral, B. Iochum, D. V. Vassilevich
2.2. Heat trace asymptotics for generalized Laplacians. From expression (5) of generalized Laplacian P = −g µν ∇µ ∇ν − E, we need also to define the associated curvature: L R
µν := ∇µ ∇ν − ∇ν ∇µ = L(µν ) − R(µν ),
as repeated commutators with ∇. For and higher “covariant derivatives” of E and example, E ;µ := [∇µ , E], E ;µν := [∇ν , E ;µ ]. Actually, there are two covariant derivatives, ∇µL := δµ + L(λµ ) and ∇µR := δµ − R(ρµ ) and two gauge symmetries in the problem. One of these symmetries acts on “left” fields λµ , l1 , l2 , the other acts on “right” fields ρµ , r1 , r2 . The gauge group has a direct product structure. Explicit expressions for the symmetry transformations can be found
and derivatives is gauge invariant. in [43]. The functional Sp of any polynomial of E, Full invariance will mean that also all vector indices are contracted in pairs. This is precisely the class of invariants which will appear in the heat trace asymptotics. Due to the product structure of the gauge group there are gauge which do not belong invariants to the class we have just described. For example, Sp L(l1 ) is such an invariant. We will see that in the spectral action picture, there is only one gauge group, namely the automorphism group of the algebra but lifted to the spinor bundle via the charge conjugation operator. In such an application, the representation really looks like the adjoint one (see Sect. 4). We shall need the notion of canonical mass dimension. We assign canonical mass
, and canonical mass dimension 1 to each derivative. Canonical dimension 2 to E and mass dimension of any monomial is the sum of canonical mass dimensions of all factors. The heat trace asymptotic is then given by the following Theorem 2.6. Let P be as defined above. Then i) There is a full asymptotic expansion of the heat trace ∞ Tr L(l)R(r )e−t P ∼+ ak (l, r ; P) t (k−n)/2 . t→0
(24)
k=0
ii) The coefficients ak can be expressed as ak (l, r ; P) = bα Sp L(l)R(r )Aα ,
(25)
α
where Aα are independent invariant free polynomials of canonical mass dimension k of
and their covariant derivatives. The numbers bα are constants. The odd-numbered E, coefficients a2 j+1 vanish. iii) Values of bα are uniquely defined by considering “pure left” (ρµ = r1 = r2 = 0, r = 1) or “pure right” (λµ = l1 = l2 = 0, l = 1) cases. In particular, (26) a0 (l, r, P) = (4π )−n/2 Sp L(l)R(r ) , −n/2 a2 (l, r, P) = (4π ) Sp L(l)R(r )E , (27)
µν ) .
µν a4 (l, r, P) = (4π )−n/2 1 Sp L(l)R(r )(6E 2 + 2E ;µ µ + (28) 12
Proof. Existence of the trace follows from Lemma 2.1. The proof of the second statement (below) can be considered as a constructive proof for the existence of the asymptotic expansion.
Heat Kernel and Number Theory on NC-Torus
425
To evaluate the asymptotic behavior of the trace on the left-hand side of (24) we use the canonical basis {Uk } of Hτ . This is a standard procedure used in quantum field theory for a long time (cf. [33]), which was recently applied to noncommutative theories 2 [41, 43]: one first factors out a global e−t|k| term, Tr L(l)R(r )e−t P = τ Uk∗ L(l)R(r )e−t P Uk k
=
2 µ µ µ e−t|k| τ Uk∗ L(l)R(r )et ((∇ −ik )(∇µ −ikµ )+2ik (∇µ −ikµ )+E) Uk .
k
(29) Then, one expands the exponential in (29) as a power series in E and (∇ − ik). As a result, one gets a sum of monomials of the form 2 (30) e−t|k| kµ1 · · · kµm τ Uk∗ L(l) R(r ) F(E, (∇ − ik))µ1 ···µm Uk . k
We stress that it is important that each ∇ appear in the combination (∇ − ik). We take in F all (∇ − ik) one by one starting with the rightmost (∇ − ik) and push them to the right. Being commuted through an operator of the type L( f )R(h), (∇ − ik) replaces the functions f, h in this operator by their derivatives, e.g., (∇ − ik)L( f ) = L( f )(∇ − ik) + L(∇ L f ), where only the Leibniz rule satisfied by the derivations δµ has been used. When (∇ − ik) hits Uk , it becomes an operator of (left and right) multiplication by the connection, i.e., (∇µ − ikµ ) Uk = L(λµ ) − R(ρµ ) Uk . In this way, one can remove all derivative operators (which are replaced by left and right multiplication operators) and all momenta k from the expression inside the NC-torus trace in (30). Therefore, one can apply (20) to obtain instead of (30) the following expression 2 (m) Gµ (31) e−t|k| τ Uk∗ L(l) R(r ) F(E, (∇ − ik))µ1 ···µm Uk . 1 ...µm k
Let us now evaluate the power of t corresponding to each monomial. If F contains an N Eth power of E, an N Pth power of (∇ − ik)2 and an N Kth power of 2ik µ (∇µ − ikµ ), then F itself contains t N E +N P +N K . Explicit multipliers kµ which we put in front of F in (30) come from 2ik µ (∇µ − ikµ ) only. Consequently, m = N K . For odd m, the tensors G (m) vanish up to exponentially small terms, while for an even m the tensor G (m) is proportional to t −m/2 . The sum over k brings another t −n/2 . Altogether, we have t N E +N P +N K /2−n/2 . This means that such monomials contribute to the coefficient a p (l, r, P) with p = 2N E + 2N P + N K , which are precisely the canonical mass dimensions of the monomials as defined above. Besides, N K should be even. Consequently, odd numbered heat kernel coefficients a2 j+1 vanish. Now we have to prove that the heat kernel coefficients are of the form declared in
µν and their the theorem, i.e., that they are invariant polynomials constructed from E, derivatives. We have already proved that the expression inside the trace τ in (31) is in fact a multiplication operator (i.e., a combination of left and right regular representation operators) which does not contain k. Together with gauge invariance of the heat trace this could have been enough to get the statement. However, since the gauge group has a product structure, there are more gauge invariants than we expect to find in the heat trace asymptotics. Let us collect all monomials Fa (E, (∇ − ik))µ1 ...µm of a given (even)
426
V. Gayral, B. Iochum, D. V. Vassilevich
canonical mass dimension p which appear in the expansion of the exponential of (29), and consider the sum
∗ (m) µ1 ...µm τ Uk L(l) R(r ) (32) G µ1 ...µm Fa (E, (∇ − ik)) Uk . a
As we have demonstrated above, Fa are free polynomials of E, ω and their derivatives δµ E, δµ ων , etc. The same procedure as above can be carried out for an arbitrary Laplace type operator P¯ acting on smooth sections of an arbitrary (non-abelian) vector bun¯ the dle over the commutative torus Tn . The free polynomials of the endomorphism E, ¯ and their derivatives are in one-to-one corresponconnection ω¯ µ , which characterize P, dence with the polynomials in (32). In the case of P¯ we know that all terms can be recombined into covariant derivatives and field strengths thus giving standard heat kernel coefficients. This is a purely combinatorial statement, which does not depend on the ¯ Therefore, the same recombination can be done also ¯ and E (or E). nature of ∇ (or ∇) in the noncommutative case considered here. This completes the proof of the second assertion. The third point is easy, it simply means that independent invariants remain independent when reduced to “pure left” or “pure right” cases. The coefficients in front of these invariants can therefore be read off from “pure left” heat kernel coefficients [41] on the torus. This includes (26), (27), (28) and even a6 which is not given explicitly in the present work. The interested reader can calculate also higher terms in the heat trace asymptotics by using the expressions for a8 [1] and a10 [38] obtained in the commutative case. 3. Toward the Asymptotics for Toric Noncommutative Manifolds This section is devoted to the study of the asymptotic (17) in a more general setting of noncommutative spaces. We will concentrate here on toric noncommutative manifolds, C ∞ (M ), also called periodic isospectral deformations (the aperiodic case [25], akin to the Moyal plane, will be studied elsewhere). This class of quantum spaces can be thought of as a curved space generalization of NC-tori. They were originally defined by Connes and Landi [15] (from cohomological considerations) within a twisted product approach (that we will follow here) and later by Connes and Dubois-Violette [16] in a more intrinsic way via fixed-point algebra techniques. We first recall the definition of Cc∞ (M ): let (M, g) be a Riemannian (compact or not) n-dimensional manifold without boundary. Consider α : Tl → Isom(M, g), a smooth isometric action of a l-torus on M, typically given by the maximal abelian subgroup of the isometry group of the manifold (the interesting class is l ≥ 2). This action induces a spectral (Peter–Weyl) decomposition of any smooth function with compact support f ∈ Cc∞ (M), f = fr , such that αz ( fr ) = e−ir.z fr , ∀z ∈ Tl , r ∈Zl
where the action by automorphism of the l-torus on Cc∞ (M) (also denoted α) is given by (αz f )( p) := f (α−z ( p)). It is important to notice that this expansion is convergent in the sup-norm . ∞ (in fact fr ∞ is a Schwartz sequence for f ∈ Cc∞ (M).)
Heat Kernel and Number Theory on NC-Torus
427
By analogy with the noncommutative torus, given a skewsymmetric l × l matrix , one can deform the algebra Cc∞ (M) to a noncommutative one Cc∞ (M ), defining the following twisted product on pairs of homogeneous elements i
fr gs = e− 2 r.s fr .gs .
(33)
It should be clear that this product can also be realized via the Rieffel star-product (4) associated to the action α [36]. In this setting, the natural trace τ of this algebra is the integral with Riemannian volume form µg τ (.) = (.) µg , (34) M
and all first-order differential operators which commute with the action α form a Lie algebra of derivations. On the Hilbert space H of square integrable functions on M with Riemannian volume form µg , one defines left and right twisted multiplication operators L(l), R(r ), l, r ∈ Cc∞ (M), by L(l)ψ = l ψ,
R(r )ψ = ψ r, for all ψ ∈ H.
Because the Peter–Weyl expansion is sup-norm convergent, those operators are bounded for (at least) smooth compactly supported functions. For any isometric Rl -action on a Riemannian manifold, from the expression of the kernel of the operators L(l), R(r ) (see for instance [21]) and using the heat kernel K t ( p, p ) of the scalar Laplacian associated with the metric g and its volume form µg , one can compute Tr L(l) R(r ) e−t = (2π )−l µg ( p) d l y d l z l( p) r (αz ( p)) K t α−y ( p), p . M
After a Peter–Weyl expansion of the functions l, r , this reads µg ( p) lq ( p) r−q ( p) K t αq ( p), p . Tr L(l) R(r ) e−t = q∈Zl
(35)
M
For a large class of manifolds (see [17, 25] for a review of sufficient conditions), the behavior of the off-diagonal heat kernel is controlled by the geodesic distance function: 2 2 1 1 e−dg ( p, p )/4t ≤ K t ( p, p ) ≤ C e−dg ( p, p )/4(1+c)t , (4π t)n/2 (4π t)n/2
(36)
where dg is the geodesic distance and C, c are positive constants. This estimate and the fact that the metric on the orbits of the torus action is constant (since Tl -action is isometric) show that the previous discussion on the role of the arithmetic nature of the deformation parameters applies also in this setting (since the geodesic distance on the orbits is the torus one). But in this framework this is not the end of the story. Indeed, such an action is not necessarily free (for example it is for NC-torus but not for Connes–Landi spheres), that is, there may exist fixed or rather singular points for the action. For instance, on a neighborhood of a fixed point, we see that in the integral
428
V. Gayral, B. Iochum, D. V. Vassilevich
(35), we are left with the heat kernel on the diagonal (i.e., the damping factor of the exponential of geodesic distance disappears). This has certainly some consequences for the power-t expansion. In view of (36), note that the lack of freedom for the action can be rephrased in terms of non-local integrability of the function p → dg−2 (α y ( p), p) for certain 0 = y ∈ Tl , in the neighborhood of singular points. At this level of generality, we are only able to treat the free torus-action case, where one can easily derive the asymptotic of (35). From previous techniques, the estimate (36) and under the Diophantine assumption (16), one gets 1 Tr L(l)R(r )e−t = µg ( p) lq ( p) r−q ( p) kt ( p, p) + e.s.t. (4π t)n/2 M =
1 (4π t)n/2
q∈Z ∞ k=0
tk
q∈Z
M
µg ( p) lq ( p) r−q ( p) a2k ( p) + e.s.t.,
(37) ( p) are the local heat kernel coefficients for the scalar Riemannian Laplacian. where a2k For a non-free torus action, it seems to be difficult to outstrip the qualitative level in general, i.e., to express the asymptotic of (35) in terms of geometric invariants. We will instead treat the (quite simple but non-trivial) example of the ambient space of Connes–Landi 3-sphere S3θ [15]. One standard way to construct this ambient space goes as follow. One parameterizes R4 (with standard metric) in spherical (φ1 , φ2 , ψ), φi ∈ T, ψ ∈ [0, π/2] (with nontrivial boundary conditions) and radial R ∈ [0, +∞[ coordinates. That is to say, in terms of Cartesian coordinates:
x1 = R cos ψ cos φ1 , x2 = R cos ψ sin φ1 , x3 = R sin ψ cos φ2 , x4 = R sin ψ sin φ2 . Then, one twists the product via the T2 -action y.(R, ψ, φ1 , φ2 ) = (R, ψ, φ1 + y1 mod 2π, φ2 + y2 mod 2π ),
y ∈ R2 .
In other words, we are mapping the commutative generators u i = e2iπ φi , i = 1, 2, to those of the noncommutative 2-torus (of course S3θ is obtained by imposing the sphere relation on the generators). For the question of the asymptotic of (35), it is more convenient to move to another coordinate system. It allows to identify the ambient space of S3θ with the ambient space of T2θ . This is achieved by setting r1 = R cos ψ, r2 = R sin ψ, which leads to a parameterization of R4 in double polar coordinates (r1 , φ1 ; r2 , φ2 ). Thus, it corresponds to twist the product of the commutative algebra S(R4 ) via the action of T2 given by the two S O(2)-rotations (which generate the maximal compact Abelian subgroup of the isometry group of R4 ). In such a case, the only interesting 0 1 , with θ irrational. situation is when = θ −1 0
Heat Kernel and Number Theory on NC-Torus
429
We make the Fourier transform in the two angular directions and leave two radial coordinates r1 , r2 as they are. From the expression of the heat kernel of R4 parameterized by (r1 cos φ1 , r1 sin φ1 , r2 cos φ2 , r2 sin φ2 ) and action (φ1 , φ2 ).(r, φ1 , φ2 ) := (r, φ1 + φ1 mod 2π, φ2 + φ2 mod 2π ), we have 1 − r12 (1−cos θq2 )+r22 (1−cos θq1 ) /2t K t α−q ( p), p = e , (4π t)2 and we obtain, up to an irrelevant numerical constant c, for (35) c 2 2 Tr L(l)R(r )e−t = 2 d 2 r r1 r2 lq (r) r−q (r)e− r1 (1−cos θq2 )+r2 (1−cos θq1 ) /2t , t R + ×R + 2 q∈Z
(38) where lq (r) (resp. rq (r)) is the r-dependent coefficients in the Fourier expansion of l (resp. r ) in terms of ei(φ1 q1 +φ2 q2 ) as opposed to the terms in the spectral decomposition of l (resp r ). The term with q = 0 in (38) gives our “standard” result. Other terms give (in the asymptotics) contributions from the “singularities” in r1 = 0, r2 = 0. There is a relation between oscillations of a smooth function in the angular directions and its behavior near the origin of the coordinate system r = 0. Consider a smooth complex function ψ on R2 . Let us restrict it to the unit disc in R2 . We are going to expand ψ in a series of eigenfunctions of the Laplace operator on the disc. Consider a polar coordinate system (r, φ) centered at the origin. Let ψ 0 be a restriction of ψ to 0 the boundary of the disc, and ψ (φ) = l ψl0 eilφ . Then ψr0 (φ) := l ψl0 eilφ r|l| is ˜ a (smooth) zero mode of the Laplacian, and ψ(r, φ) := ψ(r, φ) − ψr0 (φ) is a smooth function satisfying Dirichlet boundary conditions on the boundary of the disc. ψ˜ can be expanded in a sum of non-zero eigenfunctions of the Laplacian, which are eilφ J|l| (rλ), where J|l| is the Bessel function, and the eigenvalues λ are defined by the boundary condition J|l| (λ) = 0. The Taylor expansion for J|l| around r = 0 starts with r|l| and contains the powers r|l|+2k , k ∈ N0 . Since ψ is smooth, its harmonic expansion is rapidly convergent, and we can conclude that each Fourier mode ψl behaves near r = 0 as (0)
ψl (r) = r|l| (ψl
(1)
+ r2 ψl
+ . . . ).
(39)
Let us see in detail what happens near r1 = 0. For that we have to look at the sum over q1 = 0, q2 = 0. The corresponding terms in (38) read c 2 f q2 (r) e−r1 (1−cos θq2 )/2t , (40) d 2 r r1 r2 2 t q2 =0
where f q2 (r) := l0,q2 (r) r0,−q2 (r). Fix 1 > 0. It is easy to see that the integral gives an exponentially small term. For r1 ≤ 1 we use the Taylor expansion of (0)
∞ 1
dr1
(1)
f 0,q2 (r) = f 0,q2 (r2 ) + r12 f 0,q2 (r2 ) + . . . . Then (40) takes the form (up to higher order terms): (0) 2t 4t 2 c 1 (1) f 0,q2 (r2 ) + f 0,q2 (r2 ) + ··· . dr2 r2 t2 2 1 − cos θq2 (1 − cos θq2 )2 q2 =0
(41)
430
V. Gayral, B. Iochum, D. V. Vassilevich
The sum and the integral are convergent if θ satisfies a Diophantine condition. Recall the control of (1 − cos θqi )−1 by (14). It is interesting to note that already the 1/t term receives a contribution from the singularity. ∞ ∞ Next, let q1 = 0, q2 = 0. The contributions from the integrals 1 dr1 and 2 dr2 1 2 are exponentially small. Therefore, we restrict ourselves to the integral 0 0 dr1 dr2 , where we use again the Taylor expansion in the radii. The Taylor expansion of any smooth q q 2q 2q function starts with r11 r22 and the Taylor expansion for lq · r−q starts with r1 1 r2 2 . The corresponding terms contribute to the heat kernel coefficients with
q1 +1
q2 +1 2t 2t 1 (42) t 2 1 − cos θq2 1 − cos θq1 so that the modifications start with t 2 . One can easily evaluate corresponding terms (which describe the effect of the singularity at r1 = r2 = 0). The asymptotic we obtain strongly depends on the functions l and r through their Taylor coefficients in a neighborhood of singular points. This is typical for the heat trace asymptotics if boundaries or singularities are present, cf. [26, 40]. 4. Spectral Action for NC-Tori Within noncommutative geometry, the spectral action introduced by Chamseddine– Connes plays an important role [8]. More precisely, given a spectral triple (A, H, D), where A is an algebra acting on the Hilbert space H and D is a Dirac-like operator (see [12, 28]), they proposed a physical action depending only on the spectrum of the covariant Dirac operator D A := D + A + J A J −1 , where A is a one-form represented on H, i.e. it is of the form A= ai [D, bi ],
(43)
(44)
i
where ai , bi ∈ A, J is a real structure on the triple corresponding to charge conjugation and ∈ { 1, −1 } depending on the dimension of this triple and comes from the commutation relation J D = D J.
(45)
S(D A , ) := Tr (D2A /2 ) ,
(46)
This action is
where is any positive function viewed as a cut-off which could be replaced by a step function up to some mathematical difficulties surmounted in [20]. This means that counts the spectral values of |D A | less than the mass scale (note that the resolvent of D A is compact since, by assumption, the same is true for D). In [24], the spectral action on NC-tori has been computed only for operators of the form D + A. Thanks to our previous result, we can fill the gap and compute (46) in full generality.
Heat Kernel and Number Theory on NC-Torus
431
We need to fix notations: Let A := C ∞ (Tn ) acting on H := Hτ ⊗ C2 with n = 2m or n = 2m + 1 (i.e., m = n2 is the integer part of n2 ), the square integrable sections of the trivial spin bundle over Tn . Each element of A is represented on H as L(a) ⊗ 12m . The Tomita conjugation m
J0 (a) := a ∗ satisfies [J0 , ∂µ ] = 0 since J0 ∂µ Uk = J0 (ikµ )Uk = −ikµ U−k = ∂µ U−k = ∂µ J0 Uk . Besides, it induces the analogous operator on H, J := J0 ⊗ C0 , m
where C0 is an operator on C2 . The Dirac operator is defined by D := −i eaµ δµ ⊗ γ α = −i δµ ⊗ γ µ , where we use hermitian Dirac matrices γ . This implies C0 γ µ = −γ µ C0 since J D = (J0 ⊗ C0 )(−i∂µ ⊗ γ µ ) = J0 (−i)∂µ ⊗ C0 γ µ = i∂µ J0 ⊗ γ µ C0 , which by (45) is equal to (−i∂µ )J0 ⊗ γ µ C0 . Moreover, C02 = ±12m depending on the parity of m. Finally, one introduces the chirality (which in the even case is χ := id ⊗ (−i)m γ 1 · · · γ n ) and this yields that (A , H, D, J, χ ) satisfies all axioms of a spectral triple, see [12, 28]. The unitary elements u of A (or of its generated C∗ -algebra) play an important role since they reflect the inner automorphisms of A . For instance Uu := (u ⊗ 12m )J (u ⊗ 12m )J −1 is a unitary on H (with Uu∗ = (u ∗ ⊗ 12m )J (u ∗ ⊗ 12m )J −1 ) such that Uu DUu∗ = D + u ⊗ 12m [D, u ∗ ⊗ 12m ] + J u ⊗ 12m [D, u ∗ ⊗ 12m ]J −1 , explaining the construction of one-forms A in (44) thus satisfying Uu AUu∗ = (u ⊗ 12m )Au ∗ ⊗ 12m . These properties follow from the axioms: for all a, b ∈ A , [a ⊗ 12m , J b ⊗ 12m J −1 ] = 0 and [D, a ⊗ 12m ], J b ⊗ 12m J −1 = 0. In conclusion, the fact that the perturbed Dirac operator must satisfy condition (45) (which is equivalent to H being endowed with a structure of A -bimodule: for a, b ∈ A and ψ ∈ Hτ , a.ψ.b := (a ⊗ 12m )J (b∗ ⊗ 12m )J −1 ψ), yields the necessity of a symmetrized covariant Dirac operator: D A = D + A + J A J −1 . Note that for a ∈ A , using J0 L(a)J0−1 = R(a ∗ ), J L(a) ⊗ γ µ J −1 = R(a ∗ ) ⊗ C0 γ µ C0−1 = −R(a ∗ ) ⊗ γ µ , (47) and that the representation L and the antirepresentation R are C-linear, commute and satisfy [δµ , L(a)] = L(δµ a), [δµ , R(a)] = R(δµ a). Choosing an arbitrary selfadjoint one-form A, it can be written as A = L(−i Aµ ) ⊗ γ µ , Aµ = −A∗µ ∈ A ,
(48)
D A = −i δµ + L(Aµ ) − R(Aµ ) ⊗ γ µ .
(49)
and using (47)
432
V. Gayral, B. Iochum, D. V. Vassilevich
Defining A˜ µ := L(Aµ ) − R(Aµ ), we get D2A = −g µν (δµ + A˜ µ )(δν + A˜ ν ) ⊗ 12m − 21 µν ⊗ γ µν , where γ µν := 21 (γ µ γ ν − γ ν γ µ ) and µν := [δµ + A˜ µ , δν + A˜ ν ] = L(Fµν ) − R(Fµν ), where Fµν := δµ (Aν ) − δν (Aµ ) + [Aµ , Aν ]. Gathering all results, D2A = −g µν δµ + L(Aµ ) − R(Aµ ) δν + L(Aν ) − R(Aν ) ⊗ 12m −
1 L(Fµν ) − R(Fµν ) ⊗ γ µν . 2
(50)
Now, comparing (50) and (5), we can apply the previous result with the following replacement in (6) and (7)
or in Theorem 2.6,
⎧ L(λµ ) → L( Aµ ) ⊗ 12m , ⎪ ⎪ ⎪ ⎪ ⎪ R(ρµ ) → R( Aµ ) ⊗ 12m , ⎪ ⎪ ⎨ L(l ) → − 1 L(F ) ⊗ γ µν , 1 µν 2 1 µν ⎪ R(r ) → − R(F 1 µν ) ⊗ γ , ⎪ 2 ⎪ ⎪ ⎪ L(l2 ) → 0, ⎪ ⎪ ⎩ R(r2 ) → 0,
(51)
⎧ ∇µ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨E
µν ⎪ ⎪ ⎪ L(l) ⎪ ⎪ ⎩ R(r )
(52)
→ → → → →
δµ + L(Aµ ) − R(Aµ ) ⊗ 12m , − 1 L(Fµν ) − R(Fµν ) ⊗ γ µν , 2 L(Fµν ) − R(Fµν ) ⊗ 12m , 1, 1.
It is interesting to note that, in the commutative case when = 0, L = R thus D A = D for any selfadjoint one-form A: the Dirac operator does not fluctuate. We will derive the spectral action by Laplace transform techniques such as in [34], see [44] for details on Laplace transform (alternatively one can follow [20]). We assume that the function has the following property: ∈ C ∞ (R+ ) is the Laplace transform of ψˆ ∈ S(R+ ) := { g ∈ S(R) : g(x) = 0, x ≤ 0}. (53)
Heat Kernel and Number Theory on NC-Torus
433
Thus, any function with this property has necessarily an analytic extension on the right complex plane and is a Laplace transform. Consequently, any m-differentiable function ψ such that ψ (m) = is the Laplace transform of a function ψˆ and by differentiation, it satisfies ∞ ˆ (z) = ψ (m) (z) = (−1)m dt, z > 0. e−t z t m ψ(t) 0
One can invoke dominated convergence (see [24]), to obtain: ∞ 2 2 ˆ dt Tr e−t D A / t m ψ(t) Tr (D2A /2 ) = (−1)m 0
m ∞
= (−1)m 0
=
m
ˆ n−2k a˜ 2k t m+k−n/2 ψ(t) dt + O(n−2(m+1) )
k=0
n−2k 2k a˜ 2k + O(n−2(m+1) ),
k=0
where 2k is defined by
2k := (−1)
∞
m
ˆ dt, t m+k−n/2 ψ(t)
(54)
0
and a˜ 2k := a2k (1, 1; D2A ),
(55)
obtained from (25) with replacement (51). When n = 2m is even, 2k has the more familiar form: ∞ 1 m−1−k dt, for k = 0, · · · , m − 1, 0 (t) t (m−k) 2k = for k = m, · · · , n. (−1)k (k−m) (0),
(56)
For n odd, the coefficients 2k have less explicit forms because they involve fractional derivatives of , so in this case, it is better to stick with definition (54). Let us summarize: Theorem 4.1. Let D A = D + A + J A J −1 and A = L(−i Aµ ) ⊗ γ µ a hermitian oneform where A∗µ = −Aµ ∈ A = C ∞ (Tn ). Let ∈ C ∞ (R+ ) be a positive function satisfying condition (53). Then the following expansion of the spectral action holds: S(D A , ) =
n/2
n−2k 2k a˜ 2k + O(n−2(n/2+1) ),
k=0
where the 2k are defined in (54) or (56) depending on the dimension and the a˜ 2k are defined in (55) with replacement (51). More precisely, a˜ 0 = 2n/2 π n/2 , a˜ 2 = 0, a˜ 4 = (4π )−n/2 21 Sp(E 2 ) +
1
µν 12 Sp(
Moreover, all terms in a˜ 2k linear in l1 , r1 are zero.
µν ) .
434
V. Gayral, B. Iochum, D. V. Vassilevich
Proof. The last assertions follow from a˜ 0 = (4π )−n/2 Sp(1⊗12m) , a˜ 2 = (4π )−n/2 Sp(E), 1 1 1 −n/2 2 µν E
µν ) and Tr(γ νµ ) = 0, so all
µν a˜ 4 = (4π ) ;µν ) + 12 Sp( 2 Sp(E ) + 6 Sp(g linear terms in E are of zero trace. Now comes an amazing fact: in four dimensions, the non-standard terms (those with products of traces) simply disappear. Indeed, when n = 4, with g = diag(1, 1, 1, 1) and a direct sum of two “Diophantine” matrices, we find, up to negative powers of , S(D A , )
µν µν µν = (4π )−n/2 4 0 4 Sp(1) − (0) 6 Sp L(F Fµν ) + R(F Fµν ) − 2L(F )R(Fµν ) µν µν = 4π n/2 0 4 − (0) F ) − τ (F ) τ (F ) . τ (F µν µν 3
But F µν is a sum of derivatives plus a commutator, so is of zero trace. Thus, the spectral action has the standard form: µν −2 S(D A , ) = 4π n/2 0 4 − (0) (57) 3 τ (F Fµν ) + O( ). The only difference appears in the numerical value of the coefficients. What happens for generic compact toric NC manifolds (we add the assumption that M carries a spin structure as well)? Even lacking a trace asymptotic expansion of the semi-group generated by a generalized Laplacian (i.e., an analogue of Theorem 2.6), we are able to finish in the 4-d pure Diophantine case, using the symmetries we have at disposal. First, it should be clear from examples treated in previous section that the supplementary terms coming from the ‘singular points’ actually do not appear. Indeed, they should appear only in the sub-leading order of a given term, but here the only one we have is the ‘Yang–Mills’ one, which is already the last with non-negative power of (i.e., such correction terms appears in 4-d with negative powers of ). In summary, the torus action being free or not has no serious implication for the structure of the spectral action in dimension four or less. One should emphasize that this is no longer true in higher dimensions. What previous examples also do show is that, up to sub-leading order terms, −t D 2 t ( p, p), Tr L(l) R(r ) e = µg ( p) l0 ( p) r0 ( p) K M
t is the on-diagonal kernel of e−t D . However, a0 = 0 whenever a is either a where K commutator or a derivative (with respect to the action α) in C ∞ (M ). Thus the same conclusion holds, namely, that the asymptotics of Tr[ (D+ A+ J A J −1 )2 /2 ] can be easily derived from those of Tr[ (D + A)2 /2 ], in 4-d with Diophantine deformation matrix. Note that the latter is easily computable from the classical asymptotics of the t ( p, p). kernel K To compute the spectral action, it can also be convenient to use the full force of [9]. 2
5. NC-QFT: Structure of Divergences for a Scalar Field Theory Let us consider a real scalar field φ in a four-dimensional NC torus with the classical action √ λ S[φ] = d 4 x g 21 (∂µ φ)2 + 21 m 2 φ 2 + 24 φ φ φ φ , (58)
Heat Kernel and Number Theory on NC-Torus
435
where λ is a coupling constant. Here we change to the notations which are more common in quantum field theory and write the star products (4) or (33) and partial derivatives. To calculate the effective action in this theory, we split φ = ϕ − δφ into a background part ϕ and quantum fluctuations δφ. Then one expands S[φ] about the background. The first term, S[ϕ], simply gives the classical approximation to the effective action, The second term, which is proportional to the first derivative of S[ϕ] is canceled by external sources. The quadratic term can be rewritten as √ (2) 1 S [ϕ, δφ] = 2 d 4 x g (δφ)P(δφ) , where (cf. [22]) P = −g µν ∂µ ∂ν +
λ 6
[R(ϕ ϕ) + L(ϕ ϕ) + L(ϕ) R(ϕ)] + m 2 .
(59)
Note that P > 0 for λ > 0, since −g µν ∂µ ∂ν ≥ 0 and ϕ ∗ = ϕ so that R(ϕ ϕ) + L(ϕ ϕ) + L(ϕ) R(ϕ) ∗ = 21 L(ϕ) + R(ϕ) L(ϕ) + R(ϕ) + 21 L(ϕ)∗ L(ϕ) + 21 R(ϕ)∗ R(ϕ). This operator corresponds to the following choice in (5–7): λµ = ρµ = 0 , l1 = −r1 = − λ6 ϕ ϕ −
m2 2
, l2 = −r2 =
λ 6
ϕ.
Formally, the one-loop effective action reads W =
1 2
ln det P .
This expression has to be regularized. We shall use the zeta-function regularization [19, 30]. The zeta function can then be defined as an L 2 -trace, ζ (s, P) = Tr(P −s ) . The pole structure of the zeta function is determined by the asymptotic properties of heat trace at t → 0 (see, e.g. [26]). Due to Theorem 2.6, the zeta function of P has only simple poles and is regular at s = 0. There is a useful relation ak (P) = Ress= n−k (s) ζ (s, P) . 2
In particular, an = ζ (0, P). The regularized effective action is defined as W (s) = − 21 µ2s (s) ζ (s, P) . The regularization is removed in the limit s → 0. µ is a dimensional constant introduced to keep the regularized effective action dimensionless. Because of a pole of the -function, W (s) is divergent at s → 0. The divergent part of the effective action reads 1 1 ζ (0, P) = − 2s a4 (P). Wdiv (s) = − 2s
436
V. Gayral, B. Iochum, D. V. Vassilevich
Let us assume that satisfies a Diophantine condition. Then, by Eq. (28), 1 λm 2 √ 4 4 √ 2 −1 vm − 2 d x g ϕ + v ( d 4 x g ϕ)2 a4 (P) = 2 32π 3 λ2 √ √ √ √ 2 d 4 x g ϕ 4 + 3v −1 ( d 4 x g ϕ 2 )2 + 4v −1 d 4 x g ϕ 3 d 4 x g ϕ . + 36 (60) √ Here v = vol T4 = (2π )4 g, ϕ k denotes the k th star-power of ϕ. For example, ϕ 3 := ϕ ϕ ϕ. The theory is called form-renormalizable if all divergences in the effective action can be compensated by redefinitions of couplings in the classical action, i.e., if the divergent part of the effective action repeats the structure of the classical action. The term m 4 in (60) causes no problem as it does not depend on the field (it is said that such terms are removed by a renormalization of the cosmological constant). The terms with m 2 ϕ 2 and ϕ 4 can be removed by suitable renormalization of m 2 and λ in (58). The remaining non-local terms cannot be renormalized away. Therefore, the model (58) is not form-renormalizable. It is instructive to consider an infinite-volume limit of (60). Let us introduce normala x µ , where ea is a constant vierbein, ea ea = g , and let us ized coordinates y a = eµ µν µ µ ν a assume that ϕ(y ) is kept constant as v → ∞. Then all nonlocal terms vanish in the limit v → ∞. This is consistent with the conclusion of [43] that the counterterms for φ 4 on R4 are local in the zeta function regularization if is nondegenerate. Note that, for a degenerate , this is no longer true [22]. This is because the IR divergence developed in the non-planar sector of the 2-point function (proportional to |ξ |−2 in momentum space), turns out to be non-locally integrable and thus the associated Green function does not define a tempered distribution. Of course, one can add several terms to the classical action which repeat the structure of the one-loop divergences. However, such terms change, in general, also the divergent part of the effective action and bring up new structures. Can this process be closed after several steps? Below we show that, at least in one-loop approximation, the answer is affirmative. First of all, we impose antiperiodic conditions in one of the coordinates, say x 1 on the field φ (and also on the background field ϕ, and on quantum fluctuations δφ): φ(x 1 , x 2 , x 3 , x 4 ) = −φ(x 1 + 2π, x 2 , x 3 , x 4 ).
(61)
In the language of NC-QFT, this corresponds to considering a field theory on a finite projective module over the noncommutative torus. This Hermitian module is well defined. One first lifts the torus action on the commutative vector bundle and then defines the module structure on the smooth sections via a star product made out of the lifted torus action (see [16]). anti-periodic condition cancels all terms with linear integral of the field, This d 4 xϕ = 0. By condition (61) also the Laplacian spectrum is changed, ∼ pµ p µ , p ∈ Zn(1/2) := Zn + (1/2, 0, 0, 0). However, as t → 0, k∈Z
e−(k+1/2) t = 2
π t
+ e.s.t.
Heat Kernel and Number Theory on NC-Torus
437
and all asymptotic relations derived above remain true after obvious modifications. Since quantum fluctuations are anti-periodic, one has to take in (12) k ∈ Zn(1/2) . The fields l and r are powers of the background field ϕ. Therefore, r and l may be both periodic and anti-periodic. Thus, q ∈ Zn ∪ Zn(1/2) in that equation (but only half of the Fourier coefficients may be non-zero in each particular case). The Diophantine condition also should be understood with respect to k ∈ Zn(1/2) and q ∈ Zn ∪ Zn(1/2) . To deal with the remaining non-local terms in (60) we add to the classical action a non-local part 2 λ˜ 4 √ 2 δS[φ] = √ , (62) d x gφ 2 g where λ˜ is a new coupling constant. Renormalization of λ˜ allows to remove all existing one-loop divergences. It is important to make sure that no new types of divergences appear. Because of (62), the quadratic form of the action receives the contribution 2 λ˜ 2λ˜ √ √ (2) 4 √ δS [ϕ, δφ] = √ d 4 x g (δφ)2 . d x g ϕ · (δφ) + √ d 4 x g ϕ2 g g (63) The second term in (63) is harmless. Due to that, the term m 2 in (59) is replaced by a background-dependent but still coordinate-independent mass term: 2 2 2 ˜ m → m¯ = m + 2λ d 4 x ϕ 2 . (64) Let us assume λ˜ > 0 so that after this replacement the spectrumof P remains positive. Next we substitute m¯ 2 for m 2 in (60) and take into account that d 4 x ϕ = 0 to see that the divergent part of the effective action does not receive any new structure beyond those which are already present in S + δS. The first term on the right hand side of (63) does not contribute to the divergences at all. Let us denote by P¯ the operator (59) with the replacement (64). Then, the operator acting on quantum fluctuations reads P = P¯ + B,
˜ −1/2 |ϕϕ|, B = 4λg
where |ϕϕ| is a rank one operator (with suggestive notations). B being proportional to a projector to a one-dimensional subspace of the Hilbert space, it is clear that the insertions of B improve ultraviolet behavior of quantum amplitudes. To make this argument more precise we use the Duhamel principle (10): ¯
e−t ( P+B) =
∞ (−t) j β j . j=0
For j ≥ 1,
Trβ j =
j
¯ ¯ d j s Tr Be−(s2 −s1 )t P B . . . e−(1−s j +s1 )t P
˜ j (4λ) = j/2 g
j
¯
¯
d j s ϕ|e−(s2 −s1 )t P |ϕ . . . ϕ|e−(1−s j +s1 )t P |ϕ .
438
V. Gayral, B. Iochum, D. V. Vassilevich ¯
Since under our assumptions the operator P¯ is positive, |ϕ|e−s P |ϕ| ≤ ϕ 2L 2 for s ≥ 0. Thus |Trβ j | ≤
˜ j ϕ 2 j2 |4λ| L j!g j/2
,
and this means that the series expansion Tr(e
¯ −t ( P+B)
) − Tr(e
−t P¯
∞ )= (−t) j Trβ j j=1
is absolutely convergent in the UV regime, and ¯
¯
Tr(e−t ( P+B) ) − Tr(e−t P ) = O(t) , i.e., the operator B does not contribute to the coefficient a4 ( P¯ + B) or to the one-loop divergences. The action S + δS with anti-periodic boundary conditions is indeed renormalizable at one loop. The canonical mass dimension of the coupling constant e˜ is +4. Standard power-counting arguments show that the insertions of the interactions with e˜ can only improve the ultraviolet behavior of corresponding Feynman diagrams (since in the momentum cut-off regularization each positive power of e˜ should be accompanied by negative powers of the cut-off momentum). Although this power-counting may break down for noncommutative theories, it is nevertheless natural to conjecture that the theory with S + δS will remain renormalizable also at higher orders of the loop expansion. The need to add non-standard terms to the NC action in order to achieve renormalizability is not really surprising (see [29]). Note that in the approach of [29], the Diophantine condition does not play any role. The difference probably comes from the fact that we are working with compact noncommutative directions right from the beginning, while in [29] the “compactification” appears dynamically due to the presence of an oscillator potential. Physical consequences of adding the non-local term (62) to the action are still unclear to us. Since we consider the case of a compact Euclidean manifold (a torus) there are no immediate problems with causality (note that on Rn , no terms like (62) are required for the one-loop renormalization [43]). Alternatively, noncommutative theories may be viewed as effective low-energy theories, so that renormalizability is not required. In this case one needs a self-consistent subtraction scheme only. An example of such a scheme is given by the large-mass subtraction [5], which also uses the asymptotic expansion of the heat trace. The technical tools developed in the previous sections are sufficient to analyze other fields (spinors, gauge fields, etc). In this paper, we restricted ourselves to abelian gauge fields. An extension to non-abelian gauge fields can be done rather straightforwardly. Probably, even an extension to superfields can be achieved since the technique used in superspace [3] is similar to the one presented here. Acknowledgements. We thank Yann Bugeaud and Christian Mauduit for their help with number theory and Harald Grosse for helpful comments and discussions. We are indebted to José Gracia–Bondía and Joseph Várilly for philological advice and constructive comments. The work of DV was supported in part by the DFG project BO 1112/13-1 (Germany) and by the grant RNP 2.1.1.1112 (Russia).
Heat Kernel and Number Theory on NC-Torus
439
Appendix A. Proof of Asymptotic Formulae Here we prove Eq. (17). Starting from (12), |q−2π k|2 √ n 4t Tr L(l) R(r ) e−t = g (4π t)− 2 lq r−q e− , q∈Zn k∈Zn
we split the sum over q as
=
+
q∈Z
q∈Zn
.
(65)
q∈K
Consider the sum over Z first. Since for q ∈ Z the vector q/2π belongs to Zn , one can shift k in the subsequent sum by q/2π . This yields π 2 |k|2 n n √ lq r−q + = (4π t)− 2 g (4π t)− 2 lq r−q e− t lq r−q + e.s.t. q∈Z
q∈Z
0=k∈Zn
since π 2 |k|2 π 2 (|k|2 +2k) π2 π2 t lq r−q e− t ≤ 2 e− t |lq r−q | e− ≤ C e− t , q∈Z
q∈Z
0=k∈Zn
k∈Nn
2 2 because k∈Nn e−π (|k| +2k)/t is uniformly bounded in t, and {lq r−q } is a Schwartz sequence. Next we have to consider the second sum in (65). For each q ∈ K we choose k0 (q) ∈ Zn which minimizes the distance to q/(2π ) (if there are several such k, we can take any one of them.) For k = k0 , one can estimate |q −2π k|2 ≥ 4π 2 (c1 +c2 |k −k0 |2 ). From now on, ci ’s denote some positive constants. Therefore, the terms with k = k0 give only exponentially small terms in the second sum of (65). It remains to evaluate the sum 2 √ lq r−q e−|q−2π k0 (q)| /4t . (66) S = g (4π t)−n/2 q∈K
Note that q − 2π k0 (q) = 0 in (66). We split the sum in (66) into the sums over Kper and over Kinf . The first sum consists of a finite number of exponentially small terms. Consequently, it is exponentially small itself. For q ∈ Kinf we use Diophantine condition (16) to obtain 2 2(1+β) t √ |S| ≤ g (4π t)−n/2 |lq r−q |e−C /4|q| + e.s.t. q∈Kinf
Let us again divide the sum into two parts. The first one (which we denote S≤ ) is taken over a cube |qµ | ≤ Q except for q = 0. The rest is denoted S> . We estimate S> first. We would like to show that as t → 0 this sum vanishes faster n than t p− 2 for arbitrary positive p (for convenience n2 is extracted explicitly). Since l and r are smooth, their Fourier coefficients are of rapid decay. This means that the partial
440
V. Gayral, B. Iochum, D. V. Vassilevich
sums of |lq r−q | outside of the cube vanish faster than any power of Q, i.e., if Q is sufficiently large, |lq r−q | ≤ c3 Q −m (67) |qµ |>Q
for any chosen m. For reasons which will become clear later, we choose Q = c4 t −1/(4(1+β)) , m = 4(1 + β) p + 1. The inequality (67) holds for a sufficiently large c4 and a sufficiently small t. Now we estimate n
1
n
|S> | t − p+ 2 ≤ c4 (4π )− 2 t − p Q −m = c5 t 4(1+β) so that the left hand side vanishes for any p as t → 0. We now turn to S≤ . This is a finite sum with at most Q n terms. The Fourier coefficients entering this sum are bounded by a constant, |l−q rq | ≤ c6 . We also have −
C2 C2 c7 ≤ − = − 1/2 . t 4|q|2(1+β) t 4|Q|2(1+β) t
Therefore, n
n
|S≤ | t − p+ 2 ≤ c6 (4π )− 2 t − p (2Q)n e−c7 /t
1/2
.
This expression vanishes at t → 0 because of the exponential damping. This completes the proof of (17). The proof of (19) goes in the same way. The main observation is that the sum over K \ {0} remains exponentially small, while the “main” terms (produced by q ∈ Z) are zero since the first derivative of the geodesic distance vanishes in the coincidence limit. B. Beyond the Diophantine Condition This section is an attempt to understand what happens if is ‘in between’ rational numbers and “Diophantine numbers”. Consider the simplest case: T2 with
0 −1 µν , =θ 1 0 and g = diag(1, 1). To proceed, we need some results from number theory [7]. Let f : R≥1 → R>0 be a continuous function such that x → x 2 f (x) is non-increasing. Consider the set F( f ) := { θ ∈ R : |θq − p| < q f (q) for infinitely many rational numbers
p q
}.
The elements of F( f ) are termed f -approximable. Note that we cannot expect the above estimate to be valid for all rational numbers qp since for all irrational numbers θ , the set of fractional values of (θq)q≥1 is dense in [0, 1]. Then, there exists an uncountable set of real numbers θ/(2π ) which are f -approximable but not c f -approximable for any 0 < c < 1, see [7, Exercise 1.5]. Let us choose f (x) = (2π x)−1 e−c8 x ,
(68)
Heat Kernel and Number Theory on NC-Torus
441
c8 > 1, and fix a constant c9 < 1. Let us pick a θ which is f -approximable, but not c9 f -approximable. We restrict our attention to the functions l and r which do not depend on the second coordinate x 2 and lq r−q = l−q rq . Then, according to (12), 1 2 l0 r0 + 2 lq1 ,0 r−q1 ,0 e−(θq1 −2π k2 ) /4t Tr L(l)R(r )e−t = 4π t q1 >1
k2 ∈Z
up to exponentially small terms. The first term in parentheses is our standard result. Let (0) us consider the “correction” term only. Let k2 (q1 ) be an integer which minimizes the (0) distance to θq1 /2π . The sum over k2 = k2 (q1 ) is exponentially small. Consequently, the correction term becomes (0) 1 2 T (t) := lq1 ,0 r−q1 ,0 e−(θq1 +2π k2 (q1 )) /4t + e.s.t. (69) 2π t q1 >1
For a Diophantine θ , the whole correction term (69) is exponentially small. For θ/(2π ) ∈ Q this term is O(1/t). Below we work out two explicit examples and show that for the values of θ which we consider in this section the correction term is, in general, neither one nor the other. Example 1. Let us take lq1 ,0 r−q1 ,0 = e−α|q1 | . According to our assumption, θ/(2π ) is not c9 f -approximable. Consequently, for all but a finite number of q1 ∈ N, |θq1 − 2π k2(0) (q1 )| > c9 e−c8 q1 . Then we can estimate (69) as ∞ c92 e−2c8 q1 1 −α|q1 | T (t) ≤ e exp − + e.s.t. 2π t 4t q1 =0
(adding or removing any finite number of terms in this sum does not change it up to e.s.t.). Now we use the Euler–Maclaurin formula to transform this sum to an integral (with exponentially small correction terms): ∞ c92 e−2c8 q 1 dq e−αq e− 4t . 2π t 0 This integral can be easily evaluated: − cα
T (t) ≤ (2π c8 )−1 ( 2cα8 )c9
8
·t
−1+ 2cα
8
+ e.s.t.
(70)
Example 2. In our second example we shall obtain a lower bound on T (t), though for a different choice of l and r . According to our assumption, θ/(2π ) is f -approximable with f given by (68). Therefore, for an infinite set of q j , j = 1, 2, . . . , 0 < |θq j − 2π k (0) (q j )| < e−c8 q j .
(71)
We suppose that the q j are ordered, q j < q j+1 . Consequently j ≤ q j and (71) yields |θq j − 2π k (0) (q j )| < e−c8 j .
442
V. Gayral, B. Iochum, D. V. Vassilevich
Let us now take lq,0 r−q,0 = δq,q j e−α j . By repeating the arguments from the previous example, one obtains T (t) ≥ (2π c8 )−1 ( 2cα8 ) · t
−1+ 2cα
8
+ e.s.t.
(72)
The two estimates (70), (72) suggest that for non-Diophantine irrational θ/(2π ) one has to expect power-law corrections to the asymptotics (17). These power-law corrections are unstable in the sense that they crucially depend on the asymptotic behavior of lq r−q for large q. References 1. Avramidi, I.G.: (1998) A Covariant technique for the calculation of the one-loop effective action. Nucl. Phys. B 355, 712 (1991); Erratum-ibid. B 509 557 2. Avramidi, I.G.: Matrix general relativity: A new look at old problems. Class. Quant. Grav. 21, 103 (2004) 3. Azorkina, O.D., Banin, A.T., Buchbinder, I.L.: Pletnev, N.G.: One-loop effective potential in N = 1/2 generic chiral superfield model. Phys. Lett. B 635, 50–55 (2006) 4. Barvinsky, A.O., Vilkovisky, G.A.: Beyond the Schwinger-Dewitt technique: Converting loops into trees and in-in currents. Nucl. Phys. B 282, 163–188 (1987) 5. Bordag, M., Kirsten, K., Vassilevich, D.: On the ground state energy for a penetrable sphere and for a dielectric ball. Phys. Rev. D 59, 085011 (1999) 6. Branson T.P., Gilkey P.B., Vassilevich D.V.: (2000) Vacuum expectation value asymptotics for second order differential operators on manifolds with boundary. J. Math. Phys. 39, 1040 (1998); Erratum-ibid. 41:3301 7. Bugeaud, Y.: Approximation by Algebraic Numbers. Cambridge: Cambridge Univ. Press 2004 8. Chamseddine, A., Connes, A.: The spectral action principle. Commun. Math. Phys. 186, 731–750 (1997) 9. Chamseddine, A., Connes, A.: Inner fluctuations of the spectral action. J. Geom. Phys. 57, 1-21 (2006) 10. Connes, A., Douglas, M.R., Schwarz, A.: Noncommutative geometry and Matrix theory: Compactification on tori. J. High Energy Phys. 02, 003 (1998) 11. Connes, A.: C ∗ -algèbres et géométrie différentielle. C.R. Acad. Sci. Paris 290, 599–604 (1980) 12. Connes, A.: Noncommutative Geometry. London and San Diego: Academic Press, 1994 13. Connes, A.: Noncommutative geometry and reality. J. Math. Phys. 36, 6194–6231 (1995) 14. Connes, A., Moscovici, H.: The local index formula in noncommutative geometry. Geom. Funct. Anal. 5, 174–243 (1995) 15. Connes, A., Landi, G.: Noncommutative manifolds, the instanton algebra and isospectral deformations. Commun. Math. Phys. 221, 141–159 (2001) 16. Connes, A., Dubois-Violette, M.: Noncommutative finite-dimensional manifolds. I. Spherical manifolds and related examples. Commun. Math. Phys. 230, 539–579 (2002) 17. Davies, E.B.: Heat Kernels and Spectral Theory. Cambridge, Cambridge University Press, 1989 18. Douglas, M.R. Nekrasov, N.A.: Noncommutative field theory. Rev. Mod. Phys. 73, 977–1029 (2001) 19. Dowker, J.S., Critchley, R.: Effective Lagrangian and energy momentum tensor in de Sitter space. Phys. Rev. D 13, 3224 (1976) 20. Estrada, R., Gracia-Bondía, J.M., Várilly, J.C.: On summability of distributions and spectral geometry. Commun. Math. Phys. 191, 219–248 (1998) 21. Gayral, V.: Heat-kernel approach to UV/IR mixing on isospectral deformation manifolds. Ann. Henri Poincaré, 6, 991–1023 (2005) 22. Gayral, V., Gracia-Bondía, J.M., Ruiz, F.R.: Trouble with space-like noncommutative field theory. Phys. Lett. B 610, 141–146 (2005) 23. Gayral, V., Gracia-Bondía, J.M., Ruiz, F.R.: Position-dependent noncommutative products: Classical construction and field theory. Nucl. Phys. B 727, 513–536 (2005) 24. Gayral, V., Iochum, B.: The spectral action for Moyal plane. J. Math. Phys. 46(4), 043503 (2005) 25. Gayral, V., Iochum, B., Várilly, J.C.: Dixmier traces on noncompact isospectral deformations. J. Funct. Anal. 237, 507–539 (2006) 26. Gilkey P.B.: Asymptotic Formulae in Spectral Geometry. Boca Raton, FL: Chapman & Hall/CRC, 2004 27. Gracia-Bondía, J.M., Iochum, B., Schücker, T.: The standard model in noncommutative geometry and fermion doubling. Phys. Lett. B 416, 123–128 (1998)
Heat Kernel and Number Theory on NC-Torus
443
28. Gracia-Bondía, J.M., Várilly, J.C., Figueroa, H.: Elements of Noncommutative Geometry. Birkhäuser Advanced Texts, Boston: Birkhäuser, (2001) 29. Grosse, H., Wulkenhaar, R.: Renormalisation of φ 4 theory on noncommutative R4 in the matrix base. Commun. Math. Phys. 256, 305 (2005) 30. Hawking, S.W.: Zeta function regularization of path integrals in curved space-time. Commun. Math. Phys. 55, 133 (1977) 31. Herman, M.: Sur la conjugaison différentiable des difféomorphismes du cercle à des rotations. Pub. Math. de l’I.H.É.S. 49, 5–233 (1979) 32. Kirsten, K.: Spectral Functions in Mathematics and Physics. Boca Raton, FL: Chapman & Hall/CRC, 2001. 33. Nepomechie, R.I.: Calculating heat kernels. Phys. Rev. D 31, 3291 (1985) 34. Nest, R, Vogt, E, Werner, W. Spectral action and the Connes–Chamseddine model. In: Noncommutative Geometry and the Standard Model of Elementary Particle Physics. Scheck, F., Upmeier H., Werner, W. (eds.). Lecture Notes in Phys. 596, Berlin: Springer, 2002. pp. 109–132 35. Rieffel, M.A.: C ∗ -algebras associated with irrational rotations. Pac. J. Math. 93, 415–429 (1981) 36. Rieffel, M.A.: Deformation Quantization for Actions of Rd . Memoirs Amer. Soc. 506, Providence, RI: Amer. Math. Soc., 1993. 37. Szabo, R.J.: Quantum field theory on noncommutative spaces. Phys. Rept. 378, 207–299 (2003) 38. Ven, A.E.M. van de : Index-free heat kernel coefficients. Class. Quant. Grav. 15, 2311 (1998) 39. Varilly, J.C.: An Introduction to Noncommutative Geometry. EMS Lecture Notes 4, European Mathematical Society, Zürich, 2006 40. Vassilevich, D.V.: Heat kernel expansion: User’s manual. Phys. Rept. 388, 279–360 (2003) 41. Vassilevich, D.V.: Non-commutative heat kernel. Lett. Math. Phys. 67, 185–194 (2004) 42. Vassilevich, D.V.: Quantum noncommutative gravity in two dimensions. Nucl. Phys. B 715, 695– 712 (2005) 43. Vassilevich, D.V.: Heat kernel, effective action and anomalies in noncommutative theories. JHEP 0508, 085 (2005) 44. Widder, D.V.: The Laplace Transform. Princeton, NS: Princeton University Press, 1946 Communicated by A. Connes
Commun. Math. Phys. 273, 445–471 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0246-y
Communications in
Mathematical Physics
Contour Dynamics of Incompressible 3-D Fluids in a Porous Medium with Different Densities Diego Córdoba, Francisco Gancedo Instituto de Matemáticas y Fisica Fundamental, Consejo Superior de Investigaciones Cientificas, Serrano 123, 28006 Madrid, Spain. E-mail: [email protected] Received: 14 July 2006 / Accepted: 1 November 2006 Published online: 25 April 2007 – © Springer-Verlag 2007
Abstract: We consider the problem of the evolution of the interface given by two incompressible fluids through a porous medium, which is known as the Muskat problem and in two dimensions it is mathematically analogous to the two-phase Hele–Shaw cell. We focus on a fluid interface given by a jump of densities, being the equation of the evolution obtained using Darcy’s law. We prove local well-posedness when the smaller density is above (stable case) and in the unstable case we show ill-posedness. 1. Introduction The evolution of a fluid in a porous medium is an important and interesting topic of fluid mechanics (see [3]). This phenomena is based on an experimental physical principle given by H. Darcy in 1856. Darcy’s law for a 3-D fluid is given by the momentum equation µ v = −∇ p − (0, 0, g ρ), κ where v is the incompressible velocity, p is the pressure, µ is the dynamic viscosity, κ is the permeability of the medium, ρ is the liquid density and g is the acceleration due to gravity. A different problem is the motion of a 2-D fluid in a Hele–Shaw cell (see [12]). In this case the fluid is set between two fixed parallel plates. These plates are close enough in such a way that the mean velocity is described by 12µ v = −∇ p − (0, g ρ), b2 where b denotes the distance between the plates. Considering that the fluid in the porous medium only moves in two directions suppressing one of the variables in the horizontal plane, these two different physical phenomena of fluid dynamics become nevertheless mathematically analogous if we identify the permeability of the medium κ and the constant b2 /12.
446
D. Córdoba, F. Gancedo
The Muskat problem (see [14]) and the two-phase Hele-Shaw flow (see [17]) model the evolution of an interface between two fluids (in a porous medium and in a Hele– Shaw cell respectively) with different viscosities and densities. A lot of information can be found in the literature about both problems (see references in [9] and [13]). These free boundary problems are considered with surface tension using the Laplace– Young condition and also without surface tension in which case the pressures are equal on the interface. With surface tension, in the two dimensional case it has been proven that the problems have classical solutions (see [11]). Without surface tension, Siegel, Caflisch and Howison [18] proved ill-posedness in an unstable 2-D case, namely when the higher-viscosity fluid contracts, and they show global-in-time existence of small initial data in the stable case when the higher-viscosity fluid expands. The results rely on the assumption that the Atwood number Aµ =
µ1 − µ2 µ1 + µ2
is nonzero where µ1 and µ2 are the viscosities of the fluids. In the same year, Ambrose [1] treated the 2-D problem with an initial data fulfilling (ρ2 − ρ1 )g cos(θ (α, 0)) + 2 Aµ U (α, 0) > 0, and the following condition (x(α, 0) − x(α , 0))2 + (y(α, 0) − y(α , 0))2 > 0, (α − α )2
(1)
where the interface is the curve (x(α, t), y(α, t)), ρ1 and ρ2 are the densities of the fluids, θ is the angle that the tangent to the curve forms with the horizontal and U is the normal velocity (given by the Birkhoff-Rott integral). We are interested in the case Aµ = 0 that presents the evolution of the interface for different densities. This case, for example, models moist and dry regions in a porous medium. Meanwhile the work of Ambrose is based on the arclength and the tangent angle formulation used by Hou, Lowengrub and Shelley [13], due to the particular form of the vorticity in the case Aµ = 0, we get to parameterize the curve in the two dimensional problem getting the condition (1) for any time (see Eq. (16)). The free boundary problems given by fluids with different densities have been intensely studied. Notice the classical paper of Taylor [22] and the works of Wu [23] and [24] where the full water wave problem is solved considering the water with positive density and the air with zero density. A study of the two-dimensional case can be found in [2] due to Ambrose and Masmoudi. In order to simplify the notation, we consider µ/k = 12µ/b2 = 1 and g = 1. Thus, the 3-D system is written as v(x1 , x2 , x3 , t) = −∇ p(x1 , x2 , x3 , t) − (0, 0, ρ(x1 , x2 , x3 , t)),
(2)
where (x1 , x2 , x3 ) ∈ R3 are the spatial variables and t ≥ 0 denotes the time. Here ρ is defined by ρ1 in 1 (t) ρ(x1 , x2 , x3 , t) = ρ2 in 2 (t), with ρ1 , ρ2 ≥ 0 constants and ρ1 = ρ2 .
Contour Dynamics of Incompressible 3-D Fluids in a Porous Medium
447
Ω (t) 1
X3
Ω2(t)
X2
X1
We show in Sect. 2 that in this case it is not necessary to assume any condition on the pressure along the interface to obtain the contour equation. Furthermore, we illustrate below that the solutions to this model are weak solutions to the following conservation of mass equation: Dρ = ρt + v · ∇ρ = 0, Dt
(3)
where div v = 0. We notice the similarity with the 2-D vortex patch problem given by the twodimensional Euler equation where the vorticity is conserved along trajectories in a weak sense. The vorticity is considered to be a characteristic function of a domain. Chemin [8] proved global-in-time existence using paradifferential calculus. A simpler proof can be found in [4] due to Bertozzi and Constantin. In Sect. 2 we show that due to (2) the velocity can be determined from the density by singular integral operators (see [21]). It makes the equation more singular than the 2-D vortex patch problem where the velocity is given by the Biot-Savart law. A singular problem, more analogous to (3), is the evolution of the 2-D quasigeostrophic (QG) equation for sharp fronts. The QG equation models the dynamics of cold and hot air and the formation of fronts. Here the temperature θ is conserved along particle trajectories and the velocity is given by singular integral operators in the following form: v = (−R2 θ, R1 θ ), where R1 and R2 are the Riesz transforms (see [10] for more details of the QG equation). Rodrigo [19] proposed the contour equation of the sharp fronts where the temperature is concentrated in a domain and proved local existence and uniqueness. The paper is organized as follows. In Sect. 2 we derive the contour equation. We show that this equation fulfills the conservation of mass equation in Sect. 3. In Sect. 4 we prove local existence and uniqueness of the stable case. In Sect. 5 we get a family of global solutions of the 2-D stable case with small initial data. Finally, as a consequence of the previous section, in Sect. 6 we prove ill-posedness for the 3-D unstable case.
448
D. Córdoba, F. Gancedo
2. The Contour Equation We consider the equation with (x1 , x2 , x3 ) ∈ R3 where the fluid has different densities, that is ρ is represented by ρ1 , {x3 > f (x1 , x2 , t)} ρ(x1 , x2 , x3 , t) = (4) ρ2 , {x3 < f (x1 , x2 , t)}, f being the interface. Using Darcy’s Law we get curl curl v = (−∂x1 ∂x3 ρ, −∂x2 ∂x3 ρ, (∂x21 + ∂x22 )ρ). Since div v = 0 we have curl curl v = −v, therefore it follows v = (∂x1 −1 ∂x3 ρ, ∂x2 −1 ∂x3 ρ, −(∂x21 + ∂x22 )−1 ρ).
(5)
The integral operators ∂x1 −1 and ∂x2 −1 are given by the kernels K 1 (x1 , x2 , x3 ) =
x1 1 , 4π (x12 + x22 + x32 )3/2
K 2 (x1 , x2 , x3 ) =
x2 1 , 4π (x12 + x22 + x32 )3/2
respectively, thus the velocity can be expressed by v = (K 1 ∗ ∂x3 ρ, K 2 ∗ ∂x3 ρ, −K 1 ∗ ∂x1 ρ − K 2 ∗ ∂x2 ρ).
(6)
Since ρ satisfies (4) we have ∇ρ = (ρ2 − ρ1 )(∂x1 f (x1 , x2 , t), ∂x2 f (x1 , x2 , t), −1)δ(x3 − f (x1 , x2 , t)), where δ is the Dirac distribution. Using (6) we obtain (y1 , y2 , ∇ f (x − y, t) · y) ρ2 − ρ1 PV v(x1 , x2 , x3 , t) = − dy, 2 2 3/2 4π R2 [|y| + (x 3 − f (x − y, t)) ]
(7)
(8)
where we note x = (x1 , x2 ), y = (y1 , y2 ) and ∇ f (x − y, t) · y = ∂x1 f (x − y, t)y1 + ∂x2 f (x − y, t)y2 . In (8) x3 = f (x, t) and the principal value is taken at infinity (see [21]). When x3 approaches f (x, t) in the normal direction, we get a discontinuity on the velocity due to the fact that the vorticity is concentrated on the interface. Thus, for ε > 0 we define v 1 (x, f (x, t), t) = lim v(x1 − ε∂x1 f (x, t), x2 − ε∂x2 f (x, t), f (x, t) + ε, t), ε→0
and v 2 (x, f (x, t), t) = lim v(x1 + ε∂x1 f (x, t), x2 + ε∂x2 f (x, t), f (x, t) − ε, t). ε→0
It follows
(y1 , y2 , ∇ f (x − y, t) · y) ρ2 − ρ1 PV dy 2 + ( f (x, t) − f (x − y, t))2 ]3/2 2 4π [|y| R ρ2 − ρ1 ∂x1 f (x, t)(1, 0, ∂x1 f (x, t)) (9) + 2 1 + (∂x1 f (x, t))2 + (∂x2 f (x, t))2 ρ2 − ρ1 ∂x2 f (x, t)(0, 1, ∂x2 f (x, t)) + , 2 1 + (∂x1 f (x, t))2 + (∂x2 f (x, t))2
v 1 (x, f (x, t), t) = −
Contour Dynamics of Incompressible 3-D Fluids in a Porous Medium
449
(y1 , y2 , ∇ f (x − y, t) · y) ρ2 − ρ1 PV dy 2 2 3/2 4π R2 [|y| + ( f (x, t) − f (x − y, t)) ] ρ2 − ρ1 ∂x1 f (x, t)(1, 0, ∂x1 f (x, t)) − (10) 2 1 + (∂x1 f (x, t))2 + (∂x2 f (x, t))2 ρ2 − ρ1 ∂x2 f (x, t)(0, 1, ∂x2 f (x, t)) − . 2 1 + (∂x1 f (x, t))2 + (∂x2 f (x, t))2
v 2 (x, f (x, t), t) = −
The velocity in the tangential directions only moves the particles on the surface f (x, t); i.e., if we rewrite the velocity in the tangential directions, we only make a change on the parametrization and do not alter the shape of the interface. Thus, it follows that ρ2 − ρ1 (y1 , y2 , ∇ f (x − y, t) · y) v(x, f (x, t), t) = − dy, (11) PV 2 + ( f (x, t) − f (x − y, t))2 ]3/2 2 4π [|y| R due to the fact that the terms ρ2 − ρ1 ∂x1 f (x, t)(1, 0, ∂x1 f (x, t)) , 2 1 + (∂x1 f (x, t))2 + (∂x2 f (x, t))2 ρ2 − ρ1 ∂x2 f (x, t)(0, 1, ∂x2 f (x, t)) ± , 2 1 + (∂x1 f (x, t))2 + (∂x2 f (x, t))2 ±
are in the tangential directions. Moreover, if we add the following tangential terms to (11): y1 ρ2 − ρ1 PV dy(1, 0, ∂x1 f (x, t)), 2 2 [|y| + ( f (x, t) − f (x − y, t))2 ]3/2 4π R y2 ρ2 − ρ1 PV dy(0, 1, ∂x2 f (x, t)), 2 2 3/2 4π R2 [|y| + ( f (x, t) − f (x − y, t)) ] we obtain v(x, f (x, t), t) =
ρ2 − ρ1 (∇ f (x, t) − ∇ f (x − y, t)) · y (0, 0, P V dy). 2 2 3/2 4π R2 [|y| + ( f (x, t) − f (x − y, t)) ] (12)
Finally we have the contour equation given by df ρ2 − ρ1 (∇ f (x, t) − ∇ f (x − y, t)) · y (x, t) = PV dy, 2 2 3/2 dt 4π R2 [|y| + ( f (x, t) − f (x − y, t)) ] f (x, 0) = f 0 (x).
(13)
In the periodic case, we can obtain an equivalent equation to (13) due to the integral operators ∂x1 −1 and ∂x2 −1 can be presented by the kernels x1 1 ( L(x1 , x2 , x3 ) + M(x1 , x2 , x3 )), 4π (x12 + x22 + x32 )3/2 1 x2 p K 2 (x1 , x2 , x3 ) = L(x1 , x2 , x3 ) + M(x1 , x2 , x3 )), ( 4π (x12 + x22 + x32 )3/2 p
K 1 (x1 , x2 , x3 ) =
450
D. Córdoba, F. Gancedo
respectively for (x1 , x2 , x3 ) ∈ T2 × R with T2 = [−π, π ]2 and the functions L , M ∈ C ∞ (T2 × R) (see [20] for the kernel of the Riesz potentials on the n-torus). Adding an p appropiate function to the singular part of K 1 and K 2P , we can choose L ∈ Cc∞ (T2 × R),
L ≥ 0, supp L ⊂ {x12 + x22 + x32 ≤ 4},
(14)
L = 1 in {x12 + x22 + x32 ≤ 1} and L(−x1 , −x2 , −x3 ) = L(x1 , x2 , x3 ). The function M belongs to Cb∞ (T2 × R) and M(0, 0, 0) = 0. The velocity can be expressed by p
p
p
p
v = (K 1 ∗ ∂x3 ρ, K 2 ∗ ∂x3 ρ, −K 1 ∗ ∂x1 ρ − K 2 ∗ ∂x2 ρ), and due to (7) it follows (suppressing the dependence on t) (y1 , y2 , ∇ f (x − y) · y) ρ2 − ρ1 PV v(x1 , x2 , x3 ) =− L(y, x3 − f (x − y))dy 2 + (x − f (x − y))2 ]3/2 2 4π [|y| 3 T ρ2 − ρ1 (1, 1, ∇ f (x − y) · (1, 1))M(y, x3 − f (x − y))dy, − 4π T2 if x3 = f (x). Adding a term in the tangential direction we obtain (∇ f (x)−∇ f (x − y)) · y ρ2 − ρ1 0, 0, L(y, f (x)− f (x − y))dy v(x, f (x)) = 2 +( f (x)− f (x − y))2 ]3/2 2 4π [|y| T + (∇ f (x)−∇ f (x − y)) · (1, 1))M(y, f (x)− f (x − y))dy . T2
Finally we have the contour equation in the periodic case given by ρ2 − ρ1 (∇ f (x, t) − ∇ f (x − y, t)) · y df (x, t) = 2 + ( f (x, t) − f (x − y, t))2 ]3/2 2 dt 4π [|y| T × L(y, f (x, t) − f (x − y, t))dy ρ2 − ρ1 + (∇ f (x, t) − ∇ f (x − y, t)) · (1, 1)) 4π T2 × M(y, f (x, t) − f (x − y, t))dy, f (x, 0) = f 0 (x).
(15)
We use both formulations throughout the paper. Suppose that the function f (x) only depends on x1 in Eq. (13). Then the contour equation in the 2-D case (with a 1-D interface) follows: ρ2 − ρ1 (∂x f (x, t) − ∂x f (x − α, t))α df (x, t) = PV dα, 2 + ( f (x, t) − f (x − α, t))2 dt 2π α (16) R f (x, 0) = f 0 (x); x ∈ R. This equation can be obtained in a similar way to (13) using the stream function. Performing a two-dimensional analysis using the stream function, we obtain an equivalent
Contour Dynamics of Incompressible 3-D Fluids in a Porous Medium
451
equation to (16) in the two dimensional periodic case as follows: ρ2 − ρ1 (∂x f (x, t) − ∂x f (x − α, t))α df (x, t) = P(α, f (x, t)− f (x −α, t))dα 2 + ( f (x, t) − f (x − α, t))2 dt 2π α T ρ2 − ρ1 (∂x f (x, t) − ∂x f (x − α, t))Q(α, f (x, t)− f (x −α, t))dα, + 2π T f (x, 0) = f 0 (x), (17) with P(x1 , x2 ) ∈ Cc∞ (T × R), P ≥ 0, supp P ⊂ {x12 + x22 ≤ 4}, P = 1 in {x12 + x22 ≤ 1}
and P(−x1 , −x2 ) = P(x1 , x2 ).
The function Q(x1 , x2 ) belongs to Cb∞ (T × R) and Q(0, 0) = 0. If we consider the linearized equation of the motion, we obtain a dissipative equation when ρ1 < ρ2 (the greater density is below) and an unstable equation when ρ1 > ρ2 . The unstable linearized equation presents an instability similar to Kelvin-Helmholtz’s (see [6]). As usual, we note the Riesz transforms in R2 (see [21]) y1 1 P.V. f (x − y)dy, R1 f (x) = 3 2 2π |y| R y2 1 P.V. R2 f (x) = f (x − y)dy, 3 2 2π |y| R s f (ξ ) = |ξ |s and the operator s f defined by the fourier transform f (ξ ). Suppose that f (x) is uniformly small and we can neglect the terms of order greater than one in (13), then it reduces to the following linear equation:
ρ1 − ρ2 ρ1 − ρ2 (R1 ∂x1 f + R2 ∂x2 f ) = f, 2 2 f (x, 0) = f 0 (x).
ft =
(18)
Applying the Fourier transform we get fˆ(ξ ) = fˆ0 (ξ )e
ρ1 −ρ2 2 |ξ |t
,
and therefore (18) is a dissipative equation when ρ1 < ρ2 and an ill posed problem in the case ρ1 > ρ2 with a general initial data in the Schwartz class. We need an analytic initial data in order to get a well posed problem for ρ1 > ρ2 . 3. The Conservation of Mass Equation We show that if ρ is defined by (4) and f (x, t) is convected by the velocity (12) then ρ is a weak solution of the conservation of mass Eq. (3) and conversely. From now on, is equal to R2 or T2 and x = (x1 , x2 , x3 ).
452
D. Córdoba, F. Gancedo
Definition 3.1. The density ρ is a weak solution of the conservation of mass equation if for any ϕ ∈ C ∞ ( × R × (0, T )), ϕ with compact support in the real case and periodic in (x1 , x2 ) otherwise, we have T (ρ( x , t)∂t ϕ( x , t) + v( x , t)ρ( x , t)∇ϕ( x , t))d x dt = 0, (19) R
0
where the incompressible velocity v is given by Darcy’s law. Then Proposition 3.2. If f (x, t) satisfies (13) and ρ( x , t) is defined by (4), then ρ is a weak solution of the conservation of mass equation. Furthermore, if ρ is a weak solution of the conservation of mass equation given by (4), then f (x, t) satisfies (13). Proof. Let ρ be a weak solution of (3) defined by (4). Integrating by parts we have T T T I = ρ∂t ϕd x dt = ρ1 ∂t ϕd x dt + ρ2 ∂t ϕd x dt R
0
T
= (ρ1 − ρ2 ) 0
0
{x3 > f }
ϕ(x, f (x, t), t)∂t f (x, t)d xdt.
On the other hand, due to (9) and (10) we obtain T T J= ρv ∇ ϕ d x dt = ρ1 v∇ϕd x dt + ρ2 0
=
T 0
R
{x3 < f }
0
0
{x3 > f }
0
T
{x3 < f }
v∇ϕd x dt
ϕ(x, f (x, t), t)(ρ1 v 1 (x, f (x, t), t)
2
− ρ2 v (x, f (x, t), t))·(∂x1 f (x, t), ∂x2 f (x, t), −1)d xdt T = (ρ1 −ρ2 ) ϕ(x, f (x, t), t)v(x, f (x, t), t) · (∂x1 f (x, t), ∂x2 f (x, t),−1)d xdt, 0
where v(x, f (x, t), t) is given by (11). We get (ρ1 − ρ2 )2 T J = ϕ(x, f (x, t), t) 4π 0 (∇ f (x, t) − ∇ f (x − y, t)) · y ×P V d yd xdt. 2 2 3/2 R2 [|y| + ( f (x, t) − f (x − y, t)) ] Then I + J = 0 due to (19). Thus, if we choose ϕ( x , t) = ϕ(x, t) for x3 ∈ [− f L ∞ , f L ∞ ] it follows that f (x, t) fulfills (13). Following the same arguments it is easy to check that if f (x, t) satisfies (13), then ρ is a weak solution given by (4).
Remark 3.3. Note that due to (5), the velocity satisfies v = (R1 (R3 ρ), R2 (R3 ρ), −(R12 + R22 )(ρ)), where the operators R1 , R2 and R3 are the Riesz transforms in three dimensions (see [21]). Since ρ ∈ L ∞ ( × R) then v belongs to B M O (bounded mean oscillation) and therefore v is in L 2 ( × R) locally (see [21] for the definitions and properties of the B M O space).
Contour Dynamics of Incompressible 3-D Fluids in a Porous Medium
453
4. Local Well-Posedness for the Stable Case In this section we prove local existence and uniqueness for the stable case using energy estimates. First we study the case = R2 and at the end of the section we give the main differences with the periodic domain. Denote the Sobolev spaces by H k , the Hölder spaces by C k,δ with 0 ≤ δ < 1 the Hölder continuity and the hessian matrix of a function f (x) by ∇ 2 f (x). The norms of H k and C k,δ are defined as follows: f 2H k = f 2L 2 + k f 2L 2 , j
f C k,δ = f C k + max max
j
|∂xi 1 ∂x2 f (x) − ∂xi 1 ∂x2 f (y)| |x − y|δ
i+ j=k x= y
.
4.1. Case = R2 . The main theorem in this section is the following Theorem 4.1. Let f 0 (x) ∈ H k (R2 ) for k ≥ 4 and ρ2 > ρ1 . Then there exists a time T > 0 so that there is a unique solution to (13) in C 1 ([0, T ]; H k (R2 )) with f (x, 0) = f 0 (x). We choose ρ2 − ρ1 = 4π without loss of generality, then (∇ f (x, t) − ∇ f (x − y, t)) · y df (x, t) = P V dy, 2 2 3/2 dt R2 [|y| + ( f (x, t) − f (x − y, t)) ] f (x, 0) = f 0 (x).
(20)
We show the proof with k = 4 being analogous for k > 4. We apply energy methods (see [5] for more details). Then 1 d (∇ f (x) − ∇ f (x − y)) · y f 2L 2 (t) = f (x)P V d yd x 2 2 3/2 2 2 dt R2 [|y| + ( f (x) − f (x − y)) ] R (∇ f (x) − ∇ f (x − y)) · y = f (x) d yd x 2 + ( f (x) − f (x − y))2 ]3/2 2 [|y| R |y|<1 ∇ f (x) · y + f (x)P V d yd x 2 2 3/2 2 R |y|>1 [|y| + ( f (x) − f (x − y)) ] ∇ f (x − y) · y − f (x)P V d yd x 2 2 3/2 R2 |y|>1 [|y| + ( f (x) − f (x − y)) ] = I1 + I2 + I3 . The identity
∂xi f (x) − ∂xi f (x − y) =
yields
0
1
∇∂xi f (x + (s − 1)y) · y ds,
| f (x)||∇ 2 f (x + (s − 1)y)| d xd y 2 −2 3/2 0 |y|<1 R2 [1 + (( f (x) − f (x − y)) |y| ] 1 j ds |y|−1 dy f L 2 ∂xi 1 ∂x2 f L 2 ≤ C f 2H 2 . ≤C
I1 ≤ C
1
|y|−1
ds
0
|y|<1
i+ j=2
454
D. Córdoba, F. Gancedo
Integrating by parts, the term I2 is written 3 I2 = 2 ≤C ≤
R2
|y|>1
|y|>1
| f (x)|2
|y|−3
R2
( f (x) − f (x − y))(∇ f (x) − ∇ f (x − y)) · y d xd y [|y|2 + (( f (x) − f (x − y))2 ]5/2
| f (x)|2
| f (x) − f (x − y)||y|−1 |∇ f (x) − ∇ f (x − y)| d xd y [1 + (( f (x) − f (x − y))2 |y|−2 ]5/2
C f L ∞ f 2H 1 .
Integrating by parts in I3 , it follows
|y|2 − 2( f (x) − f (x − y))2 d xd y [|y|2 + ( f (x) − f (x − y))2 ]5/2 |y|>1 R2 ( f (x) − f (x − y))∇ f (x − y) · y +3 f (x) f (x − y) d xd y [|y|2 + ( f (x) − f (x − y))2 ]5/2 |y|>1 R2 |y|2 f (x) f (x − y) d xdσ (y) − 2 [|y| + ( f (x) − f (x − y))2 ]3/2 |y|=1 R2
I3 =
f (x) f (x − y)
≤ C( f L ∞ + 1) f 2H 1 . Using Sobolev inequalities, we get finally d f 2L 2 (t) ≤ C( f 3H 2 (t) + 1). dt We consider the quantity 1 d 4 ∂ f 2 2 (t) = I4 + I5 + I6 + I7 + I8 , 2 dt x1 L where
(∇∂x41 f (x) − ∇∂x41 f (x − y)) · y ∂x41 f (x)P V d yd x, 2 2 3/2 R2 R2 [|y| + ( f (x) − f (x − y)) ] =4 ∂x41 f (x) (∇∂x31 f (x) − ∇∂x31 f (x − y)) · y ∂x1 A(x, y)d yd x, R2 R2 4 =6 ∂x1 f (x) (∇∂x21 f (x) − ∇∂x21 f (x − y)) · y ∂x21 A(x, y)d yd x, R2 R2 =4 ∂x41 f (x) (∇∂x1 f (x) − ∇∂x1 f (x − y)) · y ∂x31 A(x, y)d yd x, 2 2 R R 4 = ∂x1 f (x) (∇ f (x) − ∇ f (x − y)) · y ∂x41 A(x, y)d yd x,
I4 = I5 I6 I7 I8
R2
R2
and A(x, y) = [|y|2 + ( f (x) − f (x − y))2 ]−3/2 .
(21)
Contour Dynamics of Incompressible 3-D Fluids in a Porous Medium
455
The most singular term is I4 . In order to estimate it ∇∂x41 f (x, t) · y I4 = ∂x41 f (x)P V d yd x 2 2 3/2 R2 R2 [|y| + ( f (x, t) − f (x − y, t)) ] ∇∂x41 f (y, t) · (x − y) 4 − ∂x1 f (x)P V d yd x 2 2 3/2 R2 R2 [|x − y| + ( f (x, t) − f (y, t)) ] = J1 + J2 . Integrating by parts ( f (x) − f (x − y))(∇ f (x) − ∇ f (x − y)) · y 3 J1 = |∂x41 f (x)|2 P V d yd x 2 2 R2 [|y|2 + (( f (x) − f (x − y))2 ]5/2 R 3 = |∂ 4 f (x)|2 ( dy + P V dy)d x 2 R2 x 1 |y|>1 |y|<1 3 3 ≤ f C 1 ∂x41 f 2L 2 + M( f )∂x41 f 2L 2 , 2 2 (22) where
M( f ) = max P V
( f (x) − f (x − y))(∇ f (x) − ∇ f (x − y)) · y dy . [|y|2 + (( f (x) − f (x − y))2 ]5/2 |y|<1
x
We estimate this maximum in the following form: ( f (x) − f (x − y) − ∇ f (x) · y)(∇ f (x) − ∇ f (x − y)) · y M( f ) ≤ max dy x [|y|2 + (( f (x) − f (x − y))2 ]5/2 |y|<1 (∇ f (x) · y)((∇ f (x) − ∇ f (x − y)) · y − y · ∇ 2 f (x) · y) + max dy 2 2 5/2 x [|y| + (( f (x) − f (x − y)) ] |y|<1 + max (∇ f (x) · y)(y · ∇ 2 f (x) · y)(B(x, y) − C(x, y))dy x
|y|<1
+ max P V x
(∇ f (x) · y)(y · ∇ 2 f (x) · y) dy , 2 2 5/2 |y|<1 [|y| + (∇ f (x) · y) ] (23)
where B(x, y) = [|y|2 + (( f (x) − f (x − y))2 ]−5/2 ,
C(x, y) = [|y|2 + (∇ f (x) · y)2 ]−5/2 .
Making the change of variables y = −z, we obtain that the last integral in (23) is null, then we can estimate M( f ) by |y|−1 2 M( f ) ≤ f C max dy 2 −1 )2 ]5/2 x [1 + (( f (x) − f (x − y))|y| |y|<1 2 2 −1 + f C 1 f C 2,δ |y|−2+δ dy + f C f |y| dy 1 C2 |y|<1
2 2 2 ≤ C( f C 2 + f C 1 f C 2,δ + f C 1 f C 2 ),
|y|<1
456
D. Córdoba, F. Gancedo
with 0 < δ < 1, having finally 4 4 2 J1 ≤ C( f C 2,δ + 1)∂ x 1 f L 2 .
(24)
In order to estimate J2 , we integrate by parts getting ∇ y (∂x41 f (x) − ∂x41 f (y)) · (x − y) ∂x41 f (x)P V d yd x J2 = 2 2 3/2 R2 R2 [|x − y| + ( f (x) − f (y)) ] = K1 + K2, with K1 = − and
R2
∂x41
f (x)P V
K2 =
R2
∂x41 f (x) − ∂x41 f (y) [|x − y|2 + ( f (x) − f (y))2 ]3/2
d yd x,
∂x41 f (x)
R2
(∂x41 f (x) − ∂x41 f (y))
R2
3( f (x) − f (y))( f (x) − f (y) − ∇ f (y) · (x − y)) × d yd x. [|x − y|2 + ( f (x) − f (y))2 ]5/2 Making a change of variables we can obtain ∂x41 f (x) − ∂x41 f (y) ∂x41 f (x) d yd x K 1 = −P V [|x − y|2 + ( f (x) − f (y))2 ]3/2 R2 R2 ∂x41 f (x) − ∂x41 f (y) = PV ∂x41 f (y) d yd x [|x − y|2 + ( f (x) − f (y))2 ]3/2 R2 R2 (∂x41 f (x) − ∂x41 f (y))2 1 =− d yd x 2 R2 R2 [|x − y|2 + ( f (x) − f (y))2 ]3/2 ≤ 0. Here we observe the main difference with the unstable case in which we obtain the opposite sign. Now we consider K2 = L 1 + L 2 + L 3, being L1 = 3
R2
|∂x41 f (x)|2 P V
( f (x) − f (y))( f (x) − f (y) − ∇ f (y) · (x − y)) d yd x, [|x − y|2 + ( f (x) − f (y))2 ]5/2 R2
3 L2 = − P V ∂x41 f (x)∂x41 f (y) 4 R2 R2 ( f (x) − f (y))(x − y) · (∇ 2 f (x) + ∇ 2 f (y)) · (x − y) × d yd x, [|x − y|2 + ( f (x) − f (y))2 ]5/2 L 3 = −3
R2 R2
∂x41 f (x)∂x41 f (y)( f (x) − f (y))D(x, y)d yd x,
Contour Dynamics of Incompressible 3-D Fluids in a Porous Medium
457
with D(x, y) =
( f (x)− f (y)−∇ f (y) · (x − y)− 41 (x − y) · (∇ 2 f (x)+∇ 2 f (y)) · (x − y)) . [|x − y|2 +( f (x)− f (y))2 ]5/2
The term L 1 can be estimated like J1 in (22) and one finds that 4 4 2 L 1 ≤ C(1 + f C 2,δ )∂ x 1 f L 2 .
Exchanging x for y we obtain that L 2 = 0. For the last term L 3 , it follows 4 2 |∂x1 f (x)| | f (x) − f (y)||D(x, y)|d yd x L3 ≤ C R2 R2 +C |∂x41 f (y)|2 | f (x) − f (y)||D(x, y)|d xd y 2 2 R R ≤C |∂x41 f (x)|2 | f (x) − f (x − y)||D(x, x − y)|d yd x R2 R2 +C |∂x41 f (y)|2 | f (x + y) − f (y)||D(x + y, y)|d xd y R2 R2 |∂x41 f (x)|2 d x dy + dy ≤C R2 |y|<1 |y|>1 4 2 |∂x1 f (y)| dy dx + dx + R2
≤
C f C 1 f C 2,δ ∂x41
|x|<1 f 2L 2 .
|x|>1
Finally 4 4 2 J2 = K 1 + K 2 ≤ K 2 = L 1 + L 2 + L 3 = L 1 + L 3 ≤ C( f C 2,δ + 1)∂ x 1 f L 2 ,
and due to (24) we obtain 4 4 2 I4 ≤ C( f C 2,δ + 1)∂ x 1 f L 2 .
Now we estimate the integral I5 . We have I5 = J3 + J4 where J3 = 4 ∂x41 f (x)∇∂x31 f (x) · R2 ( f (x) − f (x − y))(∂x1 f (x) − ∂x1 f (x − y)) ×P V y d yd x, [|y|2 + ( f (x) − f (x − y))2 ]5/2 R2 and
J4 = − 4P V
R2 R2
∂x41 f (x)∇∂x31 f (y) · (x − y)
( f (x) − f (y))(∂x1 f (x) − ∂x1 f (y)) d yd x [|x − y|2 + ( f (x) − f (y))2 ]5/2 = − 4P V ∂x41 f (x)∂x2 ∂x31 f (y)(x2 − y2 ) ×
R2 R2
( f (x) − f (y))(∂x1 f (x) − ∂x1 f (y)) × d yd x. [|x − y|2 + ( f (x) − f (y))2 ]5/2
(25)
458
D. Córdoba, F. Gancedo
The way to estimate J3 is similar to the term J1 in (22), and we find that 4 2 J3 ≤ C(1 + f C 2,δ ) f H 4 .
We decompose the term J4 = K 3 + K 4 + K 5 + K 6 as follows: ∂x41 f (x)∂x2 ∂x31 f (x − y)y2 E(x, y)d yd x, K 3 = −4P V R2 R2 K 4 = −4P V ∂x41 f (x)∂x2 ∂x31 f (x − y)y2 F(x, y)d yd x, R2 R2
K 5 = −4P V
∂x41 f (x)∂x2 ∂x31 f (x − y)y2 (∇ f (x) · y)(∇∂x1 f (x) · y)(B(x, y)
R2 |y|<1
−C(x, y))d yd x and
K 6 = −4P V
R2
|y|<1
∂x41 f (x)∂x2 ∂x31 f (x − y)y2
(∇ f (x) · y)(∇∂x1 f (x) · y) d yd x, [|y|2 + (∇ f (x) · y)2 ]5/2
where E(x, y) =
( f (x) − f (x − y) − ∇ f (x) · y)(∂x1 f (x) − ∂x1 f (x − y)) , [|y|2 + ( f (x) − f (x − y))2 ]5/2
and F(x, y) =
(∇ f (x) · y)(∂x1 f (x) − ∂x1 f (x − y) − ∇∂x1 f (x) · yX{|y|<1} ) . [|y|2 + ( f (x) − f (x − y))2 ]5/2
The terms K 3 , K 4 , and K 5 are estimated as J1 . Then we find that 2 2 K 3 ≤ C(1 + f C 2 ) f H 4 ,
K 4 ≤ C(1 + f C 1 f C 2,δ ) f 2H 4 , and 2 2 2 K 5 ≤ C(1 + f C 1 f C 2 ) f H 4 .
We rewrite K 6 and we get
K 6 = −4P V
R2
with the operator S defined by
S(g)(x) = P V and (x, y) =
∂x41 f (x)S(∂x2 ∂x31 f )(x)d x,
|y|<1
(x, y) g(x − y)dy, |y|2
y y y2 (∇ f (x) · |y| )(∇∂x1 f (x) · |y| ) . y 2 5/2 |y| [1 + (∇ f (x) · |y| ) ]
(26)
Contour Dynamics of Incompressible 3-D Fluids in a Porous Medium
459
The function (x, y) satisfies that (i) (x, λy) = (x, y), ∀ λ > 0, (ii) (x, −y) = −(x, y), (iii) sup |(x, y)| ≤ ∇∂x1 f L ∞ , x
and therefore S is a bounded linear map on L p (R2 ) for 1 < p < ∞ and S p ≤ C∇∂x1 f L ∞ (see [21] and references therein for more details). Then K 6 ≤ C f C 2 ∂x41 f L 2 ∂x2 ∂x31 f L 2 . We obtain finally 4 2 I5 ≤ C(1 + f C 2,δ ) f H 4 .
In order to estimate the term I6 we take 1 ds dy ∂x41 f (x) y · (∇ 2 ∂x21 f (x + (s − 1)y)) · y ∂x21 A(x, y)d x I6 = 6 R2 R2 0 1 ≤ ds dy + dy |y|<1 |y|>1 0 |∂x41 f (x)||∇ 2 ∂x21 f (x + (s − 1)y)||∂x21 A(x, y)||y|2 d x × R2 −2+δ 4 2 ≤ C( |y| dy + |y|−3 dy)(1 + f C 2,δ ) f H 4 . |y|<1
|y|>1
The most singular term of I7 is K 7 , K 7 = −12 ∂x41 f (x)(∇∂x1 f (x) − ∇∂x1 f (x − y)) · y G(x, y)d yd x, R2 R2
where G(x, y) =
( f (x) − f (x − y))(∂x31 f (x) − ∂x31 f (x − y)) [|y|2 + ( f (x) − f (x − y))2 ]5/2
.
Due to |∇∂x1 f (x) − ∇∂x1 f (x − y)| ≤ f C 2,δ |y|δ and writing ∂x31
f (x) − ∂x31
f (x − y) = 0
1
∇∂x31 f (x + (s − 1)y) · y ds,
2 2 4 2 we obtain K 7 ≤ C( f C 2,δ + 1) f H 4 and I7 ≤ C(1 + f C 2,δ ) f H 4 . The most singular term of I8 is K 8 , K 8 = −12 ∂x41 f (x)(∂x41 f (x) − ∂x41 f (x − y)) H (x, y)d yd x,
R2 R2
where H (x, y) =
( f (x) − f (x − y))(∇ f (x) − ∇ f (x − y)) · y . [|y|2 + ( f (x) − f (x − y))2 ]5/2
460
D. Córdoba, F. Gancedo
Then
K 8 = −12
R2
|∂x41 f (x)|2 P V
R2
( f (x) − f (x − y))(∇ f (x) − ∇ f (x − y)) · y d yd x, [|y|2 + ( f (x) − f (x − y))2 ]5/2
4 )∂ 4 f 2 and I ≤ C(1 + is controlled as before. We obtain K 8 ≤ C(1 + f C 8 2,δ x1 L2 4 2 f C 2,δ ) f H 4 . Finally, we have
d 4 4 2 ∂ f 2 2 (t) ≤ C(1 + f C 2,δ (t)) f H 4 (t), dt x1 L and using Sobolev inequalities we get d 4 ∂ f 2 2 (t) ≤ C( f 6H 4 (t) + 1). dt x1 L
(27)
In a similar way we obtain d 4 ∂ f 2 2 (t) ≤ C( f 6H 4 (t) + 1), dt x2 L
(28)
and since we can define f 2H 4 = f 2L 2 + ∂x41 f 2L 2 + ∂x42 f 2L 2 , due to (21), (27) and (28) it follows d f H 4 (t) ≤ C( f 5H 4 (t) + 1). dt Using Gronwall’s inequality we get that the quantity f H 4 is bounded up to a time T = T ( f 0 H 4 ). Then, applying energy methods the local existence result follows. Let the functions f 1 (x, t), f 2 (x, t) be two solutions of Eq. (13) with f 1 (x, 0) = f 2 (x, 0) = f 0 (x), and f = f 1 − f 2 . Then d f 2L 2 (t) = I9 + I10 + I11 , dt with
I9 =
R2
f (x)∇ f (x) · P V
I10 = −
R2
f (x)P V
and
R2
R2
y[|y|2 + ( f 1 (x) − f 1 (x − y))2 ]−3/2 d yd x,
∇ f (y, t) · (x − y)[|x − y|2 + ( f 1 (x) − f 1 (y))2 ]−3/2 d yd x,
I11 =
R2
f (x)P V
R2
(∇ f 2 (x) − ∇ f 2 (x − y)) · y N (x, y)d yd x,
with N (x, y) = [|y|2 + ( f 1 (x) − f 1 (x − y))2 ]−3/2 − [|y|2 + ( f 2 (x) − f 2 (x − y))2 ]−3/2 .
Contour Dynamics of Incompressible 3-D Fluids in a Porous Medium
461
Integrating by part in I9 we have I9 ≤ C( f 1 H 4 ) f 2L 2 , and I11 ≤ C( f 1 H 4 , f 2 H 4 ) f 2L 2 . The term I10 = −
R2
f (x)P V
R2
∇ y ( f (y) − f (x)) · (x − y)
−3/2 × |x − y|2 + ( f 1 (x) − f 1 (y))2 d yd x = −P V f (x)( f (x) − f (y))[|x − y|2 + ( f 1 (x) − f 1 (y))2 ]−3/2 d yd x R2 R2 +P V f (x)( f (x) − f (y)) R2 R2
3( f 1 (x) − f 1 (y))( f 1 (x) − f 1 (y) − ∇ f 1 (x)(x − y)) × d yd x. [|x − y|2 + ( f 1 (x) − f 1 (y))2 ]5/2 Then we have that I10 ≤ J5 + J6 , where 3( f 1 (x) − f 1 (y))( f 1 (x) − f 1 (y) − ∇ f 1 (x)(x − y)) J5 = | f (x)|2 P V d yd x, [|x − y|2 + ( f 1 (x) − f 1 (y))2 ]5/2 R2 R2 and J6 = − ×
R2
f (x)P V
R2
f (x − y)
3( f 1 (x) − f 1 (x − y))( f 1 (x) − f 1 (x − y) − ∇ f 1 (x) · y) d yd x. [|y|2 + ( f 1 (x) − f 1 (x − y))2 ]5/2
The term J5 is estimated as J1 obtaining J5 ≤ C( f 1 H 4 ) f 2L 2 . The term J6 can be expressed as J6 = K 9 + K 10 with ( f 1 (x) − f 1 (x − y))G(x, y) f (x) f (x − y) d yd x, K 9 = −3 2 + ( f (x) − f (x − y))2 ]5/2 2 2 [|y| 1 1 R R where the function G(x, y) is given by G(x, y) = f 1 (x) − f 1 (x − y) − ∇ f 1 (x) · y −
1 y · (∇ 2 f 1 (x) + ∇ 2 f 1 (x − y)) · y. 4
One finds that the principal value K 10 is null. Therefore, we obtain finally J6 ≤ C( f 1 H 4 ) f 2L 2 . Applying Gronwall’s inequality we get uniqueness. 4.2. Case = T2 . In the periodic case we give the theorem of local well-posedness and the differences with = R2 . Theorem 4.2. Let f 0 (x) ∈ H k (T2 ) for k ≥ 4 and ρ2 > ρ1 . Then there exists a time T > 0 so that there is a unique solution to (15) in C 1 ([0, T ]; H k (T2 )) with f (x, 0) = f 0 (x).
462
D. Córdoba, F. Gancedo
The proof is similar to Theorem 4.1 but we must use the properties of the function L in (14). We consider without loss of generality ρ2 − ρ1 = 4π . In order to control the evolution of the quantity ∂x41 f L 2 , the most singular term is (∇∂x41 f (x) − ∇∂x41 f (x − y)) · y 4 ∂x1 f (x) L(y, f (x) − f (x − y))d yd x I = 2 2 3/2 2 T2 [|y| + ( f (x) − f (x − y)) ] T L(y, f (x) − f (x − y)) ∂x41 f (x)∇∂x41 f (x) · P V y d yd x = 2 + ( f (x) − f (x − y))2 ]3/2 2 2 [|y| T T ∇∂x41 f (y) · (x − y) − ∂x41 f (x)P V L(x − y, f (x)− f (y))d yd x 2 2 3/2 T2 T2 [|x − y| + ( f (x) − f (y)) ] = J1 + J2 . Integrating by parts 3 |∂x41 f (x)|2 P V A(x, y)L(y, f (x) − f (x − y))d yd x J1 = 2 T2 T2 1 L x3 (y, f (x)− f (x − y))(∇ f (x)−∇ f (x − y)) · y − |∂x41 f (x)|2 P V d yd x 2 T2 [|y|2 +( f (x)− f (x − y))2 ]3/2 T2 = K1 + K2, (29) where A(x, y) =
( f (x) − f (x − y))(∇ f (x) − ∇ f (x − y)) · y [|y|2 + (( f (x) − f (x − y))2 ]5/2
and L x3 (x1 , x2 , x3 ) = ∂x3 L(x1 , x2 , x3 ). Due to |L(x1 , x2 , x3 ) − 1| ≤ C|(x1 , x2 , x3 )| we have (∇ f (x) · y)(y · ∇ 2 f (x) · y) 4 2 4 2 V ) f +C∂ f max dy K 1 ≤ C(1+ f C P 2,δ x1 H4 L2 x [|y|2 +(∇ f (x) · y))2 ]5/2 T2 4 2 ≤ C(1 + f C 2,δ ) f H 4 .
Using that | f (x) − f (x − y)| ≤ f C 1 |y| and L x3 = 0 in {x12 + x22 + x32 ≤ 4}, we have that 1 K2 = − |∂ 4 f (x)|2 2 T2 x1 L x3 (y, f (x) − f (x − y))(∇ f (x) − ∇ f (x − y)) · y × d yd x 2 [|y|2 + ( f (x) − f (x − y))2 ]3/2 |y|> 1+ f C1
≤
4 C( f C 2,δ
+ 1) f 2H 4 .
In order to estimate J2 , we integrate by parts getting ∇ y (∂x41 f (y)−∂x41 f (x)) · (x − y) J2 =− ∂x41 f (x)P V L(x − y, f (x)− f (y))d yd x 2 2 3/2 T2 T2 [|x − y| +( f (x)− f (y)) ] = K3 + K4 + K5 + K6
Contour Dynamics of Incompressible 3-D Fluids in a Porous Medium
463
with K3 = −
T2
∂x41
f (x)P V
K4 = K5 =
T2
∂x41 f (x)−∂x41 f (y) T2
[|x − y|2 +( f (x)− f (y))2 ]3/2
L(x − y, f (x)− f (y))d yd x,
∂x41 f (x)
(∂x41 f (x) − ∂x41 f (y))B(x, y)L(x − y, f (x) − f (y))d yd x,
T2
T2
T2
∂x41 f (x)(∂x41 f (y) − ∂x41 f (x))C(x, y)L x3 (x − y, f (x) − f (y))d yd x,
K6 =
T2
T2
∂x41 f (x)(∂x41 f (y) − ∂x41 f (x))D(x, y)d yd x,
and B(x, y) =
3( f (x) − f (y))( f (x) − f (y) − ∇ f (y) · (x − y)) , [|x − y|2 + ( f (x) − f (y))2 ]5/2
C(x, y) = −
D(x, y) = −
∇ f (y) · (x − y) , [|x − y|2 + ( f (x) − f (y))2 ]3/2
L x1 (x − y, f (x) − f (y))(x1 − y1 ) + L x2 (x − y, f (x) − f (y))(x2 − y2 ) , [|x − y|2 + ( f (x) − f (y))2 ]3/2
L x1 (x1 , x2 , x3 ) = ∂x1 L(x1 , x2 , x3 ),
L x2 (x1 , x2 , x3 ) = ∂x2 L(x1 , x2 , x3 ).
Exchanging the variables x and y we can obtain K 3 ≤ 0. The terms K 4 , K 5 and K 6 can be estimated in a similar way as K 1 . Therefore, we obtain d 4 ∂ f 2 (t) ≤ C( f 5H 4 (t) + 1), dt x1 L and analogously d 4 ∂ f 2 (t) ≤ C( f 5H 4 (t) + 1). dt x2 L This proves local existence. The proof of the uniqueness is similar to the case = R2 . 4.3. 2-D case. Using Eq. (16) in R and Eq. (17) in the periodic case we obtain the following theorem Theorem 4.3. Let f 0 (x) ∈ H k for k ≥ 3 and ρ2 > ρ1 . Then there exists a time T > 0 so that there is a unique solution to (16) in C 1 ([0, T ]; H k ) with f (x, 0) = f 0 (x). The proof is similar to that of Theorems 4.1 and 4.2.
464
D. Córdoba, F. Gancedo
5. Global Solution for the 2-D Stable Case In this section we obtain a family of global solutions for the 2-D stable case with a small initial data with respect to a fixed norm. Indeed, we can get the result with an initial data with the property f 0 H s = ∞ for s > 3/2. In this section we consider x ∈ R and f a = | fˆ(k)|ea|k| . For a > 0, if f a < ∞, then the function f can be extended analytically on the strip |z| < a. Furthermore f b , (30) ∂x f a ≤ C b−a for b > a. The main result of this section is Theorem 5.1. Let f 0 (x) be a function such that T f 0 (x) d x = 0, ∂x f 0 0 ≤ ε for ε small enough and ∂x2 f 0 b(t) ≤ εeb(t) (1 + |b(t)|γ −1 ), (31) with 0 < γ < 1, b(t) = a − (ρ2 − ρ1 )t/2, ρ2 > ρ1 and a ≤ (ρ2 − ρ1 )t/2. Then, there exists a unique solution of (16) with f (x, 0) = f 0 (x) and ρ2 > ρ1 satisfying ∂x f a (t) ≤ C(ε) exp((2σ a − (ρ2 − ρ1 )t)/4),
(32)
and ∂x2 f a (t) ≤ C(ε)(1 + |σ a − for a ≤
ρ2 −ρ1 2σ t,
ρ2 − ρ1 γ −1 t| ) exp((2σ a − (ρ2 − ρ1 )t)/4), 2
(33)
σ = 1 + δ and 0 < δ < 1.
The condition (31) can be satisfied for example if 1+γ f 0 0 < ε and fˆ0 (0) = fˆ0 (1) = fˆ0 (−1) = 0 since ∂x2 f 0 b(t) ≤ eb(t) 1+γ f 0 0 max |k|1−γ eb(t)(|k|−1) . k≥2
In order to prove the theorem, we use the Cauchy-Kowalewski method (see [15] and [16]) in a similar way as Caflisch and Orellana [7] and Siegel, Caflisch and Howison [18]. We show the proof with ρ2 − ρ1 = 2 without loss of generality. Let g(x, t) and h(x, t) be functions satisfying gt = −g, g(x, 0) = f 0 (x), h t = −h + T (g + h), h(x, 0) = 0, with T ( f ) = −π
−1
f (x)− f (x−α) 2 ∂x f (x) − ∂x f (x − α) α f (x)− f (x−α) 2 dα. α R 1+ α
(34)
Then the function f (x, t) = g(x, t) + h(x, t) is a solution of (16). First, we show some properties of the nonlinear operator T .
(35)
Contour Dynamics of Incompressible 3-D Fluids in a Porous Medium
465
Lemma 5.2. If ∂x f a , ∂x ga < 1 for a ≥ 0 then T ( f )(0) = 0,
(36)
∂x T ( f )a ≤ C1 ∂x2 f a ∂x f a ,
(37)
and ∂x T ( f ) − ∂x T (g)a ≤ C2 (∂x2 f a + ∂x2 ga )∂x f − ∂x ga
(38)
+ C2 (∂x f a + ∂x ga )∂x2 f − ∂x2 ga , with C1 = 4(1 − ∂x f a2 )−2 and C2 = 4(1 − ∂x f a2 )−2 + (1 − ∂x ga2 )−2 .
Proof of the lemma. Due to the inequality |∂x f (x)| ≤ ∂x f a < 1 and by (35) we obtain T ( f ) = π −1
∂x f (x) − ∂x f (x − α) f (x) − f (x − α) 2n (−1)n dα, (39) α α R n≥1
and T ( f ) = π −1 ∂x
(−1)n f (x) − f (x − α) 2n+1 dα. 2n + 1 R α n≥1
Thus T ( f )(0) = 0. Using (39) T ( f )(k) = π −1
n≥1
=
(−1)
n
R k ,...,k 0 2n
δ(
2n
k j , k)ik0
j=0
2n j=0
−iαk j
1−e fˆ(k j ) α
dα
2n 2n (−1)n δ( k j , k) Mn (k0 , . . . , k2n ) ik0 fˆ(k j ), n≥1
k0 ,...,k2n
j=0
j=0
where Mn (k0 , . . . , k2n ) = π −1
2n 1 − e−iαk j dα. α R j=0
We get Mn (k0 , . . . , k2n ) = (−1)n m n (k0 , . . . , k2n )
2n j=1
kj,
(40)
466
D. Córdoba, F. Gancedo
with m n (k0 , . . . , k2n ) = π −1
1
=π
0
1
0
1
=i
2n dα ds2n P V ex p iα (s j − 1)k j α R j=1
ds1 . . .
0
1
1
ds1 . . .
0
2n dα ds2n P V ex p − iαk0 +iα (s j − 1)k j α R
1
0
2n
1
ds1 . . .
0
− π −1
1 − e−iαk0 ex p iα (s j − 1)k j dα α R j=1
ds2n
0 −1
1
ds1 . . .
j=1
ds2n (sing A − sing B),
0
and A=
2n (s j − 1)k j ,
B = −k +
j=1
2n
sjkj.
j=1
It follows T ( f )(k) =
δ(
n≥1 k0 ,...,kn
2n
k j , k) m n (k0 , . . . , k2n )
j=0
2n
k j fˆ(k j ),
j=0
with |m n (k0 , . . . , k2n )| ≤ 2. We have
ea|k| |k||T ( f )(k)| ≤ 2
ea|k| |k|δ(
k n≥1 k0 ,...,kn
k
≤2
2n
k j , k)
2n
j=0
|k j || fˆ(k j )|
j=0
2n (2n + 1) ea|k0 | |k0 |2 | fˆ(k0 )| ea|k j | |k j || fˆ(k j )|, k0 ,...,kn
n≥1
j=1
and therefore ∂x T ( f )a ≤ 2∂x2 f a
3∂x f a3 − ∂x f a4 (2n + 1)∂x f a2n = 2∂x2 f a . (1 − ∂x f a2 )2 n≥1
We get (37) for ∂x f a < 1. In a similar way we obtain (38).
From (34) g can be expressed as follows: g(k, ˆ t) = e−|k|t fˆ0 (k), and by the hypothesis of the initial data we have ∂x ga (t) ≤ εea−t , ∂x2 ga (t)
≤ εe
a−t
(1 + (t − a)
(41) γ −1
),
(42)
Contour Dynamics of Incompressible 3-D Fluids in a Porous Medium
467
for t ≥ a. We will prove the existence of h by an induction argument on the iterative equation: ∂t h n+1 = −h n+1 + T (g + h n ), h n+1 (x, 0) = 0, h 0 = 0, or h n+1 (k, t) =
t
e−|k|(t−s) (T (g + h n ))(k, s)ds,
0
h 0 = 0. For h 1 we obtain the following estimates: t 1 ∂x h a (t) ≤ T (g)a+s−t (s)ds = 0
0
t−a
e−s (1 + s γ −1 )ds ≤
0
By (41) and (42) we have I2 ≤ C
t
t−a
= I1 + I2 .
t−a
t−a 0
≤ Cε2 ea−t
t
+ 0
Using (37), (41) and (42) we get t−a I1 ≤ ea−t es ∂x T (g)0 (s)ds ≤ Cea−t
t−a
es ∂x2 g0 (s)∂x g0 (s)ds
Cε2 (1 + 2γ ) a−t e . γ
∂x2 ga+s−t (s)∂x ga+s−t (s)ds
≤ Cε2 e2(a−t) a(1 + (t − a)γ −1 ) ≤
2Cε2 a−t e , δ
due to the inequalities (aδ)γ −1 > (t − a)γ −1 and aea−t ≤ δ −1 for σ a < t. Then ∂x h 1 a (t) ≤
5Cε2 a−t e . δγ
Choosing b = a + s − t + t−a 2 we have t t t ∂x T (g)b (s) ∂x T (g)b ds ≤ 2 ∂x2 h 1 a (t) ≤ ∂x2 T (g)a+s−t (s)ds ≤ b − (a + s − t) t −a 0 0 0 t t−a 2 ≤ = I3 + I4 , + 0
t−a 2
where a−t t−a a−t t−a 2 2 2Ce 2 2Cε2 e 2 s 2 e ∂x g0 (s)∂x g0 (s)ds ≤ e−s (1 + s γ −1 )ds I3 ≤ t −a 0 t −a 0 2Cε2 a−t e 2 (1 + (t − a)γ −1 ), ≤ γ
468
D. Córdoba, F. Gancedo
and I4 ≤
2C t −a
t t−a 2
∂x2 gb (s)∂x gb (s)ds ≤
2Cε2 a−t t − a γ −1 t a e (1 + ( ) )( + ) t −a 2 2 2
3Cε2 a−t e (1 + (t − a)γ −1 ). δ Therefore ≤
5Cε2 a−t e , δγ 5Cε2 a−t e 2 (1 + (t − a)γ −1 ). ∂x2 h 1 a (t) ≤ δγ ∂x h 1 a (t) ≤
(43) (44)
Define r n+1 = h n+1 − h n , Rn =
sup 0≤a<∞ σa < t
∂x r n a +
t−σ a ∂x2 r n a e 2 , 1 + (t − σ a)γ −1
t−σ a ∂x2 h n a e 2 . 1 + (t − σ a)γ −1
and Mn =
sup 0≤a<∞ σa < t
∂x h n a +
ε0 ε0 Take M1 = R1 ≤ 5Cε δγ ≤ 2 and suppose that M j , R j ≤ 2 for any j = 2, . . . , n, then t t−a t n+1 n n−1 ∂x r a ≤ ∂x T (g + h ) − ∂x T (g + h )a+s−t (s)ds = + = I7 + I8 . 2
0
0
t−a
Using (38) we have t−a I7 ≤ Cea−t es (∂x r n 0 (s)(∂x2 g + ∂x2 h n 0 (s) + ∂x2 g + ∂x2 h n−1 0 (s)) 0 t−a + Cea−t es (∂x2 r n 0 (s)(∂x g + ∂x h n 0 (s) + ∂x g + ∂x h n−1 0 (s))ds 0 t−a σ a−t 2Cε0 Rn e 2 , ≤ 2Cε0 Rn ea−t (1 + s γ −1 )ds ≤ γ 0 and
I8 ≤ C
t
(∂x r n a+s−t (s)(∂x2 g + ∂x2 h n a+s−t (s) + ∂x2 g + ∂x2 h n−1 a+s−t (s))ds
t−a t
(∂x2 r n a+s−t (s)(∂x g + ∂x h n a+s−t (s) + ∂x g + ∂x h n−1 a+s−t (s))ds t eδs−σ (t−a) (1 + (σ (t − a) − δs)γ −1 )ds ≤ 2Cε0 Rn +C
t−a
2Cε0 Rn ≤ δ
t−a t−a
t−σ a
e−x (1 + x γ −1 )d x ≤
6Cε0 Rn eσ a−t . γδ
Contour Dynamics of Incompressible 3-D Fluids in a Porous Medium
469
We obtain for b = a + s − t + σ (t−a)−δs , 2σ t ∂x2 r n+1 a (t) ≤ ∂x2 (T (g + h n ) − T (g + h n−1 ))a+s−t (s)ds 0
+ h n ) − T (g + h n−1 )b (s) ds b − (a + s − t) 0 t ∂x (T (g + h n ) − T (g + h n−1 )b ≤ 2σ σ (t − a) − δs 0 t σσ+1 (t−a) ≤ = I9 + I10 . + ≤
t ∂
x (T (g
σ σ +1 (t−a)
0
We have σ (t − a) − δs > σ2σ+1 (t − a) for 0 ≤ s ≤ σσ+1 (t − a) and therefore we obtain σ (σ + 1)C σ +1 (t−a) b e (∂x r n 0 (s)(∂x2 g + ∂x2 h n 0 (s) + ∂x2 g + ∂x2 h n−1 0 (s)) I9 ≤ t −a 0 σ (σ + 1)C σ +1 (t−a) b 2 n e (∂x r 0 (s)(∂x g + ∂x h n 0 (s) + ∂x g + ∂x h n−1 0 (s))ds + t −a 0 σ (t−a) σ +1 a−t 4σ Cε0 2(σ + 1)Cε0 eb−s (1 + s γ −1 )ds ≤ ≤ Rn Rn e 2 (1 + (t − a)γ −1 ). t −a γ 0 Using (38) and the induction hypothesis we get t (1 + (s − σ b)γ −1 ) ds eσ b−s I10 ≤ 4σ Cε0 Rn σ σ (t − a) − δs σ +1 (t−a) t σ (t−a)−δs γ −1 ) δs−σ (t−a) 1 + ( 2 ds ≤ 4σ Cε0 Rn e 2 σ σ (t − a) − δs σ +1 (t−a) σ (t−a) σ +1 σ a−t 4σ Cε0 8σ Cε0 Rn Rn e 2 (1 + (σ a − t)γ −1 ). ≤ e−x (x −1 + x γ −2 )d x ≤ t−σ a δ δ(1 − γ ) 2 Due to the estimates for I7 , I8 , I9 and I10 we obtain Cσ ε0 Rn . δγ (γ − 1)
Rn+1 ≤
(45)
Choosing ε0 small enough we get Rn+1 ≤
1 1 ε0 Rn ≤ . . . ≤ n R1 ≤ n+1 , 2 2 2
and Mn+1 ≤
n+1
Rn+1 ≤ ε0 .
j=1
Therefore, we obtain the function h = lim h n satisfying n→∞
∂x ha (t) ≤
n
Rn e
σ a−t 2
≤ ε0 e
σ a−t 2
.
470
D. Córdoba, F. Gancedo
Taking f (x, t) = g(x, t) + h(x, t), we get (32) for ρ2 − ρ1 = 2. In order to show the uniqueness, we write Eq. (13) for ρ2 − ρ1 = 2 in the following form: f t = − f + T ( f ), f (x, 0) = f 0 (x). Suppose that there exist two solutions f 1 and f 2 with f 1 (x, 0) = f 2 (x, 0). Define R by R=
sup 0≤a<∞ σa < t
It follows that R ≤ f1 = f2 .
∂ 2 f 1 − ∂x2 f 2 a t−σ a ∂x f 1 − ∂x f 2 a + x e 2 . 1 + (t − σ a)γ −1
C(ε)σ δγ (γ −1) R
and for ε small enough it yields
C(ε)σ δγ (γ −1)
< 1 and therefore
6. Ill-Posedness for the Unstable Case Here we show ill-posedness for the unstable case ρ1 > ρ2 . We use the global solution for the 2-D stable case f (x1 , t) satisfying (32) with 1+γ f 0 0 < C and 1+γ +ζ f 0 0 = ∞ for γ , ζ > 0. Making a change of variables, we define fλ (x1 , t) = λ−1 f (λx1 , −λt+λ1/2 ) obtaining { f λ }λ>0 a family of solutions to the unstable case. Using (32) follows 3
3
3
f λ H s (0) = |λ|s− 2 f H s (λ1/2 ) ≤ C|λ|s− 2 f 1 (λ1/2 ) ≤ C|λ|s− 2 e− and 3
3
f λ H s (λ−1/2 ) = |λ|s− 2 f H s (0) ≥ |λ|s− 2 C
|ρ2 −ρ1 | 1/2 λ 4
,
|k|1+γ +ζ | fˆ0 (k)| = ∞,
k
for s > 3/2 and γ , ζ small enough. We obtain an ill posed problem for s > 3/2. Theorem 6.1. Let s > 3/2, then for any ε > 0 there exists a solution f of (16) with ρ1 > ρ2 and 0 < δ < ε such that f H s (0) ≤ ε and f H s (δ) = ∞. Remark 6.2. If one considers a solution of the 3-D problem satisfying f (x1 , x2 , t) = f (x1 , t), from Eq. (13) one obtains a solution of (16). This shows that solutions of the 2-D case are solutions of the 3-D problem and therefore, using the above theorem, one obtains ill-posedness for the 3-D case with ρ1 > ρ2 . References 1. Ambrose, D.: Well-posedness of Two-phase Hele-Shaw Flow without Surface Tension. Euro. J. Appl. Math. 15, 597–607 (2004) 2. Ambrose, D., Masmoudi, N.: The zero surface tension limit of two-dimensional water waves. Commun. Pure Appl. Math. 58, 1287–1315 (2005) 3. Bear, J.: Dynamics of Fluids in Porous Media. New York: American Elsevier, 1972 4. Bertozzi, A.L., Constantin, P.: Global regularity for vortex patches. Commun. Math. Phys. 152(1), 19– 28 (1993) 5. Bertozzi, A.L., Majda, A.J.: Vorticity and the Mathematical Theory of Incompresible Fluid Flow. Cambridge: Cambridge Press, 2002
Contour Dynamics of Incompressible 3-D Fluids in a Porous Medium
471
6. Birkhoff, G.: Helmholtz and Taylor instability. In: Hydrodynamics Instability, Proc. Symp. Appl. Math. XII, Providence, RI: Amer. Math. Soc., 55–76, 1962, pp. 55–76 7. Caflisch, R., Orellana, O.: Singular solutions and ill-posedness for the evolution of vortex sheets. SIAM J. Math. Anal. 20(2), 293–307 (1989) 8. Chemin, J.Y.: Persistence of geometric structures in two-dimensional incompressible fluids. Ann. Sci. Ecole. Norm. Sup. 26(4), 517–542 (1993) 9. Constantin, P., Dupont, T.F., Goldstein, R.E., Kadanoff, L.P., Shelley, M.J., Zhou, S.M.: Droplet breakup in a model of the Hele-Shaw cell. Physical Review E 47, 4169–4181 (1993) 10. Constantin, P., Majda, A.J., Tabak, E.: Formation of strong fronts in the 2-D quasigeostrophic thermal active scalar. Nonlinearity 7, 1495–1533 (1994) 11. Escher, J., Simonett, G.: Classical solutions for Hele-Shaw models with surface tension. Adv. Differ. Eqs. 2, 619–642 (1997) 12. Hele-Shaw, H.S.: Nature 58, 34 (1898) 13. Hou, T.Y., Lowengrub, J.S., Shelley, M.J.: Removing the Stiffness from Interfacial Flows with Surface Tension. J. Comput. Phys. 114, 312–338 (1994) 14. Muskat, M.: The flow of homogeneous fluids through porous media. New York:Springer, 1982 15. Nirenberg, L.: An abstract form of the nonlinear Cauchy-Kowalewski theorem. J. Differ. Geom. 6, 561– 576 (1972) 16. Nishida, T.: A note on a theorem of Nirenberg. J. Differ. Geom. 12, 629–633 (1977) 17. Saffman, P.G., Taylor, G.: The penetration of a fluid into a porous medium or Hele-Shaw cell containing a more viscous liquid. Proc. R. Soc. London, Ser. A 245, 312–329 (1958) 18. Siegel, M., Caflisch, R., Howison, S.: Global Existence, Singular Solutions, and Ill-Posedness for the Muskat Problem. Comm. Pure and Appl. Math. 57, 1374–1411 (2004) 19. Rodrigo, J.L.: On the Evolution of Sharp Fronts for the Quasi-Geostrophic Equation. Comm. Pure and Appl. Math. 58, 0821–0866 (2005) 20. Stein, E., Weiss, G.: Introduction to Fourier Analysis on Euclidean spaces. Princeton, NJ: Princeton University Press, 1971 21. Stein, E.: Harmonic Analysis. Princeton, NJ: Princeton University Press, 1993 22. Taylor, G.: The instability of liquid surfaces when accelerated in a direction perpendicular to their planes. I. Proc. Roy. Soc. London. Ser. A. 201, 192–196 (1950) 23. Wu, S.: Well-posedness in Sobolev spaces of the full water wave problem in 2-D. Invent. Math. 130, 39– 72 (1997) 24. Wu, S.: Well-posedness in Sobolev spaces of the full water wave problem in 3-D. J. Amer. Math. Soc. 12, 445–495 (1999) Communicated by P. Constantin
Commun. Math. Phys. 273, 473–498 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0189-3
Communications in
Mathematical Physics
One-and-a-Half Quantum de Finetti Theorems Matthias Christandl, Robert König, Graeme Mitchison, Renato Renner Centre for Quantum Computation, DAMTP, University of Cambridge, Cambridge CB3 0WA, UK. E-mail: [email protected]; [email protected]; [email protected]; [email protected] Received: 19 July 2006 / Accepted: 20 September 2006 Published online: 13 March 2007 – © Springer-Verlag 2007
Abstract: When n − k systems of an n-partite permutation-invariant state are traced out, the resulting state can be approximated by a convex combination of tensor product states. This is the quantum de Finetti theorem. In this paper, we show that an upper bound 2 on the trace distance of this approximation is given by 2 kdn , where d is the dimension of the individual system, thereby improving previously known bounds. Our result follows from a more general approximation theorem for representations of the unitary group. Consider a pure state that lies in the irreducible representation Uµ+ν ⊂ Uµ ⊗ Uν of the unitary group U(d), for highest weights µ, ν and µ + ν. Let ξµ be the state obtained by tracing out Uν . Then ξµ is close to a convex combination of the coherent states Uµ (g)|vµ , where g ∈ U(d) and |vµ is the highest weight vector in Uµ . For the class of symmetric Werner states, which are invariant under both the permutation and unitary groups, we give a second de Finetti-style theorem (our “half” theorem). It arises from a combinatorial formula for the distance of certain special symmetric Werner states to states of fixed spectrum, making a connection to the recently defined shifted Schur functions [1]. This formula also provides us with useful examples that allow us to conclude that finite quantum de Finetti theorems (unlike their classical counterparts) must depend on the dimension d. The last part of this paper analyses the structure of the set of symmetric Werner states and shows that the product states in this set do not form a polytope in general.
I. Introduction There is a famous theorem about classical probability distributions, the de Finetti theorem [2], whose quantum analogue has stirred up some interest recently. The original (k) theorem states that a symmetric probability distribution of k random variables, PX 1 ···X k , that is infinitely exchangeable, i.e. can be extended to an n-partite symmetric distribution for all n > k, can be written as a convex combination of identical product distributions,
474
M. Christandl, R. König, G. Mitchison, R. Renner
i.e. for all x1 , . . . , xk , PX 1 ···X k (x1 , . . . , xk ) =
PX (x1 ) · · · PX (xk )dµ(PX ),
(1)
where µ is a measure on the set of probability distributions, PX , of one variable. In the quantum analogue [3–8] a state ρ k on H⊗k is said to be infinitely exchangeable if it is symmetric (or permutation-invariant), i.e. πρ k π † = ρ k for all π ∈ Sk and, for all n > k, there is a symmetric state ρ n on H⊗n with ρ k = trn−k ρ n . The theorem then states that ρ k = σ ⊗k dm(σ ) (2) for a measure m on the set of states on H. However, the versions of this theorem that have the greatest promise for applications relax the strong assumption of infinite exchangeability [9, 10]. For instance, one can assume that ρ k is n-exchangeable for some specific n > k, viz. that ρ k = trn−k ρ n for some symmetric state ρ n . In that case, the exact statement in Eq. (2) is replaced by an approximation ρ k ≈ σ ⊗k dm(σ ), (3) as proved in [9], where it was shown that the error is bounded by an expression propor6 tional to √kd . n−k Our paper is structured as follows. In Sect. II we derive an approximation theorem for states in spaces of irreducible representations of the unitary group. Our main application of this theorem is an improvement of the error bound in the approximation in (3) to kd 2 2 kd n for Bose-symmetric states and to 2 n for arbitrary permutation-invariant states. The last step from Bose-symmetry to permutation-invariance is achieved by embedding permutation-invariant states into the symmetric subspace, a technique which might be of independent interest. We conclude this section with a discussion of the optimality of our bounds and explain how our results can be generalised to permutation-symmetry with respect to an additional system. In Sect. III, we prove the “half” theorem of our title. This refers to a de Finetti theorem for a particular class of states, the symmetric Werner states [11], which are invariant under the action on the tensor product space of both the unitary and symmetric groups. In order to prove our result we derive an exact combinatorial expression for the distance of extremal n-exchangeable Werner states to product states of fixed spectrum. This has some mathematical interest because of the connection it makes with shifted Schur functions [1]. It also provides us with a rich supply of examples that can be used to test the tightness of the bounds of the error in Eq. (3) and, in Sect. IV, to explore the structure of the set of convex combinations of tensor product states. II. On Coherent States and the de Finetti theorem A. Approximation by coherent states. In order to state our result we need to introduce some notation from Lie group theory [12]. Let U(d) be the unitary group and fix a basis d−1 {|i}i=0 of Cd in order to distinguish the diagonal matrices with respect to this basis as the Cartan subgroup H(d) of U(d). A weight vector with weight λ = (λ1 , . . . , λd ),
One-and-a-Half Quantum de Finetti Theorems
475
where each λi is an integer, is a vector |v in the representation U of U(d) satisfying U (h)|v = h iλi |v, where h 1 , . . . , h d are the diagonal entries of h ∈ H(d). We can equip the set of weights with an ordering: λ is said to be (lexicographically) higher than λ if λi > λi for the smallest i with λi = λi . It is a fundamental fact of representation theory that every irreducible representation of U(d) has a unique highest weight vector (up to scaling); the corresponding weights must be dominant, i.e. λi ≥ λi+1 . Two irreducible representations are equivalent if and only if they have identical highest weights. It is therefore convenient to label irreducible representations by their highest weights and write Uλ for the irreducible representation of U(d) with highest weight λ. It will also be convenient to choose the normalisation of the highest weight vector |vλ to be
vλ |vλ = 1 in order to be able to view |vλ vλ | as a quantum state. Given two irreducible representations Uµ and Uν with corresponding spaces Uµ and Uν we can define the tensor product representation Uµ ⊗ Uν acting on Uµ ⊗ Uν by (Uµ ⊗ Uν )(g) = Uµ (g) ⊗ Uν (g), for any g ∈ U(d). In general this representation is reducible and decomposes as λ Uµ ⊗ Uν ∼ cµν Uλ . = λ
λ are known as Littlewood-Richardson coefficients. It follows from The multiplicities cµν the definition of the tensor product that |vµ ⊗ |vν is a vector of weight µ + ν, where (µ + ν)i = µi + νi . By the ordering of the weights, µ + ν is the highest weight in Uµ ⊗Uν and |vµ ⊗ |vν is the only vector with this weight. We therefore identify |vµ+ν with |vµ ⊗ |vν and remark that Uµ+ν appears exactly once in Uµ ⊗ Uν . Our first result is an approximation theorem for states in the spaces of irreducible representations of U(d). Consider a normalised vector | in the space Uµ+ν of the irreducible representation Uµ+ν . By the above discussion we can embed Uµ+ν uniquely into the tensor product representation Uµ ⊗ Uν . This allows us to define the reduced state of | on Uµ by ξµ = trν | |. We shall prove that the reduced state on Uµ is approximated by convex combinations of rotated highest weight states:
rotated highest weight Definition II.1. For g ∈ U(d), let |vµ := Uµ (g)|v µ be the g g vector in Uµ . Let Pµ (Cd ) be the set of states of the form |vµ vµ |dm(g), where m is a probability measure on U(d). g
g
Here, the states |vµ , with g ∈ U(d), are coherent states in the sense of [13]. For d = 2 and µ = (k, 0) ≡ (k), these states are the well-known SU(2)-coherent states. In the following theorem, we use the trace distance, which is induced by the trace norm A := 21 tr|A| on the set of hermitian operators. Theorem II.2 (Approximation by coherent states). Let | be in Uµ+ν which we consider to be embedded into Uµ ⊗ Uν as described above. Then ξµ = trν | | is ε-close to Pµ (Cd ), where ε := 2(1 −
dim Uνd d ). That is, there exists a probability measure m dim Uµ+ν
U(d) such that
ξµ −
|vµg vµg |dm(g) ≤ ε .
on
476
M. Christandl, R. König, G. Mitchison, R. Renner g
g
g
g
Proof. By the definition of |vτ and Schur’s lemma, the operators E τ := dim Uτ |vτ vτ |, g ∈ U(d) together with the normalised uniform (Haar) measure dg on U (d) form a POVM on Uτ , i.e., (4) E τg dg = 11Uτ . This allows us to write
ξµ =
wg ξµg dg ,
g
(5) g
where ξµ is the residual state on Uµ obtained when applying {E ν } to |, i.e., wg ξµg = trν ((11Uµ ⊗ E νg )| |) , where wg dg determines the probability of outcomes. g We claim that ξµ is close to a convex combination of the states |vµ , with coefficients g corresponding to the outcome probabilities when measuring | with {E µ+ν }. That is, we show that the probability measure m on U(d) in the statement of the theorem can be g defined as dm(g) := tr(E µ+ν | |)dg. Our goal is thus to estimate g
ξµ − tr(E µ+ν | |)|vµg vµg |dg = S − δ , where, using (5),
dim Uν g tr(E µ+ν | |)|vµg vµg |dg , dim Uµ+ν dim Uν g δ := 1 − tr(E µ+ν | |)|vµg vµg |dg . dim Uµ+ν
S :=
wg ξµg −
dim Uν 3 Because δ = 21 (1 − dim Uµ+ν ), it suffices to show that S ≤ 2 (1 − Uµ+ν ⊂ Uµ ⊗ Uν and |vµ ⊗ |vν = |vµ+ν , we have
dim Uν dim Uµ+ν ).
Since
dim Uν g tr(E µ+ν | |) = vµg |trν ((11µ ⊗ E νg )| |)|vµg dim Uµ+ν = wg vµg |ξµg |vµg . So
S=
wg ξµg − |vµg vµg |ξµg |vµg vµg | dg
Now, for all operators A, B, we have A − B AB = (A − B A) + (A − AB) − (11 − B)A(11 − B) , g
g
g
so putting A = ξµ and B = |vµ vµ | in (6), we have S = α + β − γ,
(6)
One-and-a-Half Quantum de Finetti Theorems
477
where α := β :=
γ :=
wg (ξµg − |vµg vµg |ξµg )dg, wg (ξµg − ξµg |vµg vµg |)dg, wg (11Uµ − |vµg vµg |)ξµg (11Uµ − |vµg vµg |)dg.
g g g g g g Combining wg |vµ vµ |ξµ = trν (|vµ vµ | ⊗ E ν )| | = with (4) and (5), we get dim Uν trν | | . α = 1− dim Uµ+ν
g dim Uν dim Uµ+ν tr ν E µ+ν | |
Similarly, dim Uν trν | | , β = 1− dim Uµ+ν and hence
α = β =
1 dim Uν . 1− 2 dim Uµ+ν
Note that for a projector P and a state ξ on H, we have 1 tr(Pξ ) = Pξ P , 2 as√a consequence of the cyclicity of the trace and the fact that the operator Pξ P = √ ( ξ P)† ( ξ P) is nonnegative. This identity together with the convexity of the trace g g distance applied to the projectors 11Uµ − |vµ vµ | gives
γ ≤
wg (11Uµ − |vµg vµg |)ξµg (11Uµ − |vµg vµg |) dg 1 = tr wg (ξµg − |vµg vµg |ξµg )dg 2 = α .
This concludes the proof because g
ξµ − tr(E µ+ν | |)|vµg vµg |dg ≤ S + δ ≤ α + β + γ + δ , and each of the quantities in the sum on the r.h.s. is upper bounded by 21 (1 −
dim Uν dim Uµ+ν ).
478
M. Christandl, R. König, G. Mitchison, R. Renner
An important special case of Theorem II.2 is the case where µ = (k) ≡ (k, 0, . . . , 0) d−1
and µ+ν = (n). In this case, U(k) ∼ = Symk (Cd ) (and likewise for U(n) ) is the symmetric d ⊗k subspace of (C ) . Its importance stems from the fact that any n-exchangeable density operator has a symmetric purification, and this leads to a new de Finetti theorem for general mixed symmetric states (cf. Sect. IIB). Corollary II.3. Let | ∈ Symn (Cd ) be a symmetric state and let ξ k := trn−k | |, k ≤ n, be the state obtained by tracing out n − k systems. Then ξ k is ε-close to P(k) (Cd ), where ε := 2 dk n . Equivalently, there exists a probability measure m on pure states on d C such that k
ξ − |ϕ ϕ|⊗k dm(ϕ) ≤ ε . Proof. Put µ = (k), ν = (n − k) in Theorem II.2. Then | ∈ U(n) = Symn (Cd ) is a symmetric state, the highest weight vector of Uµ is just the product |0⊗k , and tracing out Uν corresponds to tracing out (Cd )⊗n−k . Since Uµ (g)|v µ = (g|0)⊗k , an arbitrary state |ϕ ∈ Cd can be written as g|0 for some g ∈ U(d). , so the error in the theorem For the symmetric representation U(l) , dim U(l) = l+d−1 l (n−k+d−1 n−k ) is ε := 2(1 − n+d−1 ), and ( n ) n−k+d−1
(n − k + d − 1)! n!(d − 1)! (n − k)!(d − 1)! (n + d − 1)! n−k+d −1 n−k+1 ··· = n+1 n+d −1 d−1 n−k+1 ≥ n+1 d−1 k = 1− n+1 (d − 1)k ≥1− n+1 dk . ≥1− n
n−k n+d−1 = n
n+ j n+i The first inequality here follows from n+k+i ≤ n+k+ j , which holds for all i ≤ j, and the second to last inequality is also known as the ‘union bound’ in probability theory.
Example II.4. To get some feel for the more general case, where Uµ+ν is not the symmetric representation, let 1 ≤ p ≤ d and consider µ = ( j p ) ≡ ( j, . . . , j), ν = ((m − j) p ) p
and µ + ν = (m p ). We can consider the representation Uµ+ν given by the Weyl tensorial construction [14], with the tableau numbering running from 1 to p down the first column, p + 1 to 2 p down the second, and so on. Then the embedding Uµ+ν ⊂ Uµ ⊗ Uν
One-and-a-Half Quantum de Finetti Theorems
479
corresponds to the factoring of tensors in (Cd )⊗n = (Cd )⊗k ⊗ (Cd )⊗n−k , where k = j p and n = mp. The fact that the Young projector is obtained by symmetrising over rows and antisymmetrising over columns implies that
p Uµ+ν ⊂ Symm ( (Cd )), where p is the antisymmetric subspace on p systems corresponding to a column in the diagram. States in Uµ+ν can thus be regarded as symmetric states of m systems of dimension q = dim p (Cd ), and one can apply Corollary II.3 to deduce that ξµ is close to P( j) (Cq ). However, Theorem II.2 makes the assertion that ξµ is close to Pµ (Cd ). This statement is stronger in certain cases. For instance, when p = 2, the highest weight vector |vµ is ( √1 |01−10)⊗k and The2
orem II.2 says that ξµ is close to a convex combination of states |ϕ ϕ|⊗k/2 , where |ϕ is of the form (g ⊗ g) √1 |01 − 10 with g ∈ U(d). Note that the single-system reduced den2 sity operator of every such |ϕ has rank 2. By contrast, Corollary II.3 allows the |ϕ’s to lie in 2 (Cd ), i.e. in the span of the basis elements √1 |i 1 i 2 − i 2 i 1 , for 1 ≤ i 1 < i 2 ≤ d. 2 This includes |ϕ’s whose reduced density operator has rank larger than 2, if d > 3. B. Symmetry and purification. We now show how the symmetric-state version of our de Finetti theorem, Corollary II.3, can be generalised to prove a de Finetti theorem for arbitrary (not necessarily pure) n-exchangeable states ρ k on H⊗k . We say a (mixed) state ξ n on H⊗n is permutation-invariant or symmetric if π ξ n π † = ξ n , for any permutation π ∈ Sn . Here, the symmetric group Sn acts on H⊗n by permuting the n subsystems, i.e. every permutation π ∈ Sn gives a unitary π on H⊗n defined by π |ei1 ⊗ · · · ⊗ |ein = |eiπ −1 (1) ⊗ · · · ⊗ |eiπ −1 (n)
(7)
d of H. Note that, as a unitary operator, π † corresponds for an orthonormal basis {|ei }i=1 to the action of π −1 ∈ Sn .
Lemma II.5. Let ξ be a permutation-invariant state on H⊗n . Then there exists a purification of ξ in Symn (K ⊗ H) with K ∼ = H. Proof. Let A be the set of eigenvalues of ξ and let Ha , for a ∈ A, be the eigenspace of ξ , so ξ |φ = a|φ, for any |φ ∈ Ha . Because ξ is invariant under permutations, we have π † ξ π |φ = a|φ, for any |φ ∈ Ha and π ∈ Sn . Applying the unitary operation π to both sides of this equality gives ξ π |φ = aπ |φ; so π |φ ∈ Ha . This proves that √ the eigenspaces Ha of ξ are√invariant under permutations. Since the eigenspaces of ξ are identical to those of ξ , ξ is invariant under permutations, too. We now show how this symmetry carries over to the vector |ξ := 11 ⊗ ξ |, d where | = ( i |ei ⊗ |ei )⊗n ∈ (K ⊗ H)⊗n for an orthonormal basis {|ei }i=1 of K∼ H. Observe that | is invariant under permutations, i.e. (π ⊗ π )| = |. Using = √ this fact and the permutation invariance of ξ we find
480
M. Christandl, R. König, G. Mitchison, R. Renner
(π ⊗ π )(11 ⊗
ξ )| = 11 ⊗ π ξ π † (π ⊗ π )| = (11 ⊗ ξ )|,
so |ξ is invariant under permutations, and hence an element of Symn (K ⊗ H). Computing the partial trace over K⊗n gives † trK⊗n (11 ⊗ ξ )| |(11 ⊗ ξ )† = ξ 11 ξ = ξ, which shows that |ξ is a symmetric purification of ξ .
Definition II.6. Let P k = P k (H) be the set of states of the form m is a probability measure on the set of (mixed) states on H.
σ ⊗k dm(σ ), where
Theorem II.7 (Approximation of symmetric states by product states). Let ξ n be a permutation-invariant density operator on (Cd )⊗n and k ≤ n. Then ξ k := trn−k (ξ n ) is 2 ε-close to P k (Cd ) for ε := 2 dn k . Proof. By Lemma II.5, there is a purification | ∈ Symn (Cd ⊗ Cd ) of ξ n , and the 2 partial trace trn−k | | is ε-close to P(k) (Cd ) by Corollary II.3. The claim then is a consequence of the fact that the trace-distance does not increase when systems are traced out. We close this section by looking at a stronger notion of symmetry than permutationinvariance. This is Bose-symmetry, defined by the condition that π ξ n = ξ n for every π ∈ Sn . Bose-exchangeability is then defined in the obvious way. In the course of their paper proving an infinite-exchangeability de Finetti theorem, Hudson and Moody [4] also showed that if ξ k is infinitely Bose-exchangeable, then ξ k is in P(k) (Cd ). We now show that this result holds (approximately) for Bose-n-exchangeable states. Theorem II.8 (Approximation of Bose symmetric states by pure product states). Let ξ n be a Bose-symmetric state on (Cd )⊗n , and let ξ k := trn−k (ξ n ), k ≤ n. Then ξ k is ε-close to P(k) (Cd ), for ε := 2 dk n . Proof. We can decompose ξ n as ξn =
ai |ψi ψi |,
i
where |ψi is a set of orthonormal eigenvectors of ξ n with strictly positive eigenvalues ai . For all π ∈ Sn we have π |ψi =
1 1 π ξ n |ψi = ξ n |ψi = |ψi , ai ai
making use of the assumption π ξ n = ξ n . This shows that all |ψi are elements of Symn (Cd ). By Corollary II.3, every ξψk i = trn−k |ψi ψi | is -close to a state σψk i that is in P(k) (Cd ). This leads to ai ξψk i − ai σψk i ≤ ai ξψk i − σψk i ≤ , i
and concludes the proof.
i
i
One-and-a-Half Quantum de Finetti Theorems
481
C. Optimality. The error bound we obtain in Theorem II.7 is of size d6k
d2k n ,
which is
tighter than the √n−k bound obtained in [9]. Is there scope for further improvement? For classical probability distributions, Diaconis and Freedman [15] showed that, for n-exchangeable distributions, the error, measured by the trace distance, is bounded by k(k−1) k(k−1) min{ dk n , 2n }, where d is the alphabet size. This implies that there is a bound, 2n , that is independent of d. The following example shows that there cannot be an analogous dimension-independent bound for a quantum de Finetti theorem. Example II.9. Suppose n = d, and define a permutation-invariant state on (Cn )⊗n by 1 ξn = sign(π )sign(π )π |12 · · · n 12 · · · n|π † , n! π,π
n where {|i}i=1 is an orthonormal basis of Cn . n n (C ). Tracing out n − 2 systems gives the
ξ2 =
2 n(n − 1)
This is just the normalised projector onto projector onto 2 (Cn ), i.e. the state |i j − ji i j − ji|,
(8)
1≤i< j≤n
which has trace distance at least 1/2 from P 2 (Cn ), as will be shown by Corollary III.9 and Example IV.3. We must therefore expect our quantum de Finetti error bound to depend on d, as is 2 indeed the case for the error term kdn in Theorem II.7. By generalising this example, we d (1 − d12 ). will show in Lemma III.9 that the error term must be at least 2n This example shows that some aspects of the de Finetti theorem cannot be carried over from probability distributions to quantum states. The following argument shows that probability distributions can, however, be used to find lower bounds for the quantum case. Given an n-partite probability distribution PX = PX 1 ···X n on X n , define a state | := PX (x)|x1 ⊗ · · · ⊗ |xn ∈ H⊗n , x∈X n
where {|x}x∈X is an orthonormal basis of H. Applying the von Neumann measurement M defined by this basis to every system of ξ k := trn−k (| |) gives a measurement k outcome distributed according to M⊗k (ξ )⊗k= PX . If m is a normalised measure on the setof states on H, then measuring σ dm(σ ) gives a distribution of the form M⊗k ( σ ⊗k dm(σ )) = PXk dµ(PX ). Because the trace distance of the distributions obtained by applying the same measurement is a lower bound on the distance between two states, this implies that inf PX 1 ···X k − PXk dµ(PX ) ≤ ξ k − σ ⊗k dm(σ ) , (9) µ
where the infimum is over all normalised measures µ on the set of probability distributions on X . If PX is permutation-invariant, that is, if PX (x1 , . . . , xn ) = PX (xπ −1 (1) , . . . , xπ −1 (n) ) for all (x1 , . . . , xn ) ∈ X n and π ∈ Sn , then | ∈ Symn (H). Applying this to a distribution PX studied by Diaconis and Freedman [15], and using their lower bound on the quantity on the l.h.s. of (9) gives the following result.
482
M. Christandl, R. König, G. Mitchison, R. Renner
Theorem II.10. There is a state | ∈ Symn (C2 ) such that the distance of ξ k = trn−k | | to P k is lower bounded by 1 k k · + o( ) if n → ∞ and k = o(n), √ n 2π e n φ(α) + o(1) if n → ∞ and k/n → α ∈]0, 1/2[, where φ(α) :=
√1 2 2π
1
|1 − (1 − α) 2 eαu
2 /2
|e−u
2 /2
du.
For a fixed dimension and up to a multiplicative factor, the dependence on k and n in Corollary II.3 and Theorem II.7 is therefore tight. D. De Finetti representations relative to an additional system. A state ξ An on H A ⊗H⊗n is called permutation-invariant or symmetric relative to H A if (11 A ⊗ π )ξ An (11 A ⊗ π † ) = ξ An , for any permutation π ∈ Sn (see [16, 17, 9]). This property is strictly stronger than symmetry of the partial state ξ n := tr A (ξ An ), since symmetry of ξ n does not necessarily imply symmetry of ξ An relative to H A , as the pure state √1 (|001 + |110) ∈ C2 ⊗ (C2 )⊗2 illustrates. Taking a broader view where ξ n is part of 2 a state on a larger Hilbert space thus gives rise to additional structure. As we shall see, this stronger notion of symmetry also yields stronger de Finetti style statements. These are useful in applications, for instance those related to separability problems (cf. [18] and [19], where an alternative extended de Finetti-type theorem has been proposed). More precisely, symmetry of a state ξ An on H A ⊗ H⊗n relative to H A implies that the partial state ξ Ak := trn−k (ξ An ) is close to a convex combination of states where the part on H⊗k has product form and, in addition, is independent of the part on H A . In particular, ξ Ak is close to being separable with respect to the bipartition H A versus H⊗n . This property is formalised by the following definition which generalises Definition II.6. Definition II.6 . Let P k (H A , H) be the set of states of the form ξσA ⊗ σ ⊗k dm(σ ), where, m is a probability measure on the set of (mixed) states on H and where {ξσA }σ is a family of states on H A parameterised by states on H. The main results of Sect. IIB can be extended as follows. Theorem II.7 (Approximation of symmetric states by product states). Let ξ An be a density operator on H A ⊗ (Cd )⊗n which is symmetric relative to H A and let k ≤ n. 2 Then ξ Ak := trn−k (ξ An ) is ε-close to P k (H A , Cd ) for ε := 2 dn k . ξ n,
A state ξ An on H A ⊗H⊗n is called Bose-symmetric relative to H A if (11 A ⊗ π )ξ An = for any π ∈ Sn .
Theorem II.8 (Approximation of Bose symmetric states by product states). Let ξ An be a state on H A ⊗ (Cd )⊗n which is Bose-symmetric relative to H A , and let ξ Ak := trn−k (ξ An ), k ≤ n. Then ξ Ak is ε-close to P k (H A ⊗ Cd ), for ε := 2 dk n . The proofs of these theorems are obtained by a simple modification of the arguments used for the derivation of the corresponding statements of Sect. IIB. The main ingredient are straightforward generalisations of Theorem II.2 and Lemma II.5.
One-and-a-Half Quantum de Finetti Theorems
483
Theorem II.2 (Approximation by coherent states). Let | be in H A ⊗ Uµ+ν and define ξµ := trν | |. Then there exists a probability measure m on U(d) and a family {τg }g∈U(d) of states on H A such that dim Uνd g g
ξµ − τg ⊗ |vµ vµ |dm(g) ≤ 2 1 − . d dim Uµ+ν Lemma II.5 . Let ξ be a state on H A ⊗ H⊗n which is permutation-invariant relative to H A . Then there exists a purification of ξ in H A ⊗ K A ⊗ Symn (K ⊗ H) with H A ∼ = KA and K ∼ = H. III. On Werner States and the de Finetti theorem A. Symmetric Werner states. We now consider a more restricted class of states, the Werner states [11]. Their defining property is that they are invariant under the action of the unitary group given by Eq. (11). Werner states are an interesting class of states because they exhibit many types of phenomena, for example different kinds of entanglement, but have a simple structure that makes them easy to analyse. One reason for narrowing our focus to these special states is that a de Finetti theorem can be proved for them using entirely different methods from the proof of Theorem II.2. We also obtain a rich supply of examples that give insight into the structure of exchangeable states and provide us with an O( dn ) lower bound for Theorem II.7. Schur-Weyl duality gives a decomposition (Cd )⊗k ∼ Uλd ⊗ Vλ , (10) = λ∈Par(k,d)
with respect to the action of the symmetric group Sk given by (7) and the action of the unitary group U(d) on (Cd )⊗k given by g|ψ = g ⊗k |ψ,
(11)
for g ∈ U(d) and |ψ ∈ (Cd )⊗k . Here Par(k, d) denotes the set of Young diagrams with k boxes and at most d rows, Uλd is the irreducible representation of U(d) with highest weight λ, and Vλ is the corresponding irreducible representation of Sk . Let ρ k be a symmetric Werner state on (Cd )⊗k . Schur’s lemma tells us that ρ k must be proportional to the identity on each irreducible component Uλd ⊗ Vλ , so ρk = wλ ρλk , (12) λ
where ρλk =Pλ /(dim Uλd dim Vλ ), with Pλ the projector onto Uλd ⊗ Vλ , and wλ ≥ 0 for all λ, with wλ = 1. Let Tk (ρ k ) denote the state obtained by “twirling” a state ρ k on (Cd )⊗k , i.e., k k T (ρ ) := g ⊗k ρ k (g ⊗k )† dg, where the Haar measure on U(d) with normalisation dg = 1 is used. A state of the form Tk (σ ⊗k ) is a symmetric Werner state since its product structure ensures symmetry
484
M. Christandl, R. König, G. Mitchison, R. Renner
and twirling makes it invariant under unitary action. We call such a state a “twirled product state”. Any two states with the same spectra are equivalent under twirling, so σ → Tk (σ ⊗k ) defines a map fk : Specd → W k , where Specd is the set of possible d-dimensional spectra and W k the set of symmetric Werner states on (Cd )⊗k . The map fk can be characterised as follows: Lemma III.1. Given r = (r1 , . . . , rd ) ∈ Specd , the twirled product state fk (r ) on (Cd )⊗k satisfies fk (r ) =
λ∈Par(k,d)
wλ (r )ρλk ,
where wλ (r ) = dim Vλ sλ (r ) and sλ (r ) is the Schur function (cf. Eq. (16)). Proof. Since fk (r ) is a symmetric Werner state, Eq. (12) shows that it has the required form and it remains to compute the coefficients wλ (r ). Since the states ρλk are supported on orthogonal subspaces, wλ (r ) = tr Pλ fk (r ) , where Pλ is the projector onto the component Uλd ⊗ Vλ of the Schur-Weyl decomposition of (Cd )⊗k . Let σ = diag(r ) be a state with spectrum r . By the linearity and cyclicity of the trace, tr(PTk (Q)) = tr(Tk (P)Q)
(13)
for all operators P and Q on (Cd )⊗k , hence we obtain wλ (r ) = tr Pλ Tk (σ ⊗k ) = tr Tk (Pλ )σ ⊗k = tr Pλ σ ⊗k . In the last step, we used the fact that Pλ is invariant under the action (11). Note that Pλ projects onto the isotypic subspace of the irreducible representation Uλd in the k-fold tensor product representation of U(d). On the one hand, this shows that tr Pλ σ ⊗k is the character of the representation σ˜ → Pλ σ˜ ⊗k Pλ , evaluated at σ˜ = σ . On the other hand this representation is equivalent to dim Vλ copies of Uλd , whose character equals sλ (r ). Hence, wλ (r ) = dim Vλ sλ (r ). B. A combinatorial formula. We know from Eq. (12) that the states ρλn with λ ∈ Par(n, d) are the extreme points of the set of symmetric Werner states. A de Finetti theorem for the n-exchangeable states
One-and-a-Half Quantum de Finetti Theorems
trn−k ρλn ,
485
for λ ∈ Par(n, d) ,
(14)
therefore implies a de Finetti theorem for arbitrary n-exchangeable Werner states by the convexity of the trace distance. Note further that a de Finetti-type statement about all states of the form (14) applies to general n-exchangeable Werner states, that is, to states ρ k ∈ W k such that there is some symmetric state τ n on (Cd )⊗n with ρ k = trn−k τ n . This is because we can assume that τ n is a Werner state as ρ k = trn−k Tn (τ n ) and Tn (τ n ) ∈ W n . Our main step in the derivation of a de Finetti theorem for symmetric Werner states is a combinatorial formula for the distance of trn−k ρλn and the symmetric Werner state fk (r ). Note that for every r ∈ Specd , the state fk (r ) is a convex combination of k-fold product states with spectrum r , since (15) fk (r ) = (g diag(r ) g † )⊗k dg . In order to present our formula for trn−k ρλn − fk (r ) , we need to introduce the well-known Schur functions and also the more recently defined shifted Schur functions. We first recall the combinatorial description of the Schur function sµ by sµ (λ1 , . . . , λd ) = λT (α) , (16) T α∈µ
where the sum is over all semi-standard tableaux T of shape µ with entries between 1 and d. A semi-standard (Young) tableau of shape µ is a Young frame filled with numbers weakly increasing to the right and strictly increasing downwards. The product is over all boxes α of µ and T (α) denotes the entry of box α in tableaux T . Note that sµ (λ) is homogeneous of degree k, where k is the number of boxes in µ. It is easy to see that the sum over semi-standard tableaux in (16) can be replaced by a sum over all reverse tableaux T of shape µ, where, in a reverse tableau, the entries decrease left to right along each row (weakly) and down each column (strictly). In the sequel, all the sums will be over reverse tableaux. The shifted Schur functions are given by the following combinatorial formula [1, Theorem (11.1)]: sµ∗ (λ1 , . . . , λd ) = (λT (α) − c(α)) , (17) T α∈µ
where c(α) is independent of T and is defined by c(α) = j − i if α = (i, j) is the box in the i th row and j th column of µ. Theorem III.2 (Distance to a twirled product state). Let λ ∈ Par(n, d) and r ∈ Specd . Let fk (r ) be the twirled product state defined in (15). The distance between the partial trace trn−k ρλn of the symmetric Werner state ρλn and fk (r ) is given by
trn−k ρλn − fk (r ) =
1 2
dim Vµ |
µ∈Par(k,d)
sµ (λ) (n k)
− sµ (r )| ,
where the falling factorial (n k) is defined to be n(n − 1) · · · (n − k + 1) if k > 0 and 1 if k = 0.
(18)
486
M. Christandl, R. König, G. Mitchison, R. Renner
In order to prove the theorem we will need a number of lemmas. Our first step is to express the coefficients in trn−k ρλn in terms of Littlewood-Richardson coefficients. Lemma III.3. Let λ ∈ Par(n, d) and let Pλ be the projector onto Uλd ⊗ Vλ embedded in (Cd )⊗n . Then λ tr((Pµ ⊗ Pν )Pλ ) = cµν dim Uλd dim Vµ dim Vν λ is the Littlewood-Richardson for all µ ∈ Par(k, d) and ν ∈ Par(n − k, d), where cµν coefficient. λ is the multiplicity of the irreducible Proof. The Littlewood-Richardson coefficient cµν d representation Uλ in the decomposition of the tensor product representation Uµd ⊗ Uνd of U(d), i.e., λ Uµd ⊗ Uνd ∼ cµν Uλd . (19) = λ
This implies that the image of Pµ ⊗ Pν in (Cd )⊗n is isomorphic to ⎛ ⎞ λ cµν ⎜ ⎟ d Uλ,i ⊗ (Vµ,i ⊗ Vν,i )⎠ , ⎝ λ
i=1
as a representation of U(d) × Sn where, for each λ, the underbraced part consists of λ dim V dim V copies of U d and is contained in the component U d ⊗ V of the cµν µ ν λ λ λ Schur-Weyl decomposition of (Cd )⊗n . The conclusion follows from this. Lemma III.3 allows us to compute the partial trace of the projector Pλ . Lemma III.4. Let λ ∈ Par(n, d) and let Pλ be the projector onto Uλd ⊗ Vλ embedded in (Cd )⊗n . Then λ dim Vν trn−k Pλ = dim Uλd cµν Pµ , dim Uµd µν where the sum extends over all µ ∈ Par(k, d) and ν ∈ Par(n − k, d). Proof. Since trn−k Pλ is symmetric and invariant under the action of U(d), it has the form (cf. (12)) αµ Pµ . trn−k Pλ = µ
The claim then immediately follows from dim Uµd dim Vµ αµ = tr(Pµ trn−k Pλ ) = tr((Pµ ⊗ 11⊗n−k )Pλ ) = tr (Pµ ⊗ Pν )Pλ ν
and Lemma III.3.
One-and-a-Half Quantum de Finetti Theorems
487
In the special case where n = k + 1 we obtain a statement that has recently been derived by Audenaert [20, Prop. 4]. We now show how the expression for tr n−k Pλ in Lemma III.4 can be rewritten in terms of shifted Schur functions. To do so we use the following result expressing dim λ/µ, the number of standard tableaux of shape λ/µ, in terms of shifted Schur functions. Theorem III.5 ([1, Theorem 8.1]). Let λ ∈ Par(n, d), µ ∈ Par(k, d) be such that µi ≤ λi for all i. Then sµ (λ) dim λ/µ . = dim Vλ (n k) Okounkov and Olshanski give a number of proofs for this theorem, the second of which only uses elementary representation theory. The shifted Schur functions allow us to express partial traces of Werner states in a form analogous to Lemma III.1. Lemma III.6. Let λ ∈ Par(n, d). The partial trace of the symmetric Werner state ρλn on (Cd )⊗n satisfies trn−k ρλn = αµλ ρµk , µ∈Par(k,d)
where αµλ = dim Vµ
sµ (λ) (n k)
.
Proof. Lemma III.4 gives αµλ = dim Vµ
ν∈Par(n−k,d)
λ cµν
dim Vν . dim Vλ
λ = 0 (by the Littlewood-Richardson rule) and s (λ) = 0 (by [1, TheoNote that cµν µ rem 3.1]) unless µi ≤ λi for all i. The claim therefore follows from Theorem III.5 and the identity (see [21, p. 67]) λ dim λ/µ = cµν dim Vν . ν∈Par(n−k,d)
We are now ready to give the proof of the combinatorial formula. Proof of Theorem III.2. This is an immediate consequence of Lemmas III.1 and III.6, since
trn−k ρλn − fk (r ) = αµλ ρµk − wµ (r )ρµk µ µ =
1 λ |α − wµ (r )| , 2 µ µ
where we used the fact that the support of the ρµk ’s is orthogonal.
488
M. Christandl, R. König, G. Mitchison, R. Renner
C. A de Finetti theorem for Werner states. The following de Finetti style theorem is a consequence of Theorem III.2. We call it “half a theorem” as it is a quantum de Finetti theorem for a restricted class of quantum states, the Werner states. Theorem III.7 (Approximation by twirled products). Let λ ∈ Par(n, d) and define ¯ be defined as in (15). Then the partial trace λ¯ := ( λn1 , . . . , λnd ) ∈ Specd . Let fk (λ) n trn−k ρλ of the symmetric Werner state ρλn satisfies k4 3 k(k − 1) n k ¯ ||trn−k ρλ − f (λ)|| ≤ · +O , 4 λ λ2 where λ is the smallest non-zero row of λ. The dimension d does not appear explicitly in this bound, nor in the order term O(·). Proof. First note that we can restrict the sum to diagrams µ with no more than rows, since by definition of , λq = 0 for q > , and sµ (λ1 , . . . , λ , 0, . . . , 0) = sµ (λ1 , . . . , λ , 0, . . . , 0) = 0 for µ+1 > 0. Furthermore, Schur as well as shifted Schur functions satisfy the stability condition [1] sµ (λ1 , . . . , λ , 0, . . . , 0) = sµ (λ1 , . . . , λ ), sµ (λ1 , . . . , λ , 0, . . . , 0) = sµ (λ1 , . . . , λ ) , so that we can safely assume that λ has (non-vanishing) rows and that the tableaux are numbered from 1 to only. Note that 4 1 k(k − 1) k −k 1+ , (20) =n +O (n k) 2n n2 and n −k sµ (λ) = =
T
α
T
β
λ¯ T (α) −
c(α) n
c(α) 1 c(α)c(α ) λ¯ T (β) 1 − + + ··· , λT (α) 2 λT (α) λT (α ) α α=α
where we have made use of (17) in the first line. Using (16), the bound |c(α)| ≤ k − 1 and the fact that α enumerates k boxes, we find the bounds 4 k(k − 1) k + O( 2 ) . |n −k sµ (λ) − sµ (λ¯ )| ≤ sµ (λ¯ ) λ λ Combining this with the estimate (20) we obtain k4 sµ (λ) 1 3 k(k − 1) −1 ≤ +O 2 , 2 (n k)sµ (λ¯ ) 4 λ λ
(21)
where we have used λ ≤ n. Since trn−k ρλn − fk (λ¯ ) is a convex combination with ¯ = dim Vµ sµ (λ) ¯ of the terms on the l.h.s. of (21), this concludes the weights tr(Pµ fk (λ)) proof.
One-and-a-Half Quantum de Finetti Theorems
489
Example III.8. Three special cases may be noted: • Fix λ¯ and consider λ = n λ¯ for an integer n. The bound then turns into O
k2 n
just as in the classical case. Thus when one restricts attention to a particular diagram ¯ one obtains the same type of dimension-independent bound as Diaconis and shape λ, Freedman [15]. (This does not contradict Example II.9 where we focus on a single diagram with 1. The bound of Theorem III.7 gives no information here.) √ λ = √ • For λ = ( n, . . . , n) we have an error of order O
k2 √ n
.
• Finally, λ = (n): In this case, trn−k ρλn = fk (1, 0, . . . , 0) which means that trn−k ρλn has a product form and an application of Theorem III.7 is not needed. Note that in Theorem III.7 we only kept the dependence on the last nonzero row λ of λ. For specific applications (or for cases such as λ = (λ1 , . . . , λ−1 , 1)) one may want to derive bounds that depend on more details of λ. By the (infinite) quantum de Finetti theorem, convex combinations of tensor product states are the same thing as infinitely exchangeable states. In this light, a finite de Finetti theorem says how close n-exchangeable states are to ∞-exchangeable states, and one can generalise the notion of a de Finetti theorem, and ask How well can n-exchangeable states be approximated by m-exchangeable states, where m ≥ n? In the realm of symmetric Werner states, this amounts to bounding the distance
trn−k ρnnλ¯ − trm−k ρmmλ¯ , which is 1 2
µ∈Par(k,d)
dim Vµ
¯ sµ (n λ) (n k)
−
¯ sµ (m λ) (m k)
.
A straightforward calculation very similar to the proof of Theorem III.7 leads to an interpolation between the trivial case where m equals n and the case where m → ∞ which we have considered in Theorem III.7. D. Necessity of d-dependence. We end this section with a lower bound, which is a direct corollary to Theorem III.2. Corollary III.9. Let k < d and let λ = (m d ) be the diagram consisting of d rows of d length m. Then the distance of trn−k ρλn to P k is lower bounded by 2(n−1) (1 − d12 ), where n = md. Note that this bound can be seen as a generalisation of Example II.9, where we set d = n. It implies that any quantum de Finetti theorem can only give an interesting statement if d is small compared to n.
490
M. Christandl, R. König, G. Mitchison, R. Renner
¯ and sµ (λ) take a particularly simple form for Proof. Note first that the functions sµ (λ) the diagram λ under consideration. From Eq. (16) sµ (λ¯ ) = d −k dim Uµk ,
(22)
since dim Uµk is equal to the number of semi-standard tableaux T of shape µ, and from Eq. (17), c(α)d −k −k k n sµ (λ) = d dim Uµ 1− . (23) n α Because the trace distance does not increase when tracing out systems, and trk−2 τ k ∈ P 2 for every τ k ∈ P k , we can bound the distance of trn−k ρλn to P k as follows: min trn−k ρλn − τ k ≥ min trn−2 ρλn − τ 2 . τ 2 ∈P 2
τ k ∈P k
Let µ = (12 ). We show below that max sµ (r ) = sµ (λ¯ ) ,
(24)
r
where the maximisation ranges over all spectra. With dim Vµ = 1, this gives for every τ 2 ∈ P 2,
trn−2 ρλn − τ 2 ≥ tr(Pµ (trn−2 ρλn − τ 2 ))
≥ tr(Pµ trn−2 ρλn ) − max tr(Pµ σ ⊗2 ) ≥
sµ (λ) (n 2)
σ
− sµ (λ¯ ) ,
by Lemma III.6 and Lemma III.1. Equation (23) implies d . n −2 sµ (λ) = d −2 dim Uµ2 1 + n We thus obtain sµ (λ) (n 2)
¯ =d − sµ (λ)
−2
dim Uµ2
= dim Uµ2
1 1−
1 n
(25)
d 1+ −1 n
d +1 (n − 1)d 2
(26)
by (22) and (25). The claim then immediately follows from dim Uµ2 = d2 . It remains to prove (24). According to definition (16), for µ = (12 ), sµ (r1 , . . . , rd ) = ri 1 ri 2 , i 1
where the sum is over all indices i 1 , i 2 ∈ {1, . . . , d}. We claim that r1 + r2 r1 + r2 sµ (r1 , . . . , rd ) ≤ sµ , , r3 , . . . , rd . 2 2
(27)
One-and-a-Half Quantum de Finetti Theorems
491
This follows from the fact that we can write sµ (r ) = r1r2 + (r1 + r2 )
ri +
i≥3
ri 1 ri 2
3≤i 1
and the inequality √ r1 + r2 r1 r2 ≤ 2 relating the geometric and the arithmetic mean of r1 , r2 . Inequality (27) and the symmetry of sµ imply (24). IV. The structure of P k for Werner states We focus now on the set Tk (P k ) = P k ∩ W k of Werner states that are convex combinations of product states. Theorem III.2 approximates elements of W k by elements of P k ∩ W k . We can ask whether it is possible for a symmetric Werner state to be closer to P k than to the set P k ∩ W k . The negative answer is given by the following lemma. Lemma IV.1. The closest state τ k ∈ P k to a symmetric Werner state ρ k ∈ W k is itself a Werner state, i.e., an element of P k ∩ W k . Proof. Suppose τ k ∈ P k is the nearest (not necessarily Werner) product state, so
ρ k − τ k is minimal. Then, using the convexity of the distance
ρ k − Tk (τ k ) = Tk (ρ k − τ k ) ≤ g ⊗k (ρ k − τ k )(g † )⊗k dg = ρ k − τ k , so the Werner state Tk (τ k ) is at least as close to ρ k as τ k (and in fact the triangle inequality is strict unless τ is U(d)-invariant). This means that the closest state τ k is an element of P k ∩ W k . A symmetric Werner state ρλk , for λ ∈ Par(k, d), has the following optimality property: Lemma IV.2. Let λ ∈ Par(k, d). The state ρλk ∈ W k is closer to P k than any other state ρ k with support on Uλd ⊗ Vλ . Proof. Let ρ k be a state with support on Uλd ⊗ Vλ , and let τ k ∈ P k be the state that is closest to ρ k . By Schur’s Lemma, ρλk = Tk ((ρ k )), where (ρ k ) := k!1 π ∈Sk πρ k π † . Thus, using the triangle inequality and the unitary invariance of the trace norm,
ρλk − Tk (τ k ) = Tk ((ρ k )) − Tk ((τ k )) ≤ ρ k − τ k .
492
M. Christandl, R. König, G. Mitchison, R. Renner
Fig. 1. This shows schematically the maps underlying Theorem III.7. From the Young diagram λ on n systems, one can go to W k by taking the state ρλn and tracing out n − k systems, or one can go to (d) by normalising the row lengths of λ. From the latter space, the map fk takes one to W k , and the two routes approximately end up at the same point
The set Tk (P k ) is the convex hull of all twirled tensor products Tk (σ ⊗k ), which is the convex hull of fk (Specd ). Since fk (πr ) = fk (r ), for any permutation π of r1 , .., rd , we can restrict fk to the simplex (d) = Specd /Sd . The vertices of (d) are the points x q ∈ Specd whose first q coordinates are 1/q and the remainder zero, for q = 1, . . . ., d. Thus fk (x 1 ) is just the twirl of |0 0|⊗k , which is the projector onto Symk (H), and fk (x d ) = (11/d)⊗k is the fully mixed state. The set of Werner states in P k is thus the convex hull of fk ((d)) (see Fig. 1). What does this set look like? Example IV.3. Let us look first at the case where k = 2 and d is arbitrary. By Lemma III.1, the point r in (d) is mapped to s(2) (r )ρ(2) + s(12 ) (r )ρ(12 ) , and it is easy to check that s(12 ) (r ) = i< j ri r j is maximised by r = x d , giving s(12 ) (r ) = 1/2(1 − 1/d). The states in f2 ((d)) are therefore those of the form aρ(2) + bρ(12 ) with b ≤ 1/2(1 − 1/d). Thus P 2 ∩ W 2 has a rather trivial polytope structure, being a line segment. It follows that the state aρ(2) + bρ(12 ) lies at a distance max(0, b − 1/2(1 − 1/d)) from P 2 ∩ W 2 . By Lemma IV.1, this is also the minimum distance to P 2 . This result implies that the state ξ 2 = ρ(12 ) considered in Example II.9, Eq. (8), has distance at least 1/2 from P 2 , showing the impossibility of a dimension-free bound on the error of a quantum de Finetti theorem (see remarks following Corollary III.9). In fact, Lemma IV.2 implies that any symmetric state with support on 2 (Cd ) has distance at least 21 to P 2 . Example IV.4. Consider next the case d = 3, k = 3. We will henceforth regard the set of Werner states W 3 as a subset of R3 by identifying a state ρ = λ vλ ρλ with the vector v = (v(3) , v(2,1) , v(13 ) ). If σ = r1 |1 1| + r2 |2 2| + r3 |3 3|, Lemma III.1 tells us that f3 (r ) = (s(3) (r ), 2s(2,1) (r ), s(13 ) (r )) ⎞ ⎛ ri3 + ri2 r j + r1r2 r3 , 2 ri2 r j + 4r1 r2 r3 , r1r2 r3 ⎠ . =⎝ i= j
i= j
(28)
One-and-a-Half Quantum de Finetti Theorems
493
Fig. 2. Left: the image fk ((d)) for d = 3, k = 3, projected onto the coordinates ρ(13 ) and ρ(2,1) . The convex hull of fk ((d)) is a polytope (see Example IV.4). Right: the image fk ((d)) for d = 3, k = 4 projected onto ρ(4) and ρ(2,2) . The convex hull of this figure is not a polytope; since it is equal to the projection of the convex hull of fk ((d)), the latter set, P 4 ∩ W 4 , cannot be a polytope
The vertices of (d) are mapped to f3 (x 1 ) = (1, 0, 0), 1 1 , ,0 , f3 (x 2 ) = 2 2 10 16 1 , f3 (x 3 ) = , , 27 27 27 and comparison of Eq. (28) and the coordinates of the vertices gives f3 (r ) = ri3 − ri2 r j + 3r1r2 r3 f3 (x 1 ) + 4 ri2 r j − 24r1r2 r3 f3 (x 2 ) + (27r1 r2 r3 )f3 (x 3 ), and one can show that the polynomial coefficients are positive. So f3 (r ) lies in the convex span of {f3 (x 1 ), f3 (x 2 ), f3 (x 3 )}. Note that f3 ((d)) is a subset of the set of triseparable Werner states studied in [22]. Thus for d = 3, P 3 ∩ W 3 is a polytope (see Fig. 2), as in the previous example. However, if the number of diagrams with a given value of k and d exceeds d, the situation is different: Theorem IV.5. Let k, d be such that |Par(k, d)| > d. Then the set Tk (P k ) is not a polytope. Proof. Let X denote the subspace spanned by fk (x q ) for q = 1, . . . , d, where we identify W k with a subset of R|Par(k,d)| , as in Example IV.4. Since |Par(k, d)| > d, there is a non-zero vector v in R|Par(k,d)| that is orthogonal to X with respect to the Euclidean scalar product in R|Par(k,d)| . Suppose fk (r ) lies in X for all r ∈ (d). Then fk (r ).v = 0, for all r , so from Lemma III.1 we have for all r ∈ (d), (vλ dim Vλ )sλ (r ) = 0 , (29)
λ∈Par(k,d)
where v = λ vλ ρλ . Since the Schur polynomials are homogeneous, Eq. (29) extends from (d) to all r with non-negative components, and therefore all derivatives of the
494
M. Christandl, R. König, G. Mitchison, R. Renner
polynomial on the l.h.s. of this equation are zero at the origin. Since every coefficient of this polynomial is proportional to one of these derivatives, it must be identically zero. But the Schur functions sλ form a basis for the space of homogeneous symmetric polynomials of degree k in d variables, and therefore no such relationship can hold. Therefore Tk (P k ) includes a point outside X . If Tk (P k ) is a polytope, it has a vertex w not in X . Since Tk (P k ) is the convex hull of fk ((d)), w has the form w = fk (a). As w not in X , a is not a vertex of (d), which implies that there is a line segment in (d) passing through a. Because fk is smooth, the image under fk of the line segment t → a +tξ has a tangent vector at the vertex w. If this tangent vector does not vanish, then we have a contradiction, since then the curve must contain points outside the polytope Tk (P k ) in any neighbourhood of w, however small. It remains to show that, for any point a ∈ (d) that is not a vertex, there is a vector ξ ∈ Rd such that 1. the line segment t → a + tξ lies within (d) for sufficiently small absolute values of the real parameter t, and 2. the derivative of fk in the direction ξ at the point a has non-vanishing tangent vector, k ) i.e. ∂f (a+tξ |t=0 = 0. ∂t It is enough to show that the component of this tangent vector in some direction τ ∈ R|Par(k,d)| is non-vanishing, i.e. that ∂(τ.fk (a + tξ )) ∂t
t=0
= ξ.(∇r (τ.fk (r )))
r =a
= 0 .
(30)
We choose ξ as follows: Suppose a lies in the convex hull of the h vertices x q1 , . . . , x qh of (d), arranged in increasing size of the index qi , with 2 ≤ h ≤ d. Thus a=
h
u i x qi , with 0 < u i < 1 for 1 ≤ i ≤ h.
(31)
q1 q2 q1 x − x q2 , q2 − q1
(32)
i=1
Define ξ=
= (1, . . . , 1, β, . . . , β , 0, . . . , 0) ∈ Rd , q2 −q1
q1
1 where β = q−q . Then a + tξ lies within the convex hull of x q1 , . . . , x qh , and hence in 2 −q1 (d), for small enough values of |t|. To define τ , we use the fact the monomial symmetric functions m λ , for λ ∈ Par(k, d), also form a basis of the homogeneous symmetric polynomials of degree k in d variables. In particular, m (d) (r ) = rid = κλ,(d) sλ (r ) ,
λ
where the coefficients κλµ constitute the transition matrix, which is given by the inverse of the matrix of Kostka numbers [23]. We now take κλ,(d) τ= ρλ , dim Vλ λ
One-and-a-Half Quantum de Finetti Theorems
495
which implies that τ.fk (r ) =
rid .
From (30) and (32) therefore ξ.(∇r (τ.fk (r )))|r =a =
ξi
i
=d
q1
∂
d j rj
∂ri aid−1 −
i=1
r =a
q2 dq1 aid−1 q2 − q1 i=q1 +1
>0, the last inequality holding because Eq. (31) implies a1 = · · · = aq1 > aq1 +1 = · · · = aq2 . The tangent vector at a in the direction ξ is therefore non-vanishing, which completes the proof. Figure 2 shows an example where d = 3, k = 4 and |Par(k, d)| = 4 > d. One might wonder whether Theorem IV.5 is tight, in the sense that, for |Par(k, d)| ≤ d, the set Tk (P k ) is a polytope. For k = 3, d = 3, where |Par(k, d)| = d, we have seen that this is true. However, for k = 4, d = 5, which also gives |Par(k, d)| = d, empirical evidence suggests that Tk (P k ) is not a polytope, having a convex boundary. This is shown in Fig. 3, which also plots the images of traced-out states trn−k ρλn with n = 10 and n = 60 and shows how the approximation to Tk (P k ) improves as more systems are traced out; it also reveals some intriguing striations in the case n = 60, corresponding to diagrams whose top rows are the same length. Thus the characterisation of the set P k ∩ W k seems to be quite subtle, and Werner states again uphold their reputation for exhibiting an interesting variety of phenomena.
V. Conclusions Although the quantum de Finetti theorem is usually thought of as a theorem about symmetric states, the unitary group shares the limelight in the results described here. Our highest weight version of the de Finetti theorem (Theorem II.2) generalises the usual symmetric-state version, but the extra generality almost comes free; indeed, one could argue that the structure of the proof is made clearer by taking the broader viewpoint. One can regard a highest weight vector as the state in a representation that is as unentangled as possible; this point of view has been taken by Klyachko [24]. It is therefore natural to regard highest weight vectors as analogues of product states, which is the role they have in our theorem. In the special case of symmetric states, our Theorem II.7 gives bounds for the distance between the n-exchangeable state ρ k and the set P k of convex combinations of products σ ⊗k ; these bounds are optimal in their dependence on n and k, the theorem giving an upper bound of order k/n and there being examples of states that achieve this bound
496
M. Christandl, R. König, G. Mitchison, R. Renner
Fig. 3. The figures show the image fk ((d)) (shaded region) for d = 5, k = 4, projected onto the coordinates ρ(4) and ρ(2,2) . The image has a smooth convex boundary, so P 4 ∩ W 4 cannot be a polytope. Also shown are the points obtained by tracing out n − k systems from states in W n . Each point corresponds to a diagram with n = 10 boxes (top figure), n = 20 (centre figure) and n = 60 boxes (bottom figure); the line segments demarcate the convex hull of all the points. As expected, fk ((d)) is approximated more closely as n increases
(see Theorem II.10). The dependence of the bound on the dimension d is less clear, the theorem giving a factor of d 2 whereas in the classical case Diaconis and Freedman [15] obtained a bound with a dimension factor of order d.
One-and-a-Half Quantum de Finetti Theorems
497
Diaconis and Freedman also obtained a bound, k(k−1) 2n , that is independent of the dimension. No such bound can exist for quantum states, as Example II.9 shows; one can find a state ρ n with the property that ρ 2 , obtained by tracing out all but two of the systems, lies at a distance at least 1/2 from P 2 . This example is a Werner state, in fact the fully antisymmetric state on d = n systems, and it is an illustration of the usefulness of this family of states in giving information about P k . Lemma III.6 shows that the shifted Schur functions [1] are closely connected with partial traces of Werner states. The meaning of this connection needs to be further explored: does the algebra of shifted symmetric functions have a quantum-informational significance? Another intriguing connection is with the theorem of Keyl and Werner [25]. They show that the spectrum of a state ρ can be measured by carrying out a von Neumann measurement of ρ ⊗n on the subspaces Uλ ⊗ Vλ in the Schur-Weyl decomposition of (Cd )⊗n (Eq. (10)); if λ is obtained, then λ¯ = ( λn1 , . . . , λnd ) approximates the spectrum of ρ. Our theorem tells us that ρ k = trn−k ρλn can be approximated by the twirled product σ ⊗k , where σ has spectrum λ¯ . By the Keyl-Werner theorem, the state trn−k ρλn must therefore project predominantly into subspaces Uµ ⊗ Vµ with µ close to λ in shape (but rescaled by k/n). In this sense, tracing out a Werner state approximately ‘preserves the shape’ of its diagram. We can get an intuition for why this should be by iterating the special case of Lemma III.4 where one box is removed (cf. [20, Prop. 4]). This shows that tracing out is approximately equivalent, for large n, to a process that selects a row of a diagram with probability proportional to the length of that row and then removes a box from the end of the row. There have been many applications of the de Finetti theorem to topics including foundational issues [7, 26], mathematical physics [17, 27] and quantum information theory [10, 18, 28–31]; there have also been various generalisations [3–7, 9, 10, 15–17]. We have taken one-and-a-half footsteps along this route. Acknowledgements. We thank Aram Harrow and Andreas Winter for helpful discussions, and Ignacio Cirac and Frank Verstraete for raising the question of how to approximate n-exchangeable states by m-exchangeable states (see end of Sect. IIIC). We also thank the anonymous reviewers for their helpful comments. This work was supported by the EU project RESQ (IST-2001-37559) and the European Commission through the FP6-FET Integrated Project SCALA, CT-015714. MC acknowledges the support of an EPSRC Postdoctoral Fellowship and a Nevile Research Fellowship, which he holds at Magdalene College Cambridge. GM acknowledges support from the project PROSECCO (IST-2001-39227) of the IST-FET programme of the EC. RR was supported by Hewlett Packard Labs, Bristol.
References 1. Okounkov, A., Olshanski, G.: Alg. i Anal. 9, no. 2, 13–146 (1997) (Russian); Eng. in st. Petersburg Math. J 9, no. 2 (1998) 2. de Finetti, B.: Ann. Inst. H. Poincaré 7, 1 (1937) 3. Størmer, E.: J. Funct. Anal. 3, 48 (1969) 4. Hudson, R.L., Moody, G.R., Wahrschein, Z.: Verw. Geb. 33, 343 (1976) 5. Petz, D.: Prob. Th. Rel. Fields. 85, 1–11 (1990) 6. Caves, C.M., Fuchs, C.A., Schack, R.: J. Math. Phys. 43, 4537 (2002) 7. Fuchs CA, Schack R: In: Quantum Estimation Theory, M.G.A. Paris, J. Rehaeck (eds), Berlin: Springer, 2004 8. Fuchs, C.A., Schack, R., Scudo, P.F.: Phys. Rev. A 69, 062305 (2004) 9. König, R., Renner, R.: J. Math. Phys. 46, 122108 (2005) 10. Renner, R.: Security of Quantum Key Distribution. PhD thesis, ETH Zurich, 2005, available at http://axiv.org/list/quant-ph/0512258, 2005 11. Werner, R.F.: Phys. Rev. A 40, 4277 (1989)
498
M. Christandl, R. König, G. Mitchison, R. Renner
12. Carter, R., Segal, G., MacDonald, I.: Lectures on Lie Groups and Lie Algebras. London Mathematical Society Student Texts vol. 32, 1st ed, Cambridge: Cambridge Univ. Press, 1995 13. Perelomov, A.: Generalized coherent states and their application. Texts and Monographs in Physics, Berlin: Springer-Verlag, 1986 14. Weyl, H.: The Theory of Groups and Quantum Mechanics. New York: Dover Publications, Inc., 1950 15. Diaconis, P., Freedman, D.: The Annals of Probability 8, 745 (1980) 16. Fannes, M., Lewis, J.T., Verbeure, A.: Lett. Math. Phys. 15, 255 (1988) 17. Raggio, G.A., Werner, R.F.: Helv. Phys. Acta 62, 980 (1989) 18. Ioannou, L.M.: Deterministic computational complexity of the quantum separability problem. http://arxiv.org/list/quant-ph/0603199; 2006, to appear in QIP, 2006 19. Doherty, A.: Personal communication, 2006 20. Audenaert, K.: Available at http://qols.ph.ic.ac.uk/ ∼ kauden/QITNotes_files/irreps.pdf, 2004 21. Fulton, W.F.: Young Tableaux. Cambridge: Cambridge University Press, 1997 22. Eggeling, T., Werner, R.F.: Phys. Rev. A 63, 04211 (2001) 23. Macdonald, I.G.: Symmetric functions and Hall polynomials. Oxford: Clarendon Press, 1979 24. Klyachko, A.: http:/laxiv.org/list/quant-ph/0206012, 2002 25. Keyl, M., Werner, R.F.: Phys. Rev. A 64, 052311 (2001) 26. Hudson, R.L.: Found. Phys. 11, 805 (1981) 27. Fannes, M., Spohn, H., Verbeure, A.: J. Math. Phys. 21, 355 (1980) 28. Brun, T.A., Caves, C.M., Schack, R.: Phys. Rev. A 63, 042309 (2001) 29. Doherty, A.C., Parillo, P.A., Spedalieri, F.M.: Phys. Rev. A 69, 022308 (2004) 30. Audenaert KMR.: In: Proceedings of MTNS2004 (2004), available at http://arxiv.org/list/quant-ph/ 0402076, 2004 31. Terhal, B.M., Doherty, A.C., Schwab, D.: Phys. Rev. Lett 90, 157903 (2003) Communicated by M.B. Ruskai
Commun. Math. Phys. 273, 499–532 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0256-9
Communications in
Mathematical Physics
Universality of a Double Scaling Limit near Singular Edge Points in Random Matrix Models T. Claeys, M. Vanlessen Department of Mathematics, Katholieke Universiteit Leuven, Celestijnenlaan 200B, B-3001 Leuven, Belgium. E-mail: [email protected] Received: 21 July 2006 / Accepted: 9 November 2006 Published online: 8 May 2007 – © Springer-Verlag 2007 −1 −n tr Vs,t (M) Abstract: We consider unitary random matrix ensembles Z n,s,t e d M on the space of Hermitian n × n matrices M, where the confining potential Vs,t is such that the limiting mean density of eigenvalues (as n → ∞ and s, t → 0) vanishes like a power 5/2 at a (singular) endpoint of its support. The main purpose of this paper is to prove universality of the eigenvalue correlation kernel in a double scaling limit. The limiting kernel is built out of functions associated with a special solution of the PI2 equation, which is a fourth order analogue of the Painlevé I equation. In order to prove our result, we use the well-known connection between the eigenvalue correlation kernel and the Riemann-Hilbert (RH) problem for orthogonal polynomials, together with the Deift/Zhou steepest descent method to analyze the RH problem asymptotically. The key step in the asymptotic analysis will be the construction of a parametrix near the singular endpoint, for which we use the model RH problem for the special solution of the PI2 equation. In addition, the RH method allows us to determine the asymptotics (in a double scaling limit) of the recurrence coefficients of the orthogonal polynomials with respect to the varying weights e−nVs,t on R. The special solution of the PI2 equation pops up in the n −2/7 -term of the asymptotics.
1. Introduction and Statement of Results 1.1. Unitary random matrix ensembles. On the space Hn of Hermitian n × n matrices M, we consider for n ∈ N and s, t ∈ R the unitary random matrix ensemble, 1 Z n,s,t
e−n tr Vs,t (M) d M.
(1.1)
500
T. Claeys, M. Vanlessen
Here, Z n,s,t is a normalization constant and the confining potential Vs,t is a real analytic function, depending on two parameters s, t ∈ R, satisfying the asymptotic condition, lim
x→±∞
Vs,t (x) = +∞, uniformly for s, t ∈ [−δ0 , δ0 ] for some δ0 > 0. log(x 2 + 1)
(1.2)
Then, Z n,s,t =
Hn
e−n tr Vs,t (M) d M
is convergent as n → ∞ so that the random matrix model is well- defined. It is well-known, see e.g. [25], that an important role in the study of the unitary random matrix ensemble (1.1) is played by the following scalar 2-point (correlation) kernel, n
n
K n(s,t) (x, y) = e− 2 Vs,t (x) e− 2 Vs,t (y)
n−1
(n,s,t)
pk
(n,s,t)
(x) pk
(y),
(1.3)
k=0
constructed out of the orthonormal polynomials (n,s,t)
pk
(n,s,t) k
(x) = κk
x + ··· ,
(n,s,t)
κk
> 0,
with respect to the varying weights e−nVs,t on R. Indeed, the correlations between the eigenvalues of M can be written in terms of the correlation kernel. More precisely, the m-point correlation function R(s,t) n,m satisfies [25], (s,t) R(s,t) (x , . . . , x ) = det K (x , x ) . (1.4) m i j n,m 1 n 1≤i, j≤m
Further, the limiting mean eigenvalue distribution µs,t has a density ρs,t which can be retrieved from the correlation kernel as follows: ρs,t (x) = lim
n→∞
1 (s,t) K (x, x). n n
(1.5)
The limiting mean eigenvalue distribution µs,t equals [10] the equilibrium measure in external field Vs,t . This is the unique measure minimizing the logarithmic energy [28] 1 dµ(x)dµ(y) + Vs,t (y)dµ(y), log (1.6) I Vs,t (µ) = |x − y| among all probability measures µ on R. Furthermore, there exists a real analytic function qs,t , such that [9], 1 − ρs,t (x) = qs,t (x), (1.7) π − + − q − , with q ± ≥ 0 and denotes the negative part of qs,t , i.e. qs,t = qs,t where qs,t s,t s,t − + q qs,t s,t = 0. Due to condition (1.2) we have that qs,t (x) → +∞ as x → ±∞, so that µs,t is supported on a finite union of intervals, which we denote by Ss,t . It is known
Universality of Eigen value Correlation Kernel in Double Scaling Limit
501
[28] that the equilibrium measure µs,t satisfies the following Euler-Lagrange variational conditions: there exists a constant κs,t ∈ R such that for x ∈ Ss,t , (1.8) 2 log |x − u|dµs,t (u) − Vs,t (x) = κs,t , 2
log |x − u|dµs,t (u) − Vs,t (x) ≤ κs,t ,
for x ∈ R \ Ss,t .
(1.9)
The external field Vs,t is called regular if strict inequality in (1.9) holds, if the density ρs,t does not vanish in the interior of the support Ss,t , and if qs,t has a simple zero at each of the endpoints of the support Ss,t . If one of these conditions is not valid, Vs,t is called singular. The singular points x ∗ are classified as follows, see [10, 21]: (i) x ∗ ∈ R \ Ss,t is a type I singular point if equality in (1.9) holds. Then, x ∗ is a + of multiplicity 4m with m ∈ N. zero of qs,t ∗ (ii) x ∈ Ss,t is a type II singular point if it is an interior point of Ss,t where the − equilibrium density ρs,t vanishes. Then, x ∗ is a zero of qs,t of multiplicity 4m. ∗ (iii) x is a type III singular point if it is an endpoint of the support Ss,t and a zero of qs,t of multiplicity larger than one. Then, x ∗ is a zero of qs,t of multiplicity 4m + 1, which means that ρs,t (x) ∼ c|x − x ∗ |(4m+1)/2 . In this paper, we consider external fields Vs,t which are such that in the critical case s = t = 0, V0 = V0,0 has a type III singular (edge) point x ∗ with m = 1, i.e. ρ0,0 (x) ∼ c|x − x ∗ |5/2 ,
as x → x ∗ .
(1.10)
Further, we take Vs,t of the special form, Vs,t = V0 + sV1 + t V2 ,
(1.11)
where V1 is an arbitrary real analytic function, while V2 is real analytic and in addition satisfies some critical condition which we will specify in Sect. 1.4 below. 1.2. Universality in random matrix theory. Consider for now unitary random matrix ensembles Z n−1 e−n tr V (M) d M on the space of Hermitian n × n matrices M. Scaling limits of the associated correlation kernel K n show universal behavior. Near regular points, universality results have been established in [1, 8, 10, 11, 27]. For example, if x ∗ lies in the bulk of the spectrum (i.e. x ∗ is such that it lies in the interior of the support S of the equilibrium measure in external field V , and such that the equilibrium density ρ does not vanish at x ∗ ) there is a constant c such that 1 u v sin π(u − v) Kn x ∗ + , x ∗ + = . (1.12) lim n→∞ cn cn cn π(u − v) On the other hand, if x ∗ is a regular edge point of the spectrum (i.e. x ∗ is an endpoint of S and ρ vanishes like a square root at x ∗ ), there is a constant c such that 1 u v Ai (u)Ai (v) − Ai (v)Ai (u) , K n x ∗ + 2/3 , x ∗ + 2/3 = 2/3 n→∞ cn cn cn u−v lim
where Ai is the Airy function.
(1.13)
502
T. Claeys, M. Vanlessen
Near singular points, similar results hold. In those singular cases it is interesting to consider double scaling limits where the external field V depends on additional parameters. In [2, 5, 6, 29], an external field V was considered such that there is a type II singular (interior) point x ∗ with m = 1, i.e. ρ(x) ∼ c(x − x ∗ )2 ,
as x → x ∗ .
If an additional parameter is included in the external field, Vt = V /t, one observes for t close to 1 the transition where two intervals in the support of the limiting mean density of eigenvalues merge to one interval through the critical case of a type II singular point. In the double scaling limit where n → ∞ and t → 1 in such a way that c0 n 2/3 (t −1) → s ∈ R for some appropriately chosen constant c0 , there exists a constant c such that (for the associated correlation kernel K n,t ), lim
1 u v K n,t x ∗ + 1/3 , x ∗ + 1/3 = K crit,II (u, v; s). 1/3 cn cn cn
Here, K crit,II (u, v; s) is built out of functions associated with the Hastings-McLeod solution [18] of the second Painlevé equation. The main purpose of this paper is to obtain, for the random matrix models in Sect. 1.1 above, a similar result near the type III singular (edge) point of V0 with m = 1. We take a double scaling limit (n → ∞ and s, t → 0), and the limiting kernel K crit,III will be built out of functions which are associated with a special solution of the fourth order analogue of the Painlevé I equation. The case of a type III singular (edge) point was also studied in the physics literature [3, 4]. In addition, the techniques that we use to prove this allow us to determine the asymptotics (in a double scaling limit) of the recurrence coefficients in the three-term (n,s,t) recurrence relation satisfied by the orthogonal polynomials pk with respect to the varying weights e−nVs,t on R. 1.3. -functions associated with a special solution of the PI2 equation. We consider the following differential equation for y = y(s, t), which we denote as the PI2 equation, 1 1 3 1 2 (1.14) y + (ys + 2yyss ) + yssss . s = ty − 6 24 240 For t = 0, this equation is the second member in the Painlevé I hierarchy [20, 23]. The PI2 equation has been studied for example in [4, 19, 26] (for t = 0) and [7, 15] (for general t). The Lax pair for the PI2 equation is the linear system of differential equations ∂ = U , ∂ζ
∂ = W , ∂s
(1.15)
where
1 −4ys ζ − (12yys + ysss ) 8ζ 2 + 8yζ + (12y 2 + 2yss − 120t) , U= U21 4ys ζ + (12yys + ysss ) 240
(1.16)
U21 = 8ζ 3 − 8yζ 2 − (4y 2 + 2yss + 120t)ζ + (16y 3 − 2ys2 + 4yyss + 240s), (1.17)
Universality of Eigen value Correlation Kernel in Double Scaling Limit
and
W =
0 ζ − 2y
503
1 . 0
(1.18)
The system of differential equations (1.15)–(1.18) can only be solvable if y = y(s, t) is a solution to the PI2 equation (1.14). For different solutions y, we have different Lax pairs. We are interested in the special solution y which was studied in [4, 7, 15]. This solution y = y(s, t) is characterized by the vanishing of its Stokes multipliers s1 , s2 , s5 , and s6 , see [19] for details. It was shown in [7] that y has no poles for real s and t, and that it has, for fixed t ∈ R, the following asymptotic behavior: 1 y(s, t) = ∓(6|s|)1/3 ∓ 62/3 t|s|−1/3 + O(|s|−1 ), 3
as s → ±∞.
(1.19)
It has been shown in [26, App. A] that for t = 0, y is uniquely determined by realness and asymptotic condition (1.19). For general t we are not aware of a similar result although it is supported by a conjecture of Dubrovin [15] that this should hold for general t. For s, t ∈ R,the Lax pair (1.15)–(1.18) associated with this special choice of y has a unique 1 solution for which the following limit holds, see [7, 19], 2 1 1 1 (ζ ; s, t) θ(ζ ;s,t) 1 −→ √ e e− 4 πi , 2 (ζ ; s, t) −1 2 as ζ → ∞ with 0 < Arg ζ < 6π/7, 1 0 where σ3 = 0 −1 denotes the third Pauli-matrix, and where θ is given by 1
ζ 4 σ3
θ (ζ ; s, t) =
1 7/2 1 3/2 ζ − tζ + sζ 1/2 . 105 3
(1.20)
(1.21)
The functions 1 and 2 will appear below in the universal limiting correlation kernel near type III singular (edge) points of V0 with m = 1. 1.4. Statement of results. We work under the following assumptions. Assumptions 1.1. (i) We consider external fields Vs,t of the form Vs,t = V0 + sV1 + t V2 ,
(1.22)
where V0 , V1 , and V2 are real analytic and are such that there exists a δ0 > 0 such that the following holds lim
|x|→∞
Vs,t (x) = +∞, log(x 2 + 1)
uniformly for s, t ∈ [−δ0 , δ0 ].
(1.23)
(ii) V0 is such that the equilibrium measure ν0 in external field V0 is supported on one single interval [a, b] ⊂ R, and b is a type III singular (edge) point of V0 with m = 1. Then, ν0 is of the form [9], dν0 (x) =
1 h 0 (x) (b − x)(x − a) χ[a,b] (x)d x, 2π
(1.24)
504
T. Claeys, M. Vanlessen
with χ[a,b] the indicator function of the set [a, b], and with h 0 real analytic and satisfying, h 0 (b) = h 0 (b) = 0,
and
h 0 (b) > 0.
(1.25)
Furthermore, we assume that V0 has no other singular points besides b. In particular, a is a regular (edge) point and we then have that h 0 (a) > 0.
(1.26)
(iii) V2 is such that it satisfies the critical condition b u−a V (u)du = 0. b−u 2 a
(1.27)
Throughout the rest of this paper we let V be the neighborhood of the real line where V0 , V1 , V2 , and h 0 are analytic. Example 1.2. The assumptions above are valid for the particular example where V0 , V1 , and V2 are given by V0 (x) =
1 4 4 1 8 x − x 3 + x 2 + x, 20 15 5 5
V1 (x) = x,
V2 (x) = x 3 − 6x.
(1.28)
Then, the equilibrium measure ν0 is supported on the interval [−2, 2] and given by dν0 (x) =
1 (x + 2)1/2 (x − 2)5/2 χ[−2,2] (x)d x. 10π
(1.29)
It should be noted that a type III singular (edge) point cannot occur when V0 is a polynomial of degree lower than 4. Example 1.3. In the continuum limit of the Toda lattice [12], an external field of the form Vt1 ,t2 (x) = (1 + t1 )(V0 (x) + t2 x) has been studied. This deformation of V0 can be written in the form (1.22) (so that it is included in the class of external fields studied in this paper). Indeed, if we let V1 (x) = x and V2 (x) = V0 (x) + cx, with c some constant chosen such that the critical condition (1.27) holds, then Vt1 ,t2 = V0 + sV1 + t V2 , with s = t2 + t1 t2 − ct1 and t = t1 . Remark 1.4. In Sect. 2 we will show that assumption (iii) is equivalent to the vanishing of the equilibrium density dνd2x(x) at the right endpoint b, where ν2 is the unique measure which minimizes I V2 (ν), see (1.6), among all signed measures ν, supported on [a, b] and having zero mass, ν([a, b]) = 0. Remark 1.5. The case where the left (instead of the right) endpoint of the support is singular can be transformed to our case by considering the external field Vs,t (−x).
Universality of Eigen value Correlation Kernel in Double Scaling Limit
505
Remark 1.6. Without giving any mathematical details, we now describe the transitions that can occur for s and t near 0. First, if we let t = 0 and s vary around 0, one typically observes the transition from the regular one-interval case to the singular case and back to the regular one-interval case. Next, for s = 0 and t around 0, we can observe the transition from the regular one-interval case to the regular two-interval case. Finally, letting both s and t vary around 0, we can observe one of the above described transitions, or the critical transition where a type II singular point moves to the endpoint b, where it becomes a type III singular point before moving on as a type I singular point. Further, to describe our results, we have to introduce constants c, c1 , and c2 , 2/7 h (b) h 1 (b) 15 √ h 0 (b) b − a > 0, c1 = 1/2 , c2 = − 3/2 2 , c= 1/2 2 c (b − a) c (b − a)1/2 (1.30) where h 0 is the real analytic function appearing in (1.24), and where the functions h 1 and h 2 are defined as, 1 b
du , for x ∈ [a, b] and j = 1, 2. (b − u)(u − a)V j (u) h j (x) = − π a u−x (1.31) 1.4.1. Universality of the double scaling limit. Our main result is the following. Theorem 1.7. Let Vs,t = V0 +sV1 +t V2 be such that Assumptions 1.1 above are satisfied. We take a double scaling limit where we let n → ∞ and at the same time s, t → 0, in such a way that lim n 6/7 s and lim n 4/7 t exists, and put s0 = c1 · lim n 6/7 s ∈ R,
t0 = c2 · lim n 4/7 t ∈ R,
(1.32) (s,t)
where the constants c1 and c2 are defined by (1.30). Then, the 2-point kernel K n satisfies the following universality result: u v 1 (1.33) lim 2/7 K n(s,t) b + 2/7 , b + 2/7 = K crit,III (u, v; s0 , t0 ), cn cn cn uniformly for u, v in compact subsets of R. Here, K crit,III is built out of the functions 1 and 2 defined in Sect. 1.3, K crit,III (u, v; s, t) =
1 (u; s, t)2 (v; s, t) − 1 (v; s, t)2 (u; s, t) . −2πi(u − v)
(1.34)
Remark 1.8. Since y(s, t) has no poles [7] for s, t ∈ R, the kernel K crit,III (u, v; s, t) exists for all real u, v, s, and t. Furthermore, using a similar argument as in [7, Lemma 2.3 (ii)], one can show that eπi/4 1 and eπi/4 2 are real. It then follows that K crit,III (u, v; s, t) is real for real u, v, s, and t. crit,III . Using the fact that Remark 1 1.9. It is possible to give an integral formula for K 2 satisfies the second differential equation of the Lax pair (1.15), we have that
∂1 (ζ ; s, t) = 2 (ζ ; s, t), ∂s
and
∂2 (ζ ; s, t) = (ζ − 2y(s, t))1 (ζ ; s, t). ∂s
506
T. Claeys, M. Vanlessen
Using (1.34) this yields, 1 ∂ K crit,III (u, v; s, t) = 1 (u; s, t)1 (v; s, t). ∂s 2πi Now, since lims→−∞ K crit,III (u, v; s, t) = 0, which can be shown using a Deift/Zhou steepest descent method argument [13], it then follows that K crit,III has the following integral formula, s 1 crit,III K (u, v; s, t) = 1 (u; σ, t)1 (v; σ, t)dσ. (1.35) 2πi −∞ Remark 1.10. Theorem 1.7 can be generalized to the case where the support of ν0 (the equilibrium measure in external field V0 ) consists of more than one interval. Then, the proof becomes much more technical, although the main ideas remain the same. We comment in Remark 3.8 on the modifications that have to be made in the multi-interval case. 1.4.2. Recurrence coefficients for orthogonal polynomials. It is well-known [30] that the orthonormal polynomials pk = pk(n,s,t) satisfy a three-term recurrence relation of the form, x pk (x) = ak+1 pk+1 (x) + bk pk (x) + ak pk−1 (x),
(1.36)
where ak = ak(n,s,t) > 0 and bk = bk(n,s,t) ∈ R (we suppress the s and t dependence for brevity). In the generic case where V0 has no singular points, the recurrence coefficients for s = t = 0 have the following asymptotics, see e.g. [2, 8]: an(n,0,0) =
b−a + O(n −1 ), 4
bn(n,0,0) =
b+a + O(n −1 ), 2
as n → ∞. (1.37)
For singular potentials V0 , the constant terms in the expansions (1.37) remain the same, but the error terms behave differently [2, 6]. In our case of interest, where we have a type III singular (edge) point of V0 with m = 1, the error term is of order O(n −2/7 ), and the coefficient of the n −2/7 term is expressed in terms of the special solution y of the PI2 equation discussed in Sect. 1.3. Theorem 1.11. Let Vs,t be such that Assumptions 1.1 above are satisfied. Consider the three-term recurrence relation (1.36) satisfied by the orthonormal polynomials pk = pk(n,s,t) with respect to the weight function e−nVs,t . Then, in the double scaling limit where n → ∞ and s, t → 0, in such a way that lim n 6/7 s and lim n 4/7 t exists, and put s0 = c1 · lim n 6/7 s ∈ R,
t0 = c2 · lim n 4/7 t ∈ R,
(1.38)
with c1 and c2 given by (1.30), we have 1 b−a + y(c1 n 6/7 s, c2 n 4/7 t)n −2/7 + O(n −3/7 ), 4 2c 1 b−a + y(s0 , t0 )n −2/7 (1 + o(1)), = 4 2c
an(n,s,t) =
(1.39)
Universality of Eigen value Correlation Kernel in Double Scaling Limit
507
and b+a 1 + y(c1 n 6/7 s, c2 n 4/7 t)n −2/7 + O(n −3/7 ) 2 c b+a 1 = + y(s0 , t0 )n −2/7 (1 + o(1)), 2 c
bn(n,s,t) =
(1.40)
where the constant c is given by (1.30), and where y is the special solution of the PI2 equation discussed in Sect. 1.3. Remark 1.12. Note that the expansions of the recurrence coefficients are of the same form as the conjectured (by Dubrovin [15, Main Conjecture, Part 3], see also [14]) expansions for solutions of perturbed hyperbolic equations. Here, the perturbation parameter plays the role of 1/n in our context. Remark 1.13. For polynomials which are orthogonal on certain complex contours, it can occur that the equilibrium density vanishes like a power 3/2. Asymptotics of the recurrence coefficients in this case were obtained in [16]. Here, a special solution of the Painlevé I equation occurs instead of a solution of the PI2 equation and the asymptotics are in powers of n −1/5 . Observe further that in [16] there is no term of order n −1/5 in the asymptotics. In (1.39) and (1.40) we see that there is no term of order n −1/7 . In the proof of Theorem 1.11 this term will drop out in a similar way as the n −1/5 -term in [16]. 1.5. Outline of the rest of the paper. We prove our results by characterizing the orthogonal polynomials via the well-known 2 × 2 matrix valued Fokas-Its-Kitaev RiemannHilbert (RH) problem [17] and applying the Deift/Zhou steepest descent method [13] to analyze this RH problem asymptotically. This approach has been used many times before, see e.g. [5, 6, 8, 10, 11, 16, 22, 31, 32]. An important step in the Deift/Zhou steepest descent method is the construction of so-called g-functions associated with equilibrium measures. Those equilibrium measures will be constructed in Sect. 2. In order to deal with the deformations Vs,t of V0 , we use modified equilibrium problems where we allow the measures to be negative, which was also done in [5, 6, 16]. Another modification of the equilibrium problem is that we choose the support of the equilibrium measure fixed, instead of allowing it to choose its own support. In Sect. 3, we perform the Deift/Zhou steepest descent analysis to the RH problem Y for orthogonal polynomials. Via a series of transformations Y → T → S → R we want to arrive at a RH problem for R which is normalized at infinity (i.e. R(z) → I as z → ∞) and with jumps uniformly close to the identity matrix. Then, R itself is close to the identity matrix. By unfolding the series of transformations we then get the asymptotics of Y . The key step in this method will be the local analysis near the endpoints a and b. Near the regular endpoint a, we construct (in Sect. 3.5) a parametrix built out of Airy functions. Due to the modified equilibrium measures, which have a fixed support, we also need to make a technical modification in the construction of the Airy parametrix, compared with the parametrix as used e.g. in [8]. To construct the local parametrix near the singular endpoint b (in Sect. 3.6) we use a model RH problem associated with the special solution y of the PI2 equation as discussed in Sect. 1.3. The results of Sect. 3 will be used in Sect. 4 to prove the universality result for the correlation kernel (see Theorem 1.7) and in Sect. 5 to determine the asymptotics of the recurrence coefficients (see Theorem 1.11).
508
T. Claeys, M. Vanlessen
2. Equilibrium Measures We consider external fields Vs,t = V0 + sV1 + t V2 which satisfy Assumptions 1.1 in the beginning of Sect. 1.4. In order to perform the Deift/Zhou steepest descent analysis to the RH problem for orthogonal polynomials one would expect to use the equilibrium measure µs,t in external field Vs,t minimizing I Vs,t (µ), see (1.6), among all probability measures µ on R. However, as in [5, 6, 16] it will be more convenient to use modified equilibrium measures νs,t which we allow to be negative. Furthermore, unlike in [5, 6, 16], we take the support of the measures νs,t to be fixed instead of letting it depend on s and t. The aim of this section is to find measures νs,t (depending on the parameters s, t ∈ R) supported on the interval [a, b] ⊂ R (where [a, b] is the support of the equilibrium measure ν0 in external field V0 ), such that νs,t ([a, b]) = 1, and such that they satisfy the following condition: there exist s,t ∈ R such that for every δ > 0 there are ε, κ > 0 sufficiently small such that for s, t ∈ [−ε, ε], 2 log |x − u|dνs,t (u) − Vs,t (x) = s,t , for x ∈ [a, b], (2.1) 2 log |x − u|dνs,t (u) − Vs,t (x) < s,t − κ, for x ∈ R \ [a − δ, b + δ]. (2.2) We seek νs,t in the following form: νs,t = ν0 + sν1 + tν2 ,
(2.3)
where ν0 is the equilibrium measure in external field V0 minimizing I V0 (ν), see (1.6), among all probability measures ν on R. From Assumption 1.1 (ii) we know that ν0 can be written as follows: dν0 (x) = ψ0,+ (x)χ[a,b] (x)d x,
(2.4)
where χ[a,b] is the indicator function of the set [a, b], and where ψ0,+ is the +boundary value of the function ψ0 (z) =
1 R(z)h 0 (z), 2πi
for z ∈ V \ [a, b],
with h 0 analytic in the neighborhood V of the real line, and with 1/2 R(z) = (z − a)(z − b) , for z ∈ C \ [a, b].
(2.5)
(2.6)
Here, we take the principal branch of the square root so that R is analytic in C \ [a, b]. Further, since a is a regular (edge) point and since b is a type III singular (edge) point with m = 1, we have, cf. (1.25) and (1.26), h 0 (a) > 0,
h 0 (b) = h 0 (b) = 0,
and
h 0 (b) > 0.
(2.7)
Since V0 is assumed to have no other singular points besides b, we know (cf. (1.8) and (1.9)) that ν0 satisfies the following condition: there exists 0 ∈ R such that 2 log |x − u|dν0 (u) − V0 (x) = 0 , for x ∈ [a, b], (2.8) 2 log |x − u|dν0 (u) − V0 (x) < 0 , for x ∈ R \ [a, b]. (2.9)
Universality of Eigen value Correlation Kernel in Double Scaling Limit
509
We will now construct the two measures ν1 and ν2 . In order to do this we introduce the following auxiliary (analytic) functions: 1 dξ , for z ∈ V and j = 1, 2, (2.10) R(ξ )V j (ξ ) h j (z) = 2πi γ ξ −z where γ is a positively oriented contour in V with [a, b] and z in its interior, and where R is given by (2.6). Observe that, using the fractional residue theorem, one has, b 1 du h j (x) = − , for x ∈ [a, b], (2.11) R+ (u)V j (u) πi a u−x where the integral is a Cauchy principal value integral. So, h j is real on [a, b]. Observe that by Assumption 1.1 (iii) and (2.11), h 2 (b) = 0.
(2.12)
Lemma 2.1. Define two signed measures ν1 and ν2 supported on [a, b] as dν j (x) = ψ j,+ (x)χ[a,b] d x,
j = 1, 2,
(2.13)
where χ[a,b] is the indicator function of the set [a, b], and where ψ j,+ is the +boundary value of the function ψ j (z) =
1 h j (z) , 2πi R(z)
for z ∈ V \ [a, b].
(2.14)
Here, h j is given by (2.10), see also (2.11) for its expression on [a, b], and R is given by (2.6). Then, ν j has zero mass, i.e. b ψ j,+ (u)du = 0, (2.15) ν j ([a, b]) = a
and there exist constants j ∈ R such that 2 log |x − u|dν j (u) − V j (x) = j , Proof. Define, for j = 1, 2, the auxiliary functions b 1 du F j (z) = , R+ (u)V j (u) 2πi R(z) a u−z
for x ∈ [a, b].
(2.16)
for z ∈ C \ [a, b],
(2.17)
which, by standard techniques and by (2.10) and (2.14), are equal to 1 1 dξ F j (z) = V j (z) − R(ξ )V j (ξ ) 2 4πi R(z) γ ξ −z 1 for z ∈ V \ [a, b], = V j (z) − πiψ j (z), 2 where γ is a positively oriented contour in V with [a, b] and z in its interior. This, together with the fact that ψ j,+ = −ψ j,− on (a, b), yields F j,+ (x) − F j,− (x) = −2πiψ j,+ (x), F j,+ (x) + F j,− (x) =
V j (x),
for x ∈ [a, b],
(2.18)
for x ∈ [a, b].
(2.19)
510
T. Claeys, M. Vanlessen
Since F j is analytic in C \ [a, b] and since, by (2.17), F j (z) = O(z −2 ) as z → ∞, a standard complex analysis argument shows that
1 2πi
b
a
F j,+ (u) − F j,− (u) ds = F j (z), u−z
for z ∈ C \ [a, b].
By (2.18), this yields, F j (z) = −
b
a
ψ j,+ (u) du = −z −1 u−z
b
ψ j,+ (u)du + O(z −2 ),
as z → ∞.
a
Comparing this with the fact that F j (z) = O(z −2 ) as z → ∞, we obtain
b a ψ j,+ (u)du = 0, so that (2.15) is proven. It remains to prove (2.16). It is straightforward to check that, F j (z) = − a
b
ψ j,+ (u) 1 du = −πiψ j (z) + u−z 2
γ
ψ j (ξ ) dξ, ξ −z
for z ∈ V \ [a, b],
so that, using the fractional residue theorem, F j,± (x) = −πiψ j,± (x) − a
b
ψ j,+ (u) du, u−x
for x ∈ [a, b].
From (2.19) and the fact that ψ j,+ + ψ j,− = 0 on [a, b] this yields, d dx
2 log |x − u|dν j (u) + V j (x) = 2
b
a
This proves (2.16).
ψ j,+ (u) du + F j,+ (x) + F j,− (x) = 0. u−x (2.20)
Corollary 2.2. Let νs,t = ν0 + sν1 + tνt . Then, dνs,t (x) = ψs,t,+ (x)χ[a,b] d x, where ψs,t = ψ0 + sψ1 + tψ2 ,
on V \ [a, b],
(2.21)
with ψ0 given by (2.5) and ψ1 and ψ2 given by (2.14). So, νs,t is supported on [a, b] and has mass one, i.e. νs,t ([a, b]) = 1. Further, there exist constants s,t ∈ R such that for any δ > 0 there are ε, κ > 0 sufficiently small such that for s, t ∈ [−ε, ε] the conditions (2.1) and (2.2) are satisfied. Proof. Since νs,t = ν0 + sν1 + tνt , from (2.15), and from the fact that ν0 ([a, b]) = 1 it is clear that νs,t ([a, b]) = 1. Next, with s,t = 0 + s1 + t2 , we have (2.22) 2 log |x − u|dνs,t (u) − Vs,t (x) − s,t = I0 (x) + s I1 (x) + t I2 (x), where
I j (x) = 2
log |x − u|dν j (u) − V j (x) − j ,
j = 1, 2, 3.
Universality of Eigen value Correlation Kernel in Double Scaling Limit
511
Then, condition (2.1) follows from (2.8) and (2.16). Now, by using (2.9) and the fact that I0 (x) → −∞ as |x| → ∞, there exists κ > 0 such that 3 I0 < − κ, 2
on R \ [a − δ, b + δ].
(2.23)
Further, one can check that I1 and I2 are bounded on R \ [a − δ, b + δ], and thus there exists ε > 0 such that for s, t ∈ [−ε, ε], s I1 + t I2 <
1 κ, 2
on R \ [a − δ, b + δ].
Inserting (2.23) and (2.24) into (2.22) we obtain condition (2.2).
(2.24)
Remark 2.3. The measure ν1 (ν2 ) is the equilibrium measure that minimizes I V1 (ν) (I V2 (ν)) among all signed measures ν, supported on [a, b] with ν([a, b]) = 0. The measures νs,t on the other hand minimize I Vs,t (ν) among all signed measures supported on [a, b] with ν([a, b]) = 1. Observe that since ν0 has a strictly positive density on (a, b) (since ν0 has no type II singular points) we have for any δ > 0 that νs,t is positive on (a + δ, b − δ) for s, t sufficiently small. 3. Riemann-Hilbert Analysis 3.1. RH problem for orthogonal polynomials. For each fixed n, s, and t, we consider the Fokas-Its-Kitaev Riemann-Hilbert problem [17] characterizing the orthogonal polyno(n,s,t) mials pk with respect to the weight functions e−nVs,t . We seek a 2 × 2 matrix-valued function Y (z) = Y (z; n, s, t) (we suppress the n, s, and t dependence for brevity) that satisfies the following conditions. RH problem for Y . (a) Y : C \ R → C2×2 is analytic. (b) Y possesses continuous boundary values for x ∈ R denoted by Y+ (x) and Y− (x), where Y+ (x) and Y− (x) denote the limiting values of Y (z ) as z approaches x from above and below, respectively, and 1 e−nVs,t (x) Y+ (x) = Y− (x) , for x ∈ R. (3.1) 0 1 (c) Y has the following asymptotic behavior at infinity: n z 0 , as z → ∞. Y (z) = I + O(z −1 ) 0 z −n The unique solution of the RH problem is given by ⎛ ⎞ pn (u)e−nVs,t (u) κn−1 −1 p (z) κ du n n ⎜ ⎟ 2πi R u−z ⎟, Y (z) = ⎜ −nV (u) s,t ⎝ ⎠ pn−1 (u)e −2πiκn−1 pn−1 (z) −κn−1 du u−z R
(3.2)
for z ∈ C \ R, (3.3)
512
T. Claeys, M. Vanlessen (n,s,t)
where pk = pk is the k th degree orthonormal polynomial with respect to the varying (n,s,t) > 0 is the leading coefficient of pk . The solution weight e−nVs,t , and where κk = κk (3.3) is due to Fokas, Its, and Kitaev [17], see also [8, 10, 11]. (s,t) It is now possible to write the 2-point kernel K n , see (1.3), in terms of Y . Indeed using the Christoffel-Darboux formula for orthogonal polynomials and the fact that det Y ≡ 1 (which follows easily from (3.1), (3.2), and Liouville’s theorem), we get K n(s,t) (x,
y) = e
− n2 Vs,t (x) − n2 Vs,t (y)
e
−1 1 1 0 1 Y± (y)Y± (x) . 0 2πi(x − y)
(3.4)
So, in order to prove Theorem 1.7, we need to analyze the RH problem for Y asymptotically. We do this by applying the Deift/Zhou steepest descent method [13] to this RH problem. 3.2. Normalization of the RH problem at infinity: Y → T . In order to normalize the RH problem for Y at infinity, the equilibrium measures νs,t , introduced in Sect. 2 play a key role. Consider the log-transform gs,t of νs,t , gs,t (z) =
b
log(z − u)dνs,t (u),
for z ∈ C \ (−∞, b].
(3.5)
a
Here, we take the principal branch of the logarithm so that gs,t is analytic in C\(−∞, b]. We now give properties of gs,t which are crucial in the following. From (3.5) and condition (2.1) it follows that gs,t,+ (x) + gs,t,− (x) − Vs,t (x) − s,t = 0,
for x ∈ [a, b].
(3.6)
for x ∈ R,
(3.7)
Another crucial property is that
b
gs,t,+ (x) − gs,t,− (x) = 2πi
dνs,t (u),
x
so that since νs,t is supported on [a, b] and has mass one (see Corollary 2.2), gs,t,+ (x) − gs,t,− (x) =
2πi, for x < a, 0, for x > b.
(3.8)
Now, we are ready to perform the first transformation Y → T . Define the matrix valued function T as 1
1
T (z) = e− 2 ns,t σ3 Y (z)e−ngs,t (z)σ3 e 2 ns,t σ3 ,
for z ∈ C \ R,
(3.9)
that appears in the variational conditions (2.1) and (2.2), and where s,t isthe constant 0 denotes the third Pauli-matrix. Using (3.6), (3.8), the RH conditions where σ3 = 01 −1 for Y , and the fact that gs,t (z) = log z + O(1/z) as z → ∞, it is straightforward to check that T is a solution to the following RH problem.
Universality of Eigen value Correlation Kernel in Double Scaling Limit
513
RH problem for T . (a) T : C \ R → C2×2 is analytic. (b) T+ (x) = T− (x)vT (x) for x ∈ R, with ⎧ ⎪ e−n(gs,t,+ −gs,t,− ) 1 ⎪ ⎪ , on (a, b), ⎪ ⎪ ⎨ 0 en(gs,t,+ −gs,t,− ) vT = ⎪ n(gs,t,+ +gs,t,− −Vs,t −s,t ) ⎪ 1 e ⎪ ⎪ , on R \ (a, b). ⎪ ⎩ 0 1 (c) T (z) = I + O(1/z),
(3.10)
as z → ∞.
Remark 3.1. From (3.7) we see that the diagonal entries of vT on (a, b) are rapidly oscillating for large n. Further, using condition (2.2) and (3.5), we see that vT − I decays exponentially on R \ [a − δ, b + δ]. 3.3. Opening of the lens: T → S. Here, we will transform the oscillatory diagonal entries of the jump matrix vT on (a, b) into exponentially decaying off-diagonal entries. This step is referred to as the opening of the lens. Introduce a scalar function φs,t as,
b
φs,t (z) = −πi
ψs,t (ξ )dξ,
for z ∈ V \ (−∞, b],
(3.11)
z
where the path of integration does not cross the real line, and where ψs,t is defined by (2.21). The important feature of the function φs,t is that by (3.7), φs,t,+ and φs,t,− are purely imaginary on (a, b) and satisfy, b −2φs,t,+ (x) = 2φs,t,− (x) = 2πi dνs,t (u) = gs,t,+ (x) − gs,t,− (x), for x ∈ (a, b), x
(3.12) which means that −2φs,t and 2φs,t provide analytic extensions of gs,t,+ − gs,t,− into the upper half-plane and lower half-plane, respectively. Further, 2gs,t + 2φs,t − Vs,t − s,t is analytic in V \ (−∞, b] and satisfies by (3.12) and (3.6), 2gs,t,± + 2φs,t,± − Vs,t − s,t = gs,t,+ + gs,t,− − Vs,t − s,t = 0,
on (a, b),
so that by the identity theorem, 2gs,t − Vs,t − s,t = −2φs,t ,
on V \ (−∞, a].
(3.13)
Using (3.8) this yields, gs,t,+ + gs,t,− − Vs,t − s,t = 2gs,t,− − Vs,t − s,t + (gs,t,+ − gs,t,− ) = −2φs,t,− + 2πi,
on (−∞, a). (3.14)
514
T. Claeys, M. Vanlessen
Inserting (3.12), (3.13), and (3.14) into (3.10), the jump matrix for T can be written in terms of φs,t as ⎧ 2nφs,t,+ ⎪ e 1 ⎪ ⎪ , on (a, b), ⎪ ⎪ ⎨ 0 e2nφs,t,− vT = ⎪ −2nφs,t,− ⎪ ⎪ 1e ⎪ , on R \ (a, b). ⎪ ⎩ 0 1
(3.15)
It is straightforward to check, using the fact that φs,t,+ + φs,t,− = 0 on (a, b), that vT has on the interval (a, b) the following factorization: vT =
1
e2nφs,t,−
0 1
0 1 −1 0
1
e2nφs,t,+
0 , 1
on (a, b),
(3.16)
and the opening of the lens is based on this factorization. Observe that, since
b Re φs,t,± (x) = 0 and Im φs,t,± (x) = ∓ x dνs,t (u) for x ∈ (a, b) (see (3.12)), and since νs,t is positive on (a + δ, b − δ) for δ > 0 and s, t sufficiently small (see Remark 2.3), it follows (as in [8]) from the Cauchy-Riemann conditions that Re φs,t (z) < 0,
for |Im z| = 0 small and a + δ < Re z < b − δ.
(3.17)
We deform the RH problem for T into a RH problem for S by opening a lens as shown in Fig. 1, so that we obtain a contour . For now, we choose the lens to be contained in V, but we will specify later how we choose the lens exactly. Let ⎧ ⎪ T (z), for z outside the lens, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 1 0 ⎪ ⎨T (z) , for z in the upper part of the lens, −e2nφs,t (z) 1 S(z) = ⎪ ⎪ ⎪ ⎪ ⎪ 1 0 ⎪ ⎪ for z in the lower part of the lens. ⎪ ⎩T (z) e2nφs,t (z) 1 ,
(3.18)
Then, using (3.16) and the RH conditions for T , one can check that S is the unique solution of the following RH problem:
Fig. 1. The lens
Universality of Eigen value Correlation Kernel in Double Scaling Limit
515
RH problem for S. (a) S : C \ → C2×2 is analytic. (b) S+ (z) = S− (z)v S (z) for z ∈ , with ⎧ ⎪ 0 1 ⎪ ⎪ , on (a, b), ⎪ ⎪ ⎪ −1 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ 1 0 vS = , on ∩ C± , ⎪ e2nφs,t 1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 1 e−2nφs,t,− ⎪ ⎪ , on R \ (a, b). ⎪ ⎩ 0 1 (c) S(z) = I + O(1/z),
(3.19)
as z → ∞.
Remark 3.2. On the lips of the lens (away from a and b) and on R \ [a − δ, b + δ], it follows from (3.17) and (2.2) that the jump matrix for S converges exponentially fast to the identity matrix as n → ∞. This convergence is uniform as long as we stay away from small disks surrounding the endpoints a and b. Near these endpoints we have to construct local parametrices. 3.4. Parametrix P (∞) for the outside region. From Remark 3.2, we expect that the leading order asymptotics of Y will be determined by a solution P (∞) of the following RH problem: RH problem for P (∞) . (a) P (∞) : C \ [a, b] → C2×2 isanalytic. 0 1 (∞) (∞) , for x ∈ (a, b). (b) P+ (x) = P− (x) −1 0 (c) P (∞) (z) = I + O(1/z), as z → ∞. It is well known, see for example [8, 11], that P (∞) given by P
(∞)
z − b σ3 /4 1 1 −1 1 1 (z) = , i −i i −i z−a
for z ∈ C \ [a, b],
(3.20)
is a solution to the above RH problem. Note that P (∞) is independent of the parameters s, t and n. 3.5. Parametrix P (a) near the regular endpoint a. Here, we do the local analysis near the regular endpoint a. Let Uδ,a = {z ∈ C : |z − a| < δ} be a small disk with center a and radius δ > 0 sufficiently small such that the disk lies in V. We seek a 2 × 2 matrix valued function P (a) (depending on the parameters n, s, and t) in the disk Uδ,a with the same jumps as S and which matches with P (∞) on the boundary ∂Uδ,a of the disk. We thus seek a 2 × 2 matrix valued function that satisfies the following RH problem:
516
T. Claeys, M. Vanlessen
RH problem for P (a) . (a) P (a) : Uδ,a \ → C2×2 is analytic. (a) (a) (b) P+ (z) = P− (z)v S (z) for z ∈ ∩ Uδ,a , where v S is given by (3.19). (c) P (a) satisfies the matching condition P (a) (z)(P (∞) )(−1) (z) = I + O(n −1/7 ),
(3.21)
as n → ∞ and s, t → 0 such that (1.32) holds, uniformly for z ∈ ∂Uδ,a \ . 3.5.1. Airy model RH problem. We will construct P (a) by introducing an auxiliary 2 × 2 matrix valued function A(ζ ; r ) with jumps (in the variable ζ ) on an oriented contour = j j , shown in Fig. 2, consisting of four straight rays 1 : arg ζ = 0,
2 : arg ζ =
6π , 7
3 : arg ζ = π,
4 : arg ζ = −
6π . 7
These four rays divide the complex plane into four regions I , II , III , and IV , also shown in Fig. 2. Put y j = y j (ζ ; r ) = ω j Ai (ω j (ζ + r )), with ω = e
2πi 3
and with Ai the Airy function, and let, ⎧ ⎪ y0 −y2 ⎪ ⎪ , ⎪ ⎪ y0 −y2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ −y1 −y2 ⎪ ⎪ , ⎪ ⎪ ⎨ −y1 −y2 √ − πi A(ζ ; r ) = 2π e 4 × ⎪ ⎪ −y y 2 1 ⎪ ⎪ , ⎪ ⎪ −y2 y1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ y0 y1 ⎪ ⎪ ⎩ , y0 y1
j = 0, 1, 2,
for ζ ∈ I , for ζ ∈ II , (3.22) for ζ ∈ III , for ζ ∈ IV .
With y j we mean the derivative of y j with respect to ζ . It is well-known, see e.g. [8, 11], that A satisfies the following RH problem:
Fig. 2. The oriented contour . The four straight rays 1 , . . . , 4 divide the complex plane into four regions I,II,III and IV.
Universality of Eigen value Correlation Kernel in Double Scaling Limit
517
RH problem for A. (a) A is analytic for ζ ∈ C \ and for r in C. (b) A satisfies the following jump relations on , A+ (ζ ) = A− (ζ ) A+ (ζ ) = A− (ζ ) A+ (ζ ) = A− (ζ )
0 −1 1 0 1 1
1 , 0 1 , 1 0 , 1
for ζ ∈ 3 ,
(3.23)
for ζ ∈ 1 ,
(3.24)
for ζ ∈ 2 ∪ 4 .
(3.25)
(c) A has the following asymptotic behavior at infinity, A(ζ ; r ) = (ζ + r )− =ζ
−
σ3 4
σ3 4
2 3/2 N I + O (ζ + r )−3/2 e− 3 (ζ +r ) σ3
1 1 N I − r 2 ζ −1/2 σ3 + r 4 ζ −1 I + O(r 6 ζ −3/2 ) + O(r ζ −1 ) 4 32 2 3/2 +r ζ 1/2 )σ3
× e−( 3 ζ
,
(3.26)
as ζ → ∞, uniformly for r such that sgn Im (ζ + r ) = sgn Im ζ ,
and
|r | < |ζ |1/4 .
(3.27)
In (3.26), N is given by 1 1 1 − 14 πiσ3 e N=√ . 2 −1 1
(3.28)
3.5.2. Construction of P (a) . We seek P (a) in the following form P (a) (z) = E (a) (z)σ3 A n 2/3 f a (z); n 2/3 rs,t (z) σ3 enφs,t (z)σ3 ,
(3.29)
where E (a) is an invertible 2 × 2 matrix valued function analytic on Uδ,a and where f a and rs,t are (scalar) analytic functions on Uδ,a which are real on (a −δ, a +δ). In addition we take f a to be a conformal map from Uδ,a onto a convex neighborhood f a (Uδ,a ) of 0 such that f a (a) = 0 and f a (a) < 0. If those conditions are all satisfied, and if we open the lens (recall that the lens was not yet fully specified) such that f a ( ∩ (Uδ,a ∩ C+ )) = 4 ∩ f a (Uδ,a ), and f a ( ∩ (Uδ,a ∩ C− )) = 2 ∩ f a (Uδ,a ), then it is straightforward to verify, using (3.19) and (3.23)–(3.25), that P (a) defined by (3.29) satisfies conditions (a) and (b) of the RH problem for P (a) .
518
T. Claeys, M. Vanlessen
Let 2/3 a 3 −3/2 −πi f a (z) = ψ0 (ξ )dξ (a − z) (a − z) 2 z 2/3 √ 1 (z − a) + O((z − a)2 ), =− h 0 (a) b − a 2
as z → a,
where we have used (2.5), and let a (sψ1 (ξ ) + tψ2 (ξ ))dξ f a (z)−1/2 . rs,t (z) = −πi
(3.30)
(3.31)
z
Then, f a is analytic with f a (a) = 0 and f a (a) < 0, it is real on (a − δ, a + δ), and it is a conformal mapping on Uδ,a provided δ > 0 is sufficiently small. Further, it is straightforward to check that rs,t is analytic on Uδ,a and real on (a − δ, a + δ), as well. Thus, f a and rs,t satisfy the above conditions, so that P (a) defined by (3.29), with E (a) any invertible analytic matrix valued function, satisfies conditions (a) and (b) of the RH problem for P (a) . Remark 3.3. We can use any functions f a and rs,t , satisfying the conditions stated under Eq. (3.29), to construct the parametrix P (a) . However, we have to choose them so as to compensate for the factor enφs,t σ3 in (3.29). Using (2.21), (3.11), and the fact that
b a ψs,t+ (u)du = 1 we have e
−n
2 3/2 +r (z) f (z)1/2 s,t a 3 f a (z)
σ3
= (−1)n e−nφs,t (z)σ3 ,
for z ∈ Uδ,a \ [a, a + δ). (3.32)
From this and (3.26) it is clear that our choice of f a and rs,t will do the job. It now remains to determine E (a) such that the matching condition (c) holds as well. In order to do this we make use of the following result. Proposition 3.4. Let n → ∞ and s, t → 0 such that (1.32) holds. Then, −σ3 /4 P (a) (z) = (−1)n E (a) (z) n 2/3 f a (z) σ3 N σ3 (n 4/7rs,t (z))2 (n 4/7rs,t (z))4 −1/7 −2/7 −3/7 In × I− σ3 n + + O(n ) . 4 f a (z)1/2 32 f a (z) (3.33) Proof. We will use the asymptotics (3.26) of A. In order to do this we have to check that condition (3.27) is satisfied for our choice of ζ = n 2/3 f a (z) and r = n 2/3rs,t (z). Obviously rs,t (z) = O(n −4/7 ) as n → ∞ and s, t → 0 such that (1.32) holds, uniformly for z ∈ ∂Uδ,a . Then, it is straightforward to check that there exists n 0 ∈ N sufficiently large, and κ1 , κ2 > 0 sufficiently small, such that |n 2/3rs,t (z)| < |n 2/3 f a (z)|1/4 ,
(3.34)
for z ∈ ∂Uδ,a (for a possible smaller δ), for n ≥ n 0 , and for s and t such that |c1 n 6/7 s − s0 | ≤ κ1 and |c1 n 4/7 t − t0 | ≤ κ2 .
Universality of Eigen value Correlation Kernel in Double Scaling Limit
519
Further, since f a and rs,t are analytic near a and real valued on (a − δ, a + δ) one can check that Im f a (z) = f a (Re z)Im z + O((Im z)2 ), (Re z) + O((Im z)2 ), Im rs,t (z) = rs,t
as z → a, as z → a.
(Re z) = O(n −4/7 ) uniformly for z ∈ U Now, since f a (a) = 0 and rs,t δ,a one then can find a constant C > 0 such that,
|Im rs,t (z)| < C|Im z| < |Im f a (z)|, for z ∈ ∂Uδ,a \ (a − δ, a + δ), for n ≥ n 0 , and for s and t such that |c1 n 6/7 s − s0 | ≤ κ1 and |c1 n 4/7 t − t0 | ≤ κ2 (for a possible smaller δ, κ1 and κ2 , and for a possible larger n 0 ). This yields sgn(Im ( f a (z) + rs,t (z))) = sgn(Im f a (z)).
(3.35)
We now have shown that condition (3.27) is satisfied so that we can use the asymptotic behavior (3.26) of A. Using (3.29), (3.26), (3.32), and the fact that rs,t (z) = O(n −4/7 ) we obtain (3.33). From (3.33) and the fact that rs,t = O(n −4/7 ) it is clear that (in order that the matching condition (c) is satisfied) we have to define E (a) by E (a) = (−1)n P (∞) σ3 N −1 σ3 (n 2/3 f a )σ3 /4 .
(3.36)
E (a)
Obviously, is well-defined and analytic in Uδ,a \ (a, a + δ). Further, using condition 1/4 1/4 (b) of the RH problem for P (∞) , Eq. (3.28), and the fact that f a,− = i f a,+ on (a, a + δ), it is straightforward to check that E (a) has no jump on (a, a + δ). We then have that E (a) is analytic in Uδ,a except for a possible isolated singularity at a. However, E (a) has at most a square root singularity at a and hence it has to be a removable singularity. Further, since det P (∞) ≡ 1 and det N = 1 it is clear that det E (a) ≡ 1 and thus E (a) is invertible. This ends the construction of the parametrix near the regular endpoint. 3.6. Parametrix P (b) near the critical endpoint b. Here, we do the local analysis near the critical endpoint b. Let Uδ,b = {z ∈ C : |z − b| < δ} be a small disk with center b and radius δ > 0 sufficiently small such that Uδ,b lies in V and such that the disks Uδ,a and Uδ,b do not intersect. We seek a 2 × 2 matrix valued function P (b) (depending on n, s and t) in the disk Uδ,b with the same jumps as S and with matches with P (∞) on the boundary ∂Uδ,b of the disk. We thus seek a 2 × 2 matrix valued function that satisfies the following RH problem: RH problem for P (b) . (a) P (b) : Uδ,b \ → C2×2 is analytic. (b) P+(b) (z) = P−(b) (z)v S (z) for z ∈ Uδ,b ∩ , where v S is given by (3.19). (c) P (b) satisfies the matching condition P (b) (z)(P (∞) )−1 (z) = I + O(n −1/7 ),
(3.37)
as n → ∞ and s, t → 0 such that (1.32) holds, uniformly for z ∈ ∂Uδ,b \ . Due to the singular behavior of the equilibrium measure dν0 (x) near b, see Assumptions 1.1 (ii), the Airy parametrix does not fit near b. Instead we use a different model RH problem associated with the PI2 equation (1.14).
520
T. Claeys, M. Vanlessen
3.6.1. Model RH problem for the PI2 equation. We construct P (b) by introducing the following model RH problem for the special solution y of the PI2 equation (1.14) as discussed in Sect. 1.3. This RH problem depends on two complex parameters s, t and has jumps on the oriented contour as defined in Sect. 3.5, see Fig. 2. We seek a 2 × 2 matrix valued function (ζ ) = (ζ ; s, t) satisfying the following conditions: RH problem for . (a) is analytic for ζ ∈ C \ . (b) satisfies the following jump relations on , 0 1 , + (ζ ) = − (ζ ) −1 0 1 1 + (ζ ) = − (ζ ) , 0 1 1 0 + (ζ ) = − (ζ ) , 1 1
for ζ ∈ 3 ,
(3.38)
for ζ ∈ 1 ,
(3.39)
for ζ ∈ 2 ∪ 4 .
(3.40)
(c) has the following behavior at infinity, 1
(ζ ) = ζ − 4 σ3 N
I − hσ3 ζ −1/2 +
1 h2 i y −1 −3/2 ζ + O(ζ ) e−θ(ζ ;s,t)σ3 , 2 −i y h 2 (3.41)
where y = y(s, t) is the special solution of the PI2 equation (1.14) as discussed in Sect. 1.3, where ∂h ∂s = −y, where N is given by (3.28), and where θ is given by (1.21). Remark 3.5. Note that the only difference between the model RH problem for Airy functions and the one for PI2 lies in the asymptotic condition (c). In particular, in θ we have an extra factor ζ 7/2 . If we fix s0 , t0 ∈ R, it was proven in [7, Lemma 2.3 and Prop. 2.5] that there exists a neighborhood U of s0 and a neighborhood W of t0 such that the RH problem for is (uniquely) solvable for all (s, t) ∈ U × W. Furthermore, for (s, t) ∈ U × W, is analytic both in s and t, and condition (c) holds uniformly for (s, t) in compact subsets of U × W. In [7, Sect. 2.3], the authors have shown that the solution of the RH problem for satisfies the Lax pair (1.15)–(1.18). From (1.20), (3.41), and (3.40) we then obtain ⎧ ⎪ (ζ ; s, t) ⎪ 11 ⎪ , ⎪ ⎪ ⎨ 21 (ζ ; s, t) 1 (ζ ; s, t) = 2 (ζ ; s, t) ⎪ ⎪ 1 ⎪ 11 (ζ ; s, t) ⎪ ⎪ ⎩ 21 (ζ ; s, t) 1
for 0 < Arg ζ < 6π/7, 0 , 1
(3.42) for 6π/7 < Arg ζ < π .
Universality of Eigen value Correlation Kernel in Double Scaling Limit
521
3.6.2. Construction of P (b) . We seek P (b) in the following form: P (b) (z) = E (b) (z) n 2/7 f b (z); n 6/7 s f 1 (z), n 4/7 t f 2 (z) enφs,t (z)σ3 ,
(3.43)
where E (b) is an invertible 2 × 2 matrix valued function analytic on Uδ,b and where f b , f 1 , and f 2 are (scalar) analytic functions on Uδ,b which are real on (b − δ, b + δ). We take f 1 and f 2 to be such that f 1 (b) = c1 and f 2 (b) = c2 (where c1 and c2 are given by (1.30)). Then it is clear from (1.32) that for n sufficiently large and s and t sufficiently small, n 6/7 s f 1 (z) ∈ U, and n 4/7 t f 2 (z) ∈ W, for z ∈ Uδ,b , where U and W are the neighborhoods of s0 and t0 where exists. In addition we take f b to be a conformal map from Uδ,b onto a convex neighborhood f b (Uδ,b ) of 0 such that f b (b) = 0 and f b (b) > 0. If those conditions are all satisfied, and if we open the lens (recall that the lens was not yet fully specified near b) such that f b ( ∩ (Uδ,b ∩ C+ )) = 2 ∩ f b (Uδ,b ), and f b ( ∩ (Uδ,b ∩ C− )) = 4 ∩ f b (Uδ,a ), then it is straightforward to verify, using (3.19) and (3.38)–(3.40), that P (b) defined by (3.43) satisfies conditions (a) and (b) of the RH problem for P (b) . Let 2/7 b −7/2 f b (z) = 105 −πi ψ0 (ξ )dξ (z − b) (z − b) = c(z − b) + O(z − b)2 , z
(3.44) as z → 0, where
c=
15 √ h (b) b − a 2 0
2/7 .
To get the expansion of f b near b we have used (2.5) and the facts that h 0 (b) = h 0 (b) = 0 (see (2.7)). Further since h 0 (b) > 0 we have that c > 0. So, we have defined an analytic function f b with f b (b) = 0 and f b (b) = c > 0, which is real on (b − δ, b + δ), and which is a conformal mapping on Uδ,b provided δ > 0 is sufficiently small. Next, let f 1 and f 2 be defined by b b −1/2 f 1 (z) = −πi ψ1 (ξ )dξ f b (z) , f 2 (z) = −3 −πi ψ2 (ξ )dξ f b (z)−3/2 . z
z
(3.45) Since f b is a conformal mapping in Uδ,b it is clear from (2.14) and (2.6) that f 1 is analytic in Uδ,b . To see that f 2 is analytic in Uδ,b as well, we also need to use the extra condition h 2 (b) = 0 (see (2.12)). Further, f 1 and f 2 are real on (b − δ, b + δ) and one can check that, f 1 (b) =
h 1 (b) c1/2 (b − a)1/2
= c1 ,
f 2 (b) = −
h 2 (b) 3/2 c (b − a)1/2
= c2 .
(3.46)
Thus, f b , f 1 , and f 2 satisfy the above conditions, so that P (b) defined by (3.43), with E (b) any invertible analytic matrix valued function, satisfies conditions (a) and (b) of the RH problem for P (b) .
522
T. Claeys, M. Vanlessen
Remark 3.6. As in Remark 3.3 we note that we could have also used different functions f b , f 1 , and f 2 . However, we have to choose them so as to compensate for the factor enφs,t σ3 in (3.43). Using (1.21), (3.44), (3.45), (2.21), and (3.11) we have θ (n 2/7 f b (z); n 6/7 s f 1 (z), n 4/7 t f 2 (z)) = nφs,t (z),
for z ∈ Uδ,b \ (b − δ, b]. (3.47)
From this and (3.41) it is clear that our choice of f b , f 1 , and f 2 will do the job. It now remains to determine E (b) such that the matching condition (c) holds as well. In order to do this we make use of the following proposition (the analogon of Proposition 3.4). Proposition 3.7. Let n → ∞ and s, t → 0 such that (1.32) holds. Then, −σ3 /4 P (b) (z) = E (b) (z) n 2/7 f b (z) N 1 h2 i y −1 −2/7 −3/7 (z) n + O(n ) , × I − h f b (z)−1/2 σ3 n −1/7 + f b 2 −i y h 2 (3.48) where we have used for brevity the notation h = h(n 6/7 s f 1 (z), n 4/7 t f 2 (z)),
and
y = y(n 6/7 s f 1 (z), n 4/7 t f 2 (z)).
Proof. This follows easily from (3.43), (3.41), and (3.47)
From (3.48) it is clear that (in order that the matching condition (c) is satisfied) we have to define E (b) by, σ3 /4 E (b) = P (∞) N −1 n 2/7 f b ,
(3.49)
where N is given by (3.28) and where P (∞) is the parametrix for the outside region, given by (3.20). Similarly as we have proven that E (a) is an invertible analytic matrix valued function in Uδ,a , we can check that E (b) is invertible and analytic in Uδ,b . This completes the construction of the parametrix near the singular endpoint. 3.7. Final transformation: S → R. Having the parametrix P (∞) for the outside region and the parametrices P (a) and P (b) near the endpoints a and b, we have all the ingredients to perform the final transformation of the RH problem. Define ⎧ (a) −1 ⎪ ⎨ S(z) P (z), for z ∈ Uδ,a \ , R(z) = S(z) P (b) −1 (z), for z ∈ Uδ,b \ , ⎪ −1 ⎩ S(z) P (∞) (z), for z ∈ C \ ( ∪ Uδ,a ∪ Uδ,b ).
(3.50)
Then, by construction of the parametrices, R has only jumps on the reduced system of contours R shown in Fig. 3, and R satisfies the following RH problem. The circles around a and b are oriented clockwise.
Universality of Eigen value Correlation Kernel in Double Scaling Limit
523
Fig. 3. The contour R after the third and final transformation
RH problem for R. (a) R : C \ R → C2×2 is analytic. (b) R+ (z) = R− (z)v R (z) for z ∈ R , with ⎧ (∞) −1 (a) ⎪ on ∂Uδ,a , ⎨P P , v R = P (b) P (∞) −1 , on ∂Uδ,b , ⎪ ⎩ (∞) (∞) −1 P vS P , on the rest of R .
(3.51)
(c) R(z) = I + O(1/z), as z → ∞. (d) R remains bounded near the intersection points of R . As n → ∞ and s, t → 0 such that (1.32) holds, we have by construction of the parametrices that the jump matrix for R is close to the identity matrix, both in L 2 and L ∞ - sense on R , I + O(n −1/7 ), on ∂Uδ,a ∪ ∂Uδ,b , v R (z) = (3.52) I + O(e−γ n ), on the rest of R , with γ > 0 some fixed constant. Then, arguments as in [10, 11] guarantee that R itself is close to the identity matrix, R(z) = I + O(n −1/7 ),
uniformly for z ∈ C \ R ,
(3.53)
as n → ∞ and s, t → 0 such that (1.32) holds. This completes the Deift/Zhou steepest descent analysis. Remark 3.8. The Deift/Zhou steepest descent method can be generalized to the case where the support of ν0 consists of more than one interval. However, there are two (technical) differences. First, in the multi-interval case, the equilibrium measures ν1 and ν2 have densities which are more complicated than in the one-interval case, but it remains possible to give explicit formulae. Consequently, condition (1.27), which expresses the requirement that the density of ν2 vanishes at the singular endpoint, has to be modified. Further, the construction of the outside parametrix P (∞) is more complicated, since it uses -functions as in [10, Lemma 4.3]. With these modifications the asymptotic analysis can be carried through in the multi-interval case. 3.8. Asymptotics of R. For the purpose of proving the universality result for the kernel K n(s,t) (Theorem 1.7) it is enough to unfold the series of transformations Y → T → S → R and to use (3.53). This will be done in the next section. However, in order to (n,s,t) (n,s,t) and bn (Theorem determine the asymptotics of the recurrence coefficients an −1/7 1.11) we need to expand the O(n ) term in (3.53).
524
T. Claeys, M. Vanlessen
We show that the jump matrix v R for R has an expansion of the form, v R (z) = I +
1 (z) 2 (z) + 2/7 + O(n −3/7 ), n 1/7 n
(3.54)
as n → ∞ and s, t → 0 such that (1.32) holds, uniformly for z ∈ R , and we will explicitly determine 1 and 2 . On R \(∂Uδ,a ∪∂Uδ,b ), the jump matrix is the identity matrix plus an exponentially small term, so that 1 (z) = 0,
2 (z) = 0,
for z ∈ R \ (∂Uδ,a ∪ ∂Uδ,b ).
(3.55)
Now, from (3.51), (3.33), and (3.48) we obtain (3.54) with, 2 1 4/7 n rs,t (z) f a (z)−1/2 P (∞) (z)σ3 P (∞) (z)−1 , for z ∈ ∂Uδ,a , 4 1 (z) = −h f 0 (z)−1/2 P (∞) (z)σ3 P (∞) (z)−1 , for z ∈ ∂Uδ,b ,
1 (z) = −
(3.56) (3.57)
and 4 1 4/7 n rs,t (z) f a (z)−1 I, 32 2 1 h iy −1 , 2 (z) = f 0 (z) −i y h 2 2 2 (z) =
for z ∈ ∂Uδ,a ,
(3.58)
for z ∈ ∂Uδ,b ,
(3.59)
where we have used for brevity h = h(n 6/7 s f 1 (z), n 4/7 t f 2 (z)),
y = y(n 6/7 s f 1 (z), n 4/7 t f 2 (z)).
Observe that 1 and 2 have an extension to an analytic function in a punctured neighborhood of a and a punctured neighborhood of b with simple poles at a and b. As in [11, Theorem 7.10] we obtain from (3.54) that R satisfies, R(z) = I +
R (1) (z) R (2) (z) + + O(n −3/7 ), n 1/7 n 2/7
(3.60)
as n → ∞ and s, t → 0 such that (1.32) holds, which is valid uniformly for z ∈ C \ (∂Uδ,a ∪ ∂Uδ,b ). We have that R (1) and R (2) are analytic on C \ (∂Uδ,a ∪ ∂Uδ,b ), R
(1)
(z) = O(1/z),
R
(2)
(z) = O(1/z),
We will now compute the functions R (1) and R (2) explicitly.
as z → ∞.
(3.61) (3.62)
Universality of Eigen value Correlation Kernel in Double Scaling Limit
525
Determination of R (1) . Expanding the jump relation R+ = R− v R using (3.54) and (3.60), and collecting the terms with n −1/7 we find (1)
(1)
R+ (z) = R− (z) + 1 (z),
for z ∈ ∂Uδ,a ∪ ∂Uδ,b .
This together with (3.61) and (3.62) gives an additive RH problem for R (1) . Recall that 1 is analytic in a neighborhood of z = a and z = b except for simple poles at a and b. So, 1 (z) =
A(1) + O(1), z−a
as z → a,
1 (z) =
B (1) + O(1), z−b
as z → b,
for certain matrices A(1) and B (1) . We then see by inspection that ⎧ (1) A B (1) ⎪ ⎪ + for z ∈ C \ (U δ,a ∪ U δ,b ), ⎪ ⎨z − a z − b, R (1) (z) = ⎪ (1) ⎪ B (1) ⎪ ⎩ A + − 1 (z), for z ∈ Uδ,a ∪ Uδ,b , z−a z−b
(3.63)
solves the additive RH problem for R (1) . It now remains to determine A(1) and B (1) . This can be done by expanding the formulas (3.56) and (3.57) near z = a and z = b, respectively. We then find after a straightforward calculation (using also the fact that f 1 (b) = c1 and f 2 (b) = c2 , see (3.46), 1√ 1 i , b − a (n 4/7rs,t (a))2 (− f a (a))−1/2 i −1 8 1 √ −1 i , = h b − a f b (b)−1/2 i 1 2
A(1) =
(3.64)
B (1)
(3.65)
where we used h to denote h(c1 n 6/7 s, c2 n 4/7 t) for brevity. Determination of R (2) . Next, expanding the jump relation R+ = R− v R using (3.54) and (3.60), and collecting the terms with n −2/7 we find (2)
(2)
(1)
R+ (z) = R− (z) + R− (z)1 (z) + 2 (z),
for z ∈ ∂Uδ,a ∪ ∂Uδ,b .
(1) This together with (3.61) and (3.62) gives an additive RH problem for R (2) . Since R− is the boundary value of the restriction of R (1) to the disks Uδ,a and Uδ,b and since 1 and 2 are analytic in a neighborhood of a and b, except for simple poles at a and b, we have
A(2) + O(1), z−a B (2) + O(1), R (1) (z)1 (z) + 2 (z) = z−b
R (1) (z)1 (z) + 2 (z) =
as z → a, as z → b,
526
T. Claeys, M. Vanlessen
for certain matrices A(2) and B (2) . As in the determination of R (1) we then see by inspection that ⎧ (2) A B (2) ⎪ ⎪ for z ∈ C \ (U δ,a ∪ U δ,b ), ⎪ ⎨z − a + z − b, (2) R (z) = (3.66) ⎪ A(2) (2) ⎪ B ⎪ (1) ⎩ + − R (z)1 (z) − 2 (z), for z ∈ Uδ,a ∪ Uδ,b , z−a z−b solves the additive RH problem for R (2) . The determination of A(2) and B (2) is more complicated than the determination of A(1) and B (1) . It involves R (1) (a) and R (1) (b) for which we need to determine also the constant terms in the expansions of 1 near z = a and z = b. After a straightforward (but rather long calculation) we find, (n 4/7rs,t (a))2 h (n 4/7rs,t (a))4 0 i 1 i , (3.67) + A(2) = 32(− f a (a)) −i 0 8(− f a (a))1/2 f b (b)1/2 −i 1 B
(2)
y + h2 0 = 2 f b (b) −i
(n 4/7rs,t (a))2 h −1 i + 1/2 1/2 −i 0 8(− f a (a)) f b (b)
i , −1
(3.68)
where we used h and y to denote h(c1 n 6/7 s, c2 n 4/7 t) and y(c1 n 6/7 s, c2 n 4/7 t) for brevity. 4. Universality of the Double Scaling Limit Here, we will prove the universality result for the 2-point correlation kernel K n(s,t) . We (s,t) do this by using the expression (3.4) for K n in terms of Y and by unfolding the series of transformations Y → T → S → R. Proof of Theorem 1.11. From Eqs. (3.4), (3.9), and (3.13), the reader can verify that the (s,t) 2-point kernel K n can be written as, cf. [5, 6], −1 1 1 0 1 T+ (y)T+ (x) , for x, y ∈ R. K n(s,t) (x, y) = e−nφs,t,+ (x) e−nφs,t,+ (y) 0 2πi(x − y) From (3.18) and the fact that S+ = R P+(b) on (b − δ, b + δ), see (3.50), we have ⎧ ⎪ R P (b) , on (b, b + δ), ⎪ ⎨ + T+ = 1 0 (b) ⎪ , on (b − δ, b). ⎪ ⎩ R P+ 2nφ s,t,+ e 1 (s,t)
Inserting this in the previous equation for K n
we arrive at,
1 01 2πi(x − y) 1 , × P −1 (y)R −1 (y)R(x) P(x) 0
K n(s,t) (x, y) = e−nφs,t,+ (x) e−nφs,t,+ (y)
(4.1)
Universality of Eigen value Correlation Kernel in Double Scaling Limit
527
for x ∈ (b − δ, b + δ), where ⎧ (b) ⎪ ⎨ P+ , on (b, b + δ), P= 1 0 (b) ⎪ , on (b − δ, b). ⎩ P+ e2nφs,t,+ 1
(4.2)
Further, we define ⎧ ⎪ ⎨+ = 1 ⎪ ⎩+ 1
on R+ , (4.3)
0 , on R− , 1
where is the solution of the RH problem for , see Sect. 3.6. By (3.42), we have that 11 = 1 and 21 = 2 . Using (3.43), (4.2), and (4.3) a straightforward calculation yields, P(x) = E (b) (x) n 2/7 f b (x); n 6/7 s f 1 (x), n 4/7 t f 2 (x) enφs,t,+ (x)σ3 , for x ∈ (b − δ, b + δ). Inserting this into (4.1) we then obtain, −1 2/7 1 01 n f b (y); n 6/7 s f 1 (y), n 4/7 t f 2 (y) 2πi(x − y) 1 ×(E (b) )−1 (y)R −1 (y)R(x)E (b) (x) n 2/7 f b (x); n 6/7 s f 1 (x), n 4/7 t f 2 (y) , 0 (4.4)
K n(s,t) (x, y) =
for x ∈ (b − δ, b + δ). Now, we introduce for the sake of brevity some notation. Let un = b +
2/7 √ u v , and vn = b + 2/7 , with c = f b (b) = 15 h 0 (b) b − a . 2 2/7 cn cn (4.5)
We then have, lim n 2/7 f b (u n ) = u,
n→∞
and
lim n 2/7 f b (vn ) = v.
n→∞
(4.6)
Furthermore, since f 1 (b) = c1 and f 2 (b) = c2 (see (3.46)) we have in the limit as n → ∞ and s, t → 0 such that (1.32) holds, lim n 6/7 s f 1 (u n ) = s0 , lim n
4/7
t f 2 (u n ) = t0 ,
lim n 6/7 s f 1 (vn ) = s0 ,
(4.7)
t f 2 (vn ) = t0 .
(4.8)
lim n
4/7
Now, a similar argument as in [24] shows that lim E b−1 (vn )R(vn )−1 R(u n )E b (u n ) = I.
(4.9)
528
T. Claeys, M. Vanlessen
Inserting (4.6)–(4.9) into (4.4) and using the fact that 11 = 1 and 21 = 2 it is then straightforward to obtain lim
1 K (s,t) (u n , vn ) cn 2/7 n −1 1 1 0 1 (v; s0 , t0 )(u; s0 , t0 ) = 0 2πi(u − v) 1 = (1 (u; s0 , t0 )2 (v; s0 , t0 ) − 1 (v; s0 , t0 )2 (u; s0 , t0 )) , (4.10) −2πi(u − v)
where we take the limit n → ∞ and s, t → 0 such that (1.32) holds. This completes the proof of Theorem 1.7. 5. Asymptotics of the Recurrence Coefficients We will now determine the asymptotics of an(n,s,t) and bn(n,s,t) as n → ∞ and s, t → 0 such that (1.32) holds. In order to do this, we make use of the following result, see e.g. [8, 11]. Let Y be the unique solution of the RH problem for Y . There exist 2 × 2 constant (independent of z but depending on n, s and t) matrices Y1 and Y2 such that Y (z)
z −n 0 0 zn
=I+
Y1 Y2 + 2 + O(1/z 3 ), z z
as z → ∞,
(5.1)
and an(n,s,t) =
(Y1 )12 (Y1 )21 ,
bn(n,s,t) = (Y1 )11 +
(Y2 )12 . (Y1 )12
(5.2)
We need to determine the constant matrices Y1 and Y2 . For large |z| it follows from (3.9), (3.18), and (3.50), that 1
1
Y (z) = e 2 ns,t σ3 R(z)P (∞) (z)engs,t (z)σ3 e− 2 ns,t σ3 .
(5.3)
So, in order to determine Y1 and Y2 we need the asymptotic behavior of P (∞) (z), engs,t (z)σ3 , and R(z) as z → ∞. Asymptotic behavior of P (∞) (z) as z → ∞. Expanding the factor ((z − b)/(z − a))σ3 /4 in (3.20) at z = ∞ it is clear that,
P (∞) (z) = I +
(∞)
P1 z
(∞)
+
P2 z
+ O(1/z 3 ),
as z → ∞,
(5.4)
with P1(∞)
i 0 1 , = (b − a) −1 0 4
P2(∞)
i 2 ∗ 1 2 . = (b − a ) −1 ∗ 8
(5.5)
Universality of Eigen value Correlation Kernel in Double Scaling Limit
529
Asymptotic behavior of engs,t (z)σ3 as z → ∞. By (3.5) we have −n G1 G2 0 z + 2 + O(1/z 3 ), =I+ as z → ∞, engs,t (z)σ3 0 zn z z with
G 1 = −n
b
a
1 0 udνs,t (u) , 0 −1
∗0 G2 = . 0∗
(5.6)
(5.7)
Asymptotic behavior of R(z) as z → ∞. As in [11] the matrix valued function R has the following asymptotic behavior at infinity: R(z) = I +
R1 R2 + 2 + O(1/z 3 ), z z
as z → ∞.
The compatibility with (3.60), (3.63), and (3.66) yields that R1 = A(1) + B (1) n −1/7 + A(2) + B (2) n −2/7 + O(n −3/7 ), R2 = a A(1) + bB (1) n −1/7 + a A(2) + bB (2) n −2/7 + O(n −3/7 ),
(5.8)
(5.9) (5.10)
as n → ∞ and s, t → 0 such that (1.32) holds. Here, A(1) , B (1) , A(2) , and B (2) are given by (3.64), (3.65), (3.67), and (3.68), respectively. Now, we are ready to determine the asymptotics of the recurrence coefficients. Proof of Theorem 1.11. Note that by (5.3), (5.4), (5.6) and (5.8), 1 1 Y1 = e 2 ns,t σ3 P1(∞) + G 1 + R1 e− 2 ns,t σ3
(5.11)
and
1 1 (∞) (∞) (∞) Y2 = e 2 ns,t σ3 P2 + G 2 + R2 + R1 P1 + P1 + R1 G 1 e− 2 ns,t σ3 .
(5.12)
(n,s,t)
. Inserting (5.11) into (5.2), and using We start with the recurrence coefficient an (∞) (∞) the facts that (P1 )12 = −(P1 )21 = i(b −a)/4 (by (5.5)) and (G 1 )12 = (G 1 )21 = 0 (by (5.7)), we obtain ! "1/2 b−a 2 b−a (n,s,t) = +i . (5.13) an ((R1 )21 − (R1 )12 ) + (R1 )12 (R1 )21 4 4 Now, from the formula (5.9) for R1 and the formulas (3.64), (3.65), (3.67), and (3.68) for A(1) , B (1) , A(2) , and B (2) , we have ! 4/7 2 " y (n rs,t (a))2 h + (R1 )21 − (R1 )12 = −i + n −2/7 + O(n −3/7 ), f b (b) 4(− f a (a))1/2 f b (b) and (R1 )12 (R1 )21
b−a =− 4
(n 4/7rs,t (a))2 h + 1/2 4(− f a (a)) f b (b)
2
n −2/7 + O(n −3/7 ).
530
T. Claeys, M. Vanlessen
Note that we have used y to denote y(c1 n 6/7 s, c2 n 4/7 t) for brevity. Inserting the latter two equations into (5.13) and using the fact that f b (b) = c (by (3.44)) we then obtain (1.39). We will now consider the recurrence coefficient bn(n,s,t) . Inserting (5.11) and (5.12) (∞) (∞) into (5.2), and using the facts that (P1 )11 = (P1 )22 = 0, (G 1 )12 = (G 2 )12 = 0, (G 1 )11 + (G 1 )22 = 0, and R1 = O(n −1/7 ), we obtain bn(n,s,t) = (R1 )11 +
(∞)
(P2 !
= (R1 )11 +
(∞)
)12 + (R1 )11(P1 (∞) (P1 )12
(P2
(∞)
)12
(P1
(∞)
)12
⎡ × ⎣1 −
(P1(∞) )12
(∞)
+
+ (R1 )12
+ (R1 )11 +
(R1 )12
)12 + (R2 )12
(R1 )12
(R2 )12 (∞)
(P1
"
)12 ⎤
2
(P1(∞) )12
+ O(n −3/7 )⎦ .
(∞)
Since (P1 )12 = i(b − a)/4, (P2 )12 = i(b2 − a 2 )/8, R1 = O(n −1/7 ), and R2 = O(n −1/7 ) we then obtain after a straightforward calculation and combining terms, bn(n,s,t) =
4i b+a b+a 4i + 2(R1 )11 + 2i (R1 )12 − (R2 )12 (R1 )12 1+ 2 b−a b−a b−a 4i (R1 )11 (R1 )12 + O(n −3/7 ). − (5.14) b−a
Now, from (5.9), (5.10), (3.64), (3.65), (3.67), and (3.68) we have b+a 4i (2) (2) (R1 )12 − (R2 )12 = 2i A12 − B12 n −2/7 + O(n −3/7 ) b−a b−a h2 (n 4/7rs,t (a))4 y + − n −2/7 + O(n −3/7 ), = f b (b) f b (b) 16(− f a (a))
2(R1 )11 + 2i
and (R1 )11 (R1 )12 = −i = −i
(1) 2 A12
b−a 4
−
(1) 2 B12
n −2/7 + O(n −3/7 )
h2 (n 4/7rs,t (a))4 − f b (b) 16(− f a (a))
n −2/7 + O(n −3/7 ).
Inserting the latter two equations into (5.14) and using the facts that (R1 )12 = O(n −1/7 ) and f b (b) = c we obtain (1.40). So, the theorem is proven.
Universality of Eigen value Correlation Kernel in Double Scaling Limit
531
Acknowledgements. We thank Arno Kuijlaars for careful reading and stimulating discussions. The authors are supported by FWO research project G.0455.04, by K.U.Leuven research grant OT/04/24, and by INTAS Research Network NeCCA 03-51-6637. The second author is Postdoctoral Fellow of the Fund for Scientific Research - Flanders (Belgium).
References 1. Bleher, P., Its, A.: Semiclassical asymptotics of orthogonal polynomials, Riemann-Hilbert problem, and universality in the matrix model. Ann. Math. 150, 185–266 (1999) 2. Bleher, P., Its, A.: Double scaling limit in the random matrix model: the Riemann-Hilbert approach. Comm. Pure Appl. Math. 56, 433–516 (2003) 3. Bowick, M.J., Brézin, E.: Universal scaling of the tail of the density of eigenvalues in random matrix models, Phys. Lett. B 268(1), 21–28 (1991) 4. Brézin, E., Marinari, E., Parisi, G.: A non-perturbative ambiguity free solution of a string model. Phys. Lett. B 242(1), 35–38 (1990) 5. Claeys, T., Kuijlaars, A.B.J.: Universality of the double scaling limit in random matrix models. Comm. Pure Appl. Math. 59, 1573–1603 (2006) 6. Claeys, T., Kuijlaars, A.B.J., Vanlessen, M.: Multi- critical unitary random matrix ensembles and the general Painlevé II equation. http://arxiv.org/list/math-ph/0508062, to appear in Ann. Math. 7. Claeys, T., Vanlessen, M.: The existence of a real pole- free solution of the fourth order analogue of the Painlevé I equation. Nonlinearity 20, 1163–1184 (2007) 8. Deift, P.: Orthogonal Polynomials and Random Matrices: A Riemann- Hilbert Approach. Courant Lecture Notes 3, New York: New York University, 1999 9. Deift, P., Kriecherbauer, T., McLaughlin, K.T-R.: New results on the equilibrium measure for logarithmic potentials in the presence of an external field. J. Approx. Theory 95, 388–475 (1998) 10. Deift, P., Kriecherbauer, T., McLaughlin, K.T-R., Venakides, S., Zhou, X.: Uniform asymptotics for polynomials orthogonal with respect to varying exponential weights and applications to universality questions in random matrix theory. Comm. Pure Appl. Math. 52, 1335–1425 (1999) 11. Deift, P., Kriecherbauer, T., McLaughlin, K.T-R., Venakides, S., Zhou, X.: Strong asymptotics of orthogonal polynomials with respect to exponential weights. Comm. Pure Appl. Math. 52, 1491–1552 (1999) 12. Deift, P., McLaughlin, K.T-R.: A continuum limit of the Toda lattice. Vol. 131, Memoirs of the Amer. Math. Soc. 624, Providence, RI: Amer. Math. Soc., 1998 13. Deift, P., Zhou, X.: A steepest descent method for oscillatory Riemann-Hilbert problems. Asymptotics for the MKdV equation. Ann. Math. 137, 295–368 (1993) 14. Dubrovin, B., Liu, S.-Q., Zhang, Y.: On Hamiltonian perturbations of hyperbolic systems of conservation laws I: quasi-triviality of bi-Hamiltonian perturbations. Comm. Pure Appl. Math. 59(4), 559–615 (2006) 15. Dubrovin, B.: On Hamiltonian perturbations of hyperbolic systems of conservation laws, II: universality of critical behaviour. Commun. Math. Phys. 267, 117–139 (2006) 16. Duits, M., Kuijlaars, A.B.J.: Painlevé I asymptotics for orthogonal polynomials with respect to a varying quartic weight. Nonlinearity 19, 2211–2245 (2006) 17. Fokas, A.S., Its, A.R., Kitaev, A.V.: The isomonodromy approach to matrix models in 2D quantum gravity. Commun. Math. Phys. 147, 395–430 (1992) 18. Hastings, S.P., McLeod, J.B.: A boundary value problem associated with the second Painlevé transcendent and the Korteweg-de Vries equation. Arch. Rat. Mech. Anal. 73, 31–51 (1980) 19. Kapaev, A.A.: Weakly nonlinear solutions of equation PI2 . J. Math. Sc. 73(4), 468–481 (1995) 20. Kawai, T., Koike, T., Nishikawa, Y., Takei, Y.: On the Stokes geometry of higher order Painlevé equations. Analyse Complexe, systèmes Dynamiques, sommabilité Des séries Divergentes Et théories Galoisiennes. II. Astérisque 297, 117–166 (2004) 21. Kuijlaars, A.B.J., McLaughlin, K.T-R.: Generic behavior of the density of states in random matrix theory and equilibrium problems in the presence of real analytic external fields. Comm. Pure Appl. Math. 53, 736–785 (2000) 22. Kuijlaars, A.B.J., McLaughlin, K.T-R., Van Assche, W., Vanlessen, M.: The Riemann–Hilbert approach to strong asymptotics for orthogonal polynomials. Adv. Math. 188(2), 337–398 (2004) 23. Kudryashov, N.A., Soukharev, M.B.: Uniformization and transcendence of solutions for the first and second Painlevé hierarchies, Phys. Lett. A 237(4–5), 206–216 (1998) 24. Kuijlaars, A.B.J., Vanlessen, M.: Universality for eigenvalue correlations at the origin of the spectrum. Commun. Math. Phys. 243, 163–191 (2003) 25. Mehta, M.L.: Random Matrices. 2nd. ed. Boston: Academic Press, 1991 26. Moore, G.: Geometry of the string equations. Commun. Math. Phys. 133(2), 261–304 (1990) 27. Pastur, L., Shcherbina, M.: Universality of the local eigennvalue statistics for a class of unitary invariant random matrix ensembles. J. Stat. Phys. 86(1–2), 109–147 (1997)
532
T. Claeys, M. Vanlessen
28. Saff, E.B., Totik, V.: Logarithmic Potentials with External Fields. New York: Springer-Verlag, 1997 29. Shcherbina, M.: Double scaling limit for matrix models with non analytic potentials. http:// arxiv./org/list/cond-math/0511161, 2005 30. Szeg˝o, G.: “Orthogonal polynomials”. 3r d ed., Providence, RI: Amer. Math. Soc. 1974 31. Vanlessen, M.: Strong asymptotics of the recurrence coefficients of orthogonal polynomials associated to the generalized Jacobi weight. J. Approx. Theory 125, 198–237 (2003) 32. Vanlessen, M.: Strong asymptotics of Laguerre-type orthogonal polynomials and applications in random matrix theory. http://arxiv.org/list/math.CA/0504604, 2005, to appear in Constr. Approx Communicated by B. Simon
Commun. Math. Phys. 273, 533–559 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0220-8
Communications in
Mathematical Physics
Scattering Solutions in Networks of Thin Fibers: Small Diameter Asymptotics S. Molchanov, B. Vainberg Dept. of Mathematics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA. E-mail: [email protected] Received: 24 July 2006 / Accepted: 5 October 2006 Published online: 13 March 2007 – © Springer-Verlag 2007
Abstract: Small diameter asymptotics is obtained for scattering solutions in a network of thin fibers. The asymptotics is expressed in terms of solutions of related problems on the limiting quantum graph . We calculate the Lagrangian gluing conditions at vertices v ∈ for the problems on the limiting graph. If the frequency of the incident wave is above the bottom of the absolutely continuous spectrum, the gluing conditions are formulated in terms of the scattering data for each individual junction of the network. 1. Formulation of the Problem and Statement of the Results The paper concerns the asymptotic analysis of wave propagation through a system of wave guides when the thickness ε of the wave guides is very small and the wave length is comparable to ε. The problem is described by the stationary wave (Helmholtz) equation −ε2 u = λu,
x ∈ ε ,
(1)
in a domain ε ⊂ R d , d ≥ 2, with infinitely smooth boundary (for simplicity) which has the following structure: ε is a union of a finite number of cylinders C j,ε (which we shall call channels), 1 ≤ j ≤ N , of lengths l j with the diameters of cross-sections of order O (ε) and domains J1,ε , . . . , J M,ε (which we shall call junctions) connecting the channels into a network. It is assumed that the junctions have diameters of the same order O(ε). Let m channels have infinite length. We start the numeration of C j,ε with the infinite channels. So, l j = ∞ for 1 ≤ j ≤ m. The axes of the channels form edges j of the limiting (ε → 0) metric graph . The vertices v j ∈ V of the graph correspond to the junctions J j,ε . The Helmholtz equation in ε must be complemented by the boundary conditions (BC) on ∂ε . In some cases (for instance, when studying heat transport in ε ) the Neumann BC is natural. In fact, the Neumann BC presents the simplest case due to The authors were supported partially by the NSF grant DMS-0405927.
534
S. Molchanov, B. Vainberg
J3,ε
C1,ε
C4,ε
C6,ε J1,ε
J2,ε C8,ε
C5,ε C7,ε C2,ε
J4,ε C3,ε
Γ
Fig. 1. An example of a domain ε with four junctions, four unbounded channels and four bounded channels.
the existence of a simple ground state (a constant) of the problem in ε . However, in many applications, the Dirichlet, Robin or impedance BC are more important. We shall consider (apart from a general discussion) only the Dirichlet BC, but all the arguments and results can be modified to be applied to the problem with other BC. An important class of domains ε are self-similar domains with only one junction and all the channels being infinite. We will call them spider domains. Thus, if ε is a spider domain, then there exist a point x = x(ε) and an ε-independent domain such that ε = {( x + εx) : x ∈ }.
(2)
Thus, ε is the ε-contraction of = 1 . For any ε , let J j (v),ε be the junction which corresponds to a vertex v ∈ V of the limiting graph . Consider a junction J j (v),ε and all adjacent to J j (v),ε channels. If some of these channels have a finite length, we extend them to infinity. We assume that, for each v ∈ V, the resulting domain v,ε which consists of a junction J j (v),ε and emanating from it semi-infinite channels is a spider domain (i.e., v,ε is self-similar). This assumption can be weakened. For example, one can consider some type of “curved” channels, and the final results (with some changes) will remain valid. Simple equations on the limiting graph in this case will be replaced by more complicated equations with variable coefficients. However, even small deviation from the assumption on the selfsimilarity of v,ε would make the statement of the results and the proofs much more technical. So, we consider only domains ε for which v,ε , v ∈ V, are self-similar. Hence, the cross sections ω j,ε of channels C j,ε are ε−homothety of bounded domains ω j ∈ R d−1 . Let λ j,0 < λ j,1 ≤ λ j,2 . . . be eigenvalues of the negative Laplacian −d−1 in ω j with the Dirichlet boundary condition on ∂ω j , and let {ϕ j,n } be the set of corresponding orthonormal eigenfunctions. The eigenvalues λ j,n coincide with the eigenvalues of −ε2 d−1 in ω j,ε . In the presence of infinite channels, the spectrum of the operator −ε2 in ε with the Dirichlet boundary condition on ∂ε has an absolutely continuous component which coincides with the semi-bounded interval [λ0 , ∞), where λ0 = min λ j,0 . 1≤ j≤m
(3)
Equation (1) is considered under the assumption that λ ≥ λ0 , when propagation of waves is possible. There are two very different cases: λ → λ0 as ε → 0, i.e. the frequency is at the edge (or bottom) of the absolutely continuous spectrum, or λ → λ > λ0 , i.e. the
Scattering Solutions in Networks of Thin Fibers: Small Diameter Asymptotics
535
frequency is above the bottom of the absolutely continuous spectrum. There are many results about the first case, the references will be given later. This paper concerns the asymptotic analysis of the scattering solutions for the Dirichlet problem in ε when λ is close to λ > λ0 . If ε → 0, one can expect that the solution u ε of (1) in ε can be described in terms of the solution ς = ςε (t) of a much simpler problem on the graph . For example, if λ j,0 < λ < λ j,1 for all j, then ς satisfies the following equation on each edge of the graph −
ε2 d 2 ς (t) = (λ − λ j,0 )ς (t) , dt 2
(4)
where t is the length parameter on the edges. One has to add appropriate gluing conditions (GC) at the vertices v of . These gluing conditions give basic information on the propagation of waves through the junctions. They define the solution ς of the problem (4) on the limiting graph. The ordinary differential equation (4), the GC, and the solution ς depend on ε. However, we shall often call the corresponding problem on the graph the limiting problem, since it enables one to find the main term of the asymptotics as ε → 0 for the solution u = u ε of the problem (1) in ε . One of the main difficulties in the problem under investigation was to find the GC, in particular, since the GC differ dramatically from those which were known in the case of λ close to the bottom of the spectrum. Let us define the scattering solutions for the Dirichlet problem in ε . We introduce local coordinates (t, y) in each channel C j,ε with t axis parallel to the cylinder C j,ε , 0 < t < l j , and y ∈ R n−1 being Euclidean coordinates in the plane perpendicular to the t axis. The coordinate y is chosen in such a way that ω j,ε = {(εy) : y ∈ ω j ∈ R n−1 }. 1−d For each j, the set {ε 2 ϕ j,n ( εy )} is the orthonormal basis in L 2 (ω j,ε ) consisting of eigenfunctions of the operator −ε2 d−1 . Let l be a bounded closed interval of the real axis which does not contain the points λ j,n , j ≤ N . Thus, there exist m j ≥ 1 such that λ j,m j < λ < λ j,m j +1 for all λ ∈ l. As will be seen from the definitions below, m j + 1 is the number of waves which may propagate in each √ direction in the channel C j,ε without loss of energy and with frequencies less than λ, λ ∈ l. We put m j = −1, thus {λ j,n , 0 ≤ n ≤ m j } is the empty set if λ j,0 > λ for λ ∈ l. Consider the non-homogeneous Dirichlet problem (−ε2 − λ)u = f, x ∈ ε ;
u = 0 on ∂ε .
(5)
Definition 1. Let f ∈ L 2com (ε ) have a compact support, and λ ∈ l. A solution u of (5) is called an outgoing solution if it has the following asymptotic behavior at infinity in each infinite channel C j,ε , 1 ≤ j ≤ m: u=
mj
√ a j,n e
i
λ−λ j,n ε
t
ϕ j,n (y/ε) + O(e−γ t ), γ = γ (ε) > 0,
(6)
n=0 (ε)
Definition 2. A function = s,k , 1 ≤ s ≤ m, 0 ≤ k ≤ m j , is called a solution of the scattering problem in ε if (−ε2 − λ) = 0, x ∈ ε ;
= 0 on ∂ε ,
(7)
536
S. Molchanov, B. Vainberg
and has the following asymptotic behavior at infinity in each infinite channel C j,ε , 1 ≤ j ≤ m: (ε) s,k
= δs, j e
−i
√
λ−λs,k ε
t
ϕs,k (y/ε) +
mj
√ t j,n e
i
λ−λ j,n ε
t
ϕ j,n (y/ε) + O(e−γ t ),
(8)
n=0
where γ = γ (ε) > 0, and δs, j is the Kronecker symbol, i.e. δs, j = 1 if s = j, δs, j = 0 if s = j. The first term in (8) corresponds to the incident wave, and all other terms describe the transmitted waves. The incident wave depends on s and k, where s determines the channel, and s and k together determine the frequency of the incident wave. The transmission coefficients t j,n also depend on s and k (i.e. on the choice of the incident wave), so sometimes we will denote them by t s,k j,n . We introduce an order in the set of incident waves and corresponding scattering solutions and the same order in the set of transmitted waves. Namely, we number the incident waves in the channel C1,ε taking them in the order of increase of absolute values of their frequencies, then we number all the solutions in the channel C2,ε , and so on. With this order taken into account, the transmission coefficients for a particular scattering solution form a column vector with M=
m (m j + 1)
(9)
j=1
entries. Together, they form an M × M scattering matrix T = {t s,k j,n },
(10)
where s, k define the column of T and j, n define the row. We denote by D the diagonal M × M matrix with elements λ − λ j,n on the diagonal taken in the same order as above. The following statement can be useful in some applications, and will be proved in the next section (although it will not be used in this paper). Theorem 3. The matrix D 1/2 T D −1/2 is unitary and symmetric. The operator H = −ε2 with the Dirichlet boundary conditions on ∂ε is non-negative, and therefore the resolvent Rλ = (−ε2 − λ)−1 : L 2 (ε ) → L 2 (ε )
(11)
is analytic in the complex λ plane outside the positive semi-axis λ ≥ 0. Hence, the operator Rk 2 is analytic in k in the half plane Imk > 0. We are going to consider an analytic extension of the operator Rk 2 onto the real axis and in the lower half plane. Such an extension does not exist if Rk 2 is considered as an operator in L 2 (ε ) since Rk 2 is an unbounded operator when λ = k 2 belongs to the spectrum of the operator Rλ . However, one can extend Rk 2 analytically if it is considered as an operator in the following spaces (with a smaller domain and a larger range): 2 Rk 2 : L 2com (ε ) → L loc (ε ).
(12)
Scattering Solutions in Networks of Thin Fibers: Small Diameter Asymptotics
537
Theorem 4. (1) The spectrum of the operator H = −ε2 in ε with the Dirichlet boundary conditions on ∂ε consists of the absolutely continuous component [λ0 , ∞), where λ0 > 0 is given by (3) and, possibly, a discrete set of positive eigenvalues {λ j,ε } with the only possible limiting point at infinity. The multiplicity of the a.c. spectrum changes at points λ = λ j,n , and at any point λ, it is equal to the number of points λ j,n , 1 ≤ j ≤ m, located below λ. The eigenvalues λ j,ε = λ j for spider domains ε do not depend on ε. (2) The operator (12) admits a meromorphic extension from the upper half plane Imk > 0 into lower half plane Imk < 0 with the branch points at k = ± λ j,n of the √ second order and the real poles at k = ± λs,εand, perhaps, at some of the branch points. The resolvent (12) has a pole at k = ± λ j,n if and only if the homogeneous problem (5) with λ = λ j,n has a nontrivial solution u such that u= a j,n ϕ j,n (y/ε) + (e−γ t ), x ∈ C j,ε , t → ∞, 1 ≤ j ≤ m. (13) j,n:λ j,n =λ
√ (3) If f ∈ L 2com (ε ), and k = λ is real and is not a pole or a branch point of the operator (12), and λ > λ0 , then the problem (5), (6) is uniquely solvable and the 2 ( ) limit outgoing solution u can be found as the L loc ε u = Rλ+i0 f.
(14)
(4) There exist √ exactly M (see (9)) different scattering solutions for values of λ > λ0 such that k = λ is not a pole or a branch point of the operator (12), and the scattering solution is defined uniquely after the incident wave is chosen. Remark 1. Operator H = −ε2 and its domain depend on ε. One could use the term “family of operators” when referring to H . We prefer to drop the word “family”, but one must always keep in mind that H depends on ε. 2. Existence of a pole of the operator (12) at a branch point meansthat Rk 2 has a pole at z = 0 if this operator function is considered as a function of z = k 2 − λ j,n . 3. One can not identify poles of the resolvent and eigenvalues of the operator based only on general theorems of functional analysis since we deal with the poles of the modified resolvent (12) which belong to the absolutely continuous spectrum of the operator. 4. The eigenvalues λ j,ε of the operator H can be embedded into the absolutely continuous spectrum, and can be located below the absolutely continuous spectrum. In particular, from the minimax principle it follows that H necessarily has a non-empty discrete spectrum below λ0 if at least one of the junctions is wide enough. For example, non-empty discrete spectrum below λ0 exists if a junction contains a ball Bρ of the radius ρ = r ε such that the negative Dirichlet Laplacian in the ball Br has an eigenvalue below λ0 . (ε)
Let us describe the asymptotic behavior of scattering solutions = s,k as ε → 0, λ ∈ l. Note that an arbitrary solution u of Eq. (1) in a channel C j,ε can be represented as a series with respect to the orthogonal basis {ϕ j,n (y/ε)} of the eigenfunctions of the Laplacian in the cross-section of C j,ε . Thus it can be represented as a linear combination of the travelling waves √ e±i
λ−λ j.n ε
t
ϕ j,n (y/ε), 1 ≤ n ≤ m j ,
538
S. Molchanov, B. Vainberg
and functions which grow or decay exponentially along the axis of C j,ε . The main term of small ε asymptotics of scattering solutions contains only travelling waves, i.e. on each channel C j,ε , any function has the form =
(ε) s,k
=
mj
√ (α j,n e
i
λ−λ j.n ε
√ t
+ β j,n e
−i
λ−λ j.n ε
t
ε )ϕ j,n (y/ε) + rs,k ,
(15)
n=0
where ε |rs,k | ≤ Ce−
γ d(t) ε
, γ > 0, and d(t) = min(t, l j − t).
The constants α j,n and β j,n depend also on s, k and ε. The formula (15) can be written in a shorter form as follows: =
(ε) s,k
=
mj
ε ς j · ϕ j + rs,k ,
ε |rs,k | ≤ Ce−
γ d(t) ε
,
n=0
where ϕ j = ϕ j (y/ε) is the vector with components ϕ j,n (y/ε), 0 ≤ n ≤ m j , and ς j = ς j (t) is a (m j + 1)-vector whose components ς j,n are linear combinations of the corresponding oscillating exponents in t, i.e. ς j satisfies the following equation: d2 + D 2j )ς j = 0, 0 < t < l j , (16) dt 2 where D j is the diagonal matrix with elements λ − λ j.n , 0 ≤ n ≤ m j , on the diagonal. In order to complete the description of the main term of the asymptotic expansion (15), we need to provide the choice of constants in the representation of ς j,n as linear combinations of the exponents. Thus, 2(m j + 1) constants must be chosen for each channel C j,ε . We consider the limiting graph , whose edges j are the axes of the channels C j,ε . Let ς be the vector valued function on which is equal to ς j on j . The vector ς has a different number of coordinates on different edges j of the graph . We specify ς by imposing conditions at infinity and gluing conditions (GC) at each vertex v of the graph . Let V = {v} be the set of vertices v of the limiting graph . These vertices correspond to the junctions in ε . The conditions at infinity concern only the infinite channel C j,ε , j ≤ m. They depend on the choice of the incident wave and have the form: 1 if ( j, n) = (s, k) , 1 ≤ j ≤ m. (17) β j,n = 0 if ( j, n) = (s, k) (ε2
The GC at vertices v of the graph are universal for all incident waves and depend on λ. In order to state the GC at a vertex v, we choose the parametrization on in such a way that t = 0 at v for all edges adjacent to this particular vertex. The origin (t = 0) on all other edges can be chosen at any of the end points of the edge. Consider auxiliary scattering problems for the spider type domain v,ε formed by the individual junction, which corresponds to the vertex v, and all channels with an end at this junction, where the channels are extended to infinity if they have a finite length. We denote by v the limiting graph which is defined by v,ε . Definitions 1, 2 and Theorem 4 remain valid for the domain v,ε . In particular, one can define the scattering matrix T = Tv for the problem (1) in the domain v,ε . Let v1 , v2 , . . . vl , l = l(v), be indices of channels
Scattering Solutions in Networks of Thin Fibers: Small Diameter Asymptotics
539
in ε which correspond to channels in v,ε . Let us form a vector ς (v) by writing the coordinates of all vectors ςvs in one column, starting with coordinates of ςv1 , then coordinates ofςv2 , and so on. Let us denote by Dv (λ) the diagonal matrix with the diagonal elements λ − λvs,k written in the same order as the coordinates of the vector ς (v) . Let Iv be the unit matrix of the same size as the size of the matrix Dv (λ). The GC at the vertex v has the form ε[Iv + Tv ]Dv−1 (λ)
d (v) ς (t) + i[Iv − Tv ]ς (v) (t) = 0, dt
t = 0.
(18)
The GC (18) has the following form in the coordinate representation. Let Z = Z (v) be the set of indices ( j, n), where j are the indices of the edges of ending at v and 0 ≤ n ≤ m j . Then s,k s,k s,k −1/2 d ε δ j,n + t s,k = 0 at v, ς (v) (λ − λ ) + i δ − t (v) ς j,n j,n j,n j,n j,n j,n dt ( j,n)∈Z
(s, k) ∈ Z , where t s,k j,n (v) are the transmission coefficients of the auxiliary problem in the spider s,k domain v,ε (i.e. t s,k j,n (v) are the elements of Tv ), and δ j,n = 1 if (s, k) = ( j, n),
δ s,k j,n = 0 if (s, k) = ( j, n). Definition 5. A family of subsets l(ε) of a bounded closed interval l ⊂ R 1 will be called thin if, for any δ > 0, there exist constants β > 0 and c1 , independent of δ and ε, and c2 = c2 (δ), such that l(ε) can be covered by c1 intervals of length δ together with c2 ε−1 intervals of length c2 e−β/ε . Note that |l(ε)| → 0 as ε → 0. Theorem 6. Let l be a bounded closed interval of the λ-axis which does not contain points λ j,n . Then there exists γ = γ (ω j , l) > 0 and a thin family of sets l(ε) such that the asymptotic expansion (15) holds on all (finite and infinite) channels C j,ε uniformly in λ ∈ l \ l(ε) and x in any bounded region of R d . The function ς in (15) is a vector function on the limiting graph which satisfies Eq. (16), conditions (17) at infinity, and the GC (18). Remark 1) It will be shown in the proof of Lemma 11 that for spider domains the estimate of the remainder is uniform for all x ∈ R d . For general domains, we provide the estimate of the remainder only in bounded regions of R d in order not to complicate the exposition. 2) The arguments, used to justify the asymptotic behavior of the scattering solutions and prove Theorem 6, can be applied to study the asymptotic behavior of the outgoing solutions of the non-homogeneous problem (5) as ε → 0, λ > λ0 . The asymptotics will be expressed in terms of solutions of the corresponding non-homogeneous equation on the limiting graph. One can easily show that the GC can not be chosen independently of f even if we consider only functions f with compact support. However, if the support of f is separated from the junctions then the solution of the non-homogeneous equation on the limiting graph satisfies the same universal GC (18) that appear when scattering solutions are studied. The latter is related to the following fact: the outgoing solution in a narrow channel behaves as a combination of plane waves plus a term which decays exponentially outside of the support of f when ε → 0.
540
S. Molchanov, B. Vainberg
Note that the GC for the function ς on the limiting graph depend on λ. In fact, there exists an effective matrix potential on which is independent of λ, and allows one to single out the scattering solutions ς on with the same scattering data as for the original problem in ε . These results will be published elsewhere. The convergence of the spectrum of the problem in ε to the spectrum of a problem on the limiting graph has been extensively discussed in the physical and mathematical literature (e.g., [4–7, 9, 12, 13, 16, 18] and references therein). What makes our paper different is the following: all the publications that we are aware of, are devoted to the convergence of the spectra (or resolvents) only in a small (in fact, shrinking with ε → 0) neighborhood of λ0 (bottom of the absolutely continuous spectrum), or below λ0 . Usually, the Neumann BC on ∂ε is assumed. We deal with asymptotic behavior of solutions of the scattering problem in ε when λ is close to λ > λ0 , and the BC on ∂ε can be arbitrary. In particular, papers [5, 12, 13, 18] contain the gluing conditions and the justification of the limiting procedure ε → 0 near the bottom of the spectrum λ0 under assumption that the Neumann BC is imposed at the boundary of ε . Note that λ0 = 0 for the Neumann BC. Typically, the GC in this case are: the continuity of ς (s) at each vertex v and dj=1 ς j (v) = 0, i.e. the continuity of both the field and the flow. These GC are called Kirchhoff’s GC. In the case when the shrinkage rate of the volume of the junction neighborhoods is lower than the one of the area of the cross-sections of the guides, more complex energy dependent or decoupling conditions can arise (see [9, 13, 7] for details). Let us stress again that this is the situation near the bottom λ0 = 0 of the absolutely continuous spectrum. As follows from Theorem 3, the GC and the small ε asymptotics are different when λ > λ0 . Both assumptions (λ → λ0 , and the fact that the BC is the Neumann condition) in the papers above are very essential. The Dirichlet Laplacian near the bottom of the absolutely continuous spectrum λ0 > 0 was studied in a recent paper [16] under the condition that the junctions are more narrow than the tubes. It is assumed there that the domain ε is bounded. Therefore, the spectrum of the operator (1) is discrete. It is proved that the eigenvalues of the operator (1) in the O(ε2 )-neighborhood of λ0 behave asymptotically, when ε → 0, as eigenvalues of the problem in the disconnected domain that one gets by omitting the junctions, separating the channels in ε , and adding the Dirichlet conditions on the bottoms of the channels. This result indicates that the waves do not propagate through the narrow junctions when λ is close to the bottom of the absolutely continuous spectrum. A similar result was obtained in [3] for the Schrödinger operator with a potential having a deep strict minimum on the graph, when the width of the walls shrinks to zero. We also studied the Dirichlet problem for general domains ε without special assumptions on the geometry of the junctions when, simultaneously, ε → 0, λ → λ0 , and the diameters of the guides and junctions have the same order O(ε). Our conclusion is that, generically, waves do not propagate through the junctions when the frequency is close to the bottom of the absolutely continuous spectrum. Let us stress that this is true both in the case when the diameters of the junctions are smaller than the diameters of the guides, and in the case when they are larger. Some special conditions must be satisfied for waves to propagate if λ → λ0 . An infinite cylinder, which can be considered as two half-infinite tubes with the junction of the same shape, can be considered as an example of a domain where the propagation of waves at λ = λ0 is not suppressed. Less trivial examples will be given in our next paper. We do not deal with the problem near the bottom of the absolutely continuous spectrum in this publication. A detailed analysis of
Scattering Solutions in Networks of Thin Fibers: Small Diameter Asymptotics
541
this problem will be published elsewhere. However, we show here that the GC on the limiting graph with λ > λ0 , generically, have a limit as ε → 0, λ → λ0 , and the limiting conditions are the Dirichlet conditions. To be more exact, the following statement will be proved. √ λ0 . Then Theorem 7. 1) Assume that the resolvent (12) does not have a pole at k = √ the scattering matrix (10), defined for λ > λ0 , admits an analytic in z = λ − λ0 extension to a neighborhood of the point z = 0 and is equal to −I at z = 0, where I is the (m 0 × m 0 )-identity matrix and m 0 is the number of infinite channels C j,ε with λ j,0 = λ0 . 2) Assume that the resolvent √ of the auxiliary problem in the spider type domain ν,ε does not have a pole at k = λ0 . Then the GC (18) have a limit as λ → λ0 of the form εTv
d (v) ς (t) + 2iς (v) (t) = 0, dt
t = 0,
d Tv . The GC also have a limit when ε → 0, λ → λ0 independently. This where Tv = dz limit is the Dirichlet condition ς (v) (0) = 0.
A simple version of the results presented in this paper (for models admitting the separation of variables) was published in our paper [14]. The next section contains the proofs of Theorem 4 and 3. The statements of these theorems mostly concern problems with a fixed value of ε. Without loss of generality, one can assume that ε = 1 there. The last section is devoted to the proof of Theorem 6 on asymptotic behavior of the scattering solutions as ε → 0. Here the dependence of all objects on ε is essential. At the end of the last section, one can find a proof and a short discussion of Theorem 7. 2. Analytic Properties of the Resolvent Rλ We denote by (a) ε the following bounded part of ε : (a) ε = ε \ ∪ (C j,ε ∩ {t > a}).
(19)
j≤m
The next lemma will be needed later. Lemma 8. If the homogeneous problem (5), (6) with a real λ > 0 has a non-trivial √ solution u, then either λ is an eigenvalue of −ε2 and u decays exponentially at infinity, or λ ∈ {λ j,n } and (13) holds. (a)
Proof. From the Green formula for u and u in the domain ε , a > 0, it follows that
∂u Im ud S = 0, (a) ∂ν ∂ε (a)
where ν is the unit normal to ∂ε and d S is an element of the surface area. Using the boundary condition (5) we arrive at
Im u t udy = 0. (20) (a)
∂ε \∂ε
542
S. Molchanov, B. Vainberg
This, (6), and the orthogonality of the functions ϕ j,n imply, for a → ∞, λ − λ j,n |a j,n |2 + O(e−γ a ) = 0, j,n:λ j,n <λ
which justifies the lemma after taking the limit as a → ∞. This completes the proof. Let C j,ε be the channel C j,ε extended along the whole t axis, C j,ε = {(t, εy) : t ∈ R, y ∈ ω j ⊂ R n−1 }. ( j)
We denote by Rλ the resolvent (11) of the operator −ε2 in the extended channel C j,ε . Let L a2 (C j,ε ) be the set of functions from L 2 (C j,ε ) with the support in the region |t| ≤ a, and let H 2 (C bj,ε ) be the Sobolev space of functions in the domain C j,ε ∩{b < |t| < b+1}. Consider the operator ( j)
2 Rλ : L 2com (C j,ε ) → L loc (C j,ε ).
(21)
The following lemma can be easily proved using the method of separation of variables. Lemma 9. (1) The operator (21) admits an analytic continuation from the upper half plane Imλ > 0 onto the real axis with the branch points at λ = λ j,n , n = 0, 1, . . . . ( j) (2) If λ j,m j < λ < λ j,m j +1 and h ∈ L 2com (C j ) then Rλ h has the following behavior as t → ±∞, √ mj λ−λ j,n ( j) ± i |t| ε c j,n e ϕ j,n (y/ε) + O(e−γ (ε)|t| ), γ > 0, (22) Rλ h = n=1
where c±j,n
=
c±j,n (h)
ε−d = 2i λ − λ j,n
ω j,ε
√
∞ −∞
e
∓i
λ−λ j,n ε
τ
ϕ j,n (y/ε)h(τ, y)dτ dy.
(23)
(3) Let λ ∈ l, where l is a bounded closed interval of the real axis such that λ j,m j < λ < λ j,m j +1 for all λ ∈ l. Let h ∈ L 23ε (C j,ε ) and b ≥ 0. Then there exist positive constants c = c(l) and γ = γ (l) which are independent of λ ∈ l, ε and h, and such that the remainder term r in the right-hand side of (22) has the estimate ||r || H 2 (C b
j,ε )
≤ ce−γ b/ε ||h|| L 2
3ε (C j,ε )
.
Proof of Theorem 4. The statements of the theorem mostly concern the problem with a fixed value of ε. Without loss of generality, we can assume that ε = 1, and we omit ε in the notations of all objects (ε , C j,ε , and so on). The dependence on ε will be restored in some parts of the proof, when this dependence on ε is essential. Step 1. Construction of the resolvent. Let us introduce the following partition of unity on m j=0
φ j = 1.
(24)
Scattering Solutions in Networks of Thin Fibers: Small Diameter Asymptotics
543
We fix arbitrary functions φ j ∈ C ∞ (), 1 ≤ j ≤ m, such that φ j = 1 in the (infinite) channel C j for t ≥ 2, φ j = 0 in C j for t ≤ 1 and outside of C j . The function φ0 is defined as follows: φ0 = 1 − j≤m φ j . We also need functions ψ j that are equal to one on the supports of ϕ j , which will allow us to smoothly extend functions defined only on infinite channels or only in a bounded part of onto the whole domain . We fix functions ψ j ∈ C ∞ (), 1 ≤ j ≤ m, such that ψ j = 1 in the infinite channel C j for t ≥ 1 (i. e. on the support of φ j ), ψ j = 0 outside of C j . Let ψ0 ∈ C ∞ () be a function such that ψ0 = 1 on the support of φ0 , and ψ0 = 0 in all infinite channels C j when t ≥ 3. Note that ψ j φ j = φ j , 0 ≤ j ≤ m.
(25)
We construct the parametrix (almost resolvent) for the problem (5) in the form Pλ : L 2 () → L 2 (),
Pλ f = ψ0 Rλ (φ0 f ) +
m
( j)
ψ j Rλ (φ j f ),
(26)
j=1
where Rλ is the resolvent (11) of the operator in with a fixed λ = iσ, σ > 0, ( j) which will be chosen later, and Rλ are resolvents of the negative Dirichlet Laplacians 2 in C j . If f ∈ L () then φ j f = 0 outside C j , and we consider φ j f as an element of ( j) ( j) L 2 (C j ). Then the operator Rλ can be applied to φ j f and Rλ (φ j f ) ∈ L 2 (C j ). Since ( j) ψ j = 0 at the bottom of C j and outside of C j , we consider ψ j Rλ (φ j f ) as an element of L 2 () that is equal to zero outside of C j . In this way, the operator Pλ is well defined for λ ∈ / [0, ∞). Let us look for a solution u ∈ L 2 () of the problem (5) with λ ∈ / [0, ∞) in the form of u = Pλ h with unknown h ∈ L 2 (). Obviously, u satisfies the Dirichlet boundary condition since each term in (26), applied to any h, satisfies the Dirichlet boundary condition. The substitution of Pλ h for u in Eq. (5) with λ ∈ / [0, ∞) (and ε = 1) leads to (− − λ)Pλ h = −(ψ0 )[Rλ (φ0 h)] − 2∇ψ0 · ∇[Rλ (φ0 h)] −ψ0 ( + λ )[Rλ (φ0 h)] − (λ − λ )ψ0 [Rλ (φ0 h)] m ( j) ( j) ( j) (ψ j )[Rλ (φ j f )] + 2∇ψ j · ∇ Rλ (φ j h) + ψ j ( + λ)[Rλ (φ j h)] = f. − j=1
Using (25), (24), the last relation can be rewritten in the form h + Fλ h = f,
(27)
where Fλ h = −[( + λ − λ )ψ0 ][Rλ (φ0 h)] − 2∇ψ0 · ∇[Rλ (φ0 h)] m ( j) ( j) (ψ j )[Rλ (φ j h)] + 2∇ψ j · ∇ Rλ (φ j h) . −
(28)
j=1
Let us show that the operator Fλ : L 2 () → L 2 (),
λ∈ / [λ0 , ∞),
(29)
544
S. Molchanov, B. Vainberg ( j)
is compact and depends analytically on λ. Indeed, the resolvents Rλ and Rλ map any function f ∈ L 2 into the solution of the problem (5) in the domains , C j , respectively. Thus, these operators are bounded as operators from L 2 into the Sobolev spaces H 2 . Since the formula (28) contains at most first derivatives of the resolvents, the operator Fλ , λ ∈ / [λ0 , ∞), is bounded if it is considered as an operator from L 2 () into the Sobolev space H 1 (). Since ∇ψ0 = ∇ψ j = 0 at points x ∈ C j with t > 3, from (28) it follows that, for any infinite channel C j , Fλ h = 0, x ∈ C j ∩ {t > 3}.
(30)
Hence, the Sobolev imbedding theorem implies that the operator (29) is compact. The ( j) analyticity of the operator (29) is obvious since the operators Rλ depend analytically on λ, and Rλ does not depend on λ. Now we put λ = λ = iσ and show that ||Fiσ || → 0 as σ → ∞. In fact, since the norm of the resolvent does not exceed the inverse distance from the spectrum, we have that ( j)
||Rλ ||, ||Rλ || ≤ 1/σ,
(31)
where the first norm is considered in the space L 2 () and the second one is in the space L 2 (C j ). Multiplying Eq. (5), considered in the domain or C j , by u and integrating over ( j) the domain, we get the following relation for the functions u = Rλ f and u = Rλ f, respectively:
||∇u||2L 2 − iσ ||u||2L 2 = u f d x, which implies that
||∇u||2L 2
≤|
u f d x| ≤ ||u|| L 2 || f || L 2 .
Thus, ( j)
||Rλ f || H 1 () , ||Rλ f || H 1 (C j ) ≤ Cσ −1/2 || f || L 2 .
(32)
Since the formula (28) contains at most first derivatives of the resolvents, estimates (31), (32) imply that ||Fiσ || → 0 as σ → ∞. We fix λ = iσ in (26) in such a way that ||Fλ || < 1. Then from the analytic Fredholm theorem it follows that the operator (E + Fλ )−1 : L 2 () → L 2 (), λ ∈ / [λ0 , ∞),
(33)
exists and depends meromorphically on λ. From here, (26) and ( 27) the representation for the resolvent follows: Rλ = Pλ (E + Fλ )−1 , λ ∈ / [λ0 , ∞).
(34)
Step 2. Analytic continuation of the resolvent. In order to extend the operator (12) meromorphically into the lower half plane Imk < 0 we need to repeat the arguments used to justify (34). Consider the space L a2 () of functions f ∈ L 2 () with supports in (a) (see (19)), i.e. f = 0 in the infinite channels C j when t > a. Let f ∈ L a2 (). Without
Scattering Solutions in Networks of Thin Fibers: Small Diameter Asymptotics
545
loss of generality, one can assume that a > 3. Then (27) and (30) imply that h is also supported in (a) , i.e. Fλ can be considered as an operator in L a2 () : Fλ : L a2 () → L a2 (), λ ∈ / [0, ∞). Let χ = χa (t) be a function equal to one when t ≤ a and zero when t > a. From Lemma 9 it follows that the operators ( j)
χ Rk 2 : L a2 (C j ) → L a2 (C j ), Imk > 0, admit an analytic continuation into the lower half plane with the branch points at ( j) k = ± λ j,n . Further, u = Rk 2 f satisfies Eq. (5) with λ = k 2 for all complex k ∈ C, and therefore the operators ( j)
( j)
χ Rk 2 , χ ∇ Rk 2 : L a2 (C j ) → L a2 (C j ),
k ∈ C,
are compact and analytic in the complex plane C. Since χ = 1 on the supports of ∇ψ j , ( j) 0 ≤ j ≤ m, we can insert the factor χ on the left of all the resolvents Rλ in ( 28). From here it follows that the operator Fk 2 : L a2 () → L a2 (),
k ∈ C, is compact and analytic with branch points at k = ± λ j,n . Hence, the operator (35) (E + Fk 2 )−1 : L a2 () → L a2 (), k ∈ C, is meromorphic with the branch points at k = ± λ j,n . Together with (26), (34) and ( j)
2 (C ), k ∈ C, this implies that the analyticity of the operators Rk 2 : L a2 (C j ) → L loc j the operator (12) admitsa meromorphic continuation to the lower half plane with the branch points at k = ± λ j,n and poles determined by the poles of the operator (35). Obviously, the poles of the operator (35) may have a limiting point only at λ = ∞. Step 3. Spectral analysis. First of all note that the existence of the meromorphic extension of the operator (12) together with the Stone formula immediately imply that the operator H = − does not have singular spectrum. The proof of this fact can be found in [17] (see Theorem XIII.20). In order to prove the part of statement (1) of the theorem concerning the absolutely continuous spectrum of the operator H = −, we split the domain into pieces by introducing cuts along the bases t = 0 of all infinite channels. We denote the new (not connected) domain by , and denote the negative Dirichlet Laplacian in by H , i. e. H is obtained from H by introducing additional Dirichlet boundary conditions on the cuts. Obviously, the operator H has the absolutely continuous spectrum described in statement (1) of the theorem. Thus, it remains to show that the wave operators for the couple H, H exist and are complete. The justification of the existence and completeness of the wave operators can be found in [1]. Another option is to derive the latter fact independently using the Birman theorem stating that the validity of the inclusion
(H − λ)−n − (H − λ)−n ∈ J1
(36)
for some λ and n ≥ 1 implies the existence and completeness of the wave operators. Here J1 is the space of operators of the trace class. The inclusion (36) can be derived
546
S. Molchanov, B. Vainberg
from (34) and a similar formula for the resolvent of the operator H . This completes the proof of the statement about the absolutely continuous spectrum. The discreteness of the set {λ j,ε } of eigenvalues follows from the fact that the operator (12) is meromorphic in λ and has poles at {λ j,ε }. The existence of the poles at {λ j,ε } can be derived from the Stone formula. Another proof will be given below. Let us prove the part of statement (1) concerning the spider domains. If ε is a spider domain, then there exists a point x (ε) and an ε-independent domain such that the transformation (see (2)) Lε : x → x (ε) + εx,
(37)
maps into ε . In order to stress the fact that the operator H = −ε2 in the domain ε depends on ε, we shall denote it by H (ε) . The operator − in the domain shall be denoted by H (1) . Obviously, H (ε) = L ε H (1) L −1 ε ,
(38)
and this implies the independence of the eigenvalues of the operator H (ε) of ε. This completes the proof of statement (1). Step 4. Real poles of the resolvent. The first part of statement (2) about the existence of the analytic extension of the resolvent was justified in Step 2 of the proof. Now we are going to prove the second part of that statement concerning the set of real poles of the operator (12 ). We denote this set of poles by K . Let us assume that either u is an eigenfunction of the operator H = − with an eigenvalue λ = λ > 0 or u is a non-trivial solution of the homogeneous problem (5), (13)√with λ = λ > 0 (recall that we assume that ε = 1). We are going to show that k = ± λ ∈ K . Consider the restrictions u j of u to the cylinders C j , 1 ≤ j ≤ m. Let v j ∈ L 2 (C j ) be the solution of the problem (− − λ)v j = 0,
x ∈ Cj;
v j = 0 on ∂ C j ,
v j = u j when t = 0,
where λ ∈ / [0, ∞), and ∂ C j is the lateral boundary of C j . The solution v j ∈ L 2 (C j ) of this problem is unique and can be found by separation of variables. The function u j satisfies the same equation with the fixed λ = λ and the same boundary conditions. It is also defined uniquely by its values at t = 0 and can be found by separation of variables. This implies that v j converges to u as λ → λ + i0. Since u is a solution of a homogeneous elliptic problem, u ∈ C ∞ . Thus, u j is infinitely smooth when t = 0, and the convergence v j → u j takes place, for example, in the Sobolev space H 2 on the part of the cylinder C j where 0 ≤ t ≤ 2. Let v=
m
φ j v j + φ0 u ∈ L 2 (C j ), λ ∈ / [0, ∞),
j=1
where {φ j } is the partition of unity which was introduced above. The function u can not be equal to zero identically on \ ∪ C j due to the uniqueness of the solution of the Cauchy problem for the operator − − λ . Thus ||v|| L 2 (\∪C j ) = ||u|| L 2 (\∪C j ) = c0 > 0.
(39)
On the other hand, (− − λ)v = −
m j=1
[(φ j )v j + 2∇φ j · ∇v j ] − (λ − λ )φ0 u − (φ0 )u − 2∇φ0 · ∇u.
Scattering Solutions in Networks of Thin Fibers: Small Diameter Asymptotics
547
Thus, (− − λ)v ∈ L a2 (). From the convergence v j → u and (24) it follows that (− − λ)v tends to zero in L a2 () as λ → λ + i0. √ provides the √ Together with (39) this existence of the pole of the operator √(12) at k = λ . The pole at k = − λ exists due to the relation Rλ = Rλ . Hence, ± λ ∈ K . √ to K . The relation Now let us assume that at least one of the points ± λ belongs √ Rλ = Rλ implies that the second point also belongs to K , i.e. λ ∈ K , and there exist a > 0 and f ∈ L a2 () such that w := Rλ f =
u(x) v(x, λ) + ; ||v|| L 2 ((a+2) ) ≤ c, n (λ − λ ) (λ − λ )n−1
λ → λ + i0, (40)
where n ≥ 1 and u does not vanish identically. In fact, n can not exceed one, but it is not important for us now. Obviously, (− − λ )u = 0, x ∈ ;
u = 0 on ∂.
(41)
From here and Lemma 8 it follows that in order to complete the proof of the second statement of the theorem it is sufficient to show that the asymptotic expansion (6) holds for the function u. Note that (41) implies that u ∈ C ∞ . Since f = 0 in all infinite channels C j when t > a, from relation (40) it follows that (− − λ)v = (λ − λ )u, x ∈ C j ∩ {t > a};
v = 0 on ∂.
From here, the estimate in (40), and standard local a priori estimates for solutions of elliptic problems it follows that for any vector α, |
1 3 ∂αv | ≤ c(α), x ∈ C j ∩ {a + > t > a + }, λ → λ + i0, ∂xα 2 2
and therefore ∂ α [(λ − λ )n w] ∂αu → ∂xα ∂xα
(42)
uniformly on C j ∩ {t = a + 1} as λ → λ + i0. We restrict the functions (λ − λ )n w and u to C j ∩ {t = a + 1} and expand the restrictions with respect to the basis. {ϕ j,n } 0 be the of the operator − in the cross section of the channel C j . Let γ j,n (λ) and γ j,n coefficients of these expansions. Then (42) implies that for any β, 0 |γ j,n (λ) − γ j,n | < cβ n −β , λ → λ + i0.
(43)
The function w := (λ − λ )n w satisfies the following relations in C j ∩ {t ≥ a + 1} : |t=a+1 = γ j,n (λ)ϕ j,n (y), (− − λ) w = 0, w = 0 for x ∈ ∂C j ∩ {t > a + 1}, w n
where λ ∈ / [0, ∞). One can find the solution w ∈ L 2 of this problem by the method of separation of variables and then pass to the limit as λ → λ + i0 using (43). This leads to the asymptotic expansion (6) for u and completes the proof of the second statement of Theorem 4.
548
S. Molchanov, B. Vainberg
Step 5. The proof of the last two statements of the theorem. If k = pole or a branch point of Rk 2 then w := Rλ f = u(x) + (λ − λ )v(x, λ); ||v|| L 2 ((a) ) ≤ c(a),
√
λ , λ > 0, is not a
λ → λ + i0,
where a > 0 is arbitrary and (− − λ )u = f, x ∈ ;
u = 0 on ∂.
In order to prove the third statement of the theorem, we need only to show that the asymptotic expansion (6) holds for u. It can be done exactly in the same way as it was done for function u in (40) by representing u in C j ∩ {t > a + 1} as the limit of functions w as λ → λ + i0. In order to prove the last statement of the theorem one can look for the solution = s,k of the scattering problem in the form √ = φs e−i λ−λs,k t ϕs,k (y) + u, where φs is the function from the partition of unity (24). This reduces problem (7), (8) to the uniquely solvable problem (5), (6) for u. This completes the proof of Theorem 4. Proposition 10. Let s,k and s ,k be two scattering solutions, and let a s,k j,n be the transmission coefficients for the scattering solution s,k . Then 1) The following energy conservation law is valid:
2 λ − λ j,n |a s,k j,n | =
mj m
2 λ − λ j,n |a s,k j,n | =
λ − λs,k .
j=1 n=0
j,n
2) If these solutions correspond to different incident waves ((s, k) = (s , k )), then s ,k λ − λ j,n a s,k j,n a j,n = 0. j,n
Proof. Since the statement concerns the problem with a fixed value of ε, one can put ε = 1 and omit ε in the notations ε , (a) ε . Green’s formula for s,k and s ,k in the (a) domain implies, similarly to (20), that
[( s,k )t s ,k − s,k ( s ,k )t ]dy = 0. ∂(a) \∂
From here, (8), and the orthogonality of the functions ϕ j,n it follows that √ s ,k s,k −2i λ−λs,k a λ − λ j,n a s,k j,n a j,n − λ − λs,k a j,n e j,n
√ + λ − λs ,k a sj,n,k e2i λ−λs ,k a − λ − λs,k δ + O(e−γ a ) = 0, a → ∞,
s s , and δ = 0 otherwise. We take the average with respect where δ = 1 if = k k to a ∈ (A, 2 A) and pass to the limit as A → ∞. Then we get s ,k λ − λ j,n a s,k j,n a j,n = λ − λs,k δ, j,n
which justifies both statements of the proposition. This completes the proof.
Scattering Solutions in Networks of Thin Fibers: Small Diameter Asymptotics
549
Proof of Theorem 3. Proposition 10 is equivalent to the relation A∗ A = I for A = D 1/2 T D −1/2 , which provides the unitarity of the matrix A. If one applies Green’s formula to the scattering solutions s,k and s ,k , then the arguments used in the proof of Proposition 10 lead to the symmetry of D 1/2 T D −1/2 . This completes the proof of Theorem 3. 3. Asymptotic Behavior of Scattering Solutions as ε → 0 We start with a study of scattering solutions in spider domains ε . Lemma 11. Theorem 6 is valid for spider domains. Proof. The transformation L −1 ε , see (37), maps the spider domain ε into the ε−independent domain with the channels C j , 1 ≤ j ≤ m. The coordinates ( t, y) in C j are related to coordinates (t, y) in C j,ε via the formulas t = t/ε, y = y/ε.
(44)
= s,k of the problem in has the form similar to (8): The scattering solution s,k = δs, j e−i
√
t λ−λs,k
ϕs,k ( y) +
mj
t j,n ei
√
t λ−λ j,n
ϕ j,n ( y) + O(e−γt ),
n=0
t → ∞. x ∈ Cj, s,k is a smooth function, the remainder term Since r in the formula above can be estimated for all values of t: | r | ≤ Ce−γt , x ∈ C j . Since the scattering solutions in the domains ε and are related via the formula s,k (L −1 s,k (x) = ε x), it follows that √ mj √ λ−λ j,n λ−λs,k (ε) −i t i t ε ε ϕs,k (y/ε) + t j,n e ϕ j,n (y/ε) + r (ε) , s,k = δs, j e n=0
|r
(ε)
| ≤ Ce
−γ t/ε
, x ∈ Cj.
(45)
Thus, the asymptotic expansion (15), (17) is valid, and it only remains to show that the GC (18) holds for vectors ς = ςs,k determined by (45) (the definition of these vectors is given in the paragraph above formula (18)). We form the matrix = (t) with columns ςs,k taking them in the same order as the order chosen for elements in each of these vectors (first we put columns with s = 1 and k = 1, 2, . . . , m 1 , then columns with s = 2, and so on). From (45) it follows that (0) = I + T, (0) =
i D(−I + T ), ε
where T is the scattering matrix, I is the identity matrix of the same size, and D is the diagonal matrix of the same size with elements λ − λ j,n on the diagonal. Hence, ε(I + T )D −1 (0) + i(I − T )(0) = 0 and GC (18) holds for the columns of the matrix . This completes the proof of the lemma.
550
S. Molchanov, B. Vainberg
The following two lemmas about spider domains will be needed in order to prove Theorem 6 for general domains. Let Rλ be the resolvent of the operator H = −ε2 in λ be the resolvent of the similar operator H = − in the a spider domain ε , and let R domain which is the image of ε under the map L −1 ε , see (37). Note that the operλ is ε-independent. ator Rλ and its domain, L 2 (ε ), depend on ε, while the operator R Formula (38) implies Lemma 12. The following relation holds λ L −1 Rλ = L ε R ε . Let us fix m constants t j > 0, 1 ≤ j ≤ m. Let ε be a spider domain with the channels C j,ε , 1 ≤ j ≤ m. Consider slices D j,ε of C j,ε defined by the inequalities |t − t j | ≤ 3ε. Let ε be a bounded domain which is obtained from ε by cutting off the infinite parts of channels C j,ε on which t ≥ 43 t j. Let a function h ∈ L 2 (ε ) be supported in one of the domains D j,ε , for example, with j = s. Below, when the resolvent Rλ of the operator H = −ε2 in ε is considered with λ belonging to the continuous spectrum of the operator, Rλ is understood in the sense of the analytic extension described in Theorem 4. We denote the Sobolev spaces of functions which are square integrable together with their derivatives of up to the second order by H 2 ( ε ) and H 2 (D j,ε ). Lemma 13. Let ε be a spider domain. Let l be a bounded closed interval of the λ-axis that does not contain points λ j,n , and let a function h ∈ L 2 (ε ) be supported in the domain Ds,ε . Then there exists γ = γ (ω j , l) > 0 such that (1)
Rλ h =
ms
cs.k s,k + r0 in ε , |r0 (x)| ≤
k=0
Ce−γ /ε ||h|| L 2 (ε ) , λ j ∈l |λ − λ j |
(46)
− (h) are given by (23), and where s,k are scattering solutions, the coefficients cs.k = cs.k λ j are eigenvalues of the operator H in ε (see statement (1) of Theorem 4); √ mj ms λ−λ j,n (s) s,k i t ε cs.k t j,n e ϕ j,n (y/ε) + r j in D j,ε , (47) (2) Rλ h = δs, j Rλ h + k=0
n=0
where ||r j || H 2 (D j,ε ) ≤
Ce−γ /ε ||h|| L 2 (ε ) . λ j ∈l |λ − λ j | (s)
Here δs, j is the Kronecker symbol (δs, j = 1 if s = j, δs, j = 0 if s = j), Rλ is the (channel C resolvent of − in the extended channel Cs,ε s,ε extended to −∞ along the t s,k − axis), cs.k = cs.k (h), and t j,n are the transmission coefficients (see the remark following Definition 2). Proof. Let a function α ∈ C ∞ (ε ) have the form: α = 1 in Cs,ε when t > 78 t j + ε, α = 0 in ε \Cs,ε , and α = 0 in Cs,ε when t < 78 t j . Consider the function (s)
u = α Rλ h +
ms k=0
cs.k [ s,k − αe−i
√
λ−λs,k t ε
ϕs,k (y/ε)].
(48)
Scattering Solutions in Networks of Thin Fibers: Small Diameter Asymptotics
551
Obviously, u = 0 on ∂ε , since each term in the right hand side above satisfies the Dirichlet boundary condition. Furthermore, −ε2 u − λu = h − ε2 [∇α · ∇ Rλ(s) h + (α)Rλ(s) h − ∇α · ∇g − (α)g],
(49)
where g=
ms
cs.k e−i
√
λ−λs,k t ε
ϕs,k (y/ε).
k=0
The right hand side in (49) has the form h + h 1 , where h 1 is supported in the slice 7 7 8 ts ≤ t ≤ 8 ts + ε of C s,ε . From Lemma 9 it follows that ||h 1 || L 2 (ε ) ≤ Ce−γ /ε ||h|| L 2 (ε ) . It is also clear that the behavior of the function u at infinity is described by (6). Hence, u = Rλ (h + h 1 ) due to statement (3) of Theorem 4. From here and (48) it follows that Rλ h =
α Rλ(s) h
+
ms
cs.k ( s,k − αe−i
√
λ−λs,k t ε
ϕs,k (y/ε)) − Rλ h 1 .
(50)
k=0
This implies equality (46) with r0 = −Rλ h 1 . Let ε be obtained from ε by cutting off the parts of channels C j,ε where t ≥ 78 t j . Since operator (12) is meromorphic (due to Theorem 4) and has poles of at most first order due to the Stone formula, ||r0 || L 2 ( ε ) can be estimated by the right-hand side of inequality (46). Since (−ε2 − λ)r0 = 0 in ε and r0 = 0 on the lateral side of ∂ ε , standard a priori estimates for elliptic equations lead to the estimates on r0 in Sobolev norms in ε . These estimates, together with Sobolev imbedding theorems, justify the estimate (46). Similarly, Eq. (47) follows from (50) and Lemma 12. This completes the proof of the lemma. We need two more auxiliary statements in order to prove Theorem 6. Lemma 14. Let a real-valued function f belong to C n+1 (R 1 ) and || f ||C n+1 =
n+1 k=0
n
sup | f (k) | = A+ < ∞,
(51)
x
| f (k) (x)| ≥ A− > 0,
x ∈ R1.
(52)
k=0
Then for any σ ≤ A− /2, the set σ = {x : | f (x)| ≤ σ } has the following structure. There exists a constant c which depends only on A± and n and such that, for any bounded interval ⊂ R 1 , a) the number of connected components of σ in is finite and does not exceed c(|| + 1), b) the measure of each connected component of σ in does not exceed cσ 1/n . Remark The last estimate can not be improved. In fact, if f (x) = sinn x then σ ∩ [− π2 , − π2 ] ∼ 2σ 1/n .
552
S. Molchanov, B. Vainberg
Proof. We shall denote by c j different constants which depend on A± and n but not on f. If x ∈ σ then (52) implies that n
| f (k) (x)| ≥ A− /2,
k=1
and therefore, | f (k) (x)| ≥ A− /2n for the chosen x and some k = k(x), 1 ≤ k ≤ n. Since | f (k+1) | ≤ A+ , x ∈ R 1 , there exists an interval x such that x ∈ x , | f (k) (x)| ≥ A− /4n − on x , and |x | = c0 = 4nAA+ . The set of intervals x covers σ ∩ . Hence, one can select a finite number of intervals x covering σ ∩ . Then one can omit some of them in such a way that the remaining intervals still cover σ ∩ with multiplicity at most two. This leaves us with at most 2( || c0 + 1) ≤ c1 (|| + 1) intervals x covering σ ∩ . Thus, it is enough to prove the lemma for an individual interval (one of the intervals x ) such that | | = c0 and | f (k) (x)| ≥ c2 on for some fixed value of k, 1 ≤ k ≤ n. Equations f (x) = ±σ have at most k solutions on . In fact, if there exist k + 1 points where f (x) = σ then there are k intermediate points where f (x) = 0. Thus, there are k − 1 points where f (x) = 0, and so on. Finally, there has to be a point where f (k) (x) = 0. This contradicts the assumption that | f (k) (x)| ≥ c2 on . Hence, the set σ ∩ consists of at most k + 1 intervals. It remains only to show that the length of these intervals does not exceed cσ 1/k . In order to estimate this length, we assume that there is an interval [x1 , x1 + h], where | f (x)| ≤ σ, | f (k) (x)| ≥ c2 . Put h = h/k and consider the k th difference
k k k = f (x1 ) − f (x1 + h ) + f (x1 + 2h ) − · · · + (−1)k f (x1 + kh ). (53) 1 2 There exists a point ξk ∈ [x1 , x1 + h] such that k = (h )k f (k) (ξk ). Thus, |k | ≥ c2 h k = c3 h k . On the other hand, from (53) and the estimate | f (x)| ≤ σ it follows kk that |k | ≤ σ 2k . Hence, c3 h k ≤ σ 2k , i.e. h ≤ cσ 1/k . This completes the proof of the lemma. Lemma 15. Let a set of functions f ε = f ε (λ), ε → 0, on a closed interval l ⊂ R 1 , have the form fε =
M
C j (λ)ei
g j (λ) ε
,
(54)
j=1
where functions C j (λ) are real valued, functions g j (λ) are analytic, there are no two functions g j (λ) whose difference is a constant, and M
|C j (λ)| ≥ 1.
(55)
j=1
Then, for any η > 0, the set η (ε) = {λ : | f ε (λ)| ≤ e−η/ε } is thin (see the definition in the introduction).
(56)
Scattering Solutions in Networks of Thin Fibers: Small Diameter Asymptotics
553
Proof. Consider the set 0 , where gi (λ) = g j (λ) for some i = j. Due to the analyticity of the functions g j (λ), this set consists of a finite number of points. Let us denote the number of points in 0 by c1 . Let δ be the δ/2-neighborhood of 0 . Then |gi (λ) − g j (λ)| ≥ a(δ) > 0,
i = j,
λ ∈ l \ δ .
(57)
Consider the functions f ε (µ) =
M
C j (εµ)ei
g j (εµ) ε
,
εµ ∈ l \ δ .
j=1
For any k, g (εµ) dk k i jε (µ) = [g (εµ)] C (εµ)e + O(ε). f ε j j dµk
M
(58)
j=1
We move the remainders to the left hand side and consider (58) with 1 ≤ k ≤ M as g j (εµ)
equations for unknowns C j (εµ)ei ε . The matrix of this system of equations with the elements a j,k = [g j (εµ)]k is a Vandermond matrix, and its determinant is bounded from below due to (57). This and (55) imply that M dk | k f ε (µ)| ≥ A− (δ) > 0 dµ j=1
if ε is small enough. It also follows from (58) that M+1 j=1
|
dk f ε (µ)| ≤ A+ . dµk
f ε (µ) on Hence, Lemma 14 is applicable to at least one of the functions Re f ε (µ) or Im each connected interval of the set l \ δ stretched by a factor of ε−1 . Since we have at most c1 +1 of those intervals, this implies that the set {λ : | f ε (λ)| ≤ σ } can be covered by δ and c2 (δ)ε−1 intervals of length c2 (δ)σ 1/M . We take σ = e−η/ε , and this completes the proof of the lemma. Proof of Theorem 6. The proof is based on a representation of the resolvent Rλ of the problem in ε through the resolvents Rv,λ of the operators H = −ε2 in the spider domains v,ε , formed by an individual junction, which corresponds to a vertex v, and all the channels with an end at this junction, where the channels are extended to infinity if they have finite length. Let us consider the slices D j,ε of the finite channels C j,ε , j > m, defined by the conditions t j ≤ t ≤ t j + 3ε where t j = 4l j /5. We construct the following partition of the unity ε : φv = 1, v∈V
where V is the set of all the vertices v of the limiting graph , φv ∈ C ∞ (ε ), and is defined as follows. The function φv is equal to one on the junction Jv , which corresponds to the vertex v; on the infinite channels adjacent to Jv and on the parts of the
554
S. Molchanov, B. Vainberg
finite channels adjacent to Jv , where t ≤ t j + ε. The function φv is equal to zero on the parts of finite channels adjacent to Jv , where t ≥ t j + 2ε, and also on all the other junctions and channels which are not adjacent to Jv . Let ψv ∈ C ∞ (ε ), ψv = 1 on the support of φv , ψv = 0 on the parts of finite channels adjacent to Jv , where t ≥ t j + 3ε, and also on all other junctions and channels which are not adjacent to Jv . We fix a vertex v = v . Let J v be the corresponding junction of ε . We choose the parametrization on in such a way that the value t = 0 on all the edges adjacent to v corresponds to v . The origin (t = 0) on all the other edges can be chosen at any end of the edge. We are going to justify the asymptotic expansion (15) in the domain C( v ) consisting of the infinite channels adjacent to J v and the parts t < 3l j /5 of the finite channels C j,ε adjacent to J . Moreover, it will be shown that the function ς in v the asymptotic expansion satisfies Eq. (16), conditions (17) at infinity, and the GC (18). Since v is arbitrary and the union of all domains C( v ), v ∈ V , covers all the channels, the validity of (15) in C( v ) justifies the statements of Theorem 6. (ε) Let us show that the asymptotic expansion (15) in C( v ) for any scattering solution s,k follows from a similar expansion for functions of the form u = Rλ f, where f ∈ L 2 (ε ) (ε) is supported in ∪D j,ε . In fact, let u = ψv1 s,k,v1 , where the vertex v1 = v1 (s) corre(ε)
sponds to the first junction Jv1 encountered by the incident wave, s,k,v1 is the solution of the scattering problem in the spider domain v1 ,ε , and the function u is considered as a function in ε which is equal to zero outside of the support of ψv1 . Then (−ε2 − λ)u = f,
(ε)
(ε)
f := −ε2 [∇ψv1 · ∇ s,k,v1 + (ψv1 ) s,k,v1 ].
Obviously, f ∈ L 2 (ε ) and f is supported in ∪D j,ε . From statement (3) of Theorem 4 it follows that there exists the unique outgoing solution v = Rλ f of the equation (−ε2 − λ)v = f, λ ∈ l \{λ j }. Then (ε)
(ε)
s,k = ψv1 s,k,v1 − Rλ f, since this function satisfies (7) and (8). From here and Lemma 11 it follows that the asymptotic expansion (15) and the properties of ς mentioned in Theorem 6 hold for (ε) s,k in C( v ) if the corresponding properties are valid for Rλ f in C( v ). Hence, the proof of the theorem will be complete as soon as we show that, for any f ∈ L 2 (ε ) with the support in ∪D j,ε , the function u = Rλ f has expansion (15) in C( v ) with β j,n = 0 and ς satisfying the GC (18). Consider the operator Pλ defined by the formula ψv Rv,λ (φv h), λ ∈ l, (59) Pλ h = v∈V
L 2 (ε )
is supported in ∪D j,ε , l is defined in the statement of Theorem 6, where h ∈ and the resolvents Rv,λ for real λ ∈ l are understood in the sense of Theorem 4. We look for u = Rλ f in the form of Pλ h with an unknown h ∈ L 2 (ε ). This leads to the equation (compare with (27), (28)) h + Fλ h = f, Fλ h = −ε2 [2∇ψv · ∇ Rv,λ (φv h) + (ψv )Rv,λ (φv h)]. (60) v∈V
From here, similarly to (34), it follows that Rλ f = Pλ (I + Fλ )−1 f
(61)
Scattering Solutions in Networks of Thin Fibers: Small Diameter Asymptotics
555
for Imλ > 0. Similarly to (35), one can show that the operator (I + Fλ )−1 : L a2 (ε ) → L a2 (ε )
(62)
admits a meromorphic extension into the lower half plane Imλ ≤ 0 with the branch points at λ = λ j,n . The only difference is that now we use operators Rv,λ instead of ( j) ( j) Rλ , and Rv,λ depend meromorphically on λ, while Rλ are analytic in λ. So, one needs to refer to a version of an analytic Fredholm theorem where the operator may have poles (with residues of finite ranks). This version of the theorem can be found in [2], and applications of this theorem to operators similar to (62) can be found in [19, 20]. Hence, formula (61) is established for all complex λ. All the operators in (61) depend on ε. The function (I + Fλ )−1 f is meromorphic in λ, and its poles depend on ε. In order to find a set on the interval l where the operator (I + Fλ )−1 exists and is bounded uniformly in λ we shall use the following reduction of Eq. (60) to a system of equations where the domains of the operators do not depend on ε. Recall that f is supported in ∪D j,ε . Formula (60 ) for Fλ implies that the function Fλ h is also supported in ∪D j,ε . Thus, the support of any solution h of (60) belongs to ∪D j,ε . We shall identify functions f and h with vector functions whose components are the restrictions of f and h, respectively, to individual domains D j,ε , m + 1 ≤ j ≤ N . Furthermore, we map D j,ε onto the ε-independent domain D j by the transformation L ε defined by formulas t − t j = ε t,
y = ε y.
(63)
This transformation differs from (44) by a shift in t (compare with (37)). The vector functions of variables ( t, y) with components from L 2 (D j ) defined by f, h will be denoted by f and h , respectively. Then Eq. (60) can be written in the form h + Fλ h = f , where Fλ is the [(N − m) × (N − m)]-matrix operator which corresponds to the operator Fλ . Here N − m is the number of finite channels in ε . Recall that the entries of the vectors f and h are functions with the domains D j , which do not depend on ε (and λ). Our next goal is to describe how the entries of the matrix Fλ depend on ε and λ. It will be done using (60), where each resolvent Rv,λ can be specified using (47). The first term in the right hand side of (47), (s)
Rλ : L 2 (Cs,ε ) → L 2 (Cs,ε ), onto the ε-independent cylinder C . depends on ε. The transformation (63) maps Cs,ε s The operator
(s) := L ε R (s) (L ε )−1 : L 2 (Cs ) → L 2 (Cs ) R λ λ does not depend on ε (Lemma 12), and it depends meromorphically on λ. Thus, the contributions from the first term in the right-hand side of (47) to the entries of the matrix Fλ are operators which are ε-independent and meromorphic in λ. The rest of the terms in the right-hand side of (47) (other than the remainder) are operators of finite ranks. − Due to Lemma 13 (see also the formula (23) for cs,k = cs,k ), the contributions of these
556
S. Molchanov, B. Vainberg
terms to the entries of Fλ are ε-independent operators which are analytic in λ and are of the rank one, multiplied by functions qv; j,n,s,k of the form √ √ qv; j,n,s,k (λ, ε) = ei
αj
λ−λ j,n +βs
λ−λs,k
ε
.
(64)
Here α j = t j or α j = l j − t j (independently, βs = ts or βs = ls − ts ). Formula (47) leads to α j = t j , βs = ts if 1) the channels C j,ε and Cs,ε are adjacent to a common junction, which corresponds to the vertex v, and 2) the parameter t on both channels C j,ε and Cs,ε is introduced in such a way that t = 0 at the vertex v. Other options in the choice of α j and βs correspond to opposite parametrization of the channels C j,ε , Cs,ε , or both. If C j,ε and Cs,ε do not have a common junction which corresponds to the vertex v then qv; j,n,s,k = 0. Thus, the matrix operator Fλ can be represented in the form ⎡ ⎤ j,n,s,k ⎦ Fλ = Fλ0 + ⎣ qv; j,n,s,k (λ, ε)Fλ + R, (65) v;n,k
j,s>m
j,n,s,k
where Fλ0 , Fλ are ε-independent operators, Fλ0 is meromorphic in λ, operators j,n,s,k Fλ are analytic in λ and have rank one, the summation extends over all the vertices v and over n ∈ [0, m j ], k ∈ [0, m s ]. The operator R = R(ε, λ) corresponds to the remainder term in (47) and has the following estimate ||R|| ≤
Ce−γ /ε . λ j ∈l |λ − λ j |
Since the analytic Fredholm theorem [2] is applicable to the operator I + Fλ , from (65) it follows that it is also applicable to the operator I + Fλ0 . Let l δ be the δ/2neighborhood of the set consisting of both the poles λ j of the operator (I + Fλ0 )−1 j located inside l and the points λ ∈ l. Then ||(I + Fλ0 )−1 || ≤ C(δ),
||R || ≤ C(δ)e−γ /ε ,
λ ∈ l \ lδ,
(66)
where R = R(I + Fλ0 )−1 . Formula (65) implies, for λ ∈ l \ l δ , −1 (I + Fλ )−1 = (I + Fλ0 )−1 I + qG + R , where qG is the matrix operator with matrix elements N − m < j, s ≤ N . Here j,n,s,k
Gλ j,n,s,k
The operators G λ The equation
j,n,s,k
= Fλ
v;n,k
(67) j,n,s,k
qv; j,n,s,k (λ, ε)G λ
,
(I + Fλ0 )−1 .
are meromorphic in λ and have rank one. (I + qG)x = g
(68)
Scattering Solutions in Networks of Thin Fibers: Small Diameter Asymptotics
557
for x can be reduced to an equation in the finite dimensional space S spanned by the j,n,s,k ranges of the operators G λ . We fix a basis in S, reduce Eq. (68) to the algebraic system A x = g for coordinates of the projection of x on S, and solve the system using the Kramer rule. Since functions qv; j,n,s,k are bounded when λ ∈ l, the procedure described above allows us to estimate the norm of the operator (I + qG)−1 through | det −1 A|. Hence, there exists a polynomial P = P(qv; j,n,s,k ) of variables qv; j,n,s,k which has the following properties. Its coefficients are meromorphic in λ (with poles belonging to the set { λ j } ∪ {λ j }), the polynomial is linear with respect to each variable qv; j,n,s,k , and is such that ||(I + qG)−1 || ≤ C| f ε (λ)|−1 ,
λ ∈ l \ lδ,
f ε (λ) := 1 + P(qv; j,n,s,k (λ, ε)).
The function f ε (λ) here has the form (54) with one of g j identically equal to zero, and the corresponding coefficient C j equal to one. The latter implies (55). Thus, Lemma 15 can be applied to the function f ε (λ) above on each connected interval of the set l\l δ . There are only finitely many such intervals. Thus, for any η > 0, there exists a thin set η (ε) such that ||(I + qG)−1 || ≤ Ceη/ε ,
λ ∈ l\η (ε).
We choose η < γ , where γ is defined in (66). Then ||(I + qG + R )−1 || ≤ Ceγ /2ε ,
λ ∈ l\η (ε),
when ε is small enough,. A similar estimate holds for operator (67): ||(I + Fλ )−1 || ≤ Ceγ /2ε ,
λ ∈ l\η (ε).
Hence, the same estimate is valid for the operator (I + Fλ )−1 , and from (59), (61) it follows that Rλ f = ψv Rv,λ (φv h), λ ∈ l , (69) v∈V
where h ∈ L 2 (ε ) is supported in ∪D j,ε , j > m, and ||h|| L 2 (ε ) ≤ Ceγ /2ε || f || L 2 (ε ) , λ ∈ l\η (ε).
(70)
Relations (69), (70), (46), and Lemma 11 together provide the asymptotic expansion (15) for Rλ f needed to complete the proof of Theorem 6. The last result, which we are going to discuss now, concerns the limiting behavior of the GC as λ approaches λ0 , the bottom of the absolutely √ continuous spectrum. We assume that the resolvent (12) does not have a pole at k = λ0 . Obviously this assumption holds for generic domains ε . Theorem 4 implies that this assumption is equivalent to the absence of bounded solutions of the homogeneous problem (5) with λ = λ0 . Recall that the scattering matrix (10) depends on λ > λ0 and the GC (18) depend on both λ > λ0 and ε > 0.
558
S. Molchanov, B. Vainberg (ε)
Proof of Theorem 7. Consider an infinite channel C s , for which λs,0 = λ0 . Let s,0 be the scattering solution which corresponds to the incident wave ψinc = e−i
√
λ−λs,0 t ε
ϕs,0 (y/ε).
Let φs ∈ C ∞ (ε ), φs = 1 in the channel Cs for t ≥ 2, φs = 0 in Cs for t ≤ 1 and (ε) outside of Cs . We represent s,0 in the form (ε) s,0 = φs ψinc + u, λ > λ0 .
Then u is the outgoing solution of the problem (−ε2 − λ)u = f, x ∈ ε ;
u = 0 on ∂ε ,
where f = −ε2 (φs )ψinc − 2ε2 ∇φs ∇ψinc has a compact support. Hence, u = R √λ f. From here, the second statement of Theorem 4, and the absence of a pole at k√= λ0 2 ( ), is analytic in z = it follows that u, if considered as an element of L loc λ − λ0 ε in a neighborhood of the point z = 0. Then from standard local a priori estimates for solutions of elliptic problems it follows that u is analytic, if considered as an element of any Sobolev space of functions on any bounded part of ε . Hence, the restrictions u j of u to cross-sections t = 2 of infinite channels C j are analytic in z. Thus, for any infinite channel C j , u is an outgoing solution of the problem (−ε2 − λ)u = 0, x ∈ ε ∩ {t > 2}; u = 0 on ∂ε ∩ {t > 2}; u|t=2 = u j . (71) √ Since u j is analytic in z = λ − λ0 in a neighborhood of the point z = 0, the coefficients a j,n in the asymptotic expansion (6) for the solution u of (71) are analytic in z. This proves the analyticity of the scattering matrix. (ε) From analyticity of u in z and (71) it also follows that the scattering solution s,0 , when z = 0, is a solution of the homogeneous problem (5) with λ = λ0 , and satisfies (ε) (13). Thus s,0 ≡ 0 when z = 0 due to Theorem 4. This implies that T = −I and completes the proof of the first statement. The second statement of the theorem is an obvious consequence of the analyticity of Tv and (18). This completes the proof of the theorem. Remarks concerning Theorem 7. 1) Consider a bounded domain ε with one junction and several channels of finite length. Let ε be a spider type domain which one gets by extending the channels of ε to infinity. The spectrum of the problem (5) in ε is discrete, and there exists a sequence of eigenvalues which approach λ0 as ε → 0. Each of these eigenvalues has the form λn (ε) = λ0 + O(ε2 ).
(72)
Theorem 7, concerning the problem in ε , can be used to specify the asymptotic behavior (72) of the eigenvalues λn (ε). The last statement of the theorem and (72) indicate that, for generic domains ε , the asymptotic behavior of λn (ε) as ε → 0 (when n is fixed or n → ∞ not very fast) is the same as for eigenvalues of the corresponding Dirichlet problem on the limiting graph with the Dirichlet GC at the vertex. This result will be discussed in more detail elsewhere. 2) Our paper [14] contains a mistake in the statement of Theorem 5.1 (which is a simplified version of Theorem 7 above) about the form of the GC at the bottom of the
Scattering Solutions in Networks of Thin Fibers: Small Diameter Asymptotics
559
absolutely continuous spectrum: k → 0 has to be replaced there by k → 0, ε → 0. The arguments in the last 5 lines of the proof are wrong, but can be easily corrected with the additional assumption that ε → 0. (Also, the index of summation in the formulas (5.2), (5.4), (5.6) of that paper must be n, not j.) References 1. Birman, M.S.: Perturbations of the continuous spectrum of a singular elliptic operator by varying the boundary and the boundary conditions (Russian, English summary). Vestnik Leningrad. Univ. 17(1), 22–55 (1962) 2. Blekher, P.M.: On operators that depend meromorphically on a parameter. Moscow Univ. Math. Bull. 24(5–6), 21–26 (1972) 3. Dell’Antonio, G., Tenuta, L.: Quantum graphs as holonomic constraints. J. Math. Phys. 47, 072102:1–21 (2006) 4. Duclos, P., Exner, P.: Curvature-induced bound states in quantum waveguides in two and three dimensions. Rev. Math. Phys. 7, 73–102 (1995) 5. Freidlin, M., Wentzel, A.: Diffusion processes on graphs and averaging principle. Ann. Probab. 21(4), 2215–2245 (1993) 6. Exner, P., Seba, P.: Electrons in semiconductor microstructures: a challengee to operator theorists. In: Schrödinger Operators, Standard and Nonstandard (Dubna 1988), Singapure: World Scientific, 1989 pp. 79–100 7. Exner, P., Post, O.: Convergence of spectra of graph-like thin manifolds. J. Geom. Phys. 54, 77–115 (2005) 8. Kostrykin, V., Schrader, R.: Kirchhoff’s rule for quantum waves. J. Phys. A: Mathematical and General 32, 595–630 (1999) 9. Kuchment, P.: Graph models of wave propagation in thin structures. Waves in Random Media 12, 1– 24 (2002) 10. Kuchment, P.: Quantum graphs. I. Some basic structures, Waves in Random Media 14(1), 107–128 (2004) 11. Kuchment, P.: Quantum graphs. II. Some spectral properties of quantum and combinatorial graphs. J. Phys. A: Mathematical and General 38(22), 4887–4900 (2005) 12. Kuchment, P., Zeng, H.: Convergence of spectra of mesoscopic systems collapsing onto a graph. J. Math. Anal. Appl. 258, 671–700 (2001) 13. Kuchment, P., Zeng, H.: Asymptotics of spectra of Neumann Laplacians in thin domains. In: Advances in Differential Equations and mathematical Physics, Yu. Karpeshina etc. (editors), Contemporary Mathematics 387, Providence, RI: Amer. Math. Sec., 2003 pp. 199–213 14. Molchanov, S., Vainberg, B.: Transition from a network of thin fibers to quantum graph: an explicitly solvable model. Contemporary Mathematics 115, Providence, RI: Amer. Math. Sec., 2006 pp. 227–239 15. Novikov, S.: Schrödinger operators on graphs and symplectic geometry. The Arnold fest., Fields Inst. Commun. 24, Providence, RI: Amer. Math. Sec., 1999 pp. 397–413 16. Post, O.: Branched quantum wave guides with Dirichle BC: the decoupling case. J. Phys. A: Mathematical and General 38(22), 4917–4932 (2005) 17. Reed, M., Simon, B.: Methods of modern mathematical Physics, IV: Analysis of operators, New York: Acadamic Press, A Subsidiary of Harcourt Brace Jovnnovich, Publishers, 1978 18. Rubinstein, J., Schatzman, M.: Variational problems on myltiply connected thin strips. I. Basic estimates and convergence of the Laplacian spectrum. Arch. Ration. Mech. Anal. 160(4), 293–306 (2001) 19. Vainberg, B.: On analytic properties of the resolvent for a class of sheaves of operators, Math. USSR-Sb. 6, 241–273 (1968) 20. Vainberg, B.: On short wave asymptotic behavior of solutions of stationary problems and the asymptotic behavior as t → ∞ of solutions of non-stationary problems. Russ. Math. Surv. 30(2), 1–58 (1975) Communicated by B. Simon
Commun. Math. Phys. 273, 561–599 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0270-y
Communications in
Mathematical Physics
On Absolute Moments of Characteristic Polynomials of a Certain Class of Complex Random Matrices Yan V. Fyodorov1, , Boris A. Khoruzhenko2, 1 School of Mathematical Sciences, University of Nottingham, Nottingham, NG7 2RD, UK.
E-mail: [email protected]
2 School of Mathematical Sciences, Queen Mary, University of London, London E1 4NS, UK.
E-mail: [email protected] Received: 31 January 2006 / Accepted: 8 March 2007 Published online: 31 May 2007 – © Springer-Verlag 2007
Abstract: The integer moments of the spectral determinant | det(z I − W )|2 of complex random matrices W are obtained in terms of the characteristic polynomial of the positive-semidefinite matrix W W † for the class of matrices W = AU , where A is a given matrix and U is random unitary. This work is motivated by studies of complex eigenvalues of random matrices and potential applications of the obtained results are discussed in this context. 1. Introduction The characteristic polynomials of random matrices have recently attracted considerable interest in the mathematical physics literature. Initially, the interest was stimulated by applications in number theory [35, 36], quantum chaos [3, 21, 27] and quantum chromodynamics (QCD) [43, 46, 29, 22], but with the emerging connections to integrable systems [39, 47], combinatorics [16, 44], representation theory [9, 10, 12, 15] and analysis [5] it has become apparent that the characteristic polynomials of random matrices are also of independent interest. In this paper we are concerned with the integer moments of the squared modulus of the characteristic polynomial of complex random matrices in a rather general class of matrices W = AU , where A ≥ 0 is fixed and U is a random unitary matrix distributed uniformly over the unitary group. In the particular case when A is an identity matrix, the matrix W is unitary and its eigenvalues lie on the unit circle. Various moments of the characteristic polynomial for this class of matrices were obtained recently, see [35, 36, 13–15]. In the general case, the eigenvalues of W = AU will be distributed in a region in the complex plane. Eigenvalue The research in Nottingham was supported by EPSRC grant EP/C515056/1: “Random Matrices and Polynomials: a tool to understand complexity”. Part of this work was carried out during the Newton Institute programme on Random Matrix Approaches in Number Theory.
562
Y. V. Fyodorov, B. A. Khoruzhenko
statistics of such complex eigenvalues, and in particular the mean eigenvalue density, are of interest for physics of open chaotic systems, see, e.g. [24, 25], and in QCD, see, e.g. [47] and references therein, and are difficult to study analytically. In this context moments of the squared modulus of the characteristic polynomial frequently provide a very useful tool. Indeed, in a variety of random matrix ensembles the mean eigenvalue density, ρ(x, y) = tr δ(z I − W )W , z = x + i y, (1.1) can be expressed in terms of the mean-square-modulus of the characteristic polynomial in a closely related random matrix ensemble. In (1.1) the angle brackets stand for the average over the matrix distribution, and I is the identity matrix. An obvious example is served by the Ginibre ensemble of complex matrices [31]. In this ensemble the matrix distribution has density Const. exp(− tr W W † ), where W † is the complex conjugate transpose of W . The mean density ρn (x, y) of eigenvalues of the Ginibre matrices of size n × n is given by 1 −|z|2 |z|2k e . π k! n−1
ρn (x, y) =
(1.2)
k=0
One can arrive at (1.2) in various ways. Ginibre computed the joint probability density function of the eigenvalues and then applied the method of orthogonal polynomials. Another way is to use the method of dimensional reduction, see e.g. [45, 17, 18] which gives the following relation: e−|z| |det(z In−1 − Wn−1 )|2 Wn−1 . π(n − 1)! 2
ρn (x, y) =
(1.3)
Here the angle brackets stand for averaging over the Ginibre ensemble of complex matrices of size (n − 1) × (n − 1). The mean-square on the right-hand side in (1.3) can be easily computed giving again (1.2). A less obvious example, which in fact provided the initial impetus for the present study, is the so-called ensemble of ‘random contractions’ [25]. In its simplest variant of rank-one deviations from the unitarity, these are random n × n matrices satisfying the constraint 1−γ 0 † Wn Wn = , 0 < γ < 1. (1.4) 0 In−1 In the ‘polar’ coordinates, Wn = G n Un , where Un is a CUE √n matrix, i.e. a matrix drawn at random from the unitary group U (n), and G n = diag ( 1 − γ , 1, . . . , 1). The mean density of eigenvalues1 of Wn can be expressed as the mean-square-modulus of the characteristic polynomial of (n − 1) × (n − 1) matrices G˜ n−1 Un−1 , n − 1 γ˜ n−2 ρn (x, y) = | det(z In−1 − G˜ n−1 Un−1 )|2 Un−1 , 1 − γ < |z|2 < 1, π γ |z|2 γ (1.5) where now the angle brackets stand for averaging over the unitary group U (n − 1) with respect to the normalized Haar measure, and G˜ n−1 = diag( 1 − γ˜ , 1, . . . , 1),
γ˜ =
|z|2 + γ − 1 . |z|2
1 Note that constraint (1.4) implies that the eigenvalues of W lie in the annulus 1 − γ ≤ |z|2 ≤ 1.
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
563
Another example is provided by finite-rank deviations from Hermiticity [23]. We only consider the simplest but still non-trivial case of rank-one deviations. Let Wn = Hn + iΓn ,
(1.6)
where Hn is a GUEn matrix, i.e. random Hermitian matrix of size n × n with probability distribution β
d Pβ,n (H ) = Const.e− 2 tr H
2
n j=1
d Hjj
1≤ j
d Hjjd Hjj , 2
β > 0,
and Γn = diag(γ , 0, . . . , 0), γ > 0. For the mean eigenvalue density ρn (x, y) of matrices (1.6) we have ρn (x, y) = rβ,n (x, y) | det(z In−1 − (Hn−1 + i Γ˜n−1 )|2 Hn−1 ,
0 < y < γ,
(1.7)
where βx 2
β n (γ − y)n−2 e− 2 −β(γ −y)y rβ,n (x, y) = , √ 4 2πβ γ n−1 (n − 2)!
Γ˜n−1 = diag(γ − y, 0, . . . , 0),
and the angle brackets stand for averaging with respect to the distribution d Pβ,n−1 (H ). We derive (1.5) and (1.7) in Sect. 6. The above formulas relating the mean eigenvalue density and the mean-squaremodulus of the characteristic polynomial are specific to the considered matrix distributions. In the general situation, the mean density of eigenvalues can be determined from the fractional moments of | det(z I − W )|2 = det(z I − W )(z I − W )† (e.g., by way of the logarithmic potential of the eigenvalue distribution), or from averages of ratios of det[(z I − W )(z I − W )† + ε2 I ], see e.g. [25]. Getting explicit formulas for that kind of objects outside the classes of Hermitian and unitary matrices is, however, a considerable challenge. Although it is known that the fractional moments of | det(z I − W )|2 can be written in terms of a hypergeometric function of matrix argument W W † [39], the corresponding series are hard to deal with in the limit of the infinite matrix dimension. Our main result, Theorem 1, expresses ±m U , det(z I − AU )(z I − AU )†
m = 1, 2, . . . ,
where the integration is over the unitary matrices U with respect to the Haar measure, as an m-fold integral of powers of the characteristic polynomial of A A† . This integral can be written as an m × m determinant with entries given by a certain integral transform of the characteristic polynomial of A A† , see (2.11)-(2.12). In particular, this result implies that for the ensembles of random complex matrices W with unitary invariant matrix distribution (e.g., for the Feinberg-Zee ensemble [19]) our formulas effectively reduce the original non-Hermitian problem to a Hermitian one, albeit on the level of characteristic polynomials. This, as explained in more detail at the end of the next section, has a clear computational advantage, as one can then use various formulas for the averages of products and ratios of the characteristic polynomials of Hermitian matrices which have been obtained recently, see [11, 26, 5]. In contrast, with the exception of essentially Gaussian weights [1, 2], no such formulas are known for complex matrices.
564
Y. V. Fyodorov, B. A. Khoruzhenko
We also express
1 det[(z I − AU )(z I − AU )† + ε2 I ]
U
as a two-fold integral of the inverse spectral determinant of A A† , see Theorem 2. Again, the non-Hermitian problem is reduced to a Hermitian one. This regularized inverse spectral determinant can be useful as an indicator of the domain of the distribution of complex eigenvalues. 2. Statement of Main Results and Discussion Let n and m be positive integers. Define dµn (t1 , . . . , tm ) =
m 1 2 ∆ (t1 , . . . , tm ) (1 + t j )−n−2m dt1 . . . dtm , cn
t j ≥ 0, (2.1)
j=1
and, for n ≥ 2m, dνn (t1 , . . . , tm ) =
m 1 2 ∆ (t1 , . . . , tm ) (1 − t j )n−2m dt1 . . . dtm , kn
0 ≤ t j ≤ 1,
j=1
(2.2)
m− j m ∆(t1 , . . . , tm ) = det ti
where
i, j=1
=
(ti − t j )
(2.3)
1≤i< j≤m
is the Vardermonde determinant, and cn =
m−1 j=0
j!( j + 1)!(n + j)! (n + m + j)!
and kn =
m−1 j=0
j!( j + 1)!(n − m − j − 1)! (n − j − 1)!
(2.4)
are the normalization constants. The Selberg Integral, see e.g. [38], asserts that dµn and dνn are unit mass measures, ∞ ∞ 1 1 . . . dµn (t1 , . . . , tm ) = . . . dνn (t1 , . . . , tm ) = 1. 0
0
0
0
The measures dµn and dνn define probability distributions which have the following random matrix interpretation. Consider two families of matrix distributions on the space
m of m × m complex matrices Z = x jk + i y jk j,k=1 : m 1 1 d x jk dy jk , d µˆ n (Z ) = cˆn det n+2m (Im + Z Z † ) j,k=1
and d νˆ n (Z ) =
m 1 det n−2m (Im − Z Z † ) d x jk dy jk , kˆn j,k=1
n ≥ 0,
(2.5)
n ≥ 2m.
(2.6)
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
565
The measures d νˆ n (Z ) are defined on the matrix ball Z Z † < Im and the constants cˆn and kˆn are determined by the normalization condition d µˆ n (Z ) = d νˆ n (Z ) = 1. Z Z † ≥0
0≤Z Z † ≤Im
A standard calculation, see e.g. [30], shows that if s(Z Z † ) is a symmetric function of the eigenvalues t1 , . . . , tm of Z Z † , i.e. s(Z Z † ) = s(t1 , . . . , tm ), then
+∞ +∞ s(Z Z ) d µˆ n (Z ) = · · · s(t1 , . . . , tm ) dµn (t1 , . . . tm ), †
Z Z † ≥0
0≤Z Z † ≤Im
0
0
1
1
s(Z Z † ) d νˆ n (Z ) =
··· 0
(2.7)
s(t1 , . . . , tm ) dνn (t1 , . . . tm ).
(2.8)
0
Theorem 1 below, which we state in slightly more generality than required for spectral determinants, tells how to integrate moments of determinantsover the unitary group equipped with the Haar measure dU fixed by the normalization U (n) dU = 1. Theorem 1. Let A, B, C, D be complex matrices of size n × n. (i) For any positive integer m,
∞ ∞ m det [(AU + C)(BU + D) ]dU = . . . det(C D † + t j AB † ) dµn (t1 , . . . , tm ). m
†
U (n)
0
0
j=1
(2.9) (ii) If A A† < CC † and B B † < D D † then for any positive integer m such that 2m ≤ n, U (n)
dU = m det [(AU + C)(BU + D)† ]
1 1 ... 0
m
0
dνn (t1 , . . . , tm )
.
(2.10)
det(C D † − t j AB † )
j=1
Remark. Identities (2.9) – (2.10) may be written in yet another form by making use of the well-known identity, see e.g. [40] Part 2, Problem 68,
m
m
m . . . det p j (ti ) i, j=1 det q j (ti ) i, j=1 dt1 . . . dtm = m! det pi (t)q j (t)dt
.
i, j=1
We have U (n)
det m [(AU + C)(BU + D)† ]dU m! det = cn
0
+∞
det(C D † + t AB † ) t i+ j dt (1 + t)n+2m
m−1 (2.11) i, j=0
566
Y. V. Fyodorov, B. A. Khoruzhenko
and U (n)
dU m! det = det m [(AU + C)(BU + D)† ] kn
1 0
(1 − t)n−2m t i+ j dt det(C D † − t AB † )
m−1 , i, j=0
(2.12) where the constants cn and kn are given in (2.4). Obviously, by letting C = D = z I in (2.9) and (2.10) one obtains formulas for the integer moments of the spectral determinants | det(z I − AU )|2 . In particular,
∞ | det(z I − AU )| dU = (n + 1) 2
U (n)
and, provided n ≥ 2,
0
det(|z|2 I + t A A† ) dt (1 + t)n+2
(2.13)
⎧ 1 ⎪ ⎪ (n − 1)(1 − t)n−2 ⎪ ⎪ dt, if |z|2 < λmin (A A† ); ⎪ ⎪ ⎨ det(A A† − t|z|2 I )
dU = 01 ⎪ | det(z I − AU )|2 ⎪ (n − 1)(1 − t)n−2 ⎪ ⎪ U (n) 2 † ⎪ ⎪ det(|z|2 I − t A A† ) dt, if |z| > λmax (A A ), ⎩
(2.14)
0
where λmin (A A† ) and λmax (A A† ) are respectively the smallest and the largest eigenvalues of A A† . If λmin (A A† ) ≤ |z|2 ≤ λmax (A A† ), then the integral on the left-hand side in (2.14) should be handled with care. One way to do this is to regularize the integrand. For positive ε, define dU † Rz,ε (A, A ) =
† . U (n) 1 1 2 det ε I + I − z AU I − z AU The integral on the right-hand side is, in fact, a function of A A† and our next theorem evaluates this function in terms of the eigenvalues of A A† . Theorem 2. Let ε > 0, and assume that n ≥ 2. Then for any n × n matrix A and any non-zero complex z, Rz,ε (A, A† ) +∞ 1 n−1 1 dx n−2 = (1 − t) dt .
√ 1 2πi 0 † + ε 2 − t I − iε t x + 1 I −∞ x det A A 2 x |z| If the eigenvalues a 2j of A A† are all distinct then for any z in the annulus λmin (A A† ) < |z|2 < λmax (A A† ) we have lim
ε→0
n Rz,ε (A, A† ) 1 2 2 2 n−2 2 2 (|z| − a ) θ (|z| − a ) , (2.15) = (n − 1)|z| j j 2 2 ln(1/ε ) a − a 2j j=1 k= j k
where θ is the Heaviside step function.
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
567
We prove Theorems 1 and 2 in Sects. 4 and 5, respectively, by making use of two techniques, which to a certain extent are equivalent. One is based on the expansion of moments of the spectral determinants in characters of the unitary group and subsequent use of the orthogonality of the characters. In this way, Theorem 1 is equivalent to two combinatorial identities (3.18) and (3.19), one of which is a particular case of the Selberg integral in the form of Kaneko [34] and Kadell [33]. We prove (3.18) and (3.19) in Sect. 3. These combinatorial identities can be stated in the form of matrix integrals (3.27) and (3.28) and are of independent interest. They lead to evaluation of some nontrivial matrix integrals, as discussed at the end of Sect. 3. The other technique is based on the so-called color-flavor transformation, due to Zirnbauer [49]. This transformation has many uses, and in the random matrix context it provides a very convenient tool to handle moments of spectral determinants. As an application of Theorem 1, let us consider random matrices (1.4). In the limit n → ∞ the eigenvalues of Wn get closer and closer to the unit circle. Let Nn (a, b) be the number of eigenvalues of Wn in the annulus 2a 2b 2 ≤ 1 − |z| ≤ , 0 < a < b. Da,b = z : n n By (1.5), Nn (a, b)U (n) ⎛ n−2 ⎜ n − 1 γ˜ = ⎝ π γ |z|2 γ
⎞ ⎟ | det(z In−1 − G˜ n−1 Un−1 )|2 dUn−1 ⎠ d xd y.
U (n−1)
Da,b
Making use of (2.13),
∞
| det(z In−1 − G˜ n−1 Un−1 )|2 dUn−1 = n
U (n−1)
0
(|z|2 + t (1 − γ˜ ))(|z|2 + t)n−2 dt (1 + t)n+1
and 1 Nn (a, b)U (n) = π n where f n (q) =
n−1 πγ q
1− 2a n
1− 2b n
f n (q)dq,
n−2 ∞ γ˜ [q + t (1 − γ˜ )](q + t)n−2 dt γ (1 + t)n+1 0
with γ˜ = (|z|2 + γ − 1)/|z|2 . Letting n → ∞, we obtain, after simple manipulations, a(γ − 2) sinh b b(γ − 2) 1 sinh a exp − exp , lim Nn (a, b)U (n) = n→∞ n a γ b γ recovering one of the formulas of [25], in which, using a different method requiring knowledge of the joint probability distribution of eigenvalues, they found the mean density of eigenvalues and higher order correlation functions for the general case of finite-rank deviation from the CUE. Note that when γ = 1 the nonzero eigenvalues of
568
Y. V. Fyodorov, B. A. Khoruzhenko
G n Un coincide with those of the (n − 1) × (n − 1) matrix obtained from Un by removing its first row and column, see [50] for more information about eigenvalue statistics of truncated unitary matrices. Now, we would like to elaborate on the point made at the end of the Introduction. Consider random complex matrices W of the size n × n with unitary invariant matrix distribution. Then, by making use of the unitary invariance and Theorem 1, ∞ pn (|z|2 t) 2 2 | det(z I − W )| = | det(z I − W U )| dU = dt, (2.16) W (1 + t)n+2 U (n) 0 W where pn (x) = det(x I + W W † )W . A similar formula holds for higher order moments of | det(z I − W )|2 . Thus, Theorem 1 reduces the original non-Hermitian problem to a Hermitian one. The integral on the right-hand side in (2.16) can be evaluated, in the limit of the infinite matrix dimension, in terms of the limiting eigenvalue distribution of W W † . To this end, consider, for example, the complex n × n matrices W with the matrix distribution characterized by the Feinberg-Zee density Const. e−n tr V (W W ) , †
(2.17)
where V (r ) is a polynomial in r , V (r ) = am r m + . . ., am > 0. Then pn (x) = en
ln(x+λ)dw(λ)
(1 + o(1)),
where dw(λ) is the limiting normalized eigenvalue counting measure of W W † , and it follows that 1 lim ln | det(z I − W )|2 = (x, y), n→∞ n W where
⎧ 2 if |z| > m 1 = λdw(λ), ⎪ ⎪ln |z| ⎪ ⎪ ⎪ −1 ⎪ ⎨ ∞ −1 ln λdw(λ) if |z| < 1/m −1 = , λ dw(λ)
(x, y) = ⎪ 0 ⎪ ⎪ ⎪ ∞ ⎪ λ + t0 ⎪ ⎩|z|2 + ln 2 dw(λ) if 1/m −1 < |z| < m 1 , |z| + t0 0 (2.18) where t0 is the unique non-negative solution of ∞ 1 dw(λ) = 2 . λ+t |z| + t 0 1 The function (x, y) is subharmonic and, hence, defines a measure dν = 4π ∆ in the complex plane. Here ∆ is the Laplacian in variables x and y. For the Ginibre ensemble of random matrices this measure can be found explicitly. In this case V (r ) = r and W W † is the Wishart ensemble √ of random matrices. Its limiting eigenvalue distribution 1 dw(λ) is given by dw(λ) = 2π (4 − λ)/λ, 0 < λ < 4, with m 1 = 1 and m −1 = ∞.
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
569
A straightforward but tedious calculation shows that (x, y) = |z|2 − 1 inside the unit disk |z|2 < 1. Therefore dν is the uniform distribution on the unit disk, which is the same as the limiting eigenvalue distribution in the Ginibre ensemble of random matrices, and, hence, for this ensemble 1 ln | det(z I − W )|2 n→∞ n lim
W
1 ln | det(z I − W )|2 n→∞ n
= lim
W
,
(2.19)
so that the operations of taking the logarithm and average commute in the limit n → ∞. A similar relation is known to hold for Wigner ensembles of Hermitian matrices [6]. It would be interesting to investigate conditions on random matrix distributions which guarantee (2.19). As the left hand-side in (2.19) is the logarithmic potential of the limiting eigenvalue distribution of W , this together with our Theorem 1 would give a useful tool for calculating eigenvalue distributions in the complex plane. There are indications that the range of matrix distributions for which (2.19) holds is quite wide and contains the invariant ensembles (2.17). Indeed, (x, y) of Eq. (2.18) reproduces the density of eigenvalue distribution in ensembles (2.17) which was obtained in [19, 20] with the help of the method of Hermitization2 . In this context we would like to mention calculation of Brown’s measure for R-diagonal elements in finite von Neumann algebras [28], see also [8]. A matrix model for such elements is provided by random matrices RU , where U is random unitary and R is positive-definite, and Brown’s measure is in a way a regularized version of the eigenvalue distribution. Again, (x, y) of Eq. (2.18) reproduces Brown’s measure found in [28]. 3. Combinatorial Identities Schur functions. In order to make our paper self-contained we recall below the required facts from the theory of symmetric polynomials. A partition is a finite sequence λ = (λ1 , λ2 , . . . , λn ) of integers, called parts, such that λ1!≥ λ2 ≥ · · · ≥ λn ≥ 0. The weight of a partition, |λ|, is the sum of its parts |λ| = j λ j , and the length, l(λ), is the number of its non-zero parts. No distinction is made between partitions which differ merely by the number of zero parts, and different partitions of weight r represent different ways to write r as a sum of natural numbers. Partitions can be viewed as Young diagrams. The Young diagram of λ is a rectangular array of boxes (or dots), with λ j boxes in the j th row, the rows being lined up on the left. By transposing the diagram of λ (i.e. interchanging the rows and columns) one obtains another partition. This partition is called the conjugate of λ and denoted by λ . For example the conjugate of the partition (r ) of length one is the partition (1, . . . , 1) ≡ (1r ) of length r . Obviously, l(λ ) = λ1 and |λ| = |λ |. For any partition λ of length l(λ) ≤ n,
λ +n− j n det xi j i, j=1 sλ (x1 , . . . , xn ) = (3.1)
n− j n det xi i, j=1
is a symmetric polynomial in x1 , . . . , xn , homogeneous of degree |λ|. These polynomials are known as the Schur functions. By convention, sλ (x1 , . . . , xn ) = 0 if l(λ) > n. 2 This method has a hidden regularization procedure which has to be justified to satisfy mathematical rigor.
570
Y. V. Fyodorov, B. A. Khoruzhenko
This convention is in agreement with the apparent identities sλ (x1 , . . . , xn−1 , 0) = sλ (x1 , . . . , xn−1 ) =0
if l(λ) ≤ n − 1, if l(λ) > n − 1.
(3.2) (3.3)
For partitions of length one, λ = (r ), the Schur functions sλ are the complete symmetric functions h r , s(r ) (x1 , . . . , xn ) = h r (x1 , . . . , xn ) = x i 1 x i 2 . . . x ir , (3.4) 1≤i 1 ≤i 2 ≤...≤ir ≤n
and sλ are the elementary symmetric functions er , s(1r ) (x1 , . . . , xn ) = er (x1 , . . . , xn ) =
x i 1 x i 2 . . . x ir .
(3.5)
1≤i 1
More generally, see e.g. [37] p. 41, the Jacobi-Trudi identity asserts that for any n ≥ l(λ), n n
sλ = det h λi −i+ j i, j=1 , sλ = det eλi −i+ j i, j=1 , (3.6) where, by convention, er = h r = 0 if r < 0. We shall also need the Schur functions of matrix argument. If M is an n × n matrix then sλ (M) = sλ (x1 , . . . , xn ), where x1 , . . . , xn are the eigenvalues of M. Thus sλ (M) is a symmetric polynomial in the eigenvalues of M. In view of (3.6), it is also a polynomial in the matrix entries of M. The Schur functions of the matrix argument are the characters of irreducible representations of the general linear group and its unitary subgroup and, as a consequence, have an important property of orthogonality. If λ and µ are two partitions and A and B are two n × n matrices then, see e.g. [37] p. 445, sλ (AB † ) sλ (AU )sµ (BU )dU = δλ,µ , (3.7) dλ U (n) and
U (n)
sλ (AU BU † )dU =
sλ (A)sλ (B) , dλ
(3.8)
where dλ is the dimension of the irreducible representations of U (n) with signature λ, dλ = sλ (In ) = sλ (1n ). We use the notation (1n ) for the n-tuple (1, . . . , 1). If λ is a partition of length 1, λ = (r ), then (n + r − 1)! n +r −1 , sλ (1n ) = h r (1n ) = = r r !(n − 1)! and sλ (1n ) = er (1n ) =
n! n = . r r !(n − r )!
(3.9)
(3.10)
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
571
In general, explicit expressions are known for sλ (1n ) and sλ (1n ) in terms of the λ j ’s. If l(λ) ≤ n then for any m ≥ l(λ),
sλ (1n ) =
m
(n + λ j − j)! . (m + λ j − j)!(n − j)!
(3.11)
(n + j − 1)! . (n + j − 1 − λ j )!(m + λ j − j)!
(3.12)
(λi − i − λ j + j)
1≤i< j≤m
j=1
If l(λ ) ≤ n then for any m ≥ l(λ), sλ (1n ) =
(λi − i − λ j + j)
1≤i< j≤m
m j=1
Both identities can be derived by evaluating the binomial determinants in (3.6). We shall also need the Cauchy identities for Schur functions, see, e.g., [37], pp. 63, 65. Let X be an n × n matrix. Then m
det(In + ti X ) =
sλ (t1 , . . . , tm )sλ (X ),
(3.13)
1 = sλ (t1 , . . . , tm )sλ (X ). det(In − ti X )
(3.14)
i=1 m
λ
i=1
λ
The summation in (3.13) is over all partitions such that l(λ) ≤ m and l(λ ) ≤ n and is finite. The summation in (3.14) is over all partitions such that l(λ) ≤ min(m, n) and is infinite. The corresponding series converges absolutely if X X † < In . Beta-function determinants. When m = 1 identities (3.13) and (3.14) take the familiar form of the expansion of the characteristic polynomial and its reciprocal in terms of the elementary symmetric functions and complete symmetric functions, respectively. In this case Theorem 1 is a straightforward consequence of the orthogonality property of the Schur functions (3.7) and the Euler integral 1 +∞ t p−1 Γ ( p)Γ (q) = B( p, q), Re p, q > 0, t p−1 (1−t)q−1 dt = dt = (1 + t) p+q Γ ( p + q) 0 0 (3.15) where Γ ( p) and B( p, q) are the Gamma and Beta functions respectively. Indeed, for example, n er (AB † ) det(I + AU ) det(I + BU )† dU = er (1n ) U (n) r =0 ∞ dt det(I + t AB † ) , = (n + 1) (1 + t)n+2 0 where we have used (3.15) in the form 1 = (n + 1) er (1n )
+∞ 0
tr dt. (1 + t)n+2
(3.16)
Similarly,
1 1 = (n − 1) t r (1 − t)n−2 dt, h r (1n ) 0 and this identity does the trick for the reciprocal characteristic polynomials.
(3.17)
572
Y. V. Fyodorov, B. A. Khoruzhenko
Our proof of Theorem 1 uses the following generalization of (3.16) – (3.17) to the multivariate setting. Lemma 3. Let m and n be nonnegative integers.
(a) For any partition λ such that l(λ) ≤ m and l(λ ) ≤ n, ∞ ∞ m . . . sλ (t1 , . . . tm )∆2 (t1 , . . . , tm )
sλ2 (1m ) 1 = sλ (1n ) cn
0
j=1
0
dt j . (1 + t j )n+2m
(3.18)
(b) If 2m ≤ n then for any partition λ such that l(λ) ≤ m, sλ2 (1m ) 1 = sλ (1n ) kn
1 1 m . . . sλ (t1 , . . . tm )∆2 (t1 , . . . , tm ) (1 − t j )n−2m dt j . 0
(3.19)
j=1
0
The normalization constants cn and kn are given in (2.4), and ∆(t1 , . . . , tm ) is the Vandermonde determinant (2.3). Remark. Identity (3.19) can be inferred from a generalization of the Selberg Integral due to Kaneko [34] and Kadell [33], of which (3.19) is a particular case. However, we are not aware about any generalization of the Selberg Integral leading to (3.18). Below, we give an elementary proof of (3.18) and (3.19) based on evaluation of a determinant consisting of Beta functions. Our proof has limited scope and does not extend to the generality of Kaneko and Kadell formulas.
Proof. Let f j = m + λ j − j, j = 1, 2, . . . , m. If l(λ) ≤ m ≤ n and l(λ ) ≤ n then by (3.11) – (3.12), m−1 m sλ2 (1m ) 1 = ∆( f 1 , . . . , f m ) f j !(n + m − 1 − f j )!, sλ (1n ) j!2 (n + j)!
where ∆( f 1 , . . . , f m ) =
j=0
j=1
m
( f i − f j ) = det f jm−i
1≤i< j≤m
i, j=1
.
By adding rows in the Vandermonde determinant det f jm−i ,
m ∆( f 1 , . . . , f m ) = det pm−i ( f j ) i, j=1 , where pk (x) = (x + 1)(x + 2) . . . (x + k). Hence
m f 1 ! f 2 ! . . . f m !∆( f 1 , . . . , f m ) = det ( f j + m − i)! i, j=1 ,
and m−1 (n + m + j)!
m sλ2 (1m ) = det B( f j + m − i + 1, n + m − f j ) i, j=1 , sλ (1n ) ( j!)2 (n + j)! j=0
(3.20)
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
573
where B is the Beta function. By making use of Proposition 4 below,
m det B( f j + m − i + 1, n + m − f j ) i, j=1
m = det B( f j + m − i + 1, n + m − f j + i − 1) i, j=1 m +∞ f j m−i t t dt = det . (1 + t)n+2m i, j=1 0 It is apparent that ⎛ +∞ ⎞ +∞ +∞ m
f j t m−i dt tim−i dti t m− j ⎝ ⎠ det = · · · s (t , . . . t ) det t λ 1 m i (1 + t)n+2m (1 + ti )n+2m 0
0
=
1 m!
i=1
0 +∞
+∞
0
0
m
m− j · · · sλ (t1 , . . . tm )det 2 ti i=1
dti , (1 + ti )n+2m
and (3.18) follows. Similarly, if 2m ≤ n and l(λ) ≤ m then by (3.11), m m−1 (n − j − 1)! sλ2 (1m ) f j! = ∆( f 1 , . . . , f m ) , 2 sλ (1n ) j! (n − m + f j )! j=0
j=1
and, in view of (3.20), m−1 sλ2 (1m ) (n − j − 1)! = sλ (1n ) ( j!)2 (n − m − j − 1)! j=0
m × det B( f j + m − i + 1, n − 2m + i) i, j=1 .
(3.21)
By Proposition 4 below,
m det B( f j + m − i + 1, n − 2m + i) i, j=1
m = det B( f j + m − i + 1, n − 2m + 1) i, j=1 m 1 f j m−i n−2m t t (1 − t) dt , = det 0
and (3.19) follows.
(3.22) (3.23)
i, j=1
Proposition 4. For any p1 , p2 , . . . , pm and q1 , q2 , . . . , qm such that Re p j > m and Re q j > −1 we have
m
m det B( p j − i, q j + i) i, j=1 = det B( p j − i, q j + 1) i, j=1 .
(3.24)
574
Y. V. Fyodorov, B. A. Khoruzhenko
Proof. We shall use the identity B( p, q − 1) + B( p − 1, q) = B( p − 1, q − 1)
(3.25)
and the operation of addition of columns to transform the determinant on the left in (3.24) to the one on the right. It is convenient to write determinants by showing their columns. With this convention, |B( p − 1, q + 1), B( p − 2, q + 2), . . . , B( p − m + 1, q + m − 1), B( p − m, q + m)| represent the determinant on the left-hand side in (3.24). Let us label its columns by numbers 1, . . . , m from left to right (so that the leftmost column is column 1). Note a particular property of columns in this determinant. As we move from column j to column j + 1 the first argument of the Beta function decreases by 1, the second argument increases by 1. To be able to refer to this property, we say that columns 1, 2, . . . , m are balanced. Observing that column 1 has the desired form already, let us perform the following operation on columns 2, 3, . . . , m. Starting at column m and working backwards, let us add to each column the one that precedes it. In view of (3.25) and the above mentioned property of columns, this operation yields |B( p − 1, q + 1), B( p − 2, q + 1), B( p − 3, q + 2), . . . , B( p − m, q + m − 1)|. Observing that columns 1 and 2 have the desired form now, and that columns 2, . . . , m remain balanced, we apply our operation again, now on columns 3, . . . , m. This yields the determinant |B( p − 1, q + 1), B( p − 2, q + 1), B( p − 3, q + 1), . . . , B( p − m, q + m − 2)|, where columns 1, 2, 3 have the desired form and columns 4, . . . , m are balanced. It is clear that repeated application of our operation will yield the determinant |B( p − 1, q + 1), B( p − 2, q + 1), B( p − 3, q + 1), . . . , B( p − m, q + 1)| after the final step. This is exactly the determinant on the right in (3.25).
Applications to matrix integrals. In view of the integration formulas (2.7) and (2.8), identities (3.18) and (3.19) can be rewritten as: s 2 (Im ) s 2 (Im ) , , (3.26) sλ (Z Z † ) d µˆ n (Z ) = λ sλ (Z Z † ) d νˆ n (Z ) = λ sλ (In ) sλ (In ) Z Z † ≥0
Z Z † ≤Im
where Z are complex m × m matrices, and Im and In are identity matrices of sizes m × m and n × n, respectively. The first identity holds for any non-negative integer n and any partition λ such that l(λ ) ≤ n. The second one holds for any integer n ≥ 2m and any λ. Since the above identities become trivial (both sides vanish) for partitions of length > m we drop the restriction l(λ) ≤ m. These two identities lead to several useful matrix integrals. Let M be an m × m matrix. Then, for any non-negative integer n, sλ (M)sλ (Im ) , (3.27) sλ (M Z Z † ) d µˆ n (Z ) = sλ (In ) Z Z † ≥0
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
575
provided l(λ ) ≤ n, and if n ≥ 2m then sλ (M)sλ (Im ) sλ (M Z Z † ) d νˆ n (Z ) = † sλ (In ) Z Z ≤Im
(3.28)
for any λ. These two integrals follow from (3.26) and (3.8) and the unitary invariance of d µˆ n (Z ) and d νˆ n (Z ). If L and M are two m × m matrices then for any non-negative integer n, sλ (L M † ) , (3.29) sλ (L Z )sµ (M Z ) d µˆ n (Z ) = δλ,µ sλ (In ) Z Z † ≥0 provided l(λ ) ≤ n and l(µ ) ≤ n, and if n ≥ 2m then sλ (L M † ) . sλ (L Z )sµ (M Z ) d νˆ n (Z ) = δλ,µ sλ (In ) Z Z † ≤Im
(3.30)
These orthogonality relations follow from (3.7) and (3.27) – (3.28), and, in turn, lead to Berezin-Hua integrals [30, 7] det n (Im + L Z )det n (Im + M Z )† d µˆ n (Z ) = det n (Im + L M † ) Z Z † ≥0 d νˆ n (Z ) 1 , n ≥ 2m. = n n n † det (Im − L M † ) Z Z † ≤Im det (Im − L Z )det (Im − M Z ) One only has to recall the Cauchy identites (3.13) and (3.14). If P and Q are two n × m matrices and n ≥ 2m then it follows from (3.7) and (3.30) that sλ (P Q † U )sµ (P Q † U ) dU = sλ (P † P Z )sµ (Q † Q Z ) d νˆ n (Z ) (3.31) U (n)
Z Z † ≤Im
for any λ and µ. Identity (3.31) implies that † † † etr(P Q U +U Q P ) dU = U (n)
Z Z † ≤Im
etr(P
† P Z +Z † Q † Q)
d νˆ n (Z ).
(3.32)
The duality relation (3.32) is a particular case of Zirnbauer’s color-flavor transformation [49]. It can be easily obtained from (3.31) by making use of the expansion etr A = cλ sλ (A). (3.33) λ
! In fact, (3.32) extends to any series g(A) = λ cλ sλ (A), |g(P Q † U )|2 dU = g(P † P Z )g(Q † Q Z ) d νˆ n (Z ). U (n)
Z Z † ≤Im
It follows from (3.33) and (3.7) that the integral over the unitary group on the lefthand side in (3.32) is a function of Q † Q P † P. This function can be evaluated explicitly in terms of the eigenvalues of Q † Q P † P. We would like to demonstrate this in a slightly more general setting.
576
Y. V. Fyodorov, B. A. Khoruzhenko
For square matrices A and B of size n × n define † † etr(AU +U B ) dU. Fn (AB † ) = U (n)
(3.34)
If the eigenvalues z 12 , . . . , z n2 of the matrix AB † are all distinct then [42] n
Const. j−1 Fn (AB † ) = I (z ) , det z j−1 i i i, j=1 ∆(z 12 , . . . , z n2 ) where Ik is the modified Bessel function, Ik (z) =
z 2 j+k
∞
2
j!( j + k)!
j=0
.
For our purposes, we want to know Fn (AB † ) for matrices AB † of low rank, e.g. when AB † is rank one. 2 and 2m ≤ Lemma 5. Suppose that AB † has m distinct non-zero eigenvalues z 12 , . . . , z m n. Then
m 1 1 det g(t z 2 ) m m i (n − j)! j i, j=1 Fn (AB † ) = ... tim−i (1 − ti )n−2m dti , 2, . . . , z2 ) (n − m − j)! ∆(z m 1 j=1 i=1 0
0
√ where g(x) = I0 2 x . In particular, if AB † is rank one and z 2 is its non-zero eigenvalue then 1 Fn (AB † ) = (n − 1) I0 2 t z 2 (1 − t)n−2 dt. (3.35) 0
Proof. It follows from (3.33) and (3.7) that Fn (AB † ) =
λ
cλ2 sλ (AB † ). sλ (1n )
The coefficients cλ are given by m m (m − j)! 1 cλ = det = sλ (1m ) (λ j − j + i)! i, j=1 (m + λ j − j)! j=1
see, e.g., [4] and references therein, and Fn (AB ) = †
m−1
j!2
s 2 (1m )
sλ (AB † ) , sλ (1n ) f 1 !2 · . . . · f m !2 λ
λ
j=0
where as before f j = m + λ j − j. Note that the summation is over all partitions λ of length ≤ m, or, equivalently, over all f 1 > f 2 > . . . > f m ≥ 0. It follows now from (3.21) – (3.23) that Fn (AB ) = †
m j=1
(n − j)! (n − m − j)!
1 1 m g(t1 , . . . tm ) m−i ... ti (1 − ti )n−2m dti , 2) ∆(z 12 , . . . , z m i=1 0
0
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
where
f m det ti j
g(t1 , . . . tm ) =
2 f m det z i j
i, j=1 f 1 !2 . . . f m !2
f 1 > f 2 >...> f m ≥0
577
i, j=1
.
To complete the proof, recall ! the following generalization of the Cauchy-Binet formula, see e.g. [30] p. 22. If g(x) = f ≥0 γ f x f is an analytic function in the complex x-plane then
f m
f m m
γ f1 . . . γ fm det ti j det xi j . det g(ti x j ) i, j=1 = i, j=1
f 1 > f 2 >...> f m ≥0
i, j=1
By making use of this formula, m
" g(t1 , . . . , tm ) = det I0 2 ti z 2j
i, j=1
and the lemma follows.
,
4. Proof of Theorem 1 After all the preparatory work of the previous section, Theorem 1 becomes almost evident. We first prove (2.9) – (2.10) for C = D = I . With Lemma 3 in hand, this becomes a routine calculation. Expanding powers of determinants in the Schur functions as in (3.13) – (3.14) and integrating over the unitary group with the help of (3.7), one gets det m [(I + AU )(I + BU )† ]dU =
s 2 (1m ) λ
λ
U (n)
and
U (n)
det m [(I
sλ (1n )
sλ (AB † )
(4.1)
s 2 (1m ) dU λ = sλ (AB † ). † − AU )(I − BU ) ] sλ (1n )
(4.2)
λ
The sum in (4.1) is finite and the sum in (4.2) is absolutely converging for any A A† < I and B B † < I . Now, by making use of (3.18) and (3.19), and then (3.7) again, one arrives at
∞ ∞ m det [(I + AU )(I + BU ) ]dU = . . . det(I + t j AB † ) dµn (t1 , . . . , tm ) (4.3) m
†
U (n)
0
0
j=1
and U (n)
dU = m det [(I − AU )(I − BU )† ]
1 1 ... 0
dνn (t1 , . . . , tm )
m
0 j=1
det(I − t j AB † )
.
(4.4)
578
Y. V. Fyodorov, B. A. Khoruzhenko
Extending (4.3) and (4.4) to the generality of (2.9) and (2.10) is straightforward. If C and D are not degenerate, then det m [(AU + C)(BU + D)† ] dU U (n) = det m (C D † ) det m [(I + C −1 AU )(I + D −1 BU )† ] dU U (n)
and (2.9) follows from (4.3). The assumption that C and D are not degenerate can be removed by the continuity argument.
5. Regularization of the Inverse Determinant In this section we employ another approach to the problem of evaluation of negative moments of spectral determinants. This approach is to write the determinants as Gaussian integrals and then perform the integration over the unitary group with the help of the color-flavor transformation. Such approach is not new. It was pioneered by Zirnbauer in the context of unitary random matrix ensembles. The new element here is that we apply it in the general context of complex matrices. We shall write the spectral determinant det[(I − AU )(I − AU )† ] of n × n matrices as an 2n × 2n block determinant # # # 0 i(U † − A) ## † † † † # det[(I − AU )(I − AU ) ] = det[(U − A)(U − A) ] = # # i(U † − A)† 0 and more generally # # # εI i(U † − A) ## # det[ε I + (I − AU )(I − AU ) ] = # #. εI i(U † − A)† 2
†
Proposition 6. Suppose that Re λ j > 0, j = 1, 2. Then for any complex n × n matrix we have # # # λ1 I i #−1 # # † # i λ2 I # $ % 1 = n d 2v d 2w exp −[λ1 v † v + λ2 w† w + i(w † † v + v † w)] . π Cn Cn
(5.1)
The integral on the right-hand side converges absolutely. Remark. In this section, we shall use letters in boldface to represent column vectors in Cn . The symbol d 2v will denote the volume element of v in Cn , d 2v =
n j=1
d 2η j =
n j=1
d Re v j d Im v j .
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
579
Proof. Note that
λ I i λ1 v v + λ2 w w + i(w v + v w) = (v , w ) 1 † i λ2 I †
†
†
†
†
†
†
v w
.
In view of the singular value decomposition = U † ωV , † λ1 I i U 0 λ1 I iω U 0 , = 0 V iω λ2 I i† λ2 I 0 V† where ω is diagonal matrix of singular values of , ω = diag(ω1 , . . . , ωn ), and U and V are unitary matrices. Introducing f = U v and g = V w, λ I i f v λ I iω = ( f †, g†) 1 (v † , w † ) 1 † iω λ2 I g w i λ2 I n fj λ1 iω j = . ( f¯j , g¯ j ) iω j λ2 gj j=1
Since U and V are unitary, d 2v = d 2 f and d 2w = d 2g. Changing the variables of integration in (5.1) from v and w to f and g breaks this 2n-fold integral into the product of the 2-fold integrals 1 exp −λ1 | f j |2 − λ2 |g j |2 − iω j ( f j g¯ j + f¯j g j ) d 2 f j d 2g j = (λ1 λ2 + ω2j )−1 . π C2 Thus, the integral on the right-hand side in (5.1) equals j=1 (λ1 λ2 + ω2j )−1 which is obviously the same as the determinant on the left-hand side. Proof of Theorem 2. Obviously, without loss of generality we can put z = 1. Let dU , ε > 0. (5.2) Rε (A, A† ) = 2 † U (n) det[ε I + (I − AU )(I − AU ) ] The integral on the right-hand side converges for any n × n matrix A. It follows from Proposition 6 that 1 † † † † † Rε (A, A† ) = n d 2v d 2w e−[ε(v v +w w)−i(w A v +v Aw)] f n (v † vw† w), π Cn Cn
(5.3)
where, cf. (3.34), f n (v † vw † w) =
U (n)
† † † ei(w U v +v U w) dU =
By Lemma 5,
1
f n (v vw w) = †
†
U (n)
ei tr(vw
† U +U † wv † )
J0 2 t v † v w † w dσn (t),
0
where dσn (t) = (n − 1)(1 − t)n−2 dt
dU.
(5.4)
580
Y. V. Fyodorov, B. A. Khoruzhenko
and J0 is the Bessel function J0 (z) =
∞ j=0
i z 2 j 2
j!2
= I0 (i z).
We have | f n (v † vw† w)| ≤ 1 for all v and w. This is because |J0 (z)| ≤ 1 for all z. Therefore we can interchange the order of integrations on replacing f n in (5.3) by the integral of (5.4). This yields Rε (A, A† ) 1
1 † † † † † = dσn (t) n d 2v d 2w e−ε(v v +w w)+i(w A v +v Aw) J0 2 t v † v w† w . (5.5) π Cn Cn 0 In order to perform the integration in variables v and w we shall make use of the integral representation +∞ d x i( px+ q ) 1 √ x e J0 (2 p q) = (5.6) 2πi −∞ x which holds any p > 0 and q > 0 and is a particular case of Eq. 3.871.1 in [32]. The integral in (5.6) converges because of the oscillations of the exponential function, however the convergence is not absolute. We have +∞
d x i √t(x v † v + w† w ) 1 † † x J0 2 t v v w w = e . (5.7) 2πi −∞ x Note that by Proposition 6, √ † 1 2 2 −ε(v † v +w † w )+i(w † A† v +v † Aw ) i t(x v † v + w x w ) d v d w e e π n Cn Cn #−1 # # (ε − i √t x)I −A√ ## # =# # . # −A† (ε − i x t )I # Therefore, on replacing J0 in (5.5) by the integral of (5.7) and reversing the order of integrations one arrives at 1 +∞ 1 dx 1 † , dσn (t) Rε (A, A ) = √ † 2 2πi x det[A A + (ε − t)I − iε t(x + x1 )I ] 0 −∞ (5.8) which is the identity claimed in Theorem 2. It remains to justify reversing the order of integrations with respect to x and v, w. Firstly, we will show that the integral on the right-hand side in (5.8) is well-defined. Proposition 7. For any ε > 0 and n ≥ 2 the integral in (5.8) converges absolutely (and uniformly in A). Proof. Let a 2j be the eigenvalues of A A† so that n 1 1 1 × = w(a j , t), √ x x det[A A† + (ε2 − t)I − iε t(x + x1 )I ] j=1
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
581
where w(a, t) = Since |z| ≥ | Im z|, we have
1
a2
+ ε2
. √ − t − iε t(x + x1 )
# # #1 # # w(a1 , t)# ≤ √ 1 #x # ε t(1 + x 2 ) .
Also, for all 0 ≤ t ≤ ε2 /2 we have |w(a j , t)| ≤
|ε2
1 2 ≤ 2 2 ε − t + aj|
and for all ε2 /2 ≤ t ≤ 1 we have |w(a j , t)| ≤
1 1 1 ≤ √ ≤√ . √ 2ε t |ε t(x + x1 )| 2ε2
Therefore the absolute value of the integrand in (5.8) is majorated by the function n−1 2 1 , √ 2 ε t(1 + x ) ε2 which is obviously integrable with respect to dσn (t) × d x.
We can now turn to justification of reversing the order of integrations in v, w and x in the integral 1 † † † † † 2 I= dσn (t) d v d 2w e−ε(v v +w w)+i(w A v +v Aw) n n C C 0 +∞ d x i √t(x v † v + w† w ) x e × . −∞ x The corresponding calculation is routine but tedious. First we restrict the x-integration to the finite interval δ ≤ |x| ≤ 1/δ, δ > 0, reverse the order of integrations, and then show that the corresponding tail integrals are negligible in the limit δ → 0. Let
1 Iδ =
dσn (t) 0
d v d w 2
Cn
2
Cn
δ≤|x|≤1/δ
d x −ε(v † v +w† w)+i(w† A† v +v † Aw) e x ×ei
√
†
t(x v † v + w x w ).
The absolute value of the integrand is majorated by the integrable function 1 −ε(v † v +w † w ) , and therefore we can reverse the order of integrations and then per|x| e form the integration in v, w. This yields 1 1 dx . Iδ = dσn (t) √ † 2 x det[A A + (ε − t)I − iε t(x + x1 )I ] δ≤|x|≤1/δ 0
582
Y. V. Fyodorov, B. A. Khoruzhenko
It follows from this, in view of Proposition 7, that 1 +∞ 1 dx + o(1) dσn (t) Iδ = √ 1 † 2 0 −∞ x det[A A + (ε − t)I − iε t(x + x )I ] in the limit δ → 0. It only remains to show that the tail integrals 1 † † † † † Iδ = dσn (t) d 2v d 2w e−ε(v v +w w)+i(w A v +v Aw) Cn Cn 0 d x i √t(x v † v + w† w ) x × e |x|≥1/δ x and Iδ =
1
dσn (t)
d 2v
Cn
0
† † † † † d 2w e−ε(v v +w w)+i(w A v +v Aw) d x i √t(x v † v + w† w ) x e × |x|≤δ x
Cn
vanish in the limit δ → 0. For real r, p and q define g L (r, p, q) =
+∞
d x ir ( px+ q ) x = e x
L
1/L
0
d x ir (q x+ p ) x . e x
(5.9)
By integration by parts, 1 ir ( pL+ q ) 1 L + e g L (t; p, q) = − i pr L i pr
+∞ L
q
eir ( px+ x ) p dx + 2 x q
+∞ L
q
eir ( px+ x ) d x, x3
and, therefore, for L > 0 we have |g L (r, p, q)| ≤ Obviously, |x|≥1/δ
|q| 2 + . | p||r |L 2| p|L 2
(5.10)
√ √ d x i √t(x v † v + w† w ) x e = g 1 ( t, v † v, w † w) − g 1 (− t, v † v, w † w), δ δ x
and, by (5.10), # # # # Therefore # # #I # ≤ δ
1 0
|x|≤δ
# 4δ d x i √t(x v † v + w† w ) ## w† wδ 2 x ≤ † √ + e . # x v†v v v t
dσn (t)
d 2v
Cn
† † d 2w e−ε(v v +w w)
Cn
4δ w† wδ 2 + √ v†v v†v t
.
As the function v1† v is locally integrable with respect to d 2v for n ≥ 2, we conclude that Iδ = O(δ)
when δ → 0.
(5.11)
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
Similarly |x|≤δ
583
√ √ d x i √t(x v † v + w† w ) x = g 1 ( t, w † w, v † v) − g 1 (− t, w † w, v † v), e δ δ x
and repeating the above argument one obtains that Iδ = O(δ)
when δ → 0,
so that both Iδ and Iδ vanish in the limit δ → 0. This completes our proof of the first part of Theorem 2. It is worth mentioning another formula for the regularized average of the inverse spectral determinant, Rε (A, A ) = (n − 1) †
|z|2 ≤1
(1 − |z|2 )n−2 d 2z , det[ε2 I + (1 − z¯ )I + (1 − z)A A† ]
which is almost an immediate corollary of (3.32) and the representation 1 det[ε2 In + (In − AU )(In − AU )† ] 1 † 2 † † † † † † = n e−[v (1+ε )v +v A A v ] etr(vv AU +U A vv ) d 2v. π Cn This formula, however, does not seem to be easy to handle in the limit ε → 0. We now turn to the integral in (5.8) and evaluate it in the limit ε → 0 under the assumption that A A† has no repeated eigenvalues. The limit in (2.15) follows immediately from the asymptotic relation (5.19) which is the end-product of our calculation. The following identities, which can be obtained from the Lagrange interpolation formula, see, e.g., [41] Part VI, Problem 67, will be useful for our purposes. Proposition 8. Suppose that x1 , . . . , xn are pairwise distinct. Then n j=1
n 1 1 1 = , xj − t xj − t xk − x j
(5.12)
k= j
j=1
and, for non-negative integer r , n
x rj
j=1
k= j
& 1 0 if r ≤ n − 2, = h r −n+1 (x1 , . . . , xn ) if r ≥ n − 1, xk − x j
(5.13)
where h r , r = 0, 1, 2, . . . , are the complete symmetric functions. It follows from (5.12) that 1 det[A A†
+ (ε2
√
− t)I − iε t(x +
1 x )I ]
=
n j=1
w(a j , t, x)
a2 k= j k
1 , − a 2j
(5.14)
584
Y. V. Fyodorov, B. A. Khoruzhenko
where a1 , . . . , an are the eigenvalues of A A† and w(a, t, x) =
1 . √ a 2 + ε2 − t − iε t(x + x1 )
By the calculus of residues, +∞ 1 dx 1 , w(a, t, x) = 2 2πi −∞ x (a − t − ε2 )2 + 4ε2 a 2
(5.15)
and putting (5.14) and (5.15) into (5.8) we arrive at the following expression of Rε (A, A† ) in terms of the eigenvalues of A A† : Rε (A, A† ) =
n
Fε (a j )
j=1
where Fε (a) =
0
1
dσn (t) , 2 (a − t − ε2 )2 + 4ε2 a 2
a2 k= j k
1 , − a 2j
dσn (t) = (n − 1)(1 − t)n−2 dt.
(5.16)
(5.17)
This formula is convenient for finding Rε (A, A† ) in the limit ε → 0. If A A† > I , by letting → 0 in (5.16) and recalling (5.12) we immediately obtain 1 dσn (t) , lim Rε (A, A† ) = ε→0 det(A A† − t I ) 0 thus reproducing the corresponding formula of Theorem 1, part (ii). If A A† < I or if A A† has eigenvalues on each side of a 2 = 1, evaluation of the right-hand side in (5.16) in the limit ε → 0 requires some work. The integral in (5.17) is standard. There are different methods available to evaluate it. None seems to give an explicit expression for all parameter values. However, we are only interested in ε → 0, and in this regime Fε (a) = (n − 1)(1 − a 2 )n−2 γn−2 sgn(a 2 − 1) + L 0 (ε, a) + qn−2 (a 2 ) + O(ε), (5.18) where
⎧ 1−a 2 2 ⎪ ⎨ln ε2 if a < 1, 2 L 0 (ε, a) = ln 2a if a 2 > 1, ⎪ ⎩ a2 −1 ln ε if a 2 = 1;
γn−2 is the partial sum of the harmonic series, γn−2 =
n−2 1 , j j=1
sgn is the sign function, sgn(x) takes value 1 if x > 0, -1 if x < 0 and 0 if x = 0, and qn−2 (a 2 ) is a polynomial of degree n − 2 in a 2 with coefficients which do not depend on ε. Details of the derivation of (5.18) are given in Appendix A.
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
585
Let us now put (5.18) into (5.16). In view of (5.13) the polynomial qn−2 gives no contribution. After rearranging the remaining terms we obtain, Rε (A, A† ) = α(A A† ) ln
1 + β(A A† ) + O(ε), ε2
(5.19)
with the coefficients α and β given by α(A A† ) = (n − 1)
n
(1 − a 2j )n−2 θ (1 − a 2j )
j=1
β(A A† ) = (n − 1)
n
(1 − a 2j )n−2 ψ(a 2j )
j=1
1 , − a 2j
a2 k= j k
a2 k= j k
1 , − a 2j
where θ is Heaviside’s step function, ⎧ ⎪ ⎨1 if x > 1, θ (x) = 21 if x = 1, ⎪ ⎩0 if x < 1, and ⎧ 2 2 2 ⎪ ⎨γn−2 + ln a − ln(a − 1) if a > 1, 2 2 ψ(a ) = −γn−2 + ln(1 − a ) if a 2 < 1, ⎪ ⎩ln 2 if a 2 = 1. As one would expect, the coefficient α vanishes if A A† < I or A A† > I . This follows from identity (5.13). If A A† > I then the constant γn−2 gives no contribution, again by (5.13) and lim Rε (A, A† )
ε→0
= (n − 1)
n
(1 − a 2j )n−2 ln
j=1
a 2j a 2j
−1
k= j
1 = ak2 − a 2j
1
dσn (t) . det(A A† − t I )
1
dσn (t) . det(I − t A A† )
0
Similarly, if A A† < I then lim Rε (A, A† )
ε→0
= (n − 1)
n j=1
(1 − a 2j )n−2 ln(1 − a 2j )
k= j
1 = ak2 − a 2j
0
Thus, (5.8) indeed reproduces formulas of Theorem 1 part (ii), and our proof of Theorem 1 is now complete.
586
Y. V. Fyodorov, B. A. Khoruzhenko
6. Rank-One Deviations form CUE and GUE In this section we express the mean eigenvalue density for the random matrix ensembles (1.4) and (1.6) in terms of the spectral determinants. Our calculation is inspired by similar calculations in [17, 18] and makes use of a process known as eigenvalue deflation which was introduced in the context of random matrices in [45]. We need to recall a few facts about elementary unitary Hermitian matrices [48]. Let v be a column-vector in Cn . The matrix Rv = In − 2vv † /|v|2 ,
|v|2 = v † v,
where In is the identity matrix, defines a linear transformation which is a reflection across the hyperplane through the origin with normal v/|v|. It is straightforward to verify that Rv is unitary and Hermitian, Rv = Rv†
and
Rv Rv† = Rv2 = In .
In the context of numerical linear algebra the matrices Rv are known as Householder reflections. Any matrix can be brought to triangular form by a succession of Householder reflections. We only need the first step of this process which we now describe. Let Wn be an n × n matrix and z and x = (x1 , . . . , xn )T be an eigenvalue and eigenvector of Wn , so that Wn x = zx. Without loss of generality we may assume that x1 ≥ 0 and |x|2 = x † x = 1. Let e1 = (1, 0, . . . , 0)T and x + e1 x + e1 =√ . (6.1) v= |x + e1 | 2(1 + x1 ) Since the vector v bisects the angle of x and e1 , we have Rv x = −e1 and Rv e1 = −x. Therefore Rv Wn Rv e1 = ze1 and (recall that Rv2 = In ) z w† Rv , (6.2) Wn = R v 0 Wn−1 for some Wn−1 and w. Note that Wn−1 is (n − 1) × (n − 1) and w † is 1 × (n − 1). Obviously, applying this procedure again (to the matrix Wn−1 ) and again, one can reduce Wn to triangular form by means of unitary transformations. Such factorization is known as Schur decomposition. It is convenient to write v = (v1 , q)T , where q = (v2 , . . . , vn )T√. Since v is a unit vector, v12 + |q|2 = 1. Note that the first equation in (6.1) reads v1 = (1 + x1 )/2. Since 0 ≤ x1 ≤ 1, we must have 1/2 ≤ v1 ≤ 1. Therefore, " 1 ≤ |q|2 ≤ 1. (6.3) v1 = 1 − |q|2 and 2 In terms of q the matrix Rv is given by 2−1 2|q| −2 1 − |q|2 q † 1 − 2v12 −2v1 q † = . Rv = −2v1 q In−1 − 2qq † −2 1 − |q|2 q In−1 − 2qq †
(6.4)
The incomplete Schur decomposition (6.2) gives rise to a new coordinate system in the space of complex matrices, the new (complex) coordinates being z, w, q and
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
587
the matrix entries of Wn−1 . There are no restrictions on the range of variation of z, w and Wn−1 , and, in view of (6.3), the vector q is restricted to the spherical segment 1 2 2 ≤ |q| ≤ 1. The Jacobian of the transformation from (Wn, jk ) to this new system of coordinates,3 n d(Wn ) jk d(Wn ) jk 2
j,k=1
= J (z, w, q, Wn−1 )
n−1 n−1 dzdz dw j dw j dq j dq j 2 2 2 j=1
j=1
n−1 j,k=1
d(Wn−1 ) jk d(Wn−1 ) jk , 2
is given by (cf. Lemma 3.2 in [18]) J (z, w, q, Wn−1 ) = 22n−2 | det(z In−1 − Wn−1 )|2 (1 − |q|2 )n−2 (2|q|2 − 1).
(6.5)
We derive (6.5) in Appendix B. Suppose that we have a probability distribution d P(Wn ) = p(Wn )
n d(Wn )i j d(Wn )i j 2
(6.6)
i, j=1
on the space of complex n × n matrices. Then, following the argument of [18], see their Lemma 3.1, the mean eigenvalue density ρn (x, y), z = x + i y, of Wn is given by ρn (x, y) z w† 2 2 2 Rv , (6.7) d Wn−1 d w d q J (z, w, q, Wn−1 ) p Rv = 0 Wn−1 C2(n−1)2
Cn−1
1 2 2 ≤|q | ≤1
where d 2 Wn−1 =
n−1 j,k=1
n−1 n−1 dq j dq j dw j dw j d(Wn−1 ) jk d(Wn−1 ) jk , d2q = , d 2w = . 2 2 2 j=1
j=1
Since we integrate in the q-space over the spherical segment 1/2 ≤ |q|2 ≤ 1, it is convenient to introduce spherical coordinates, √ q = tσ , t = |q|2 , σ = q/|q|. The element of volume in the q-space is then d2q =
1 n−2 t dt d S(σ ), 2
where d S(σ ) is the element of area of the sphere |σ |2 = 1. The range of variation of t is 1/2 ≤ t ≤ 1. Next, on making the substitution √ 1 1 1+ r 2 r = (2t − 1) , (2t − 1)dt = dr, (1 − t)t = (1 − r ), t = , 4 4 2 3 For Jacobian computations it is convenient to consider z and z as functionally independent variables, so that d 2 z ≡ d Re zd Im z = dzdz/2.
588
Y. V. Fyodorov, B. A. Khoruzhenko
the expression for the Jacobian becomes simpler, √ J (z, w, q, Wn−1 )d 2 q = J (z, w, tσ , Wn−1 )d 2 q = 22n−3 | det(z In−1 − Wn−1 )|2 [(1 − t)t]n−2 (2t − 1)dtd S(σ ) 1 = | det(z In−1 − Wn−1 )|2 (1 − r )n−2 dr d S(σ ). 2 Substituting this into (6.7), we arrive at the desired formula for the mean density of eigenvalues in the ensemble with matrix distribution (6.6): ρn (x, y) 1 1 z w† 2 2 n−2 2 R . = d Wn−1 d w d S(σ ) (1−r ) dr | det(z In−1 −Wn−1 )| p R 0 Wn−1 2 C2(n−1)2
Cn−1
|σ |2 =1
0
(6.8) Here
⎛
⎞ √ 1 − r σ† ⎠. R=⎝ √ √ † − 1 − r σ In−1 − (1 + r )σ σ √ r
(6.9)
We shall now apply this result to express the mean density of eigenvalues in terms of the absolute square modulus of characteristic polynomials for two ensembles of random matrices. Rank-one deviations from unitarity. Let Un be an n × n unitary matrix and √ 1−γ 0 , 0 ≤ γ ≤ 1. (6.10) Gn = 0 In−1 The Haar measure on U (n) induces a measure on the manifold Wn† Wn = G 2n in the space of complex n × n matrices via the correspondence Wn = Un G n . The corresponding matrix distribution is uniform and can be conveniently described via matrix delta-function d P(Wn ) =
cn δ(Wn† Wn
−
G 2n )
n d(Wn )i j d(Wn )i j . 2
(6.11)
i, j=1
For Hermitian H , we define δ(H ) as δ(H ) = δ(H j j ) δ(H jk )δ(H jk ). j
(6.12)
j
The normalization constant cn can be easily computed by changing to the matrix ‘polar’ coordinates, 1!2! · · · (n − 1)! 2n cn = = . (6.13) Vol(U (n)) π n(n+1)/2 Note that the eigenvalues of G n Un and Un G n coincide, and, therefore, for the purpose of calculating the eigenvalue statistics the ensembles G n Un and Un G n are equivalent. In view of (6.2) it is more convenient to work with matrices Un G n .
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
589
Changing the coordinate system to z, w, r , σ , Wn−1 , ' ( ( ' z w† z 0 cn 2 δ −RG n R d P(Wn ) = † w Wn−1 2 0 Wn−1 × | det(z In−1 −Wn−1 )|2 d 2 z (1 − r )n−2 dr d S(σ ) d 2 w d 2 Wn−1 , (6.14) where we have used the unitary invariance of the matrix delta-function. The matrix inside the delta-function in (6.14) is ⎛ ⎞ √ |z|2 − 1 + γ r zw† − γ (1 − r )r σ † ⎝ ⎠ √ † zw − γ (1 − r )r σ Wn−1 Wn−1 − In−1 + ww † + γ (1 − r )σ σ † and the delta-function factorizes into the product of the three delta-functions, correspondingly. On substituting this into (6.8) we obtain cn ρn (x, y) = | det(z In−1 − Wn−1 )|2 f (Wn−1 ) d 2 Wn−1 , (6.15) 2 C2(n−1)2
where f (Wn−1 ) 1
= d S(σ ) (1 − r )n−2 dr δ |z|2 − 1 + γ r |σ |2 =1
×
0
† d 2 w δ Wn−1 Wn−1 − In−1 + ww† + γ (1 − r )σ σ † δ zw − γ (1 − r )r σ .
Cn−1
Note that
δ zw − γ (1 − r )r σ =
1 |z|2(n−1)
δ w−
γ
√ (1 − r )r σ . z
Therefore the integral over w yields 1 |z|2(n−1)
γ (1 − r )(γ r + |z|2 ) † † δ Wn−1 Wn−1 − In−1 + σ σ |z|2
and f (Wn−1 ) 1
1 d S(σ ) (1 − r )n−2 dr δ |z|2 − 1 + γ r = 2(n−1) |z| |σ |2 =1 0 γ (1 − r )(γ r + |z|2 ) † † . Wn−1 − In−1 + σ σ ×δ Wn−1 |z|2
590
Y. V. Fyodorov, B. A. Khoruzhenko
It is apparent that if |z|2 < 1 − γ or |z|2 > 1 then the integral over r vanishes and f (Wn−1 ) = 0. Therefore ρn (x, y) = 0 for such values of z. If 1 − γ < |z|2 < 1 then the integral over r produces a non-trivial contribution and f (Wn−1 ) =
1 γ |z|2(n−1)
n−2 1−|z|2 γ − 1 + |z|2 † † d S(σ ) 1− δ Wn−1 Wn−1 − In−1 + σσ . γ |z|2
|σ |2 =1
Introducing γ˜ =
γ − 1 + |z|2 , |z|2
we can rewrite the above expression in a shorter form, n−2
γ˜ 1 † † . f (Wn−1 ) = d S(σ ) δ W W − I + γ ˜ σ σ n−1 n−1 n−1 γ |z|2 γ |σ |2 =1
On substituting this into (6.15), we arrive at n−2 γ˜ cn ρn (x, y) = d 2 Wn−1 | det(z In−1 − Wn−1 )|2 2 2γ |z| γ C2(n−1)2
×
† d S(σ ) δ Wn−1 Wn−1 − In−1 + γ˜ σ σ † .
|σ |2 =1
Since the matrix σ σ † is unitary equivalent to the matrix 1 0 , 0 0n−1 the integration over σ can be easily performed yielding cn Vol(S 2n−3 ) 2γ |z|2 n−2
γ˜ † × | det(z In−1 − Wn−1 )|2 δ Wn−1 Wn−1 − G˜ 2n−1 d 2 Wn−1 , γ
ρn (x, y) =
C2(n−1)2
where G˜ n−1 is the (n − 1) × (n − 1) matrix (cf. (6.10)) 1 − γ˜ 0 ˜ G n−1 = 0 In−2 and Vol(S 2n−3 ) is the area of the unit sphere in R2(n−1) , Vol(S 2n−3 ) =
2π n−1 . (n − 2)!
(6.16)
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
591
It follows from (6.13) that cn /cn−1 = (n − 1)!/π n . Hence cn Vol(S 2n−3 ) n−1 = cn−1 , 2 π and finally ρn (x, y) n−1 = π γ |z|2
n−2 γ˜ cn−1 γ
† | det(z In−1 − Wn−1 )|2 δ Wn−1 Wn−1 − G˜ 2n−1 d 2 Wn−1
C2(n−1)2
#
#2 n − 1 γ˜ n−2 # ˜ n−1 ## dUn−1 det z I = − U G # n−1 n−1 π γ |z|2 γ U (n−1) as claimed in (1.5). Rank-one deviations from hermiticity. Let Hn be a GUEn matrix, i.e. random Hermitian matrix of size n × n with probability distribution cβ,n e
− β2 tr Hn2
n
d(Hn ) j j
j=1
1≤ j
d(Hn ) j j d(Hn ) j j , 2
β > 0,
and Γn be the n × n matrix 1 0 , 0 0n−1
Γn = γ
γ > 0.
Consider the random matrices Wn = Hn + iΓn . Obviously, Re Wn :=
Wn + Wn† = Hn , 2
and
Im Wn :=
Wn − Wn† = Γn . 2i
The matrices Wn are complex and their probability distribution is given by β
d P(Wn ) = cβ,n e− 2 tr(Re Wn ) δ(Im Wn − Γn ) 2
n d(Wn )i j d(Wn )i j , 2
(6.17)
i, j=1
where cβ,n is the normalization constant, cβ,n =
n/2 n 2 /2 β 1 2 π
and δ is the matrix delta-function (6.12).
(6.18)
592
Y. V. Fyodorov, B. A. Khoruzhenko
Changing the coordinate system to z, w, r , σ and Wn−1 , d P(Wn ) β 1 2 β 2 β 2 = cβ,n e− 2 tr(Re Wn−1 ) − 4 |w| − 2 (Re z) δ 2
'' Im z
w† 2i
w − 2i Im Wn−1
(
( − RΓn R
×| det(z In−1 − Wn−1 )|2 d 2 z (1 − r )n−2 dr d S(σ ) d 2 w d 2 Wn−1 , where we have used the unitary invariance of the matrix delta-function. The matrix inside the delta-function in (6.14) is '
( γ√ w† † Im z − γ r 2i + 2 (1 − r )r σ , √ † − w2i + γ2 (1 − r )r σ † Im Wn−1 − γ (1 − r )σ σ †
and the delta-function factorizes into the product of the three delta-functions correspondingly. On substituting this into (6.8) we obtain ρn (x, y) =
βx 2 β 1 2 cβ,n e− 2 | det(z In−1 − Wn−1 )|2 f (Wn−1 )e− 2 tr(Re Wn−1 ) d 2 Wn−1 , 2 2 C2(n−1) (6.19)
where f (Wn−1 ) 1
= d S(σ ) (1 − r )n−2 dr δ (y − γ r ) δ Im Wn−1 − γ (1 − r )σ σ † |σ |2 =1
0
×
The integral over w yields
1 4
e−βγ
d we 2
Cn−1
2 (1−r )r
− β|w| 4
2
δ
w† γ † + (1 − r )r σ . 2i 2
, and we arrive at
f (Wn−1 ) 1
1 2 = d S(σ ) dr (1 − r )n−2 e−βγ (1−r )r δ (y − γ r ) δ Im Wn−1 − γ (1 − r )σ σ † . 4 |σ |2 =1 0 It is apparent that if y < 0 or y > γ then the integral over r vanishes. Therefore ρn (x, y) = 0 if y < 0 or y > γ . If 0 < y < γ , then the integration over r produces the factor γ1 and the constraint r = γy , so that (γ − y)n−2 e−β(γ −y)y f (Wn−1 ) = 4γ n−1
d S(σ ) δ Im Wn−1 − (γ − y)σ σ † .
|σ |2 =1
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
593
On substituting the obtained expression for f (Wn−1 ) into (6.19), we arrive at ρn (x, y) cβ,n (γ − y)n−2 e− = 8γ n−1
βx 2 2 −β(γ −y)y
β
d 2 Wn−1 | det(z In−1 −Wn−1 )|2 e− 2 tr(Re(Wn−1 )
C2(n−1)2
×
2
d S(σ ) δ Im Wn−1 −(γ − y)σ σ † .
|σ |2 =1
The integral over σ yields
Vol(S 2n−3 ) δ Im Wn−1 − Γ˜n−1 , where Γ˜n−1 is the (n − 1) × (n − 1) matrix
1 0 Γ˜n−1 = (γ − y) 0 0n−2
and Vol(S 2n−3 ) is the area of the unit sphere in R2(n−1) (6.16). We have that cβ,n cβ,n−1
n 1/2 π β = , π 2β
and it now follows that ρn (x, y) βx 2
1 β n (γ − y)n−2 e− 2 −β(γ −y)y = √ cβ,n−1 γ n−1 4 2πβ (n − 2)!
β 2 × d 2 Wn−1 | det(z In−1 − Wn−1 )|2 e− 2 tr(Re(Wn−1 ) δ Im Wn−1 − Γ˜n−1 , C2(n−1)2
as claimed in (1.7). Acknowledgement. We would like to thank A. Gamburd for useful discussions and, in particular, for bringing reference [45] to our attention and for pointing us towards the link between our Lemma 3 and the Selberg Integral. We are also grateful to Ph. Biane for bringing reference [28] to our attention.
A. Appendix In this appendix we evaluate the integral 1 (1 − t)k dt Ik (ε2 , a 2 ) = 0 (t − a 2 + ε2 )2 + 4ε2 a 2 in the limit ε → 0.
594
Y. V. Fyodorov, B. A. Khoruzhenko
We shall use the following fact from calculus. If P(t) is a polynomial of degree k then (integrate by parts) " dt P(t) dt = Q(t) t 2 + pt + q + λ , (A.1) 2 2 t + pt + q t + pt + q where Q is a polynomial of degree k − 1 and λ is a constant. For Q and λ one has the equation (differentiate (A.1)) P(t) = Q (t)(t 2 + pt + q) +
1 Q(t)(t 2 + pt + q) + λ. 2
It follows from this that t=1 Ik (ε2 , a 2 ) = Q ε,a (t) (t − a 2 + ε2 )2 + 4ε2 a 2 + λε,a I0 (ε, a) t=0
= Q ε,a (1) (1 − a 2 + ε2 )2 + 4ε2 a 2 − Q ε,a (0)(ε2 + a 2 ) + λε,a I0 (ε, a), and the equation for Q(t) and λ is (1 − t)k = Q ε,a (t) (t − a 2 + ε2 )2 + 4ε2 a 2 + Q ε,a (t)(t − a 2 + ε2 ) + λε,a .
(A.2)
It is apparent from (A.2) that Q(t) and λ must be polynomials in a 2 and ε2 and, therefore, in the limit ε → 0, Q ε,a (t) = Q a (t) + O(ε2 ) and λε,a = λa + O(ε2 ), and (1 − t)k = Q a (t)(t − a 2 )2 + Q a (t)(t − a 2 ) + λa . This equation for Q a and λa can be explicitly solved, the solution being λa = (1 − a )
2 k
k (−1)l k (t − a 2 )l−1 (1 − a 2 )k−l . and Q a (t) = l l l=1
Note that at Q a (0) is a polynomial in a 2 of degree k − 1, and Q a (1) = (1 − a )
2 k−1
k (−1)l k = −(1 − a 2 )k−1 γk , l l l=1
where γk is the partial sum of the harmonic series, γk = 1 +
1 1 + ··· + . 2 k
We now turn to I0 (ε2 , a 2 ). Recalling the table integral dt = ln |t + t 2 + α 2 |, √ t 2 + α2
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
we have I0 (ε , a ) = ln 2
2
1 − a 2 + ε2 +
At a 2 = 1, I0 (ε , 1) = ln 2
ε2 +
595
(1 − a 2 + ε2 )2 + 4ε2 a 2 . 2ε2
√
ε4 + 4ε2 1 = ln + O(ε). 2ε2 ε
For a 2 = 1, and therefore
(1 − a 2 + ε2 )2 + 4ε2 a 2 = |1 − a 2 | +
ε2 (1 + a 2 ) + O(ε4 ), |1 − a 2 |
⎧ 1 − a2 ⎪ ⎪ ⎨ln + O(ε2 ) if a 2 < 1, ε2 I0 (ε2 , a 2 ) = a2 ⎪ ⎪ ⎩ln + O(ε2 ) if a 2 > 1. a2 − 1
After collecting all relevant terms we arrive at the desired formula Ik (ε2 , a 2 ) = (1 − a 2 )k θ (1 − a 2 ) ln
1 + β(a 2 ) + qk (a 2 ) + O(ε), ε2
where θ is the Heaviside step-function,
⎧ ⎪ ⎨1 if x > 0, θ (x) = 21 if x = 0, ⎪ ⎩0 if x < 0,
qk is a polynomial of degree k and β(a 2 ) = sgn(a 2 − 1)(1 − a 2 )k (γk − ln |1 − a 2 |) + θ (a 2 − 1) ln a 2 . We use the convention according to which the sign function, sgn(x), vanishes at x = 0. B. Appendix In this appendix we derive Eq. (6.5). Let †q − 1 2q −2 1 − q † qq † z w† R, R= . Wn = R 0 Wn−1 −2 1 − q † qq In−1 − 2qq † When z, w, η and Wn−1 get infinitesimal increments dz, dw, dq and dWn−1 the matrix Wn gets increment dz dw † z w† z w† R+R R+R d R. dWn = d R 0 Wn−1 0 dWn−1 0 Wn−1
596
Y. V. Fyodorov, B. A. Khoruzhenko
Since R is unitary Hermitian, the matrix Rd R is skew-Hermitian, so that d f −d h† dT = Rd R = dσ dTn−1 for some f , h and Tn−1 . Also Rd R = −(d R)R, and it follows that R(dWn ) = dT =
z w† z w† dz dw † − dT + 0 Wn−1 0 Wn−1 0 dWn−1
w† d h w † d f − zd h† − d h† Wn−1 + w† dTn−1 dz dw † + . 0 dWn−1 (z I − Wn−1 )d h d hw† + dTn−1 Wn−1 − Wn−1 d Sn−1 (B.1)
Let d M = R(dWn )R. It is apparent that n
n
d(Wn ) jk d(Wn ) jk =
j,k=1
d M jk d M jk .
(B.2)
j,k=1
On the other hand, it follows from (B.1) that n
d M jk d M jk
j,k=1
= | det(z I − Wn−1 )|2 dzdz
n−1
dw j dw j
j=1
n−1 j=1
dh j dh j
n−1
d(Wn−1 ) jk d(Wn−1 ) jk .
j,k=1
(B.3) To complete our derivation we now compute the Jacobian of the transformation from h to q. Recall that d h is the (2,1)-entry of the matrix dT = Rd R. A straightforward computation yields d h = (2a + b)(dq † )qq + a(dq) + bqq † (dq), where " a = −2 1 − q † q,
1 − 2q † q b= . 1 − q†q
Equation (B.4) can be written as d h = (a I + bqq † )(dq) + (2a + b)qq T (dq), and, therefore, a I + bqq † (2a + b)qq T dq dq = . dq dq (2a + b)qq † a I + bqq T
(B.4)
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
It now follows that
n−1 j=1
dh j dh j = det(a I + L)
n−1
dq j dq j ,
597
(B.5)
j=1
where L is the 2(n − 1) × 2(n − 1) matrix (2a + b)qq T bqq † . (2a + b)qq † bqq T If we find the eigenvalues of L, we shall know det(a I + L). To solve the eigenvalue problem for L, we observe that if ( f , g)T is an eigenvector of L then b(q † f )q + (2a + b)(q T g)q = λ f (2a + b)(q † f )q + b(q T g)q = λg for some λ. If λ = 0 we must have f = c1 q and g = c2 q for some c1 and c2 , and b(q † q)c1 + (2a + b)(q † q)c2 = λc1 . (2a + b)(q † q)c1 + b(q † q)c2 = λc2 This reduced eigenvalue problem yields the two non-zero eigenvalues of L, λ1 = −2aq † q and λ2 = 2(a + b)q † q. It is now apparent that λ = 0 is an eigenvalue of L of multiplicity 2(n − 2). This fact can be verified independently of the eigenvalue count by observing that for any vector u which is orthogonal to q, u 0 = 0. L = 0 and L u 0 It follows now that det(a I + L) = a 2(n−2) (a + λ1 )(a + λ2 ) = (−2)2n−2 (1 − q † q)n−2 (1 − 2q † q). (B.6) Collecting (B.2)–(B.3) and (B.5)–(B.6), one arrives at (6.5). References 1. Akemann, G., Vernizzi, G.: Characteristic Polynomials of Complex Random Matrix Models. Nucl. Phys. B 660, 532–556 (2003) 2. Akemann, G., Pottier, A.: Ratios of characteristic polynomials in complex matrix models. J. Phys. A: Math and General 37, L453–L460 (2004) 3. Andreev, A.V., Simons, B.D.: Correlators of spectral determinants in quantum chaos. Phys. Rev. Lett. 75, 2304–2307 (1995) 4. Balantekin, A.B.: Character expansions, Itzykson-Zuber integrals, and the QCD partition function. Phys. Rev. D(3) 62, 085017–085023 (2000) 5. Baik, J., Deift, P., Strahov, E.: Products and ratios of characteristic polynomials of random Hermitian matrices. J. Math. Phys. 44, 3657–3670 (2003) 6. Berezin, F.A.: Some remarks on the Wigner distribution (in Russian). Teor. Mat. Fiz. 17, 305–318 (1973). English translation: Theoret. and Math. Phys. 17(3), 1163–1171 (1974) 7. Berezin, F.A.: Quantization in complex symmetric spaces (in Russian). Izv Akad Nauk SSSR, Ser Math 39, 363–402 (1975); English translation: Math USSR-Izv 9(2), 341–379 (1976)
598
Y. V. Fyodorov, B. A. Khoruzhenko
8. Biane, Ph., Lehner, F.: Computation of some examples of Brown’s spectral measure in free probability. Colloq. Math. 90, 181–211 (2001) 9. Borodin, A., Olshanski, G., Strahov, E.: Giambelli compatible point processes. Adv. in Appl. Math. 37(2), 209–248 (2006) 10. Borodin, A., Strahov, E.: Averages of characteristic polynomials in Random Matrix Theory. Commun. Pure and Applied Math. 59(2), 161–253 (2006) 11. Brezin, E., Hikami, S.: Characteristic polynomials of random matrices. Commun. Math. Phys. 214, 111– 135 (2000) 12. Bump, D., Gamburd, A.: On the average of characteristic polynomials from classical groups. Commun. Math. Phys. 265, 227–274 (2006) 13. Conrey, J.B., Farmer, D.W., Keating, J.P., Rubinstein, M.O., Snaith, N.C.: Autocorrelation of random matrix polynomials. Commun. Math. Phys. 237, 365–395 (2003) 14. Conrey, J.B., Forrester, P.J., Snaith, N.C.: Averages of ratios of characteristic polynomials for the compact classical groups. Int. Math. Res. Not. 7, 397–431 (2005) 15. Conrey, J.B., Farmer, D.W., Zirnbauer, M.R.: Howe pairs, supersymmetry, and ratios of random characteristic polynomials for the unitary groups U(N). http://arxiv.org/list/math-ph/0511024, 2005 16. Diaconis, P., Gamburd, A.: Random matrices, magic squares and matching polynomials. Electron. J. Combin. 11(2), Research Paper 2, 26 pp. (2004/05) 17. Edelman, A.: The probability that a random real gaussian matrix has k real eigenvalues, related distributions, and the Cirular law. J. Multiv. Anal. 60, 203–232 (1997) 18. Edelman, A., Kostlan, E., Shub, M.: How many eigenvalues of a random matrix are real?. J. Amer. Math. Soc. 7, 247–267 (1994) 19. Feinberg, J., Zee, A.: Non-Gaussian Non-Hermitean Random Matrix Theory: phase transitions and addition formalism. Nucl. Phys. B 501, 643–669 (1997) 20. Feinberg, J., Scalettar, R., Zee, A.: “Single Ring Theorem” and the Disk-Annulus Phase Transition. J. Math. Phys. 42, 5718–5740 (2001) 21. Fyodorov, Y.V.: Negative moments of characteristic polynomials of random matrices: Ingham-Siegel integral as an alternative to Hubbard-Stratonovich transformation. Nucl. Phys. B 621, 643–674 (2002) 22. Fyodorov, Y.V., Akemann, G.: On the supersymmetric partition function in QCD-inspired random matrix models. JETP Lett. 77, 438–441 (2003) 23. Fyodorov, Y.V., Khoruzhenko, B.A.: Systematic analytical approach to correlation functions of resonances in quantum chaotic scattering. Phys. Rev. Let. 83, 65–68 (1999) 24. Fyodorov, Y.V., Sommers, H.-J.: Statistics of resonance poles, phase shifts and time delays in quantum chaotic scattering: Random matrix approach for systems with broken time-reversal invariance. J. Math. Phys. 38, 1918–1981 (1997) 25. Fyodorov, Y.V., Sommers, H.-J.: Random matrices close to Hermitian or unitary: overview of methods and results. J. Phys. A 36, 3303–3347 (2003) 26. Fyodorov, Y.V., Strahov, E.: An exact formula for general spectral correlation function of random Hermitian matrices. J. Phys. A: Maths and General 36, 3203–3213 (2003) 27. Fyodorov, Y.V., Strahov, E.: Characteristic polynomials of random Hermitian matrices and DuistermaatHeckman localisation on non-compact Kähler Manifolds. Nucl. Phys. B 630, 453–491 (2002) 28. Haagerup, U., Larsen, F.: Brown’s spectral distribution measure for R-diagonal elements in finite von Neumann algebras. J. Funct. Anal. 176, 331–367 (2000) 29. Halasz, M.A., Jackson, A.D., Verbaarschot, J.J.M.: Fermion determinants in matrix models of QCD at nonzero chemical potential. Phys. Rev. D 56, 5140–5152 (1997) 30. Hua, L.K.: Harmonic Analysis of Functions of Several Complex variables in the Classical Domains. Providence, RI: Amer. Math. Soc., 1963 31. Ginibre, J.: Statistical Ensembles of Complex, Quaternion, and Real Matrices. J. Math. Phys. 6, 440– 449 (1964) 32. Gradshtein, I.S., Ryzhik, I.M.: Table of Integrals, Series, and Products, 5th ed., A. Jeffrey, ed. New York: Academic Press, 1994 33. Kadell, K.W.J.: The Selberg-Jack symmetric functions. Adv. Math. 130, 33–102 (1997) 34. Kaneko, J.: Selberg integrals and hypergeometric functions associated with Jack polynomials. SIAM J. Math. Anal. 24, 1086–1110 (1993) 35. Keating, J.P., Snaith, N.C.: Random matrix theory and ζ (1/2 + it). Commun. Math. Phys. 214, 57– 89 (2000) 36. Keating, J.P., Snaith, N.C.: Random matrix theory and L-functions at s = 1/2. Commun. Math. Phys. 214, 91–110 (2000) 37. Macdonald, I.G.: Symmetric Functions and Hall Polynomials. 2nd ed. Oxford: Clarendon Press, Oxford, 1995 38. Mehta, M.L.: Random Matrices. 3rd ed. Amsterdam: Elsevier/Academic Press, 2004)
Absolute Moments of Characteristic Polynomials of Complex Random Matrices
599
39. Orlov, A.Yu.: New Solvable Matrix Integrals. In: Proceedings of 6th International Workshop on Conformal Field Theory and Integrable Models. Internat. J. Modern Phys. A 19, May, suppl., 276–293 (2004) 40. Pólya, G., Szegö, G.: Problems and Theorems in Analysis. Vol. I, Berlin-Heidelberg-New York: SpringerVerlag, 1972 41. Pólya, G., Szegö, G.: Problems and Theorems in Analysis. Vol. II, Berlin-Heidelberg-New York: SpringerVerlag, 1976 42. Schlittgen, B., Wettig, T.: Generalizations of some integrals over the unitary group. J. Phys. A: Math and General 36, 3195–3202 (2003) 43. Shuryak, E.V., Verbaarschot, J.J.M.: Random matrix theory and spectral sum rules for the Dirac operator in QCD. Nucl. Phys. A 560, 306–320 (1993) 44. Strahov, E.: Moments of characteristic polynomials enumerate two-rowed lexicographic arrays. Electron. J. Combin. 10, Research paper 24, 8 pp. (2003) 45. Trotter, H.F.: Eigenvalue distributions of large Hermitian matrices: Wigner semicircle and a theorem of Kac, Murdock, and Szego. Adv. Math. 54, 67–82 (1984) 46. Verbaarschot, J.J.M.: Spectrum of the QCD Dirac Operator and Chiral Random Matrix Theory. Phys. Rev. Lett. 72, 2531–2533 (1994) 47. Verbaarschot, J.J.M.: QCD, chiral random matrix theory and integrability. In: Applications of random matrices in physics, NATO Sci. Ser. II Math. Phys. Chem. 221, Dordrecht: Springer, 2006, pp. 163–217 48. Wilkinson, J.H.: The Algebraic Eigenvalue Problem. Oxford: Clarendon Press, 1965 49. Zirnbauer, M.R.: Supersymmetry for systems with unitary disorder: circular ensembles. J. Phys. A: Math and General 29, 7113–7136 (1996) 50. Zyczkowski, K., Sommers, H.-J.: Truncations of random unitary matrices. J. Phys. A: Math. and General 33, 2045–2058 (2000) Communicated by P. Sarnak
Commun. Math. Phys. 273, 601–618 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0252-0
Communications in
Mathematical Physics
Upper Bounds On Wavepacket Spreading For Random Jacobi Matrices Svetlana Jitomirskaya1 , Hermann Schulz-Baldes2 1 Department of Mathematics, University of California at Irvine, Irvine, CA 92697, USA 2 Mathematisches Institut, Universität Erlangen-Nürnberg, Erlangen, Germany.
E-mail: [email protected] Received: 10 July 2006 / Accepted: 9 November 2006 Published online: 28 April 2007 – © Springer-Verlag 2007
Abstract: A method is presented for proving upper bounds on the moments of the position operator when the dynamics of quantum wavepackets is governed by a random (possibly correlated) Jacobi matrix. As an application, one obtains sharp upper bounds on the diffusion exponents for random polymer models, coinciding with the lower bounds obtained in a prior work. The second application is an elementary argument (not using multiscale analysis or the Aizenman-Molchanov method) showing that under the condition of uniformly positive Lyapunov exponents, the moments of the position operator grow at most logarithmically in time. 1. Introduction One of the fundamental questions of quantum mechanics concerns the spreading of an initially localized wave packet φ under the time evolution e−ıt H associated to a Schrödinger operator H . If the physical space is Rd or Zd and the position operator is denoted by X , the spreading can be quantified using the time-averaged moments of X (or equivalently the moments of the associated classical probability distribution): ∞ dt − 2t q MT = e T φ| eı H t |X |q e−ı H t |φ , q >0. (1) T /2 0 It is well known that for short-range operators H , the moments cannot grow faster than q ballistically, that is MT ≤ C(q) T q . The growth actually is ballistic in typical scattering situations and for periodic operators H describing Bloch electrons. On the other hand, if q the moments MT are bounded uniformly in time, one speaks of dynamical localization. This can be proven in the regime of Anderson localization for random operators, but also for certain almost-periodic operators (see [Jit] for a review). There are many models q q where the moments MT exhibit some non-trivial power law behavior. If MT ∼ T q/2 , the quantum motion is diffusive, and any other asymptotic growth behavior is called
602
S. Jitomirskaya, H. Schulz-Baldes
anomalous diffusion. In order to distinguish various anomalous diffusive motions, one defines the diffusion exponents q
βq+ = lim sup T →∞
log(MT ) , log(T q )
βq− = lim inf T →∞
q
log(MT ) , log(T q )
q > 0.
(2)
If the limit exists, we write βq = βq+ = βq− . The diffusion exponents correspond to the Levy-Khinchin classification of Levy flights in classical probability, however, we stress that the quantum anomalous diffusion does not result from a probabilistic dynamics, but rather from a Hamiltonian one. It is due to delicate quantum interference phenomena. The ballistic bound implies 0 ≤ βq± ≤ 1 and convexity inequalities show that βq± is non-decreasing in q. In the regime of dynamical localization βq = 0 and for quantum diffusion βq = 21 . Anomalous diffusion corresponds to all other values of βq± . Typically βq± is then also varying with q which reflects a rich multiscale behavior of the wave packet spreading. Such anomalous diffusion was exhibited numerically in several almost-periodic Jacobi matrices having singular continuous spectra (e.g. Fibonacci and critical Harper operator), and also some random and sparse Jacobi matrices. It is a challenging problem of mathematical physics to calculate the diffusion exponents for a given Schrödinger operator, in particular, when the quantum motion is anomalous diffusive. In this work we accomplish this for the so-called random polymer models, a random Jacobi matrix described in detail in the next section, and show that 1 βq = max 0, 1 − , q > 0, (3) 2q see Theorem 3. This result confirms the heuristics and numerical results of Dunlap, Wu and Phillips [DWP] for the random dimer model, the prototype of a random polymer. The latter model was introduced and analyzed in our prior work in collaboration with G. 1 Stolz [JSS], which already contained a rigorous proof of the lower bound βq− ≥ 1 − 2q . In this work we hence focus on the upper bound, which amounts to proving quantitative localization estimates. Next let us discuss this result in the context of prior rigorous work on other onedimensional models (Jacobi matrices) exhibiting anomalous diffusion. First of all, the Guarneri bound [Gua] and its subsequent improvements [Com, Las, GS2, GS3, KL, BGT] allow to estimate the diffusion exponents from below in terms of various fractal dimensions of the spectral measure. However, those results do not allow to prove the lower bound in (3) because, as was shown by de Bievre and Germinet [BG], the random dimer model has pure-point spectrum so that the Hausdorff dimension vanishes and the Guarneri bound is empty. In fact, the argument in [JSS] is based on a large deviation estimate on the localization length of the eigenstates near the so-called critical energies at which the Lyapunov exponent vanishes. Upper bounds on anomalous quantum diffusion were first proven for Jacobi matrices with self-similar spectra [GS1, BS], and these bounds are even optimal for the so-called Julia matrices. Kiselev, Killip and Last [KKL] presented a technique based on subordinacy theory allowing to control the spread of a certain portion of the wave packet (not the fastest one and therefore not the moments), and applied it to the Fibonacci model. Tcheremchantsev [Tch] proved tight upper bounds on growing sparse potential Hamiltonians introduced in [JL] and further analyzed in [CM]. Recently, Damanik and Tcheremchantsev [DT] developed a transfer matrix based method that allows to prove upper bounds on the diffusion exponents and also applied it to the Fibonacci model.
Upper Bounds On Wavepacket Spreading For Random Jacobi Matrices
603
Another way to achieve upper bounds on the diffusion exponents in terms of properties of the finite size approximants (Thouless widths and eigenvalue clustering) was recently developed and applied to the Fibonacci operator by Breuer, Last, and Strauss [BLS]. In the models considered in [GS1, BS, DT, BLS, Tch] the anomalous diffusion is closely linked to dimensional properties of the spectral measures, even though the (generalized) eigenfunctions have to be controlled as well. As a result, the transport slows down as the fractal dimension of the spectral measure decreases. The origin of the anomalous transport in the random polymer model is of a different nature. In fact, in the random polymer model only a few, but very extended localized states near the critical energies lead to the growth of the moments. Hence this model illustrates that spectral theory may be of little use for the calculation of the diffusion exponents. This statement is even more true if the dimension of physical space is higher. There are examples of three-dimensional operators with absolutely continuous spectral measures, but subdiffusive quantum diffusion with diffusion exponents as low as 13 [BeS]. The strategy for proving upper bounds advocated in [DT] appears to be more efficient in the present context than prior techniques [GS1, KKL]. We refine and generalize the relevant part in Sect. 3, see in particular Proposition 2. It allows to give a rather simple proof of our second result, namely Theorem 1, which establishes a logarithmic q bound MT ≤ log(T )βq , β > 2, under the condition of a uniformly positive Lyapunov exponent. This result is neither new nor fully optimal. However, in the generality we have (any condition on the distribution of randomness, including e.g. Bernoulli) the Aizenman-Molchanov method [AM] cannot be applied, and the only technique previously available to obtain this statement was the multi-scale analysis of [CKM] (see also [BG, DSS]). Thus our proof is significantly simpler. Moreover, one can argue that it captures the physically relevant effect of localization. Indeed, it was shown by Gordon [Gor] and del Rio, Makarov and Simon [DMS] that a generic rank one perturbation of q a model in the regime of strict dynamical localization (MT bounded) leads to singular continuous spectrum, and therefore, by the RAGE theorem, growth of the moments. However, it was shown in [DJLS] that this growth can be at most logarithmic, just as proven in Theorem 1. The proof of Theorem 1 constitutes essentially a part of the proof of Theorem 3. In the next section we present our models and results with technical details. Section 3 contains a general (non-random) strategy for proving the upper bounds. Section 4 provides the proof of Theorem 1 as well as some statements that are used in Sect. 6. In Sect. 5 we obtain probabilistic bounds on the transfer matrices near a critical energy based on the large deviation estimate of [JSS]. Section 6 contains the proof of Theorem 3, that is the identity (3). 2. Models and Results A Jacobi matrix is an operator Hω on 2 (Z) associated to the data ω = (tn , vn )n∈Z of positive numbers tn and real numbers vn which we suppose to be both bounded by a constant C, and tn bounded away from 0. Using the Dirac notation |n for the canonical basis in 2 (Z), Hω is given by Hω |n = tn+1 |n + 1 + vn |n + tn |n − 1.
(4)
Each ω is called a configuration. The set of all configurations is contained in = ([−C, C]×2 )×Z . The left shift S is naturally defined on . A stochastic Jacobi matrix is a family (Hω )ω∈ of Jacobi matrices drawn according to a probability measure P on
604
S. Jitomirskaya, H. Schulz-Baldes
which is invariant and ergodic w.r.t. S. Furthermore, we speak of a random Jacobi matrix if P has at most finite distance correlations, namely there exists a finite correlation length L ∈ N such that (tn , vn ) and (tm , vm ) are independent whenever |n −m| ≥ L. The most prominent example of a random Jacobi matrix is the one-dimensional Anderson model for which tn = 1 and the vn are independent and identically distributed so that L = 1. Random polymer models as studied in [DWP, JSS] and described in more detail below provide an example of a random Jacobi matrix with finite distance correlations (the second crucial feature of these models is that the (tn , vn ) only take a finite number of values). We will consider the disorder and time averaged moments of the position operator q X on 2 (Z), denoted by MT as in (1): ∞ dt − 2t q e T E 0| eı Hω t |X |q e−ı Hω t |0 , MT = q >0. (5) T /2 0 Here E denotes the average over ω w.r.t. P. (Note that upper bounds on the expectation w.r.t. P yield upper bounds almost surely.) One may replace |0 by any other localized initial state (at least in 1 (Z)). As discussed in the introduction, it is well-known that the Anderson model exhibits q dynamical localization, that is MT ≤ C(q) < ∞ uniformly in T for all q > 0. A byproduct of our analysis of the random polymer model discussed below is a simple proof q of the weaker result that MT grows at most logarithmically in T whenever the Lyapunov exponent is strictly positive. In order to define the latter, let us introduce as usual in the analysis of one-dimensional systems the transfer matrices at a complex energy z by (z − vn )tn−1 −tn z Tωz (n, m) = Tn−1 · . . . · Tmz , n > m, Tnz = . tn−1 0 Furthermore Tωz (n, m) = Tωz (m, n)−1 for n < m and Tωz (n, n) = 1. Then the Lyapunov exponent is γ (z) =
lim
N →∞
1 E log Tωz (N , 0) . N
Because the tn and vn are uniformly bounded, one shows by estimating the norm of a product of matrices by the product of their norms that for every bounded set U ⊂ C, Tωz (n, m) ≤ eγ1 |n−m| ,
z ∈ U,
(6)
where the γ1 depends on U . This implies that γ (z) ≤ γ1 for z ∈ U . For a random Jacobi matrix, uniform lower bounds on γ (z) can be proven by the Furstenberg theorem (e.g. [PF]). This applies in particular to the Anderson model (also with a Bernoulli potential), and more generally to random Jacobi matrices with correlation length equal to 1. Theorem 1. Consider a random Jacobi matrix. Let the (non-random) spectrum be σ (H ) ⊂ (E 0 , E 1 ) Suppose that the Lyapunov exponent is strictly positive: γ (z) ≥ γ0 > 0,
z ∈ (E 0 , E 1 ).
(7)
Then for any β > 2 there exists a constant C(β, q) such that q
MT ≤ (log T )qβ + C(β, q) .
(8)
Upper Bounds On Wavepacket Spreading For Random Jacobi Matrices
605
Another self-averaging quantity associated to ergodic Jacobi matrices is the integrated density of states (IDS) N (E) which can be defined by N (d E) f (E) = E 0| f (Hω )|0, f ∈ C0 (R). The Thouless formula (e.g. [PF]) links the Lyapunov exponent to the IDS. It implies that, if (7) holds for real E ∈ (E 0 , E 1 ), then one also has γ (z) ≥ γ0 for all z with e(z) ∈ (E 0 , E 1 ). As shown by Theorem 1 and the remark just before it, it is necessary for a random q Jacobi matrix to have a correlation length larger than 1 for the moments MT to grow faster than logarithmically, so that the diffusion exponents βq do not vanish. That this actually happens for the random dimer model was discovered by Dunlap, Wu and Phillips [DWP]. Next let us describe in more detail the more general random polymer model considered in [JSS]. Given are two finite sequences tˆ± = (tˆ± (0), . . . , tˆ± (L ± − 1)) and vˆ± = (vˆ± (0), . . . , vˆ± (L ± − 1)) of real numbers, satisfying tˆ± (l) > 0 for all l = 0, . . . , L ± − 1, L ± ≥ 1. The associated random polymer model is the random Jacobi matrix constructed by random juxtaposition of these sequences and randomizing the origin. More precisely, configurations ω ∈ can be identified with the data of a sequence of signs (σn )n∈Z and an integer 0 ≤ l ≥ L σ1 − 1, via the correspondence (tn ) = (. . . , tˆσ1 (l), . . . , tˆσ1 (L σ1 −1), tˆσ2 (0), . . . , tˆσ2 (L σ2 −1), tˆσ3 (0), . . .) and similarly for (vn ), with choice of origin t0 = tˆσ1 (l) and v0 = vˆσ1 (l). The shift is as usual, and the probability P is the Bernoulli measure with probabilities p+ and p− = 1 − p+ combined with a randomization for l (cf. [JSS] for details). The correlation length in this model is L = max{L + , L − }. It is now natural and convenient to consider the polymer transfer matrices (E − v)t −1 −t z z z z . where Tv,t = T± = Tvˆ (L −1),tˆ (L −1) · . . . · Tvˆ (0),tˆ (0) , ± ± ± ± ± ± 0 t −1 (9) Definition 1. An energy E c ∈ R is called critical for the random polymer model (Hω )ω∈ if the polymer transfer matrices T±E c are elliptic (i.e. |Tr(T±E c )| < 2) or equal to ±1 and commute [T−E c , T+E c ] = 0 .
(10)
If L ± = 1, the model reduces to the Bernoulli-Anderson model and there are no critical energies. The most studied [DWP, Bov, BG] example is the random dimer model for which L + = L − = 2 and tˆ± (0) = tˆ± (1) = 1, vˆ+ (0) = vˆ+ (1) = λ and vˆ− (0) = vˆ− (1) = −λ for some λ ∈ R. This model has two critical energies E c = λ and E c = −λ as long as λ < 1. For further examples we refer to [JSS]. It follows from the definition that a simultaneous change of coordinates reduces both T+ and T− to rotations by angles that are denoted by η+ and η− . It is immediate from (10) that the Lyapunov exponent vanishes at a critical energy. Because the transfer matrices T±z are analytic in z, it follows that there is a constant c0 such that for all ∈ C with | | < 0 one has for n, m ∈ N, E c +
(11) Tω (n, m) ≤ ec0 | | |n−m| .
606
S. Jitomirskaya, H. Schulz-Baldes
In particular, |γ (E c + )| ≤ c0 | |. However, the correct asymptotics for the Lyapunov exponent is γ (E c + ) = O( 2 ). This was first shown (non-rigorously) by Bovier for the case of the random dimer model [Bov], but heuristics were already given in [DWP]. The rigorous result about the Lyapunov exponent and also the integrated density of states N are combined in the following theorem. Theorem 2 [JSS]. Suppose that E(e2ıησ ) = 1 and E(e4ıησ ) = 1. Then for ∈ R and some D ≥ 0, the Lyapunov exponent of a random polymer model satisfies γ (E c + ) = D 2 + O( 3 ),
(12)
in the vicinity of a critical energy E c . If [T−E c + , T+E c + ] ≥ C for some C > 0 and small , one has D > 0. Moreover, the IDS N satisfies N (E c + ) − N (E c − ) = D + O( 2 ),
(13)
for some constant D > 0. Furthermore [JSS] contains explicit formulas for D and D . The bound D > 0 is not explicitly contained in [JSS], but can be efficiently checked using Proposition 1 in [SSS]. The statement about the IDS only requires E(e2ıησ ) = 1. Let us remark that the hypothesis E(e4ıησ ) = √ 1 does not hold, for example, in the special case of a random dimer model if λ = 1/ 2. In this situation, one is confronted with an anomaly. Nevertheless, the asymptotics is as in (12) and again one can calculate D explicitly [Sch]. As we did not perform the large deviation analysis of [JSS] in the case of an anomaly, we retain the hypothesis of Theorem 2 throughout. The following is the main result of this work. Theorem 3. Suppose that E(e2ıησ ) = 1 and E(e4ıησ ) = 1, and that the random polymer model has a critical energy at which (12) holds with D > 0. Then for q > 0, 1 βq = max 0, 1 − . 2q As already pointed out, the lower bound βq− ≥ 1 −
1 2q
was proven in [JSS].
3. A Strategy for Proving Upper Bounds on Dynamics In this section it is not necessary for the Jacobi matrix to be random or ergodic; hence the index ω and the average E are suppressed. Let the notation for the Green’s function be G z (n, m) = n|
1 |m , H −z
(14)
where n, m ∈ Z and z ∈ C is not in the spectrum σ (H ) of H . The starting point of the analysis is to express the time averaged moments (5) in terms of the Green’s function ı dE q MT = |n|q (15) |G E+ T (0, n)|2 . R πT n∈Z
Upper Bounds On Wavepacket Spreading For Random Jacobi Matrices
607
Using the spectral theorem, this well known identity can be checked immediately by a contour integration. In order to decompose the expression on the r.h.s., let us introduce for 0 < α0 < α1 and E 0 ≤ E 1 , q,α0 ,α1
MT
(E 0 , E 1 ) =
T α0 <|n|≤T α1
|n|q
E1 E0
ı dE |G E+ T (0, n)|2 , πT
(16)
and MT (E 0 , E 1 ) is defined similarly with the sum running over 0 ≤ |n| ≤ T α , α > 0. The following result also holds for higher dimensional models. q,0,α
Proposition 1. Suppose σ (H ) ⊂ (E 0 , E 1 ), = dist({E 0 , E 1 }, σ (H )) > 0, and α > 1. Then there exists a constant C1 = C1 (α, , q) such that
C1
q q,0,α .
MT − MT (E 0 , E 1 ) ≤ T The proof is based on the following Combes-Thomas estimate. Even though standard, its proof is sufficiently short and beautiful to reproduce it. Lemma 1. Let (z) = dist(z, σ (H )) and C2 = (4 t ∞ )−1 where t ∞ = supn∈Z tn . Then | G z (n, m) | ≤
2 exp ( − arcsinh(C2 (z)) |n − m|) .
(z)
Proof. For η ∈ R, set Hη = eηX H e−ηX . A short calculation shows that Hη − H ≤ t ∞ |eη − e−η |. Hence one has −1 −1 −1 −1 − z) ≤ − z) − H − H . (H (Hη η Since (H − z)−1 ≤ (z)−1 , the choice |η| = arcsinh ( (z)/(4 t ∞ )) hence gives (Hη − z)−1 ≤ 2/ (z). The bound now follows from n|(H − z)−1 |m = eη(m−n) n|(Hη − z)−1 |m. The following estimate will be used not only for the proof of Proposition 1, but at several reprises below. Lemma 2. Let , α > 0, q ≥ 0 and N ∈ N. Let p = [ q+1 α ] where [ · ] stands for the integer part. Then n>N
n q e− n
α
α
≤
p e− N 2 p! N + −1 . α
608
S. Jitomirskaya, H. Schulz-Baldes
Proof. Bounding the sum by the integral on each interval of monotonicity and then extending this integral to the entire range [N , ∞), we obtain ∞ α p 1 ∞ e− N p! α α (N ) j . n q e− n ≤ 2 d x x q e− x ≤ dy y p e− y = α Nα α 1+ p j! N n>N
j=0
Bounding the sum over j by p! (N + 1) p completes the proof.
Proof of Proposition 1. We first consider the energies above the spectrum and set = dist(E 1 , σ (H )). Due to the previous two lemmata and if p = [q + 1], ∞ dE 2 q,0,∞ exp (−arcsinh(C2 ( + E))|n|) (E 1 , ∞) ≤ |n|q MT π T
+ E 0 |n|≥1 ∞ d E 8 p! arcsinh(C2 ( + E))−( p+1) . ≤ π T + E 0 q,0,∞
Since arcsinh(y) ≥ ln(y) for large y, this shows that MT (E 1 , ∞) ≤ C/T for some q,0,∞ constant C = C(q, ). A similar bound holds for MT (−∞, E 0 ). Now using the imaginary part of the energy in Lemma 1 and the bound arcsinh(y) ≥ y for sufficiently small y ≥ 0, we obtain E1 2 q,α,∞ (E 0 , E 1 ) ≤ dE |n|q exp (−arcsinh(C2 /T )|n|) MT π E0 α |n|>T
α−1
4 (E 1 − E 0 ) p! e−C2 T ≤ π C2 /T
T α + T /C2
p
.
For α > 1, this decreases faster than any power of T . Combining these estimates implies the proposition. According to Proposition 1 and Eq. (16), one now needs a good bound on the decay (in n) of the Green’s function for complex energies in the vicinity of the spectrum. As shown by Damanik and Tcheremchantsev [DT], such bounds can be obtained for Jacobi matrices in a very efficient way in terms of the transfer matrices. Here we give a streamlined proof of this statement which also works for arbitrary Jacobi matrices (the kinetic part is not necessarily the discrete Lapalacian) and does not contain energy dependent constants as in [DT]. It will be convenient to first consider such bounds for the halfline problem, which is operator (4) on 2 (N) with Dirichlet boundary conditions. This operator and its Green’s function are denoted by Hˆ and Gˆ z (n, m). Proposition 2. Set τ = max{ t 2∞ , 1, t −1 2∞ } and z = E +
ı T
. One has the bounds
|Gˆ z (0, n)|2 ≤
4 τ3 T4 , max0≤n≤N T z (n, 0) 2
|G z (0, n)|2 ≤
16 τ 4 T 6 . max0≤|n|≤N T z (n, 0) 2
n>N
and, for T ≥ 1, |n|>N
Upper Bounds On Wavepacket Spreading For Random Jacobi Matrices
609
Proof. Let N be the decoupling operator at N defined by N = t N +1 (|N N + 1| + |N + 1N |) and set Gˆ zN = ( Hˆ − z − N )−1 and Gˆ z = ( Hˆ − z)−1 . The resolvent identity reads Gˆ z = Gˆ zN − Gˆ zN N Gˆ z . Thus, with the notation Gˆ zN (n, m) = n|Gˆ zN |m, |Gˆ z (0, n)|2 = |Gˆ zN N Gˆ z (0, n)|2 n>N
n>N
= |Gˆ zN (0, N )t N +1 |2
|Gˆ z (N + 1, n)|2 ≤ t 2∞ T 2 |Gˆ zN (0, N )|2 ,
n>N
since T −1 = m(z) and Gˆ z ≤ T. As the l.h.s. is decreasing in N , we therefore have |Gˆ z (0, n)|2 ≤ t 2∞ T 2 min |Gˆ nz (0, n)|2 . (17) 0≤n≤N
n>N
N Now let N = n=0 |nn| be the projection on the states on the first N + 1 sites ˆ and set H N = N H N . Then Gˆ zN (n, m), 0 ≤ n, m ≤ N , are the matrix elements of the inverse of an (N + 1) × (N + 1) matrix Hˆ N − z. The matrix elements are closely linked to the transfer matrix ab z , T (N + 1, 0) = cd N since a, b, c, d when multiplied by the n=1 tn , are the determinants of certain minors of z − Hˆ N . Namely by Cramer’s rule (or, alternatively, by the Stieltjes continued fraction expansion and geometric resolvent identity) the following identities hold for z ∈ / σ ( Hˆ N ): 1 b Gˆ zN (0, 0) = 2 , t0 a
c Gˆ zN (N , N ) = − , a
1 d Gˆ zN −1 (0, 0) = 2 , t0 c
and 1 1 Gˆ zN (0, N ) = − , t0 a
1 1 Gˆ zN −1 (0, N − 1) = − . t0 t N c
Therefore |b| ≤ t02 T |a|, |c| ≤ T |a| and |d| ≤ t02 T |c|. As the matrix norm is bounded by the Hilbert-Schmidt norm, it follows that T z (N + 1, 0) 2 ≤
4 T2 τ2 . min{|Gˆ zN (0, N )|2 , |Gˆ zN −1 (0, N − 1)|2 }
By (17) this proves the first inequality. The second one follows from the first one (coupled with the same statement for the left half-line) by observing that the resolvent identity gives G z (0, n) = Gˆ z (0, n) − G z (0, −1) t0 Gˆ z (0, n). Therefore |G z (0, n)| ≤ (1 + T t0 ) |Gˆ z (0, n)| which implies the second bound.
610
S. Jitomirskaya, H. Schulz-Baldes
4. Logarithmic Bounds in the Localization Phase In this section we provide the proof of Theorem 1 and hence suppose throughout that the stated hypothesis hold. The main idea is to use the given positivity of the Lyapunov exponent (7), combine it with the given uniform upper bound (6) in order to deduce good probabilistic estimates on the growth of the transfer matrices. This growth in turn allows to bound the Green’s function due to Proposition 2 which then readily leads to the logarithmic upper bound on the moments. Let us set U = {z ∈ C | E 0 ≤ e(z) ≤ E 1 , |m(z)| ≤ 1 }. Lemma 3. For z ∈ U and N ∈ N, the set
N (z) = ω ∈ Tωz (N , 0) 2 ≥ eγ0 N satisfies P( N (z)) ≥
γ0 . 2 γ1 − γ0
Proof. Let us set P = P( N (z)). Due to (7), the subadditivity of the transfer-matrix cocycle and the bound (6), it follows that 1 1 E log( Tωz (N , 0) ) ≤ (1 − P) γ0 + P γ1 , N 2
γ0 ≤
with γ1 defined by (6) using U as above. This directly implies the result.
Lemma 4. Let z ∈ U and N ∈ N. Then there is a constant C3 = C3 (γ0 , γ1 ) such that the set
1
ˆ N (z) = ω ∈ max Tωz (n, 0) 2 ≥ eC3 N 2
0≤n≤N
satisfies 1 2
ˆ N (z)) ≥ 1 − e− C3 N . P( Proof. Let us split N into NN0 pieces of length N0 (here and in the sections below, we suppose without giving further details that there is an integer number of pieces and that the boundary terms are treated separately). By the stationarity, on each piece [ j N0 + 1, ( j + 1)N0 ), Lemma 3 with N = N0 applies. As the pieces are independent, we deduce
N
z 2 γ0 N0
≤ (1 − p0 ) N0 , P ω ∈ max Tω (( j + 1)N0 , j N0 + 1) ≤ e 0≤ j≤N /N0 where p0 = γ0 /(2 γ1 − γ0 ). Furthermore Tωz (( j + 1)N0 , j N0 + 1) = Tωz (( j + 1)N0 , 0) 1 1 Tωz ( j N0 , 0)−1 . As A = BC implies either B ≥ A 2 or C ≥ A 2 for arbitrary matrices, and A−1 = A for A ∈ SL(2, C), it therefore follows that
N
1 ≥ 1 − (1 − p0 ) N0 . P ω ∈
max Tωz ( j N0 , 0) 2 ≥ e 2 γ0 N0 0≤ j≤N /N0 1
Choosing N0 = cN 2 with adequate c concludes the proof.
Upper Bounds On Wavepacket Spreading For Random Jacobi Matrices
611
Since the above lemma applies equally well to negative integers N , it is a direct corollary of Lemma 4 and Proposition 2 that, for T sufficiently large, 1 ı 2 E |G ωz (0, n)|2 ≤ 32 τ 4 T 6 e− C3 N , z = E + ∈ U. (18) T |n|>N
The following lemma, holding for arbitrary ergodic families of Jacobi matrices, is useful for bounding a sum similar to the one in (18), but over |n| ≤ N . Let Bµ (z) = µ(d E) (z − E)−1 denote the Borel transform of a measure µ. Lemma 5. For any E ∈ R, T > 0 and N ≥ 1, one has |n|q Nq E |G ωz (0, n)|2 ≤ m BN (z), πT π
z = E +
0≤|n|≤N
Furthermore, for any E 0 < E 1 , |n|q
E1 E0
0≤|n|≤N
(19)
dE E+ ı E |G ω T (0, n)|2 ≤ N q . πT
Proof. One has |n|q 1 1 E |G ωz (0, n)|2 ≤ N q E 0| πT πT Hω − E − n∈Z
0≤|n|
ı . T
≤ Nq E
1 1 0| πT (Hω − E)2 +
ı T
1 T2
which by the spectral theorem gives (19). Using E1 dE dE 1 m BN (E + ı T −1 ) ≤ N (de) π (e − E)2 + E0 R πT R the inequality (20) follows upon integrating (19).
(20)
1 Hω − E +
|n n|
ı T
|0
|0, 1 T2
=
R
N (de) = 1,
q,0,α
Proof of Theorem 1. According to Proposition 1 it remains to bound E MT for α > 1. We further split this into two contributions: E1 dE E+ ı q E |G ω T (0, n)|2 , |n| MT (1) = E0 π T β
(E 0 , E 1 )
0≤|n|≤(log T )
and MT (2) corresponding to the sum over (log T )β < |n| ≤ T α . The bound (20) with N = (log T )β immediately gives MT (1) ≤ (log T )βq . The second contribution can be bounded using (18): MT (2) ≤
64 τ 4 T 5 π
1 2
n q e− C3 n ,
n≥(log T )β
which is bounded by a constant C(β, q) due to Lemma 2 as long as β > 2.
This above proof applies equally well to the half-line problem with arbitrary boundary condition, the only difference being that the IDS N has to be replaced by the E(µω (d E)), where µω is the spectral measure of |0 and Hˆ .
612
S. Jitomirskaya, H. Schulz-Baldes
5. Probabilistic Estimates Near a Critical Energy In this section, we first derive more quantitative versions of Lemmata 3 and 4 by replacing the input (6) and (7) by the estimates (11) and (12). However, these estimates are not sufficient for the proof of Theorem 3. In fact, one can further improve the lemmata by replacing the uniform upper bound (11) by a probabilistic one, deduced from a large deviation estimate from [JSS] recalled below and showing that the transfer matrices grow no more than given by the Lyapunov exponent with high probability. For sake of notational simplicity, we suppose that E c = 0. Furthermore, according to (12) and the Thouless formula we may choose positive d < D and 0 such that γ (z) ≥ d 2 ,
for z = + ıδ with | | < 0 .
(21)
In order to further simplify notation, we also assume that δ, > 0 even though all estimates hold with |δ| and | |. Lemma 6. For z = + ıδ with δ ≤ < 0 introduce the set
2
N (z) = ω ∈ Tωz (N , 0) 2 ≥ e d N .
(22)
Then P( N (z)) ≥ c1 . Proof. This is an immediate corollary of Lemma 3 with c1 = d/(2c0 ), where c0 is introduced in (11). Lemma 7. Let z = + ıδ with δ ≤ < 0 . Then there exists a constant c2 such that the set
z 2
3 N
ˆ N (z) = ω ∈ max Tω (n, 0) ≥ e 0≤n≤N
satisfies ˆ N (z)) ≥ 1 − e− c2 N . P( Proof. Let us split N into NN0 pieces of length N0 and follow the proof of Lemma 4 invoking Lemma 6 instead of Lemma 3, giving
N
1 2N −c N z 2 d
0
≥ 1 − (1 − c1 ) N0 ≥ 1 − e 1 N0 . P ω ∈ max Tω (n, 0) ≥ e 2 0≤n≤N
Choosing N0 =
2
d
shows that one may take c2 = c1 d/2.
Certainly other choices of N0 are possible in the previous proof, but the present one leading to Lemma 7 implies the following estimate, which is sufficient in order to deal with one of the terms in the next section (a boundary term of energies close to 0 ). Corollary 1. Let z = + ıδ with δ ≤ < 0 . There is a constant c3 such that 1 3 ≤ e−c3 N . E z 2 max0≤|n|≤N T (n, 0) Now we turn to the refined statements and start by recalling the following
Upper Bounds On Wavepacket Spreading For Random Jacobi Matrices
613
Theorem 4. [JSS] Suppose that E(e2ıησ ) = 1 and 0 < α ≤ 21 . Then there exist constants c4 , c5 , c6 , c7 such that the set
1 αN = ω ∈
max Tω +ıδ (n, m) ≤ ec4 ∀ |δ| ≤ c5 N −1 , | | ≤ N − 2 −α , 0≤n,m≤N
satisfies α
P(αN ) ≥ 1 − c6 e−c7 N . For a fixed energy z, this can be extended to length scales N beyond the localization length (inverse Lyapunov exponent). 1
Lemma 8. For z = + ıδ with δ ≤ c5 2 and ≥ N − 2 −α , the set
α z c4 N 2(1−2α)
, N (z) = ω ∈ max Tω (n, m) ≤ e 0≤n,m≤N
satisfies −α
P(αN (z)) ≥ 1 − c6 N e−c7 . 2
Proof. We split N into NN0 pieces of length N0 = − 2α+1 . The condition ≥ N − 2 −α insures that N0 ≤ N . By the stationarity, on each piece we may apply Theorem 4 (with 1
− 1 −α
N0 instead of N ) because ≤ N0 2 . For the j th piece, denote the set appearing in α, j α, j Theorem 4 by N . For any ω ∈ ∩ j=1,..., N N one then has for δ ≤ c5 N0−1 (and hence also δ ≤ c5 2 ) the estimate max
0≤n,m≤N
Therefore ∩ j=1,...,
N N0
N0
Tωz (n, m) ≤ e
c4
N N0
≤ e c4 N
2(1−2α)
.
α, j
N ⊂ αN (z). We hence deduce from Theorem 4 that
P(αN (z)c ) ≤ c6
N −c7 N α −α 0 ≤ c N e −c7
e , 6 N0
which is precisely the statement of the lemma.
This last lemma can now be used in order to improve Lemma 6 in the range δ < c 2 . 1
Lemma 9. Let z = + ıδ with δ ≤ c5 2 and N − 2 −α ≤ ≤ N −α , 0 < α ≤ 21 . Then the set N (z) defined in (22) satisfies for some constant c8 , P( N (z)) ≥ c8 4α . Proof. We argue as in the proof of Lemma 3. Let us set again P = P( N (z)), and estimate separately the contribution from the complement of N (z), αN (z) and its complement. Due to (21), Lemma 8 and the a priori bound (11) (used on the complement of αN (z)), it follows that d 2 ≤ (1 − P)
1 −α d 2 + P c4 2(1−2α) + 2 c0 c6 N 2 e−c7 . 2
614
S. Jitomirskaya, H. Schulz-Baldes
Hence P ≥
4α
d − 4 c0 c6 N e−c7
2 c4 − d 4α
−α
.
The hypothesis ≤ N −α implies the result (it would be enough to assume ≤ log(N ) p for some p). As the final preparatory step for the next section, we improve Lemma 7 in the range δ < c 2 , by invoking Lemma 9 in its proof. 1
Lemma 10. Let z = + ıδ with δ ≤ c5 2 and N − 2 −α ≤ ≤ N −α , 0 < α ≤ 21 . Then the set
ˆ αN (z) = ω ∈ max Tωz (n, 0) 2 ≥ e N 1−α 2(1+2α)
0≤n≤N
satisfies for some constant c9 α
ˆ αN (z)) ≥ 1 − e− c9 N . P( Proof. Splitting N into NN0 pieces of length N0 and arguing exactly as in Lemma 4 invoking Lemma 9 instead of Lemma 3, we obtain
N
1 z 2 d 2 N0
2 P ω ∈ max Tω (n, 0) ≥ e ≥ 1 − (1 − c8 4α ) N0 . 0≤n≤N
Choosing N0 = (2N 1−α 4α )/d shows that one may take c9 = c8 d/2.
Lemma 10 implies the following estimate, which is the main result of this section and will be used in the next one. 1
Corollary 2. Let z = + ıδ. If δ ≤ c5 2 and N − 2 −α ≤ ≤ N −α , 0 < α ≤ 21 , one has 1 1−α 2(1+2α) α E ≤ e−N
+ e − c9 N . max0≤|n|≤N T z (n, 0) 2 6. Proof of Upper Bound for Random Polymer Models In this section we complete the proof of Theorem 3. For this purpose, we follow the q,α ,α strategy discussed in Sect. 3 and consider MT 0 1 (E 0 , E 1 ) defined as in (16), but with q,0,1+α a disorder average E. By Proposition 1 it is sufficient to bound MT (E 0 , E 1 ) for α > 0 if (E 0 , E 1 ) contains the spectrum. Moreover, energies bounded away from critical energies have a strictly positive Lyapunov exponent [BG]. By the results of Sect. 4, these energies hence lead at most to logarithmic growth in time, and therefore give no contribution to the diffusion exponents βq . Thus we are left to deal with energy intervals around the critical energies. All of them are treated the same way, so we focus on one of them. We suppose that E c = 0 and consider only the energy interval [0, 0 ] with 0 chosen as in (21); the other side [− 0 , 0] is again treated similarly. Furthermore we split the contribution as follows: q,0,1+α
MT
q,0,1+α
(0, 0 ) = MT
(0, T −η ) + MT
q,0,1+α
(T −η , T −α ) + MT
q,0,1+α
(T −α , 0 ),
Upper Bounds On Wavepacket Spreading For Random Jacobi Matrices
615
where η = min{q, 21 }. This is a good choice due to the following lemma, showing that q,0,1+α the contribution MT (0, T −η ) is bounded by the diffusion exponent as given in Theorem 3. Lemma 11. For some constant C1 , one has q,0,1+α
MT
(0, T −η ) ≤ C1 T q−η+αq .
Proof. By (19) we have q,0,1+α
MT
(0, T −η ) ≤ T q(1+α)
T −η
0
d m BN ( + ı T −1 ).
The estimate now follows from (13) and Proposition 3 in the Appendix, with 0 = T −η . q,0,1+α
Next let us consider the boundary term MT
(T −α , 0 ).
Lemma 12. For some constant C2 = C2 (α), one has q,0,1+α
MT
(T −α , 0 ) ≤ T 4 α q + C2 .
Proof. Let us split the sum over n appearing in the definition of MT (T −α , 0 ) into 4α 4α one over |n| ≤ T and the other over |n| > T . The first one can be bounded by T 4αq using (20). In the second one we apply Proposition 2 combined with Corollary 1 in order q,0,1+α to bound the Green’s function. This shows MT (T −α , 0 ) is bounded above by 0 d
3 T4αq + |n|q 16 τ 4 T 6 e−c3 |n| . −α π T T 4α q,0,1+α
|n|> T
In the second term, the sum over n is bounded by Lemma 2. As > T −α , it follows that α p −c T the second term is bounded by C T e 3 for some C, p > 0. Hence this term gives the second contribution in the bound. Lemma 13. For η = min{q, 21 } and some constants C3 = C3 (α) and C4 , one has ⎧ 1 ⎨ T q− 2 q ≥ 21 , q,0,∞ −η −α 6qα (T , T ) ≤ C3 + C4 T · MT ⎩ α(2q−1) T q ≤ 21 . Proof. We first split MT (T −η , T −α ) into two contributions MT (1) and MT (2), the first containing all the summands with |n| smaller than the (energy dependent) localization length: Tα d
+ ı MT (1) = |n|q E |G ω T (0, n)|2 , T −η π T −2 4α q,0,∞
1≤|n|≤
T
and the second MT (2) containing the sum over |n| > −2 T 4α corresponding to the summands beyond the localization length. MT (1) is bounded using (19): T −α d −2 q 1
MT (1) ≤ T 4 α q . N (d E) πT (E − )2 + T −2 T −η
616
S. Jitomirskaya, H. Schulz-Baldes
In order to bound the factor −2q , let us split the integral over into η−α
MT (1) ≤ T
4αq
α
N (d E)
T 2q(η−( j−1)α)
j=1 η−α
=T
4αq
α
T
T −η+ jα
2q(η−( j−1)α)
T −η+( j−1)α
j=1
T −η+ jα T −η+( j−1)α
η−α α
pieces:
1 d
π T (E − )2 + T −2
d m BN ( + ı T −1 ).
Using Proposition 3 we obtain η−α
MT (1) ≤ C T
4αq
α
η−α
T
2q(η−( j−1)α)
T
−η+ jα
= CT
6qα
T
(2q−1)η
j=1
α
T (1−2q)α j .
j=1
For q ≥ 1/2, we use the bound T (1−2q)α j ≤ 1 showing that the sum is bounded by η−α (1−2q)α j ≤ T (1−2q)(η−α) . This gives α . For q ≤ 1/2 we bound each summand by T the second contribution in the lemma. It remains to show that MT (2) ≤ C3 . Due to Proposition 2 and Corollary 2, MT (2) ≤
T −α T −η
d
πT
2(1+2α) 1−α α |n| |n|q 16 τ 4 T 6 e−
+ e−c9 |n| .
|n|> −2 T 4 α
Using Lemma 2 it is now elementary to bound MT (2) by a constant.
Combining Lemmata 11, 12 and 13, and recalling that α can be taken arbitrary close to 0, proves Theorem 3. Appendix: Estimates on the Borel Transform In Sect. 6 we used well-known estimates on the Borel transform Bµ (z) = µ(d E) (z − E)−1 of a measure µ. For sake of completeness we provide a short proof. Proposition 3. If a measure µ satisfies at some E the bound µ([E − , E + ]) < C
for all > 0, then for any finite positive δ and 0 , 0 π C, m Bµ (E + ıδ) < d m Bµ (E + + ıδ) < π 2 C 0 . 2 0 Proof. One has, uniformly in δ, m Bµ (E + ıδ) = µ(de)
δ (e − E)2 + δ 2
1
1 δ2
− δ2 =δ dt µ e ∈ R |e − E| <
t 0 √ ∞ 1 1 x δ2 − δ2 < C dt dx ,
Upper Bounds On Wavepacket Spreading For Random Jacobi Matrices
617
which gives the first inequality. For the second one, let us bound the indicator function χ[− 0 , 0 ] ( ) above by obtain
0
0
2 02 .
2 + 02
Then one can use the stability of Cauchy distribution to
2 02
δ (E + − e)2 + δ 2 δ + 0 = 2 π 0 , µ(de) (E − e)2 + (δ + 0 )2 = 2 π 0 m Bµ (E + ı(δ + 0 )),
d m Bµ (E + + ıδ) <
d
2
+ 02
µ(de)
and thus the second inequality follows from the first one.
Acknowledgement. This work would have been impossible without [JSS]. We are very thankful to G. Stolz for this collaboration. We also thank the anonymous referees for comments that improved the paper. The work of S. J. was supported in part by the NSF, grant DMS-0300974, and Grant No. 2002068 from the United States-Israel Binational Science Foundation (BSF), Jerusalem, Israel. H. S.-B. acknowledges support by the DFG.
References [AM] [BGT] [BS] [BeS] [BG] [BL] [Bov] [BLS] [CKM] [Com] [CM] [DSS] [DT] [DJLS] [DMS] [DWP]
Aizenman, M., Molchanov, S.: Localization at large disorder and at extreme energies: an elementary derivation. Commun. Math. Phys. 157, 245–278 (1993) Barbaroux, J.M., Germinet, F., Tcheremchantsev, S.: Fractal dimensions and the phenomenon of intermittency in quantum dynamics. Duke Math. J. 110, 161–193 (2001) Barbaroux, J.-M., Schulz-Baldes, H.: Anomalous transport in presence of self-similar spectra. Ann. I.H.P. Phys. Théo. 71, 539–559 (1999) Bellissard, J., Schulz-Baldes, H.: Subdiffusive quantum transport for 3d-hamiltonians with absolutely continuous spectra. J. Stat. Phys. 99, 587–594 (2000) de Bièvre, S., Germinet, F.: Dynamical localization for the random dimer schrödinger operator. J. Stat. Phys. 98, 1135–1148 (2000) Bougerol, P., Lacroix, J.: Products of Random Matrices with Applications to Schrödinger Operators. Boston, Birkhäuser, 1985 Bovier, A.: Perturbation theory for the random dimer model. J. Phys. A: Math. Gen. 25, 1021–1029 (1992) Breuer, J., Last, Y., Strauss, Y.: Upper bounds on the dynamical spreading of wavepackets. In: preparation Carmona, R., Klein, A., Martinelli, F.: Anderson localization for bernoulli and other singular potentials. Commun. Math. Phys. 108, 41–66 (1987) Combes, J.-M.: Connections between quantum dynamics and spectral properties of time- evolution operators. In: Differential Equations with Applications to Mathematical Physics, Ames, W.F., Harell, E.M., Herod, J.V., eds. Boston: Academic Press, 1993 Combes, J.-M., Mantica, G.: Fractal dimensions and quantum evolution associated with sparse potential Jacobi matrices. In: Proceedings of the Bologna APTEX International Conference, World Scientific. Ser. Concr. Appl. Math. 1, River Edge, NJ: world Scientific, pp. 107–123 (2001) Damanik, D., Sims, R., Stolz, G.: Localization for discrete one-dimensional random word models. J. Funct. Anal. 208, 423–445 (2004) Damanik, D., Tcheremchantsev, S.: Upper bounds in quantum dynamics. To appear in J. Amer. Math. Soc., electronically posted on Nov. 30, 2006 del Rio, R., Jitomirskaya, S., Last, Y., Simon, B.: Operators with singular continuous spectrum: iv, hausdorff dimension, rank-one perturbations and localization. J. d’Analyse Math. 69, 153–200 (1996) del Rio, R., Makarov, N., Simon, B.: Operators with singular continuous spectrum: ii, rank one operators. Commun. Math. Phys. 165, 59–67 (1994) Dunlap, D.H., Wu, H.-L., Phillips, P.W.: Absence of localization in random-dimer model. Phys. Rev. Lett. 65, 88–91 (1990)
618
[Gor] [Gua] [GS1] [GS2] [GS3] [Jit] [JL] [JSS] [KKL] [KL] [Las] [PF] [SSS] [Sch] [Tch]
S. Jitomirskaya, H. Schulz-Baldes
Gordon, A.: Pure point spectrum under 1-parameter perturbations and instability of Anderson localization. Commun. Math. Phys. 164, 489–505 (1994) Guarneri, I.: Spectral properties of quantum diffusion on discrete lattices. Europhys. Lett. 10, 95–100 (1989); On an estimate concerning quantum diffusion in the presence of a fractal spectrum. Europhys. Lett. 21, 729–733 (1993) Guarneri, I., Schulz-Baldes, H.: Upper bounds for quantum dynamics governed by jacobi matrices with self-similar spectra. Rev. Math. Phys. 11, 1249–1268 (1999) Guarneri, I., Schulz-Baldes, H.: Lower bounds on wave packet propagation by packing dimensions of spectral measures. Math. Phys. Elect. J. 5(1), 16 pages (1999) Guarneri, I., Schulz-Baldes, H.: Intermittent lower bound on quantum diffusion. Lett. Math. Phys. 49, 317–324 (1999) Jitomirskaya, S.: Ergodic Schrödinger operators (on one foot). Preprint 2006, to appear in Barry Simon Festschrift Jitomirskaya, S., Last, Y.: Power-law subordinacy and singular spectra, i. half line operators. Acta Math. 183(2), 171–189 (1999) Jitomirskaya, S., Schulz-Baldes, H., Stolz, G.: Delocalization in random polymer chains. Commun. Math. Phys. 233, 27–48 (2003) Killip, R., Kiselev, A., Last, Y.: Dynamical upper bounds on wavepacket spreading. Amer. J. Math. 125, 1165–1198 (2003) Kiselev, A., Last, Y.: Solutions, spectrum, and dynamics for schrödinger operators on infinite domains. Duke Math. J. 102, 125–150 (2000) Last, Y.: Quantum dynamics and decomposition of singular continuous spectra. J. Funct. Anal. 142, 402–445 (1996) Pastur, L., Figotin, A.: Spectra of Random and Almost-Periodic Operators. Berlin:Springer, 1992 Schrader, R., Schulz-Baldes, H., Sedrakyan, A.: Perturbative test of single parameter scaling for 1d random media. Ann. H. Poincare 5, 1159–1180 (2004) Schulz-Baldes, H.: Lyapunov exponents at anomalies of SL(2, R) actions, to appear in Operator Theory: Advances and Applications. Basel:Birkhäuser, 2006 Tcheremchantsev, S.: Dynamical analysis of schrödinger operators with growing sparse potentials. Commun. Math. Phys. 253, 221–252 (2005)
Communicated by B. Simon
Commun. Math. Phys. 273, 619–636 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0221-7
Communications in
Mathematical Physics
On the Distinguishability of Random Quantum States Ashley Montanaro Department of Computer Science, University of Bristol, Woodland Road, Bristol, BS8 1UB, UK. E-mail: [email protected] Received: 13 July 2006 / Accepted: 16 October 2006 Published online: 6 March 2007 – © Springer-Verlag 2007
Abstract: We develop two analytic lower bounds on the probability of success p of identifying a state picked from a known ensemble of pure states: a bound based on the pairwise inner products of the states, and a bound based on the eigenvalues of their Gram matrix. We use the latter, and results from random matrix theory, to lower bound the asymptotic distinguishability of ensembles of n random quantum states in d dimensions, where n/d approaches a constant. In particular, for almost all ensembles of n states in n dimensions, p > 0.72. An application to distinguishing Boolean functions (the “oracle identification problem”) in quantum computation is given. 1. Introduction A fundamental property of quantum mechanics is that non-orthogonal pure quantum states may not be distinguished perfectly. This leads to the following quantum detection problem: given an unknown quantum state |ψ? , picked from a known set E with known a priori probabilities, find the “optimal” measurement M opt to determine |ψ? . Several different criteria for optimality may be considered [7, 8, 14]; here we only concern ourselves with optimising the probability of success P opt , and in particular the related state distinguishability problem of finding P opt without necessarily finding M opt . Efficient optimisation techniques can be used to estimate P opt numerically [9]; however, the problem of finding an analytic expression for P opt seems intractable. We are therefore led to attempting to produce bounds on P opt . This note derives two lower bounds on P opt ; one based on the pairwise distinguishability of the states in E, and one based on the eigenvalues of their Gram matrix. We use the latter, and a powerful result from random matrix theory (the Marˇcenko-Pastur law [19]), to bound the probability of distinguishing a set of random quantum states, for a quite general notion of randomness. This has an application to quantum computation in the so-called oracle identification problem introduced by Ambainis et al. [1], where we are given an n-bit Boolean function f picked from a known set of N functions, and
620
A. Montanaro
must identify f with the minimum number of queries to f . We show that, for all but an exponentially small fraction of sets with N = 2n , a quantum computer can perform this task successfully in a constant number of queries (with arbitrarily high probability), whereas classical computation requires n queries for all such sets. As showing that a set of quantum states are quite distinguishable forms an essential part of proofs in many areas of quantum information theory, we hope that these results will find application elsewhere. The organisation of the paper is as follows. Section 2 introduces notation and our main tool, the so-called “pretty good measurement”, before moving on to give the lower bounds on P opt . An extension of the lower bounds to mixed states is considered. Section 3 applies the bounds to a specific family of ensembles (those where all the states have constant inner product). Section 4 describes the random matrix theory we will be using, and applies it to the distinguishability of random quantum states. Section 5 gives the application to the oracle identification problem, and the paper closes with some discussion in Sect. 6.
2. Bounds on the Distinguishability of Quantum States We consider an ensemble E containing n d-dimensional pure states |ψi with their a priori probabilities pi . We will use {|ψi } to denote the √ set containing the same states, renormalised to reflect their probabilities (i.e. |ψi = pi |ψi ). Given an unknown state |ψ? , picked in accordance with these probabilities, the quantity we are interested in is the average probability of success for a given generalised measurement to distinguish which state we were given. For a measurement M (given by a set of positive operators {Mi } summing to the identity), let this probability be denoted by P M (E). Then we have P M (E) =
ψi |Mi |ψi = pi ψi |Mi |ψi . i
(1)
i
M opt (E) will denote the measurement with the optimal probability of success, and in an abuse of notation P opt (E) will denote this optimal probability. We call this the optimal probability of distinguishing the states in E. √ We use three matrix norms: the Euclidean (Frobenius) norm A2 = tr A† A = √ 2 † i, j |Ai j | , the trace norm A1 = tr A A = i σi (A), where σi (A) denotes the th i singular value of A, and the l1 norm i, j |Ai j |. We will often use the d ×n state matrix S = S(E) = (|ψ1 , . . . , |ψn ) whose i th column is the state |ψi . Then G = S † S gives the n × n Gram matrix [16] encoding all the inner products between the renormalised states in E. If n < d, G will have d − n zero eigenvalues. Note that every rectangular matrix M with M2 = 1 is a state matrix. ρ will represent the density matrix of the ensemble: ρ=
n
|ψi ψi |.
i=1
It is well-known [17] that G and ρ have the same non-zero eigenvalues.
(2)
On the Distinguishability of Random Quantum States
621
2.1. Use of the “pretty good measurement”. We will use a specific measurement to provide bounds on P opt (E), which is “canonical” in the sense that it performs reasonably well for any ensemble E. This is the so-called pretty good measurement (PGM), which was independently identified by several authors (e.g. [11, 12]) and has a number of useful properties. It is usually defined as a set of projectors {|νi νi |} onto “measurement vectors” |νi , where |νi = ρ −1/2 |ψi (the inverse only being taken on the support of ρ). However, it may also be defined implicitly, which brings out its “canonical” nature. To this end, consider an arbitrary measurement M for E that consists of a set of n rank 1 projectors onto unnormalised measurement vectors |µi , where each measurement vector corresponds to a state |ψi in the ensemble. (In fact, it turns out that the optimal measurement for an ensemble of pure states always falls into this category [9].) The probability of getting measurement outcome i and receiving state j is then |µi |ψ j |2 , n and the overall probability of success of this measurement is i=1 |µi |ψi |2 . We may thus encode all the inner products (and hence the probabilities) in a matrix P, where Pi j = µi |ψ j ; and rather than looking for an optimal measurement M, we can rephrase our task as looking for an optimal matrix P that corresponds to a valid measurement. We have the following requirement on P, from the fact that M must be a valid POVM: n n † ψi |µk µk |ψ j = ψi | |µk µk | |ψ j = G i j = (S † S)i j . (3) (P P)i j = k=1
k=1
A natural way √ to produce a matrix P that satisfies this condition from any given S is to take P = G, the positive semidefinite square root of G. The PGM turns out to be a measurement corresponding to this matrix P, for, if Pi j = νi |ψ j , then (P 2 )i j =
n ψi |ρ −1/2 |ψk ψk |ρ −1/2 |ψ j k=1
=
ψi |
ρ
−1/2
n
(4)
|ψk ψk |ρ −1/2
|ψ j = G i j .
(5)
k=1
n √ 2 ( G)ii . The probability of success for the PGM is thus given by P pgm (E) = i=1 Barnum and Knill have proved [4] that the PGM has the further property that it is almost optimal in the following sense. Theorem 1 (Barnum, Knill) [4]. P pgm (E) ≥ P opt (E)2 . So there is the overall relationship P opt (E)2 ≤ P pgm (E) ≤ P opt (E). For completeness, we include (in Appendix A) a simplified proof of Barnum and Knill’s result in the case of pure states. 2.2. Bounds from the pairwise inner products. A set of states that are pairwise almost orthogonal are pairwise almost distinguishable. It thus seems intuitively clear that, given such a set, the probability of success in distinguishing one state from all the others must also be high. However, this intuition is wrong. This was noted by Jozsa and Schlienz [17], who showed that the inner products of an ensemble of states may all be reduced, while simultaneously reducing the von Neumann entropy of the ensemble (which gives a measure of overall distinguishability). This effect also manifests itself in quantum
622
A. Montanaro
fingerprinting [6]. Here, d-dimensional states are “compressed” to log d-dimensional “fingerprint” states that can be distinguished pairwise. However, given such a fingerprint the corresponding original state may not be identified, as this would violate Holevo’s theorem [15]. Nevertheless, for certain ensembles the pairwise inner products can give a good lower bound on the overall distinguishability, as noted by several authors [11, 4]. In this section we derive such a bound. Our approach is based on that of Hausladen et al. [11], who found a parabola forming a lower bound on the square root function, which is useful because of the following lemma. √ Lemma 1. If the function x is bounded below by f (x) = ax + bx 2 for x ≥ 0, then √ n ( G)ii ≥ aG ii + b j=1 |G i j |2 . Proof. G is a positive semidefinite matrix and thus may be diagonalised: G = U DU † , where D =√diag({λi }) and U = (u i j ) is√unitary. Working out the matrix algebra √ shows that ( G)ii = nk=1 λk |u ik |2 , so ( G)ii ≥ nk=1 f (λk )|u ik |2 = f (G)ii . But n f (G)ii = (aG + bG 2 )ii = aG ii + b j=1 G i j G ji = aG ii + b nj=1 |G i j |2 . Our goal will be to find a and b to parametrise f such that aG ii + b nj=1 |G i j |2 is √ maximised. It is clear that, for this to be maximised, f (r ) must equal r for √ some r (or we could just increase a or b). So we will pick a and b such that f (r ) = r and 1 (i.e. the curves are tangent at this point). This leads to the simultaneous f (r ) = 2√ r equations ar + br 2 =
√ 1 r , a + 2br = √ . 2 r
(6)
Solving for a and b gives the optimal values 3 1 a = √ , b = − 3/2 . 2r 2 r
(7)
√ To see that f (x) actually is a lower bound for x for any positive value of r (with these 2 values for a and b), note that the only solutions to the related equation √ f (x) = x are x = 0, x = r , or x = 4r . As f (4r ) is negative, we have that f (x)√= x if and only if x = 0 or x = r . So the only remaining possibility is that f (x) > x for all 0 < x < r . Substituting a suitable value of x (e.g. r/2) shows that this is not the case. The expression aG ii + b nj=1 |G i j |2 may now be expressed solely in terms of r . Optimising this for r gives that the maximum is found at the point n r=
j=1 |G i j |
G ii
2
.
(8)
Returning to the original inequality, we have n √ √ G3 3 1 |G i j |2 ⇒ ( G)ii2 ≥ n ii . ( G)ii ≥ √ G ii − 3/2 2 2r 2 r j=1 |G i j | j=1
(9)
On the Distinguishability of Random Quantum States
623
We thus have the following bound on the probability of distinguishing the states in E: P pgm (E) ≥
n i=1
ψi |ψi 3 pi2 = . n 2 2 j=1 |ψi |ψ j | j=1 p j |ψi |ψ j | n
n
(10)
i=1
If all the states have equal a priori probabilities, the bound simplifies further to P pgm (E) ≥
n 1 1 n . 2 n j=1 |ψi |ψ j |
(11)
i=1
Unlike previous bounds obtained by other authors for the probability of success n of the PGM [11, 4], the bound (10) is always positive and greater than or equal to i=1 pi2 , thus showing that the PGM always does at least as well as the “non-measurement” of guessing which state was received in accordance with their a priori probabilities. 2.3. Bounds from eigenvalues. The eigenvalues of a Hermitian matrix are closely related to its diagonal elements; indeed, the former majorises the latter √ [16]. With this in mind, we look for a bound on the unknown diagonal elements of G in terms of the known eigenvalues {λi } of G. n √ 2 Lemma 2. P pgm (E) ≥ n1 = n1 S21 . i=1 λi Proof. By the fact that the trace of a matrix is the sum of its eigenvalues, we have n √ n λi ( G)ii = i=1
n √ ⇒ ( G)ii
n 2 = λi
i=1
i=1
⇒P
(13)
i=1
n 2 n √ ( G)ii2 ≥ λi ⇒n pgm
(12)
i=1 2
(14)
i=1
1 (E) ≥ n
2 n λi ,
(15)
i=1
where in (14) we used a Cauchy-Schwarz inequality, showing that equality can only be √ attained in step (14) when all the ( G)ii are equal. Interestingly, this bound is the same as the fidelity of G with the maximally mixed state
2 I /n, where the fidelity F(ρ, σ ) is defined as tr ρ 1/2 σ ρ 1/2 [20]. It is worth noting that no upper bound on the success probability in terms of the eigenvalues alone can be found, for the following reason. Any set of eigenvalues {λi } summing to 1 can give rise to a Gram matrix G, where G ii = λi , and G i j = 0 (for i = j). Such matrices correspond to an ensemble E of perfectly distinguishable states where P pgm (E) = 1. As future work, it would be interesting to determine whether an upper bound (or an improved lower bound) could be produced by considering the diagonal entries of G as well as its eigenvalues.
624
A. Montanaro
2.4. Distinguishing mixed states. It is natural to ask to what extent these lower bounds hold for the generalised problem of distinguishing an ensemble E consisting of mixed states {ρi }. The following theorem allows the problem to be related to that of distinguishing pure states. Theorem 2. Let E be an ensemble of nd-dimensional mixed states {ρi } with a priori probabilities { pi }, and having spectral decompositions ρi = dk=1 λik |vik vik |. Let F be an ensemble of the nd pure states given by the eigenvectors {|vik } with a priori probabilities { pi λik }. Then P pgm (E) ≥ P pgm (F). Proof. For mixed states, the PGM is defined by the following measurement operators {Mi }: Mi =
ρ −1/2 ρi ρ −1/2 ,
where
ρi
= pi ρi and ρ =
n
ρi .
(16)
i=1
So the probability of success √ can be bounded as follows, where we use the renormalised = √p eigenvectors |vik i λik |vik : P pgm (E) =
n
tr ρ −1/2 ρi ρ −1/2 ρi i=1
=
n
tr ρ −1/2
i=1
=
d
(17)
|vik vik | ρ −1/2
k=1
d
|vil vil |
tr ρ −1/2 |vik vik |ρ −1/2 |vil vil |
n d
(18)
l=1
(19)
i=1 k,l=1
=
n d
|vik |ρ −1/2 |vil |2 ≥
i=1 k,l=1
n d
2 |vik |ρ −1/2 |vik | = P pgm (F).
i=1 k=1
3. The Distinguishability of States with Constant Inner Product An illustrative case to apply these bounds to is that of equiprobable states where the pairwise inner products are all equal, so the states are all equally distinguishable from each other. Consider an ensemble E with Gram matrix G, where G ii = 1/n and G i j = p/n for i = j (and p is a positive real constant). In this case, the inner product bound of Sect. 2.2 gives the bound P pgm (E) ≥
1 = O(1/n). 1 + p 2 (n − 1)
(20)
The eigenvalue bound, however, gives much better results. The symmetry of G shows immediately that it has an eigenvector (1, 1, . . . , 1); the corresponding eigenvalue is λ1 = p + (1 − p)/n. The set of eigenvectors may be completed by taking any n − 1
On the Distinguishability of Random Quantum States
625
vectors orthogonal to (1, 1, . . . , 1), which will be eigenvectors with eigenvalues λ2...n = (1 − p)/n. We therefore have
P
pgm
2 1− p 1− p + (n − 1) p+ n n
2(1 − p) (1 − p) 1 (n − 1)2 ≥ (1 − p) − ≥ n n n
1 (E) ≥ n
(21) (22)
so the probability of distinguishing these states approaches a constant as n → ∞. In fact, one can show that inequality (21) is actually an equality giving the precise probability √ of success P pgm (E) (this follows from showing that the diagonal entries of G are all equal). Such an ensemble therefore provides a kind of converse to the ensemble of states used in quantum fingerprinting [6]: in this case, no matter how many states there are in the ensemble, their joint distinguishability is of the same order as their pairwise distinguishability. We will see below that this behaviour is not typical; however, it is perhaps not surprising, because E can only be realised in n dimensions. To see this, note that G is non-singular, so the states in E must be linearly independent. 4. The Distinguishability of Random Quantum States We will use Lemma 2 and some results from the theory of random matrices to put a lower bound on the probability of distinguishing random quantum states. We will find that it is possible to give strong lower bounds on the distinguishability of n random states in d dimensions, in the regime where n/d is constant.
4.1. A little random matrix theory. In this section, we will calculate the expected value of the trace norm of a random matrix. The distribution of the trace norm (i.e. the sum of singular values) of a matrix M is clearly related to that of the eigenvalues of the matrix M M † , which is known to statisticians as a sample covariance matrix. The asymptotic distribution of the eigenvalues of such a matrix is given by the Marˇcenko-Pastur law [19], which is stated in the form we need in [3]. Theorem 3 (Marˇcenko-Pastur law) [19]. Let Rr be a family of d × n matrices with n ≥ d and d/n → r ∈ (0, 1] as n, d → ∞, where the entries of Rr are i.i.d. complex random variables with mean 0 and variance 1. Then, as n, d → ∞, the eigenvalues of the rescaled matrix n1 Rr Rr† tend almost surely to a limiting distribution with density (x − A2 )(B 2 − x) pr (x) = 2πr x for A2 ≤ x ≤ B 2 (where A = 1 −
(23)
√ √ r , B = 1 + r ), and density 0 elsewhere.
We will translate this to a similar statement about the singular values of Rr . The following lemma is straightforward.
626
A. Montanaro
Lemma 3. Let Rr be a family of d × n matrices with k/m → r ∈ (0, 1] as n, d → ∞, where k = min(n, d) and m = max(n, d), and the entries of Rr are i.i.d. complex random√variables with mean 0 and variance 1. Then, as n, d → ∞, the singular values of Rr / m tend almost surely to a limiting distribution with density (y 2 − A2 )(B 2 − y 2 ) pr (y) = (24) πr y √ √ for A ≤ y ≤ B (where A = 1 − r , B = 1 + r ), and density 0 elsewhere. √ Proof. The lemma follows from Theorem 3 for n ≥ d by substituting y = x. For n ≤ d, note that the singular values of R are the same as those of R T , so the roles of n and d need merely be interchanged. Lemma 4. Let Rr be a family of d × n matrices with k/m → r ∈ (0, 1] as n, d → ∞, where k = min(n, d) and m = max(n, d), and the entries of Rr are i.i.d. complex random variables with mean 0 and variance 1. Then, as n, d → ∞, the expected trace norm of Rr tends almost surely to m 3/2 B E(Rr 1 ) = (y 2 − A2 )(B 2 − y 2 ) dy, (25) π A √ √ where A = 1 − r , B = 1 + r . Proof. With probability 1, Rr will have k non-zero singular values. Let σi (Rr ) denote the value of the i th (unsorted) singular value of Rr , for arbitrary i between 1 and k. We have B √ √ √ E(Rr 1 ) = (k m) E(σi (Rr / m)) = k m y pr (y) dy (26) A
and using Lemma 3 gives the desired result.
This turns out to be an elliptic integral which cannot be expressed in terms of elementary functions [10]. However, it is possible to produce a good lower bound, which is tight in the case r = 1: Lemma 5. Let Rr , k, m be defined as in Lemma 4. Then, as k, m → ∞, the expected trace norm of Rr
√ 64 (27) E(Rr 1 ) ≥ k m 1 − r 1 − 9π 2 with equality when r = 1. Proof. See Appendix B.
As these are asymptotic results, it is important to bound the rate of convergence of this expected value to that given by Lemma 5. This can be done using a theorem of Bai [2], who has shown that the Kolmogorov distance between the (rescaled) expected empirical spectral distribution of an m × k matrix (with m ≥ k) and the asymptotic distribution given by the Marˇcenko-Pastur law is O(m −5/48 ). After some algebra, this may be used with Lemma 5 to give
√ 64 −5/48 E(Rr 1 ) ≥ k m − O(m 1−r 1− ) (28) 9π 2 for a finite-dimensional m × k matrix Rr .
On the Distinguishability of Random Quantum States
627
4.2. Random quantum states. We can apply this result, and the lower bound of Lemma 2, to estimate the distinguishability of random quantum states uniformly distributed on the complex unit sphere in d dimensions. In fact, we may exploit the concentration of measure effects characteristic of high-dimensional spaces to show lower bounds on the distinguishability of almost all ensembles of quantum states. A uniformly random quantum state may be produced by creating a vector v, each of whose components are complex Gaussians (say vi ∼ N˜ (0, 1/d)), and normalising the result. The intuition that this normalisation step is “almost unnecessary” [22] can be formalised as follows. It is straightforward to see that E(v) = 1. In order to get an explicit expression for the concentration around this expectation the following result from Appendix A of [5] can be used. Lemma 6 (Norm concentration of Gaussian vectors) [5]. Let v be a d-dimensional random vector, each of whose components vi ∼ N˜ (0, 1/d). Then, for any , Pr[|v22 − 1| ≥ ] < 2e−d
2 /12
.
(29)
Similarly, the state matrix of an ensemble E of n equiprobable d-dimensional uniformly random quantum states is given by a d × n matrix S whose columns √ are uniformly random quantum states renormalised so that each column √ has norm 1/ n. Let S denote the matrix produced by rescaling each column by 1/ n, rather than normalising them. We will show that S and S are close with high probability. Consider an arbitrary column of S and the same column in S , denoted v and v respectively. Lemma 6 allows a bound to be put on the probability of v and v being far apart, as √ 1 √ v − v 22 = ( nv 2 − 1)v22 = ( nv 2 − 1)2 . n
(30)
We may therefore obtain √ Pr[v − v 22 ≥ ] = Pr[( nv 2 − 1)2 ≥ n] ≤ Pr[|nv 22 − 1| ≥ n] ≤ 2e−n
2 d 2 /12
.
(31)
Considering all the columns in the matrices S and S , and using the union bound, we have Pr[S − S 22 ≥ ] ≤ 2ne−d
2 /12
.
(32)
In order to convert this to a statement about the “distinguishability” function f (S) = n1 S21 that we are interested in, we need the following lemma, which is proved in Appendix C. Lemma 7. Let S be an n × d matrix with S2 ≤ l, and define f (S) = n1 S21 . Then the Lipschitz constant η of f , η = supx,y | f (x) − f (y)|/x − y2 , satisfies η ≤ 2l. Lemma 7 implies the following relationship, for any l > 0: √ Pr |(S 21 − S21 )/n| ≥ 2l ≤ Pr[S 2 ≥ l] + Pr[S − S 22 ≥ ] ≤ 2e−nd(l
2 −1)2 /12
+ 2ne−d
The final result we will need is the following concentration lemma.
2 /12
.
(33)
628
A. Montanaro
Lemma 8 (Concentration of Gaussian measure) [18]. Let p be a point in Rd picked in accordance with standard Gaussian measure. Then Pr[| f ( p) − E( f )| ≥ ] ≤ 2e−
2 /2η2
,
(34)
where η is the Lipschitz constant of f , η = supx,y | f (x) − f (y)|/x − y2 . We now have all the required ingredients to prove a lower bound on the distinguishability of almost all quantum states. Theorem 4. Let E be an ensemble of n equiprobable d-dimensional quantum states
64 −5/48 ) if n ≥ d, and picked uniformly at random. Set p = r1 1 − r1 1 − 9π − O(n 2
64 −5/48 p = 1 − r 1 − 9π 2 − O(d ) otherwise. Then, for any ≤ p/2,
4 2 Pr[P pgm (E) ≤ p − 2] ≤ 2 (n + 1)e−d /K + e−nd /5 ,
(35)
where K is a constant ≤ 300. Proof. As before, let S be the state matrix of E, and let S be the matrix produced by rescaling the√ vectors of Gaussians which would produce S if they were normalised. The matrix R = nd S fulfills the criteria for the Marˇcenko-Pastur law (Theorem 3), as its entries are complex random variables with mean 0 and variance 1. We therefore have E
1 2 S 1 n
≥
1 1 E(S 1 )2 = 2 E(R1 )2 ≥ p n n d
(36)
using the lower bound on the expected trace norm of R from Lemma 5 and the convergence result of Bai [2]. We will show that this implies a bound on n1 S21 , and hence (by Lemma 2) a bound on P pgm (E). From Lemma 8 (identifying Cd with R2d ) and Eqn. (33), we have for any l, Pr S21 /n ≤ p − 2
≤ Pr |(S 21 − S21 )/n| ≥ + Pr |S 21 /n − E S 21 /n | ≥
nd 2 d 4 nd(l 2 − 1)2 + exp − + n exp − ≤ 2 exp − 12 192 l 4 4 l2
nd 4 d 4 nd 2 ≤ 2 exp − + n exp − + exp − , 12 300 5 where, in the last line, we pick l 2 = 1 + 2 and note that ≤ 1/2.
(37) (38) (39) (40)
Despite the large constants that appear in these expressions, Fig. 1 shows numerical evidence that ensembles E of quantum states picked uniformly at random in fact appear to have a value of P pgm (E) close to the asymptotic lower bound, even when the states are (relatively) low-dimensional.
On the Distinguishability of Random Quantum States
629
1 Asymptotic lower bound Numerical results 0.9
P pgm(E)
0.8 0.7 0.6 0.5 0.4 0
0.5
1 r
1.5
2
(a) 0 ≤ r ≤ 2 1 Asymptotic lower bound Numerical results
0.9 0.8
P
pgm
(E)
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
2
4
6
8
10
r
(b) 0 ≤ r ≤ 10 Fig. 1. Asymptotic bound on P pgm (E) vs. numerical results (averaged over 10 runs) for ensembles of n = 50r 50-dimensional uniformly random states.
5. Application to Oracle Identification The oracle identification problem may be defined as follows [1]. Given an unknown n-bit Boolean function f : {0, 1}n → {0, 1} (the oracle), picked uniformly at random from a known set F of functions, identify f with the minimum number of uses of f . Set N = |F| and D = 2n . Clearly, classical computation cannot identify f with certainty in fewer than log2 N queries (as each query may reduce the search space by at most half). However, quantum computation can sometimes do better. On a quantum computer, we can encode the oracle as an n qubit unitary operator U f , defined by the action 1 2n −1 U f |x → (−1) f (x) |x. Now if the uniform superposition 2n/2 x=0 |x is input to the oracle, the following oracle state will be produced: |ψ f =
1 2n/2
n −1 2
(−1) f (x) |x.
(41)
x=0
In some cases, a single quantum query to U f may be enough to identify f with certainty. This will be the case if ψ f |ψg = 0 for all f = g (although this is not a necessary
630
A. Montanaro
condition). The satisfaction of this orthogonality condition may be expected to be a rare event, and is certainly impossible when N > D. However, if we are content with a small probability of error, the situation is better: we will show here that, in particular, almost all sets of N = D oracles may be distinguished almost certainly in a constant number of quantum queries. The oracle identification problem was introduced and studied by Ambainis et al. [1], who (among other results) developed a hybrid quantum-classical algorithm for the random oracle case with which we concern ourselves here. However, the upper bound they obtained in the case where N = D is only O(log2 N ) queries, which can be shown to be no better than classical computation. Indeed, consider a set of k random classical queries to the unknown function f . The probability that any two of the set F of random functions agree on all k queries is 2−k , so by the union bound they will all differ on at least one of the queries with probability ≥ 1 − N2 2−k . Setting k = 3 log2 N makes this success probability approach 1 for large N , showing that almost all oracles can be identified with O(log2 N ) classical queries. Lemma 9. Let E be an ensemble of N D-dimensional oracle states corresponding to Boolean functions picked√ uniformly at random (call these random oracle states). Then the rescaled state matrix N D S(E) defines a point picked uniformly at random on the N D-dimensional hypercube {−1, 1} N D . √ Proof. Each component of each state will be ±1/ N D, with equal probability of each. √ N D S(E) therefore meets the required conditions for the Marˇcenko-Pastur law (Theorem 3), so we may say immediately Lemma 10. Let E be an ensemble of N D-dimensional random oracle states, and set r = N /D. Then ⎧
⎨ 1 1 − 1 1 − 642 − O(N −5/48 ) if N ≥ D r r 9π pgm
(42) E(P (E)) ≥ ⎩ 1 − r 1 − 642 − O(D −5/48 ) otherwise 9π and in particular E(P pgm (E)) ≥ 0.720 − O(D −5/48 ) when N ≤ D. Like the sphere, the high-dimensional hypercube exhibits the concentration of measure phenomenon, and we can write down a similar result to Levy’s Lemma [18]: Lemma 11 (Concentration of measure on the cube) [18]. Given a function f : {−1, 1}d → R defined on a d-dimensional hypercube, and a point p on the hypercube chosen uniformly at random,
−2 2 , (43) Pr[| f ( p) − E( f )| ≥ ] ≤ 2 exp dη2 where η is the Lipschitz constant of f with respect to the Hamming distance, η = supx,y | f (x) − f (y)|/d(x, y). Lemma 12. Let H be a point on the nd-dimensional hypercube written down as an n ×d {−1, 1}-matrix, and let f (H ) = n 12 d H 21 . Then the Lipschitz constant η of f satisfies η ≤ 4/nd.
On the Distinguishability of Random Quantum States
Proof. See Appendix C.
631
Inserting this value of η into Lemma 11 gives Theorem 5. Let E be an
ensemble of N D-dimensional random oracle states. Set 64 −5/48 ) if N ≥ D, and p = 1 − r 1 − 64 p = r1 1 − r1 1 − 9π − O(N − 2 9π 2 O(D −5/48 ) otherwise, where r = N /D. Then
Pr[P pgm (E) ≤ p − ] ≤ 2 exp
−2N D 2 16
.
(44)
We have our desired result: with 1 query, all but an exponentially small fraction of the possible sets of N N -dimensional random oracle states may be distinguished with a constant probability bounded away from 1/2 (in fact, to get a probability of success greater than 1/2, we may take r = N /D to be as high as ∼ 1.66). A constant number of repetitions allows this probability to be boosted to be arbitrarily high. 6. Discussion This work can be seen as part of an overall programme of understanding the behaviour of random quantum states [13, 21–23]. There is a fundamental correspondence between the mixed state obtained from an equal mixture of uniformly random pure states, and that produced by starting with a larger system in a uniformly random pure state, and tracing out part of the system. Consider a d-dimensional state ρn,d =
n 1 |ψi ψi |, n
(45)
i=1
where each state in the set E = {|ψi } is picked uniformly at random. We can think of ρn,d as being produced from the following dn-dimensional state (which we consider to live in a Hilbert space Hd ⊗ Hn ) by tracing out the second subsystem: n−1 n−1 d−1 1 1 |υk |k = √ αkl |l|k |υ = √ n n k=0
(46)
k=0 l=0
for some coefficients αkl . As mentioned previously, the αkl will be approximately normally distributed as N˜ (0, 1/d). So, because of the normalisation factor at the front of the sum, the overall state |υ has coefficients which are normally distributed and scaled as N˜ (0, 1/dn). Therefore, this state is approximately picked from the uniform distribution on the unit sphere in Cdn . Popescu, Short and Winter [21] obtained an upper bound on the expected trace distance of such a state ρn,d from the maximally mixed state I /d, and used this to show that for n d, ρ ≈ I /d. Because the non-zero eigenvalues of the Gram matrix of (rescaled) states in E are the same as the eigenvalues of ρn,d [17], this paper can be seen as obtaining a similar result to [21] for the fidelity of ρn,d with the maximally mixed state, via quite different methods. However, the bound is tighter for n close to d, and the notion of “randomness” of the states {|ψi } is more general (which is simply a side-effect of relying on the powerful Marˇcenko-Pastur law).
632
A. Montanaro
Acknowledgements. I would like to thank Richard Jozsa for careful reading of this manuscript, Tony Short for helpful discussions, and Aram Harrow for many helpful comments and suggestions. I would also like to thank Jon Tyson for pointing out an error in Appendix A, and a referee for comments that greatly improved the paper. This work was supported in part by the UK Engineering and Physical Sciences Research Council QIP-IRC grant.
Appendices A. The PGM is Close to Optimal Theorem 1 (Barnum, Knill) [4]. P pgm (E) ≥ P opt (E)2 . Proof. Consider an arbitrary POVM R consisting of measurement operators {Ri }, and an arbitrary ensemble E of renormalised states {|ψi }, with a priori probabilities pi , where √ n as before |ψi = pi |ψi and ρ = i=1 |ψi ψi |. Assume wlog that Ri = |µi µi | for some vectors |µi , as the optimal measurement will always be of this form [9]. Then P R (E) = ≤
n i=1 n
ψi |Ri |ψi =
n
|ψi |µi |2 =
i=1
n
|ψi |ρ −1/4 ρ 1/4 |µi |2
(47)
i=1
ψi |ρ −1/2 |ψi µi |ρ 1/2 |µi
(48)
i=1
⎞ ⎛ n n ψi |ρ −1/2 |ψi 2 ⎝ µ j |ρ 1/2 |µ j 2 ⎠ ≤ i=1
(49)
j=1
n ≤ ψi |ρ −1/2 |ψi 2 = P pgm (E).
(50)
i=1
The first and second inequalities are Cauchy-Schwarz inequalities, and the third follows because the vectors {ρ 1/2 |µi } can easily be seen to define an ensemble with density matrix ρ: n n 1/2 1/2 1/2 ρ |µi µi |ρ =ρ |µi µi | ρ 1/2 = ρ, (51) i=1
n
i=1
1/2 |µ 2 i i=1 µi |ρ
≤ 1, as this is the probability of success of and we therefore have the measurement R applied to this ensemble. B. Proof of Lemma 5 In this appendix we will prove a lemma which immediately implies Lemma 5. See [10] for the facts used about elliptic integrals and hypergeometric series. √ √ Lemma 13. Let 0 ≤ r ≤ 1 and A = 1 − r , B = 1 + r . Then
B 64 2 2 2 2 (52) (y − A )(B − y ) dy ≥ r π 1 − r 1 − 9π 2 A with equality at r = 0, r = 1.
On the Distinguishability of Random Quantum States
633
Proof. We have
B
(y 2 − A2 )(B 2 − y 2 ) dy √ √ B 2 − A2 B 2 − A2 B 2 2 2 = − 2A K (A + B )E 3 B2 B2 √ 1/4
1/4 √ 2r 2r 2(1 + r ) = , (1 + r )E √ − (1 − r )2 K √ 3 1+ r 1+ r
f (r ) =
(53)
A
(54) (55)
where K (r ) and E(r ) are the complete elliptic integrals of the first and second kind, respectively: 1√ 1 dx 1 − r2x2 , E(r ) = d x. (56) K (r ) = √ 1 − x2 0 0 (1 − x 2 )(1 − r 2 x 2 ) Note that f (r ) may be evaluated explicitly for r = 0 and r = 1, giving 0 and 8/3 respectively. Now we may apply a standard change of variables (Landen’s transformation) to both elliptic integrals, giving √
√ √ √ 2 √ √ 2(1 + r ) 1 + r f (r ) = √ 2E( r ) − (1 − r )K ( r ) −(1− r ) (1 + r )K ( r ) 3 1+ r √ √ 4 (1 + r )E( r ) − (1 − r )K ( r ) . (57) = 3 We now move to the representation of K (r ) and E(r ) as hypergeometric series, which are defined as follows (using the notation a n¯ = a(a + 1) · · · (a + n − 1)): 2 F1 (a, b; c; r ) =
∞ a n¯ bn¯ n=0
cn¯ n!
rn,
(58)
K (r ) = (π/2) 2 F1 (1/2, 1/2; 1; r 2 ) , E(r ) = (π/2) 2 F1 (−1/2, 1/2; 1; r 2 ). (59) This has the advantage that, by a transformation rule due to Gauss, we can rewrite f (r ) as a single hypergeometric series, 2π ((1 + r ) 2 F1 (−1/2, 1/2; 1; r ) − (1 − r ) 2 F1 (1/2, 1/2; 1; r )) 3 = πr 2 F1 (−1/2, 1/2; 2; r ).
f (r ) =
(60) (61)
Returning to the original inequality, our task has been simplified to showing that
64 . (62) g(r ) = 2 F1 (−1/2, 1/2; 2; r )2 ≥ 1 − r 1 − 9π 2 Evaluating g(r ) at 0 and 1 makes it clear that this is equivalent to showing that g(r ) is concave for 0 ≤ r ≤ 1, which would follow from showing the second derivative g (r )
634
A. Montanaro
to be negative in this region. From the rules governing differentiation of hypergeometric series, it is easy to show that g (r ) =
1 2 2 F1 (1/2, 3/2; 3; r ) − 2 2 F1 (−1/2, 1/2; 2; r )2 F1 (3/2, 5/2; 4; r ) . 32 (63)
The following hypergeometric transformation allows this to be simplified. 2 F1 (a, b; c; r )
= (1 − r )c−a−b 2 F1 (c − a, c − b; c; r ) 1 (1 − r )2 2 F1 (5/2, 3/2; 3; r )2 ⇒ g (r ) = 32 − 2(1 − r )2 2 F1 (5/2, 3/2; 2; r ) 2 F1 (3/2, 5/2; 4; r ) .
(64) (65) (66)
We will show that 2 F1 (5/2, 3/2; 3; r )2 ≤ 2 F1 (5/2, 3/2; 2; r ) 2 F1 (5/2, 3/2; 4; r ) for all positive r , implying that g (r ) is negative in this region. We write out the two hypergeometric series explicitly: 2 2 F1 (5/2, 3/2; 3; r )
=
∞ km kn (5/2)n¯ (3/2)n¯ n r , , where k = n 3m¯ 3n¯ n!
(67)
∞ km kn 4m¯ 2n¯
(68)
m,n=0
2 F1 (5/2, 3/2; 2; r ) 2 F1 (5/2, 3/2; 4; r ) =
=
m,n=0 ∞ m,n=0
km kn 3m¯ 3n¯
3 3+m
2+n 2
∞ ∞ 2 6 + 3m km km kn 3(2 + n) 3(2 + m) + + = 3m¯ 3m¯ 6 + 2m 3m¯ 3n¯ 2(3 + m) 2(3 + n) m=0
≥
(69)
(70)
m,n=0 m>n
∞ ∞ 2 km 2km kn + = 2 F1 (5/2, 3/2; 3; r )2 , 3m¯ 3m¯ 3m¯ 3n¯
m=0
(71)
m,n=0 m>n
where elementary methods can be used to show that the bracketed last term in Eq. (70) is at least 2 for any non-negative m and n. This completes the proof of the lemma.
C. Lipschitz Constants This appendix contains derivations of the Lipschitz constants of the functions used for the concentration of measure results. Lemma 7. Let S be an n × d matrix with S2 ≤ l, and define f (S) = n1 S21 . Then the Lipschitz constant η of f satisfies η ≤ 2l.
On the Distinguishability of Random Quantum States
635
Error 0.012
0.01
0.008
0.006
0.004
0.002
r 0.2
0.4
0.6
0.8
1
Fig. 2. Error in approximation to elliptic integral (52) for 0 ≤ r ≤ 1.
Proof. Let k = min(n, d). We have | S21 − T 21 | | f (S) − f (T )| = sup S − T 2 nS − T 2 S,T S,T
S1 + T 1 | S1 − T 1 | = sup n S − T 2 S,T
S1 + T 1 S − T 1 ≤ sup n S − T 2 S,T √ k (S1 + T 1 ) ≤ 2kl/n ≤ 2l. ≤ sup n S,T
η = sup
(72) (73) (74) (75)
The first inequality is a triangle inequality, and the second two are derived from
S1 =
k i=1
k √ σi (S) ≤ k σi2 (S) ≤ kS2 ,
(76)
i=1
which in turn uses a Cauchy-Schwarz inequality.
Lemma 12. Let S be a point on the nd-dimensional hypercube written down as an n × d {−1, 1}-matrix, and let f (S) = n 12 d S21 . Then the Lipschitz constant η of f (with respect to the Hamming distance) satisfies η ≤ 4/nd.
636
A. Montanaro
Proof. The proof is very similar to that of Lemma 7. As before, let k = min(n, d). We have | f (S) − f (T )| 1 | S21 − T 21 | = sup 2 d(S, T ) d(S, T ) S,T S,T n d
S − T 1 S1 + T 1 ≤ sup 1 2 n d S,T i, j |S − T |i j 2 √ 2 k (S1 + T 1 ) ≤ 4k/n 2 d ≤ 4/nd, ≤ sup n2d S,T √ √ where, extending inequality (76), we use S1 ≤ kS2 ≤ k i, j |S|i j . η = sup
(77) (78) (79)
References 1. Ambainis, A., Iwama, K., Kawachi, A., Masuda, H., Putra, R.H., Yamashita, S.: Quantum identification of boolean oracles. Proc. STACS ’04, LNCS 2996, Berlin-Heidelberg: Springer, 2004, pp. 105–116 2. Bai, Z.D.: Convergence rate of expected spectral distributions of large random matrices. Part II. Sample covariance matrices. Ann. Prob. 21, 649–672 (1993) 3. Bai, Z.D.: Methodologies in spectral analysis of large dimensional random matrices. a review. Statist. Sinica 9, 611–677 (1999) 4. Barnum, H., Knill, E.: Reversing quantum dynamics with near-optimal quantum and classical fidelity. J. Math. Phys. 43, 2097–2106 (2002) 5. Bennett, C.H., Hayden, P., Leung, D., Shor, P., Winter, A.: Remote preparation of quantum states. IEEE Trans. Inform. Theory 51, 56–74 (2003) 6. Buhrman, H., Cleve, R., Watrous, J., de Wolf, R.: Quantum fingerprinting. Phys. Rev. Lett. 87, 167902 (2001) 7. Davies, E.B.: Information and quantum measurement. IEEE Trans. Inform. Theory 24, 596–599 (1978) 8. Eldar, Y.C., Forney, G.D. Jr.: On quantum detection and the square-root measurement. IEEE Trans. Inform. Theory 47, 858–872 (2001) 9. Eldar, Y.C., Megretski, A., Verghese, G.: Designing optimal quantum detectors via semidefinite programming. IEEE Trans. Inform. Theory 49, 1007–1012 (2003) 10. Gradshteyn I.S., Ryzhik I.M.: Table of integrals, series and products. New York, Academic Press (1980) 11. Hausladen, P., Jozsa, R., Schumacher, B., Westmoreland, M., Wootters, W.: Classical information capacity of a quantum channel. Phys. Rev. A 54, 1869–1876 (1996) 12. Hausladen, P., Wootters, W.: A “pretty good” measurement for distinguishing quantum states. J. Mod. Opt. 41, 2385 (1994) 13. Hayden, P., Leung, D.W., Winter, A.: Aspects of generic entanglement. Commun. Math. Phys. 265, 95– 117 (2006) 14. Helstrom C.W.: Quantum Detection and Estimation Theory. New York, Academic Press (1976) 15. Holevo, A.S.: Bounds for the quantity of information transmittable by a quantum communications channel. Problemy Peredachi Informatsii 9, no. 3, 3–11 (1993) English translation: Problems of Information Transmission 9, 177–183 (1973) 16. Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge, University Press (1985) 17. Jozsa, R., Schlienz, J.: Distinguishability of states and von Neumann entropy. Phys. Rev. A 62, 012301 (2000) 18. Ledoux, M.: The concentration of measure phenomenon. AMS Mathematical Surveys and Monographs 89, Providence, RI: Amer. Math. Soc. 2001 19. Marˇcenko V.A., J.Pastur, L.A.: Distributions of eigenvalues of some sets of random matrices. Math. USSR-Sb. 1, 507–536 (1967) 20. Nielsen, M.A., Chuang, I.L.: Quantum computation and quantum information. Cambridge, Cambridge University Press (2000) 21. Popescu, S., Short, A.J., Winter, A.: Entanglement and the foundations of statistical mechanics. http:// arXiv.org/list/quant-ph/0511225 (2005) 22. Wootters, W.K.: Random quantum states. Found. Phys. 20, 1365 (1990) 23. Zyczkowski, K., Sommers H.: Average fidelity between random quantum states. Phys. Rev. A 71, 032313 (2005) Communicated by M.B. Ruskai
Commun. Math. Phys. 273, 637–650 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0247-x
Communications in
Mathematical Physics
Regularity of the Diffusion Coefficient Matrix for Lattice Gas Reversible under Gibbs Measures with Mixing Condition Yukio Nagahata Department of Mathematical Science, Graduate School of Engineering Science, Osaka University, Toyonaka, 560–8531, Japan. E-mail: [email protected] Received: 20 July 2006 / Accepted: 11 December 2006 Published online: 11 April 2007 – © Springer-Verlag 2007
Abstract: In this paper we obtain that the diffusion coefficient matrix for lattice gas reversible under Gibbs measures with mixing condition is continuously differentiable with respect to order parameter.
1. Introduction In [10], Varadhan and Yau proved the hydrodynamic limit of a stochastic lattice gas reversible under Gibbs measures which satisfy a certain mixing condition. They comment that the uniqueness of weak solution to the Cauchy problem for the limiting diffusion equation is quite subtle and give two sufficient conditions for the uniqueness to hold good: if the diffusion coefficient is either diagonal, or Lipshitz continuous, then the uniqueness is valid. They also give a sufficient condition to be satisfied by the jump rate in order that the coefficient becomes diagonal. In this paper we prove that the coefficient is continuously differentiable, and accordingly complete the proof of the hydrodynamic limit of the model whether the coefficient is diagonal or not. The smoothness of the bulk-diffusion coefficient is proved by Bernardin [1] for a lattice gas reversible under the Bernoulli measures, by Sued [9] for a mean zero exclusion process and by the present author [6, 7] both for a lattice gas with energy and for the generalized exclusion process. Previous to these works the smoothness of the self-diffusion coefficient of the symmetric simple exclusion process has been proved by Landim, Olla, and Varadhan [4]. According to [4], we have only to prove the differentiability of the central limit theorem variance of current which appears in the Green-Kubo formula (see Sect. 5). But it seems difficult to adapt to our model the method which is introduced by [4] and developed in [1, 9], since we do not have any suitable orthonormal basis (with respect to invariant measures) of functions on the configuration space. In this paper we use a certain basis, which, although not orthonormal, is natural: by means of it, our process of particles is transformed to a dual process which may be viewed as a family of random
638
Y. Nagahata
walks on the square lattices of various dimensions and leads to an inductive Poisson equation (inductive resolvent equation), which is rather tractable. By getting some tail estimate (estimate of the rate of convergence to zero of the tail) of the solution of the inductive Poisson equation, we prove the differentiability of the central limit theorem variance of the current. This paper is organized as follows: In Sect. 2 we state the model and results. In Sect. 3 we introduce a basis of continuous functions on the configuration space and dual operator. In Sect. 4 by using the inductive Poisson equation (inductive resolvent equation), we get a coefficient of a function which solves the resolvent equation and order estimate of it. In Sect. 5 we prove the main result. 2. Model and Result Let N be a cube in Zd with width 2N + 1, centered at the origin and X N := {0, 1} N d (or X := {0, 1}Z ) be the state space of lattice gas. Let η = (ηx )x∈ N (or η = (ηx )x∈Zd ) stand for a generic element of X N (resp. X ), so that for each x, ηx is equal to 0 or 1. Each element of X N or X represents a configuration of particles on understanding that ηx = 1 if there exists a particle at x and ηx = 0 if x is vacant. We define shift operators τx ( x ∈ Zd ), which act on A ⊂ Zd and local functions f as well as configurations η, by τx A := x + A, τx f (η) := f (τx η), (τx η)z := ηz−x and a family of functions { A } A , A ⊂ Zd , by A := 1{ηx =1} .
(1)
x∈A
Let J = {J A } A⊂Z be a family of real numbers, called a potential, and suppose that it is of finite range and invariant under shift, namely there exists r = r (J ) such that if diam A > r , then J A = 0, and if A = τx B, then J A = J B . We use the letter ω to denote a configuration in X N +r (J ) as representing a boundary condition. Let us define the Hamiltonian in N with potential {J A } and boundary condition ω ∈ X N +r (J ) by J A A (η ∪ ω), H N ,ω (η) := A;A∩=∅
where η∪ω ∈ X N +r (J ) such that (η∪ω)x = ηx if x ∈ N and (η∪ω)x = ωx if x ∈ / N . We define the Gibbs measure µ N ,ω in N with potential {J A }, chemical potential λ and boundary condition ω ∈ X N +r (J ) by 1 exp[−H N ,ω (η) − λ ηx ]. µ N ,ω,λ (η) := Z N ,ω,λ x∈ N
Assumption 1. The potential {J A } A satisfies Dobrushin’s condition. A useful sufficient condition for this assumption is found in [2, Example 8.9 (p. 144)]. By Assumption 1, there is a unique infinite volume Gibbs measure, Pρ say, corresponding to each density ρ ∈ [0, 1]: Pρ [{η : η0 = 1}] = ρ. Let E ρ denote the expectation with respect to Pρ . According to [2, Corollary 8.37 (p. 162)], it holds that if f is a local function on X , then E ρ [ f ] is a continuously
Regularity of the Diffusion Coefficient Matrix for Lattice Gas
639
differentiable function of ρ. Furthermore, if f is a local function, then we have d 1 (E ρ [ f {x} ] − ρ E ρ [ f ]), Eρ [ f ] = dρ χ (ρ) x where χ (ρ) = x E ρ [({x} − ρ)({0} − ρ)]. For any function f on X N (or X ) we define π x,y by π x,y f (η) := f (η x,y ) − f (η), where η x,y is a configuration defined by ⎧ ⎨ η y if z = x, (η x,y )z := ηx if z = y, ⎩ η otherwise. z d Assumption 2. There exists a set of local functions {ci }i=1 that satisfies the detailed balance condition
ci (η) exp[−H N ,ω (η)] = ci (η0,ei ) exp[−H N ,ω (η0,ei )] for some large N and all ω ∈ X N +r (J ) , where ei is the usual unit vector along the i th coordinate axis, and that if η0 = ηei , then ci (η) = 0 and if η0 = ηei , then ci (η) > 0. d Given local functions {ci }i=1 as in Assumption 2 we define the operator L N (resp. L) by
L N f (η) :=
d
τx ci (η)π x,x+ei f (η),
i=1 x;x,x+ei ∈ N
L f (η) :=
d
τx ci (η)π x,x+ei f (η),
i=1 x∈Zd
where f is a function on X N (or a local function on X ). In a standard way one can show that there exists a unique closed extension of L in the Banach space of continuous functions on X equipped with supremum norm (cf. [5]). We denote by et L the semigroup generated by the closed extension. According to [10], Assumptions 1 and 2 together guarantee that the diffusion coefficient matrix for the limit non-linear diffusion equation is the symmetric d × d matrix given by variational formula as follows: χ (ρ) := E ρ [({0} − ρ)({x} − ρ)], x∈Zd
α, D(ρ)α :=
d
αi Di, j α j
(2)
i=1
=
d 1 inf E ρ [ ci (η)(αi ({ei } − {0} ) − π 0,ei τx g)2 ], 2χ (ρ) g d i=1
where α =
t (α , . . . , α ) 1 d
x∈Z
and the infimum is taken over all local functions.
Theorem 1. Under Assumptions 1 and 2, the diffusion coefficient matrix given by (2) is a continuously differentiable function of ρ.
640
Y. Nagahata
3. Basis of C(X) and Dual Operator Let C(X ) denote the space of continuous functions on X which we equip with the supremum norm. We recall the definition of { A } A given by (1), A := 1{ηx =1} . x∈A
It is easy to see that { A } A is a basis of local functions and the coefficient fˆ in the expansion of a local function f with respect to { A } A is given by fˆ(A) = (−1)#(A\B) f (η B ), B⊂A
where η B is a special configuration defined by 1 if x ∈ B (η B )x := 0 if x ∈ / B. Note that if a function f (η) depends only on {ηx |x ∈ A}, then fˆ(B) = 0 if B ∩ Ac = ∅ as well as that for every pair of A and B, A B = A∪B .
(3)
Let us define A x,y ⊂ Zd by ⎧ / A, ⎨ A \ {x} ∪ {y} if x ∈ A and y ∈ / A, A x,y := A \ {y} ∪ {x} if y ∈ A and x ∈ ⎩ A otherwise. It is easy to see that π x,y A (η) = A (η x,y ) − A (η) = A x,y (η) − A (η), and that if = A then with (4) we have A x,y
L A =
π x,y
d
A
(4)
= 0. Making expansion of ci and using (3) together
τx ci (η)π x,x+ei A
i=1 x∈Zd
=
d i=1 x∈Zd D⊂Zd
=
d
cˆi (τ−x D) D ( A x,x+ei − A )
cˆi (τ−x D) A x,x+ei
i=1 x:A x,x+ei = A D⊂A x,x+ei
+ −
E⊂A x,x+ei
F⊂Zd \A x,x+ei :F=∅
cˆi (τ−x D) A
D⊂A
−
cˆi (τ−x (E ∪ F)) A x,x+ei ∪F
E⊂A F⊂Zd \A:F=∅
cˆi (τ−x (E ∪ F)) A∪F .
(5)
Regularity of the Diffusion Coefficient Matrix for Lattice Gas
641
In this formula we regard the first and third terms as main terms and denote the remainder by A , namely A :=
d
cˆi (τ−x (E ∪ F)) A x,x+ei ∪F
i=1 x:A x,x+ei = A E⊂A x,x+ei F⊂Zd \A x,x+ei :F=∅
−
cˆi (τ−x (E ∪ F)) A∪F .
E⊂A F⊂Zd \A:F=∅
Since ci are local functions, there exists r such that if # A > r then cˆi (A) = 0. Therefore
A (B), which is the coefficient of A , satisfies that if # B > # A + r , then
A (B) = 0.
A (B) also satisfies that if # B ≤ # A, then By the definition of A , the coefficient
A (B) = 0. Let us define c(A, B) by
c(A, B) :=
⎧ ⎨ τx ci (η A ) = cˆi (τ−x D) if A = B x,x+ei = B, ⎩
D⊂A
0
otherwise.
Then the main term in (5) is rewritten as d {c(A x,x+ei , A) A x,x+ei − c(A, A x,x+ei ) A }. i=1
x
By using a summation by parts formula we have Lf = L
fˆ(A) A
A
=
fˆ(A)[
A
=
d {c(A x,x+ei , A) A x,x+ei − c(A, A x,x+ei ) A } + A ] i=1
d A i=1
x
c(A, A x,x+ei )( fˆ(A x,x+ei ) − fˆ(A)) A +
x
fˆ(A) A
(6)
A
for any local function f . We define L by L fˆ(A) :=
d i=1
c(A, A x,x+ei )( fˆ(A x,x+ei ) − fˆ(A)).
x
Let Ys be a Markov process on P(Zd ) generated by L and PA a distribution of the Markov process starting from A. Then this process is equivalent to the original process which starts from the configuration η A . We call L the dual operator of L.
642
Y. Nagahata
4. The Resolvent Equation and the Tail Estimate At the end of the preceding section, we remarked that the Markov process generated by L starting from A is equivalent to the original process starting from η A . Let us decompose the power set P(Zd ) into Pn = Pn (Zd ) := {A ⊂ Zd ; # A = n}. It is easy to see that Pn are ergodic classes of the Markov process generated by L. First we suppose that d = 1. On considering the position of the left-most particle and the distances of particles, there is a bijection from Pn to Z × Nn−1 : for A = {x1 , x2 , . . . , xn }, where xi < x j if i < j, the bijection φ is given by φ(A) = φ({x1 , x2 , . . . , xn }) := (x1 , x2 − x1 , x3 − x2 , . . . , xn − xn−1 ). Therefore for each n the ergodic class Pn may be identified with Z × Nn−1 , and the Markov process generated by L can be regarded as a (continuous time) random walk on Z × Nn−1 . By the definition of L each particle in the original process moves to one of two nearest neighbor sites subject to the exclusion rule. Corresponding to this transition rule the random walk moves from φ(A) to φ(A) ± ji , i = 1, . . . , n, with the reflecting boundary condition such that it suppresses the transition when the walker attempts to move out of the space Z × Nn−1 . Here ji are n dimensional vectors defined by ( ji )i = 1, ( ji )i+1 = −1, ( ji )l = 0 for l = i, i + 1. Let us define a discrete measure m(A) whose mass is given by m(A) := E 1/2 [η A ]. If we consider a family of discrete measures {m ρ (A)} whose mass is given by m ρ (A) := E ρ [η A ], they are essentially equivalent to one another on each ergodic class of the Markov process in the sense that m ρ1 and m ρ2 are absolutely continuous to each other and the Radon-Nikodym derivative is a constant which depends only on ρ1 , ρ2 and the ergodic class. By Assumption 2, it is easy to see that m is a reversible measure of L on each ergodic class. Since the Markov process on Pn is a random walk on Z × Nn−1 reversible under m, it is not difficult to obtain an estimate of the rate of convergence to zero at infinity of the Green function G of L, which we call the tail estimate of G. Furthermore since the random walk has reflecting boundary condition, it is not difficult to estimate the tail of the difference of G. Now suppose that d ≥ 2. The state space of the Markov process then fails to have linear order structure and we cannot follow the argument made in the 1-dimensional case. But in general (including the case in d = 1), we can still get the tail estimate of the Green function of L similar to that obtained in the case d = 1 by matching an element of Pn with a set of n! elements of (Zd )n , as follows: Pick A = {x1 , x2 , . . . xn } ∈ Pn . Since xi = x j if i = j, there are n! permutations, say ϕ 1 (A) = (x11 , x21 , . . . , xn1 ), . . . , ϕ n! (A) = (x1n! , x2n! , . . . , xnn! ). Then An := {ϕ i (A)|A ∈ Pn , i = 1, 2, . . . , n!} is the same as (Zd )n except for the diagonal part. We define ϕ −1 : An → Pn by ϕ −1 (x1 , x2 , . . . , xn ) := {x1 , x2 , x3 , . . . , xn }.
Regularity of the Diffusion Coefficient Matrix for Lattice Gas
643
We consider a Markov generator L˜ on An by L˜ f (x1 , . . . , xn ) =
n d
c(ϕ −1 (x1 , . . . , xn ), ϕ −1 ((x1 , . . . , xn ) + lei, j ))
j=1 i=1 l=±1
×{ f ((x1 , . . . , xn ) + lei, j ) − f (x1 , . . . , xn )}, where ei, j is an element of (Zd )n whose i th component is e j ∈ Zd and the others are 0. Here it sometimes happens that (x1 , . . . , xn ) + lei, j ∈ / An . In such cases, we regard (x1 , . . . , xn ) + lei, j as (x1 , . . . , xn ). Suppose that G = G A (B) and G˜ = G˜ A (B) are the Green functions of L and L˜ ˜ we have respectively. Then by the definition of L and L, G A (B) =
n!
G˜ ϕ 1 (A) (ϕ i (B)).
i=1
Since the Markov process generated by L˜ is essentially a (continuous time) random walk ˜ Furtheron Zdn , it is not difficult to get the decay estimate of the Green function of L. more by the definition of ϕ i , we can find the symmetry relative to planes. Therefore we can adapt the reflection principle. We state these as a proposition, but we omit the details of the proof. Proposition 1. The resolvent kernel G λ , (λ > 0) is well-defined and vanishes exponentially fast, namely there exists G λ = G λA (B) for A, B ∈ P and # A = # B such that G λ solves λG λA (B) − LG λA (B) = δ A (B), where δ A (B) = 0 or 1 according to A = B or A = B and there exist constants λ > 0 and C which may depend on λ such that |G λA (B)| ≤ C exp(−λ d(ϕ(A), ϕ(B))), where d(·, ·) is an Euclidean distance on Zd# A . If d# A ≥ 3, then the Green function G is also well-defined and given by the limit of G λ and vanishes polynomially fast, namely there exists limλ→0 G λ = G = G A (B) for A, B ∈ P with # A = # B ≥ 3/d such that G solves −LG A (B) = δ A (B) and there exists a constant C such that |G A (B)| ≤ Cd(ϕ(A), ϕ(B))−(d# A−2) . Furthermore for all d ≥ 1, there exists limλ→0 (G λA /m(A) − G λB /m(B)) = G A,B = G A,B (C) for A, B, C ∈ P and # A = # B = #C such that G solves −LG A,B (C) =
1 1 δ A (C) − δ B (C), m(A) m(B)
and there exists a constant D which may depend on A, B such that |G A,B (C)| ≤ Dd(ϕ(A), ϕ(C))−d# A .
(7)
644
Y. Nagahata
Let us define currents we by we (η) := ce (η)({e} − {0} ). Denote by wˆ e (A) the coefficient of the current, i.e., we (η) = A wˆ e (A) A (η) and by
A (B) the coefficient of the remainder term A in the formula (6). From now on, we simply write w for we and wˆ for wˆ e . By using Proposition 1, we have the following lemma. Lemma 1. There is a function gˆ λ : P → R such that gˆ λ solves the resolvent equation λgˆ λ (A) − Lgˆ λ (A) − gˆ λ (B) ˆ (8) B (A) = w(A). It also satisfies that gˆ λ vanishes exponentially fast, namely there exists C and λ > 0 such that |gˆ λ (A)| ≤ C exp(−λ diam(A ∪ {0})). Furthermore gλ (η) := A gˆ λ A (η) is well-defined and solves the resolvent equation λgλ − Lgλ = w. We also have a function gˆ 0 : P → R such that gˆ 0 = limλ→0 gˆ λ and solves the Poisson equation −Lgˆ 0 (A) − gˆ 0 (B) ˆ (9) B (A) = w(A). Furthermore gˆ 0 vanishes polynomially fast: there exists a constant C such that |gˆ 0 (A)| ≤ Cdiam(A ∪ {0})−d# A . Proof. We will construct gˆ 0 by using the Green function in Proposition 1. Construction of gˆ λ can be given in a similar way. We rewrite (9) by −Lgˆ 0 (A) = w(A) ˆ + gˆ 0 (B) (10) B (A), B;# B<# A
since B (A) = 0 if # A ≤ # B. Therefore we call (10) an inductive Poisson equation and can give gˆ 0 (A) inductively on # A, as follows: First, suppose that # A = 0, then A = ∅. Then B (A) = 0 for all B, since 0 = # A ≤ # B for all B. By the definition of L, L f (∅) = 0 for all f . Therefore if w(∅) ˆ = 0, then we get gˆ 0 (∅) = 0 (to simplify). Second, suppose that # A = 1, i.e., A ∈ P1 . Since if 1 = # A ≤ # B then B (A) = 0 and gˆ 0 (∅) = 0, the right-hand side in (10) for A ∈ P1 is also w(A). ˆ Suppose that d ≥ 3, then by using the Green function G B (A) in Proposition 1, we have for A ∈ P1 , w(B)G ˆ gˆ 0 (A) = B (A). B∈P1
ˆ = 0. Since w is a local function, there exists a finite Suppose that A∈P1 w(A)m(A) sequence of sets {Ai1 ; 1 ≤ i ≤ m} ⊂ P1 such that if B ∈ P1 \ {Ai1 ; 1 ≤ i ≤ m} then
Regularity of the Diffusion Coefficient Matrix for Lattice Gas
w(B) ˆ = 0. Put w˜ i1 := see that w(A) ˆ =
i
ˆ i1 )m(Ai1 ) j=1 w(A
m−1
w˜ i1 {
i=1
645
for 1 ≤ i ≤ m − 1, then we can easily
1 1 δ A1 (A) − δ 1 (A)}. 1 1 ) Ai+1 m(Ai ) i m(Ai+1
Therefore by using the Green function G B,C (A) in Proposition 1, we have for A ∈ P1 , gˆ 0 (A) =
m−1 B∈P1 i=1
w˜ i1 G A1 ,A1 (A). i
i+1
Third, suppose that gˆ 0 (B) for # B ≤ n − 1 is given. Since B (A) = 0 for all A such that # B ≥ # A, for A ∈ P the right-hand side in (10) is equal to w(A) ˆ + n B;# B≤n−1 gˆ 0 (B) B (A) and well-defined. If dn = d# A ≥ 3, then we have
gˆ 0 (A) =
{w(B) ˆ +
B∈Pn
gˆ 0 (C) C (B)}G B (A)
C;#C≤n−1
ˆ + B;# B≤n−1 gˆ 0 (B) for A ∈ Pn . If B (A)) = 0 then we can find A∈Pn (w(A) i n n a pair of sequences {Ai ; 1 ≤ i ≤ m n } and {w˜ i }i , that is w˜ in := ˆ in ) + j=1 {w(A n B;# B≤n−1 gˆ 0 (B) B (A )}, such that i
w(A) ˆ +
gˆ 0 (B) B (A) =
w˜ in {
i
B;# B≤n−1
Therefore we have gˆ 0 (A) =
B
1 1 n δ An (A) − n ) δ Ai+1 (A)}. m(Ain ) i m(Ai+1
n (A). w˜ in G Ain ,Ai+1
i
Finally, we conclude that if we can prove the following condition for w(A) ˆ then the existence of gˆ 0 is proved; for d ≥ 3, w(∅) ˆ =0, for d = 2, w(∅) ˆ = 0 and A∈P1 w(A)m(A) ˆ = 0, and for d = 3, w(∅) ˆ = 0, A∈P1 w(A)m(A) ˆ = 0 and {w(A) ˆ + B;# B≤1 gˆ 0 (B) B (A)}m(A) = 0. We note that the last condition A∈ P 2 ˆ + B;# B≤1 gˆ 0 (B) B (A)}m(A) = 0 makes sense if wˆ satisfies the other A∈P2 {w(A) conditions. Note that gˆ λ (A) for λ > 0 is given without any condition. We note that there exists s ∈ R such that E[w|Fs ] = 0, where Fs is a σ -algebra generated by x∈s ηx and {ηx ; x ∈ / s }. Note that w depend on {ηx ; x ∈ s }. There fore if we pick A0 = {η; x∈s η y = 0, ηz = 0 for all z ∈ / s }, A1 = {η; x∈s η y = 1, ηz = 0 if z ∈ / s } and A2 = {η; x∈s η y = 2, ηz = 0 if z ∈ / s }. Then we have 0 = E[w|A0 ] = w(∅), ˆ 0 = E[w|A1 ] = w(B)m(B)Z ˆ ˆ A1 + w(∅), B∈P1
0 = E[w|A2 ] = (
B∈P2
w(B)m(B) ˆ +
B∈P1 x∈s \B
(11) w(B)m(B ˆ ∪ {x}))Z A2 + w(∅), ˆ
646
Y. Nagahata
respectively. Here Z Ai are normalizing constants. By using (11), we get w(∅) ˆ = 0, and w(B)m(B) ˆ =0
(12)
B∈P1
for all d. Suppose that d = 1 and we consider L on P1 . Then L is discrete Laplacian on Z+ , and m(A) = 1/2 for all A ∈ P1 . Since w is a local function there exists a finite set B ⊂ P1 such that w(B) ˆ = 0 if B ∈ / B. Therefore we conclude that there exist {gˆ 0 (B)} B∈P1 and a finite set B ⊂ P1 such that gˆ 0 solves the Poisson equation (10) and 0 if B ∈ / B . According to the binomial expansion 0 = (1 − 1)# A = gˆ 0 (B) = #(A\B) if A = ∅, in general we have (−1) B⊂A ⎞ ⎛ (−1)#G ⎝ fˆ(E ∪ F) = fˆ(E)⎠ (13) E⊂A
G⊂F
E⊂(A∪F)\G
if F = ∅. Suppose F = ∅ and B ∩ F = ∅. Put A := B ∪ F. First we decompose B (A) into B (A) :=
d
i,x B (A),
i=1 x;A x,x+ei = A
cˆi (τ−x (E ∪ F)). i,x B (A) := E⊂A
By using (13), we have
⎞ ⎛ i,x B (A) = (−1)#G ⎝ cˆi (τ−x E)⎠ G⊂F
=
E⊂A\G
(−1)
#G
c(A \ G, A x,x+ei \ G).
G⊂F
Similarly we have x,x+ei )= B (A
(−1)#G c(A x,x+ei \ G, A \ G).
G⊂F
It is easy to see that if neither B ⊂ A nor B ⊂ A x,x+ei , then B (A) = 0. We conclude that the coefficient of B is rewritten as B (A) =
d i=1
x
i,x B (A),
⎧ ⎪ − (−1)#G c(A \ G, A x,x+ei \ G) ⎪ ⎪ ⎪ ⎪ G⊂A\B ⎪ ⎪ ⎪ if B ⊂ A and B \ A x,x+ei = ∅ ⎨ i,x #G x,x+e i \ G, A \ G) . (A) = B (−1) c(A ⎪ ⎪ ⎪ G⊂A x,x+ei \B ⎪ ⎪ ⎪ ⎪ if B ⊂ A x,x+ei and B \ A = ∅ ⎪ ⎩ 0 otherwise
Regularity of the Diffusion Coefficient Matrix for Lattice Gas
647
We note that the summation in the above formula is only a finite summation since if i,x A = A x,x+ei then B (A) = 0. By using this we have
gˆ 0 (B)
B∈P1
B (A)m(A) =
A∈P2
gˆ 0 (B)
B∈P1
d A∈P2 i=1
i,x B (A)m(A).
x
Since B ∈ P1 , A ∈ P2 and if neither B ⊂ A nor B ⊂ A x,x+ei , then B (A) = 0, the right-hand side above is rewritten as i,x
d B∈P1 i=1
×
gˆ 0 (B)
x
[{c(B x,x+ei ∪ {y}, B ∪ {y}) − c(B x,x+ei , B)}m(B x,x+ei ∪ {y}) y
− {c(B ∪ {y}, B x,x+ei ∪ {y}) − c(B, B x,x+ei )}m(B ∪ {y})]. By Assumption 2 we have c(A, B)m(A) = c(B, A)m(B), in general. Therefore we arrive at d
gˆ 0 (B)
B∈P1 i=1 x x,x+ei
+c(B, B
[−c(B x,x+ei , B)m(B x,x+ei ∪ {y})
y
)m(B ∪ {y})].
By using summation by parts formula, we have d [−c(B, B x,x+ei )(gˆ 0 (B x,x+ei ) − gˆ 0 (B))m(B ∪ {y})]. B∈P1 i=1
x
y
d x,x+ei )(gˆ (B x,x+ei ) − gˆ (B))] = −Lgˆ (B) and gˆ (B) solves Since i=1 0 0 0 0 x [−c(B, B the Poisson equation (10), we have B∈P1
w(B)m(B ˆ ∪ {y}).
y
By using (11), we have
w(A)m(A) ˆ +
A∈P2
=
A∈P2
gˆ 0 (B)
B∈P1
w(A)m(A) ˆ +
B∈P1
B (A)m(A)
A∈P2
w(B)m(B ˆ ∪ {y}) = 0.
y
Therefore we conclude that there exists a solution of the inductive Poisson equation (10).
648
Y. Nagahata
i,x . The proof is In order to get the tail estimate, we also use the notation B i,x (A) + also inductive on the cardinality of A. It is not so difficult to see that g(B) ˆ B i,xx,x+e (A) is dominated by a constant multiple of diam(A ∪ {0})−d# A , due g(B ˆ x,x+ei ) i B to cancellation of the first order term. It is also not so difficult to see that i,x (A) + g(B i,xx,x+e (A)}m(A) {g(B) ˆ ˆ x,x+ei ) i B B i,xx,x+e (A x,x+ei )}m(A x,x+ei ) = 0. i,x (A x,x+ei ) + g(B ˆ x,x+ei ) +{g(B) ˆ i B B Adapting the argument for the estimate (7) in Proposition 1, we get that |gˆ 0 (A)| ≤ Cdima(A ∪ {0})−d# A . 5. Proof of the Theorem Proof of Theorem 1. According to [8, Prop. II.2.2 (p.180)], the diffusion coefficient matrix defined by the variational formula (2) coincides with the diffusion coefficient matrix defined by the Green-Kubo formula based on the current-current correlation function: ¯ α, D(ρ)α :=
1 Eρ [ ci (η)(αi ({ei } − {0} ))2 ] 2χ (ρ) i=1 1 ∞ − E ρ [wα τx et L wα ]dt . 2 0 x d
Since ci and {x} are local functions, we have only to prove the differentiability of the second term. According to [3], we have ∞ E ρ [wτx et L w]dt = lim E ρ [wτx gλ ], 0
λ→0
x
x
where gλ is a solution of the resolvent equation λgλ = Lgλ = w. We note that there exists s ∈ Z such that E ρ [w|Fs ] = 0. Therefore it is easy to see that if A ∩ s = ∅, then we have E ρ [w A ] = 0. Furthermore if x ∈ s , x = 0, ei and B ∩ s = ∅, then by using Assumption 2, we also have E ρ [w{x} B ] = 0. By substituting A gˆ λ (A)(A) for gλ , where gˆ λ is given by Lemma 1 and using these equalities, we have E ρ [wτx gλ ] = E ρ [wτx gˆ λ (A) A ] = E ρ [w gˆ λ (A)τ−x A ], x
x
A⊂Zd
x
A∈Bx
where Bx := {A : #(τ−x A ∩ s ) ≥ 2 or τ−x A ∩ {0, ei } = ∅}. In order to apply Fubini’s theorem (and Lebesgue’s convergence theorem), we estimate |gˆ λ (A)|E ρ [|wτ−x A |], x
A∈Bx
Regularity of the Diffusion Coefficient Matrix for Lattice Gas
649
for λ ≥ 0. We infer that E ρ [|w A |] ≤ w∞ E ρ [ A ] ≤ w∞ θ # A , where θ = θ (ρ) is defined by θ :=
E ρ [{0} |F] < 1.
sup d \{0}
F∈σ ({0,1}Z
)
Therefore we have x
|gˆ λ (A)|E ρ [|wτ−x A |] ≤
∞
A∈Bx
|gˆ λ (A)|w∞ θ n .
x n=1 A∈Bx :# A=n
By using Lemma 1, it is not difficult to see that there exists C which may depend on λ such that for all n, |gˆ λ (A)| ≤ Cn. x
A∈Bx :# A=n
Since θ < 1, we conclude that x
|gˆ λ (A)|E ρ [|wτ−x A |] < ∞.
A∈Bx
By using Fubini’s theorem, we have
E ρ [w
gˆ λ (A)τ−x A ] =
A∈Bx
x
∞ n=1 x
gˆ λ (A)E ρ [wτ−x A ].
A∈Bx :# A=n
By using Lebesgue’s convergence theorem and Lemma 1, we have lim
λ→0
E ρ [wτx gλ ] =
x
∞ n=1 x
gˆ 0 (A)E ρ [wτ−x A ].
A∈Bx :# A=n
d We know that dρ E ρ [ f ] = x E ρ [ f ({x} − ρ)]/χ (ρ) and it is not difficult to see that there exists a constant C such that x E ρ [ A ({x} − ρ)] ≤ Cθ # A . Therefore we can justify the exchange of the order of summation and differentiation. Remark 1. Let us define a sequence of local functions {gn }n by gˆ 0 (A) A . gn := A⊂n
Then it is not difficult to see that lim E ρ [
n→∞
d i=1
= lim E ρ [ λ→0
d i=1
ci (η)(αi ({ei } − {0} ) −
π 0,ei τx gn )2 ]
x∈Zd
ci (η)(αi ({ei } − {0} ))2 ] −
1 E ρ [wα τx gλ ] . 2 x
650
Y. Nagahata
Namely, {gn }n is a minimizing sequence of variational formula (2), which does not depend on the density ρ. Furthermore the left-hand side above and differential (with respect to ρ) of it converge uniformly in ρ. Acknowledgement. The author would like to thank Professor K.Uchiyama for helping him with valuable suggestions.
References 1. Bernardin, C.: Regularity of the diffusion coefficient for lattice gas reversible under Bernoulli measures. Stochastic Process. Appl. 101(1), 43–68 (2002) 2. Georgii, H.O.: Gibbs Measures and Phase Transitions. Berlin: Walter de Gruyter & Co., 1988 3. Kipnis, C., Varadhan, S.R.S.: Central limit theorem for additive functionals of reversible Markov processes and applications to simple exclusion. Commun. Math. Phys. 104(1), 1–19 (1986) 4. Landim, C., Olla, S., Varadhan, S.R.S.: Symmetric simple exclusion process: regularity of the selfdiffusion coefficient. Commun. Math. Phys. 224(1), 307–321 (2001) 5. Liggett, T.M.: Interacting particle systems. Berlin-Heidelberg-New York: Springer, 1985 6. Nagahata, Y.: Regularity of the diffusion coefficient matrix for the lattice gas with energy. Ann. Inst. H. Poincare Probab. Statist. 41, 45–67 (2005) 7. Nagahata, Y.: Regularity of the diffusion coefficient matrix for generalized exclusion process. Preprint 8. Spohn, H.: Large Scale Dynamics of Interacting Particles. Berlin-Heidelberg-New York: Springer, 1991 9. Sued, M.: Regularity properties of the diffusion coefficient for a mean zero exclusion process. Ann. Inst. H. Poincare Probab. Statist. 41, 1–33 (2005) 10. Varadhan, S.R.S., Yau, H.T.: Diffusive limit of lattice gases with mixing condition. Asian J. Math. 1, 623–678 (1997) Communicated by H.-T. Yau
Commun. Math. Phys. 273, 651–675 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0198-2
Communications in
Mathematical Physics
Adiabatic Theorems for Quantum Resonances Walid K. Abou Salem , Jürg Fröhlich Institute for Theoretical Physics, ETH Zurich, CH-8093 Zurich, Switzerland. E-mail: [email protected]; [email protected] Received: 25 July 2006 / Accepted: 3 October 2006 Published online: 17 March 2007 – © Springer-Verlag 2007
Abstract: We study the adiabatic time evolution of quantum resonances over time scales which are small compared to the lifetime of the resonances. We consider three typical examples of resonances: The first one is that of shape resonances corresponding, for example, to the state of a quantum-mechanical particle in a potential well whose shape changes over time scales small compared to the escape time of the particle from the well. Our approach to studying the adiabatic evolution of shape resonances is based on a precise form of the time-energy uncertainty relation and the usual adiabatic theorem in quantum mechanics. The second example concerns resonances that appear as isolated complex eigenvalues of spectrally deformed Hamiltonians, such as those encountered in the N-body Stark effect. Our approach to study such resonances is based on the Balslev-Combes theory of dilatation-analytic Hamiltonians and an adiabatic theorem for nonnormal generators of time evolution. Our third example concerns resonances arising from eigenvalues embedded in the continuous spectrum when a perturbation is turned on, such as those encountered when a small system is coupled to an infinitely extended, dispersive medium. Our approach to this class of examples is based on an extension of adiabatic theorems without a spectral gap condition. We finally comment on resonance crossings, which can be studied using the last approach. 1. Introduction There are many physically interesting examples of quantum resonances in atomic physics and quantum optics. To mention one, the state of a cold gas of atoms localized in a trap may be metastable, since the trap may be not strictly confining. In typical Bose-Einstein condensation experiments, the shape of the trap usually varies slowly over time scales small compared to the lifetime of the metastable state, yet larger than a typical relaxation time (see for example [1]). This is an example of an adiabatic evolution of shape resonances. While there has been much progress in a time-independent theory of quantum Current address: Department of Mathematics, University of Toronto, M5S 2E4 Toronto, ON, Canada
652
W. K. Abou Salem, J. Fröhlich
resonances (see [2–7]), there has been relatively little work on a time-dependent theory of quantum resonances (see [8–11]). Surprisingly, and in spite of its relevance to the interpretation of many experiments and phenomena in atomic physics, the problem of adiabatic evolution of quantum resonances received very litte attention, so far (but see [12]). In this paper, we study the adiabatic evolution of three general types of quantum resonances. This is a first step towards a rigorous understanding of resonance- and metastability phenomena, such as hysteresis in magnets and Sisyphus cooling of atomic gases (see for example [13–15]). We first consider the adiabatic evolution of so-called shape resonances. More specifically, we consider a quantum-mechanical particle in a potential well, say that of a quantum dot or a locally harmonic trap, with the property that the shape of the potential well changes over time scales which are small compared to the time needed for the particle to escape from the well. The analysis of this problem is based on a precise form of the time-energy uncertainty relation, see [8], and the standard adiabatic theorem in quantum mechanics, [16]. In our approach, we obtain an explicit estimate on the distance between the true state of the system and an instantaneous metastable state. Our approach can also be applied to study the time evolution of the state of an electron in a H e+ ion moving in a time-dependent magnetic field which changes over time scales that are small compared to the ionization time of the ion; (see [8] for a discussion of this example in the time-independent situation). The second class of examples concerns quantum resonances that appear as isolated complex eigenvalues of spectrally deformed Hamiltonians, such as the N-body Stark effect (see for example [4, 6]).1 Our analysis is based on Balslev-Combes theory for dilatation analytic Hamiltonians, [17], and on an adiabatic theorem for generators of evolution that are not necessarily normal or bounded, [18]. This approach, too, yields explicit estimates on the distance between the true state of the system and an instantaneous metastable state. The third class of examples concerns resonances that emerge from eigenvalues of an unperturbed Hamiltonian embedded in the continuous spectrum after a perturbation has been added to the Hamiltonian. Typical examples of such resonances arise when a small quantum-mechanical system, say an impurity spin, is coupled to an infinite, dispersive medium, such as magnons (see for example [22–24] for relevant physical models). Our approach to such examples is based on an extension of adiabatic theorems without a spectral gap condition, [25–29]. Our results also cover the case of resonance crossings. Further details of applications where our assumptions are explicitly verified for various physical models will appear in [30]. 2. Adiabatic Evolution of Shape Resonances In this section, we study the time evolution of the state of a quantum-mechanical particle moving in Rd under the influence of a potential well which is not strictly confining. The potential well is described by a time-dependent function on Rd , x (1) vθτ (x, t) ≡ θ 2 v( , s), θ where τ is the adiabatic time scale, t is the time, s = τt is rescaled time, θ ≥ 1 is a parameter characterizing the width and height of the well, and v(x, s) is a function on 1 For the sake of simplicity, we consider nondegenerate resonances. However, our analysis can be extended to the case of degenerate resonances; (see [10, 11] for a discussion of the latter in the time-independent case).
Adiabatic Theorems for Quantum Resonances
653
Rd × R that is twice differentiable in s ∈ R and smooth in x ∈ Rd ; see below for precise assumptions on the potential. We assume that τ is small compared to the escape time of the particle from the well.2 By introducing an auxiliary adiabatic evolution, we obtain precise estimates on the difference between the true state of the particle and an instantaneous metastable state. Our analysis is based on the generalized time-energy uncertainty relation, as derived in [8], and on the usual adiabatic theorem in quantum mechanics, [16]. The Hilbert space of the system is H := L 2 (Rd , d d x). Its dynamics is generated by the time-dependent Hamiltonian H τ (t) := −/2 + vθτ (x, t),
(2)
where is the d-dimensional Laplacian.3 We make the following assumptions on the potential vθ (x, s), for s ∈ I, where I is an arbitrary, but fixed compact interval of R. (A1) The origin x = 0 is a local minimum of v(x, s), for all s ∈ I, and, without loss of generality, v(0, s) = 0 for s ∈ I. (A2) The Hessian of v(·, s) at x = 0 is positive-definite, with eigenvalues i2 (s) > 20 , i = 1, . . . , d, and 0 > 0 is a constant independent of s. (A3) Consider a smooth function g(x) with the properties that g(x) = 1 for |x| < 21 d 2 and g(x) = 0 for |x| > 1, where |x| := i=1 x i . For > 0, we define the rescaled function g,θ by x . (3) g,θ (x) := g (θ )1/3 We assume that, for all > 0, 1 max g,θ (x)|vθ (x, s) − x2 (s)x| ≤ c, d 2 x∈R
(4)
uniformly in s ∈ I , where 2 (s) is the Hessian of v(·, s) at x = 0 and c is a finite constant independent of s ∈ I. (A4) v(x, s) is smooth, polynomially bounded in x ∈ Rd , and bounded from below, uniformly in s ∈ I. Moreover, v(x, s) is twice differentiable in s ∈ I. We also assume that H (s1 ) − H (s2 ) ≤ C, ∀s1 , s2 ∈ I, where C is a finite constant. Note that under these assumptions, vθ is a potential well of diameter of order O(θ ) and height O(θ 2 ). Let 1 H0 (s) := −/2 + x2 (s)x, (5) 2 and (6) H1 (s) := H0 (s) + w,θ (x, s), where4
1 w,θ (x, s) := g,θ (x)[vθ (x, s) − x2 (s)x]. 2
(7)
2 The escape time of the particle from the well, which is related to θ, will be estimated later in this section. 3 We work in units where the mass of the particle m = 1, and Planck’s constant = 1. 4 H (s) depends on the parameters θ and , but we drop the explicit dependence to simplify notation. 1
654
W. K. Abou Salem, J. Fröhlich
Note that H (s) = H1 (s) + δv,θ (x, s),
(8)
where
1 δv,θ (x, s) := (1 − g,θ (x))[vθ (x, s) − x2 (s)x]. 2 It follows from Assumptions (A3) and (A4), and (9) that 0, |x| ≤ 21 (θ )1/3 max |δv,θ (x, s)| ≤ , s∈I θ 2 P(x/θ ), |x| ≥ 21 (θ )1/3
(9)
(10)
uniformly in s ∈ I, for some polynomial P(x) of x. Denote by P1n (s), n ∈ N, the projection onto the eigenstates of H1 (s) corresponding to the n th eigenvalue of H1 (s). It follows from Assumptions (A3) and (A4) that, for small enough, P1n (s) is twice differentiable in s as a bounded operator for s ∈ [0, 1]. Denote by U τ (s, s ) the propagator generated by H (s), which solves the equation5 ∂s U τ (s, s ) = −iτ H (s)U τ (s, s ), U τ (s, s) = 1.
(11)
Suppose that the initial state of the system is given by a density matrix ρ0 , which is a positive trace-class operator with unit trace. Then the state of the particle at time t = τ s is given by the density matrix ρs , which satisfies the Liouville equation ρ˙s = −iτ [H (s), ρs ]
(12)
and ρs=0 = ρ0 . The solution of (12) is given by ρs = U τ (s, 0)ρ0 U τ (0, s).
(13)
Let P be an orthogonal projection onto a reference subspace PH, and let ps denote the probability of finding the state of the particle in the reference subspace PH at time t = τ s. This probability is given by ps := T r (ρs P).
(14)
We are interested in studying the adiabatic evolution of a state of a particle which initially, at time t = 0, is localized inside the well. Such a state may be approximated by a superposition of eigenstates of H1 (0) (defined in (6)). The initial state of the particle is chosen to be given by N ρ0 = cn P1n (0), (15) n=0
P1n (0)
where are the eigenprojections onto the states corresponding to the eigenvalues N cn = 1, for some finite integer N . E n of H1 (0), cn ≥ 0, with n=1 We let U1 be the propagator of the auxiliary evolution generated by H1 (s). It is given as the solution of the equation ∂s U1 (s, s ) = −iτ H1 (s)U1 (s, s ), U1 (s, s) = 1.
(16)
5 Assumptions (A1)–(A4) are sufficient to show that U τ exists as a unique unitary operator with domain D, a common dense core of H (s), s ∈ I.
Adiabatic Theorems for Quantum Resonances
Moreover, let
655
W (s, 0) := U1 (0, s)U τ (s, 0).
(17)
Then W (s, 0) solves the equation
where
(s)W (s, 0), W (0, 0) = 1, ∂s W (s, 0) = −iτ H
(18)
(s) = U1∗ (s, 0)δv,θ (s)U1 (s, 0), H
(19)
as follows from (11), (6), (16) and (17). Then ps = T r (ρs P) =
cn psn ,
(20)
n
where We define
psn := T r (PU τ (s, 0)P1n (0)U τ (0, s)).
(21)
1n (s) := U1 (s, 0)P1n (0)U1 (0, s). P
(22)
We have the following proposition. Proposition 2.1. Suppose Assumptions (A1)–(A4) hold. Then s ≤ n (s)) ± 2τ (s )), psn sin2∗ (ar csin T r (P P ds f (P1n (0), H 1 ≥ 0 (s) in (19), P n (s) in (22), for s ≥ 0, where psn is defined in (21), H 1 ⎧ ⎪ ⎨0, x < 0 sin ∗ (x) := sin(x), 0 ≤ x ≤ π2 , ⎪ ⎩1, x > π 2 and f (P, A) :=
T r (P A∗ (1 − P)A).
(23)
(24)
(25)
The proof of Proposition 2.1 is given in the Appendix, and it is based on the generalized time-energy uncertainty relation derived in [8]. Before stating an adiabatic theorem for shape resonances, we want to estimate the time needed for the quantum-mechanical particle to escape from the potential well if its initial state is given by (15). Note that, for each fixed value of s ∈ I, the spectrum of H0 (s), σ (H0 (s)), is formed of the eigenvalues E ls =
d i=1
1 i (s)(li + ), 2
(26)
where l = (l1 , . . . , ld ) ∈ Nd , with corresponding eigenfunctions φls (x) =
d i=1
1/4 i (s)h li ( i (s)xi ),
(27)
656
W. K. Abou Salem, J. Fröhlich
where h li are Hermite functions normalized such that d xh l (x)h k (x) = δlk . Recall that the Hermite functions decay like a Gaussian away from the origin, 1
|h l (x)| ≤ cl,δ e−( 2 −δ)x , 2
(28)
for an arbitrary δ > 0 and a finite constant cl,δ ; (see for example [36]). It follows from analytic perturbation theory (Lemma A.1 in the Appendix) that the eigenstates of H1 (s) decay like a Gaussian away from the origin. Moreover, it follows from Assumption (A3) that δv,θ is supported outside a ball of radius 21 (θ )1/3 . Let π1n (x, y; s) denote the kernel of U1 (s, 0)P1n (0)U1 (0, s), whose modulus decays like a Gaussian away from the origin for arbitrary finite τ ; see Lemma A.1 in the Appendix. For each fixed s ∈ I, the following estimate follows from Lemma A.1, Assumptions (A3)–(A4) and (19). (s))2 = |T r (P1n (0) H (s)2 − P1n (0) H (s)P1n (0) H (s))| f (P1n (0), H (s)]2 )| = |T r ([P1n (0), H = |T r ([U1 (s, 0)P1n (0)U1 (0, s), δv,θ (s)]2 )| = d xd y|π1n (x, y; s)|2 (δv,θ (x, s) − δv,θ (y, s))2 ≤ C,n e−µ θ
2/3
,
(29)
where µ is proportional to 2/3 , C,n is a finite constant independent of s ∈ I (for finite n appearing in (15) and fixed ). Let τl ∼ eµ θ
2/3 /2
,
(30)
which, by (29) and (15), is a lower bound for the time needed for the particle to escape from the well.6 We now introduce the generator of the adiabatic time evolution for each eigenprojection, i Han (s) := H1 (s) + [ P˙1n (s), P1n (s)], (31) τ and the corresponding propagator Uan (s, s ) which satisfies ∂s Uan (s, s ) = −iτ Han (s)Uan (s, s ); Uan (s, s) = 1.
(32)
By Assumptions (A1)–(A4), it follows that (32) has a unique solution, Uan (s, s ), which is a unitary operator. From the standard adiabatic theorem in quantum mechanics [16], 6 In other words, the particle spends an exponentially large time in θ inside the well. Note that one may also directly use time-dependent perturbation theory to estimate the time needed for the particle to escape from the well, see [8].
Adiabatic Theorems for Quantum Resonances
657
we know that Uan (s, s )P1n (s )Uan (s , s) = P1n (s), sup s∈[0,1]
Uan (s, 0) − U1 (s, 0)
= O(τ
−1
(33) ),
(34)
for τ 1.7 For 1 τ τl , where τl is given in (30), it follows from (23), (22), (33) and (34) that psn = T r (PP1n (s)) + O(max(1/τ, τ/τl )). (35) Let ρ s :=
N
cn P1n (s),
(36)
n=1
the instantaneous metastable state of the particle inside the well. By (15), (35) and (36), we have that, for 1 τ τl , sup | ps − T r (P ρs )| ≤ A/τ + Bτ/τl ,
(37) (38)
s∈[0,1]
where A and B are finite constants. This proves the following theorem for the adiabatic evolution of shape resonances. Theorem 2.2. (Adiabatic evolution of shape resonances). Suppose Assumptions (A1)–(A4) hold for some τ satisfying (37). Then 1 τ ps = T r (P ρ (s)) + O(max( , )). τ τl
(39)
In other words, over time scales that are small compared to the escape time τl , given in (30), of the particle from the potential well, the true state of the particle which is initially localized inside the well, as given by the choice (15), is approximately equal to the instantaneous metastable state given in (36). We remark that a similar analysis can be applied to study the adiabatic evolution of the metastable state of the electron of an H e+ ion moving in a time-dependent magnetic field (see [8] for a discussion of this model in the time-independent case). 3. Isolated Eigenvalues of Spectrally Deformed Hamiltonians In this section, we discuss the adiabatic evolution of quantum resonances which appear as isolated eigenvalues of spectrally deformed Hamiltonians. Examples of such resonances include ones of the Stark effect and the N-body Stark effect (see for example [4, 6, 2, 3, 12]). Our analysis is based on Balslev-Combes theory for dilatation analytic Hamiltonians and on an adiabatic theorem for nonnormal and unbounded generators of evolution. The main result of this section is Theorem 3.3, which gives an estimate on the distance between the true state and an instantaneous metastable state when the adiabatic time scale is much smaller than the lifetime of the metastable state. 7 We work in units where a microscopic relaxation time is of order unity.
658
W. K. Abou Salem, J. Fröhlich
Fig. 1.
3.1. Approximate metastable states. Consider a quantum mechanical system with Hilbert space H and a family of selfadjoint Hamiltonians {Hgτ (t)}t∈R , which are given by Hgτ (t) = Hg (s),
(40)
with fixed dense domain of definition, where Hg (s) = H0 (s) + gV (s),
(41)
and H0 (s) is the (generally time-dependent) unperturbed Hamiltonian, while gV (s) is a perturbation bounded relative to H0 (s), unless specified otherwise; see the footnote after assumption (B1) below. Here, s = t/τ ∈ [0, 1] is the rescaled time. Let U (θ ), θ ∈ R, denote the one-parameter unitary group of dilatations. For fixed g, we assume that there exists a positive β, independent of s ∈ [0, 1], such that Hg (s, θ ) := U (θ )Hg (s)U (−θ ),
(42)
extends from real values of θ to an analytic family in a strip |I mθ | < β, for all s ∈ [0, 1]. The spectrum of Hg (s, θ ) is assumed to lie in the closed lower half-plane for I mθ ∈ (0, β). The relation (43) Hg (s, θ )∗ = Hg (s, θ ) holds for real θ and extends by analyticity to the strip |I mθ | < β. We make the following assumptions: (B1) λ0 (s) is an isolated or embedded simple eigenvalue of H0 (s) with eigenprojection P0 (s).8 We assume that, for each fixed s ∈ [0, 1] and I mθ ∈ (0, β), λ0 (s) is separated from the essential spectrum of H0 (s, θ ). We also assume that the corresponding eigenprojection P0 (s, θ ) is analytic in θ for I mθ ∈ (0, β) and strongly continuous in θ for I mθ ∈ [0, β). g (s, θ ) = Pg (s, θ )Hg (s, θ )Pg (s, θ ) denote the reduced (B2) For 0 < I mθ < β, let H Hamiltonian acting on Ran(Pg (s, θ )), and let λg (s) be its corresponding eigenvalue. Then g→0
λg (s) −→ λ0 (s). We assume that λg (s) is differentiable in s ∈ [0, 1]. 8 The Stark effect for discrete eigenvalues of Coulumb systems is an example where isolated eigenvalues of the unperturbed Hamiltonian become resonances once the unbounded perturbation is turned on [32, 33].
Adiabatic Theorems for Quantum Resonances
659
(B3) For each fixed s ∈ [0, 1] and fixed θ with I mθ ∈ (0, β), there is an annulus N (s, θ ) ⊂ C centered at λ0 (s) such that the resolvent, Rg (s, θ ; z) := (z − Hg (s, θ ))−1 ,
(44)
exists for each z ∈ N (s, θ ) and 0 ≤ g < g0 (z). (B4) Let γ (s) be an arbitrary contour in N (s, θ ) enclosing λ0 (s) and λg (s), for I mθ ∈ (0, β). Then, for 0 ≤ g < g(γ (s)), the spectral projection dz Rg (s, θ ; z) (45) Pg (s, θ ) := 2πi γ (s) satisfies
lim Pg (s, θ ) − P0 (s, θ ) = 0.
g→0
(46)
We assume that Pg (s, θ ) is twice differentiable in s ∈ [0, 1] as a bounded operator, for fixed θ, I mθ ∈ (0, β). (B5) RS (Rayleigh-Schrödinger) Expansion. The perturbation V (s, θ ), for |I mθ | < β, is densely defined and closed, and V (s, θ )∗ = V (s, θ ). We define Hg (s, θ ) := H0 (s, θ ) + gV (s, θ ) on a core of Hg (s, θ ). For I mθ = 0, z ∈ N (s, θ ) and g small enough, the iterated resolvent equation is Rg (s, θ ; z)P0 (s, θ ) =
N −1
g n R0 (s, θ ; z)An (s, θ ; z) + g N Rg (s, θ ; z)A N (s, θ ; z),
n=0
for N ≥ 1 (depending on the model), where An (s, θ ; z) := (V (s, θ )R0 (s, θ ; z))n P0 (s, θ ).
(47) (48)
We assume that the individual terms in (47) are well-defined, and that An (s, θ ; z) defined in (48) are analytic in θ in the strip I mθ ∈ (0, β), for n = 1, . . . , N , and z ∈ N (s, θ ), and strongly continuous in I mθ ∈ [0, β). This assumption is satisfied for N = 1 in dilatation-analytic systems where V (s, θ ) is bounded relative to H0 (s, θ ), I mθ ∈ [0, β); see, e.g., [2, 3]. Moreover, this assumption holds for arbitrary N ≥ 1, if λ0 (s) is an isolated eigenvalue of the unperturbed Hamiltonian H0 (s), as in the case of discrete eigenvalues of Coulumb systems, with V (s, θ ) a perturbation describing the Stark effect, [32, 33]. The RS-expansion for Pg (s, θ ) implies that, for I mθ ∈ (0, β), Pg (s, θ ) = PgN (s, θ ) + O(g N ),
(49)
where PgN (s, θ ) is analytic in the strip I mθ ∈ (0, β), and strongly continuous in I mθ ∈ [0, β). In other words, the spectral projection onto the resonance state is only defined up to a certain order N in the coupling constant g. This is to be expected since resonance states decay with time. We now show that, for each fixed s ∈ [0, 1], the projections PgN (s) can be regarded as projections onto approximate metastable states, up to an error of order O(g N ).
660
W. K. Abou Salem, J. Fröhlich
Denote by ψ0 (s) the eigenstate of H0 (s) with corresponding eigenvector λ0 (s), and let ψgN (s) =
1 PgN (s)ψ0 (s)
PgN (s)ψ0 (s).
(50)
We have the following proposition for approximate metastable states, for each fixed s ∈ [0, 1]; see [4]. Proposition 3.1. (Approximate metastable states). Assume that (B1)–(B5) hold, and fix s ∈ [0, 1]. Let ξ ∈ C0∞ (R) be supported close to λ0 (s) with ξ = 1 in some open interval containing λ0 (s). Then ψgN (s), e−i Hg (s)t ξ(Hg (s))ψgN (s) = agN (s)e−iλg (s)t + bgN (t),
(51)
for small g, where agN (s) = ψgN (s, θ ), Pg (s, θ )ψgN (s, θ ) = 1 + O(g 2N ), I mθ ∈ (0, β), and
bgN (t) ≤ g 2N Cm (1 + t)−m ,
for m > 0, where Cm is a finite constant, independent of s ∈ [0, 1]. Although the proof of Proposition 3.1 is a straightforward extension of the results in [4], it is sketched in the Appendix to make the presentation self-contained. Choosing t = 0 in (51) gives ψgN (s), (1 − ξ(Hg (s)))ψgN (s) = O(g 2N ).
(52)
In particular, for 0 < ξ ≤ 1, ψgN (s), e−i Hg (s)t ψgN (s) = e−iλg (s)t + O(g 2N ).
(53)
This motivates considering ψgN (s) as approximate instantaneous metastable states, up to an error term of order O(g 2N ). In the next subsection, we recall a general adiabatic theorem proven in [18]. 3.2. A general adiabatic theorem. Consider a family of closed operators {A(t)}t∈R acting on a Hilbert space H, with common dense domain of definition D. Let U (t) be the propagator given by ∂t U (t)ψ = −A(t)U (t)ψ , U (t = 0) = 1 ,
(54)
for t ≥ 0; ψ ∈ D. We make the following assumptions, which will be verified in the application we consider later in this section. (C1) U (t) is a bounded semigroup, for t ∈ R+ , i.e., U (t) ≤ M, where M is a finite constant. (C2) For z ∈ ρ(A(t)), the resolvent set of A(t), let R(z, t) := (z − A(t))−1 . Assume that R(−1, t) is bounded and differentiable as a bounded operator on H, and that ˙ A(t) R(−1, t) is bounded, where the (˙) stands for differentiation with respect to t.
Adiabatic Theorems for Quantum Resonances
661
Assume that A(t) ≡ A(0) for t ≤ 0, and that it is perturbed slowly over a time scale τ such that A(τ ) (t) ≡ A(s), where s := τt ∈ [0, 1] is the rescaled time. The following two assumptions are needed to prove an adiabatic theorem. (C3) The eigenvalue λ(s) ∈ σ (A(s)) is isolated and simple, with dist (λ(s), σ (A(s))\{λ(s)}) > δ, where δ > 0 is a constant independent of s ∈ [0, 1], and λ(s) is continuously differentiable in s ∈ [0, 1]. (C4) The projection onto λ(s), 1 Pλ (s) := R(z, s)dz, (55) 2πi γλ (s) where γλ (s) is a contour enclosing λ(s) only, is twice differentiable as a bounded operator. Note that, since λ(s) is simple, the resolvent of A(s) in a neighborhood N of λ(s), contained in a ball B(λ(s), r ) centered at λ(s) with radius r < δ, is R(z, s) =
Pλ (s) + Ranalytic (z, s), z − λ(s)
(56)
where Ranalytic (z, s) is analytic in N . We now discuss our general adiabatic theorem. Let Uτ (s, s ) be the propagator given by ∂s Uτ (s, s ) = −τ A(s)Uτ (s, s ) , Uτ (s, s) = 1, (57) for s ≥ s . Moreover, define the generator of the adiabatic time evolution, Aa (s) := A(s) −
1 ˙ [ Pλ (s), Pλ (s)], τ
(58)
with the corresponding propagator Ua (s, s ), which is given by ∂s Ua (s, s ) = −τ Aa (s)Ua (s, s ) ; Ua (s, s) = 1,
(59)
for s ≥ s . It follows from Assumption (C4) that sup [ P˙λ (s), Pλ (s)] ≤ C,
s∈[0,1]
for some finite constant C, and hence by perturbation theory for semigroups, [35] Chap. IX, and Assumption (C1), Ua defined on the domain D exists and is unique, and Ua (s, s ) < M for s ≥ s , where M = MeC . We are in a position to state our adiabatic theorem. Theorem 3.2. (A general adiabatic theorem). Assume (C1)–(C4). Then the following holds: (i)
Pλ (s)Ua (s, 0) = Ua (s, 0)Pλ (0) , for s ≥ 0 (the intertwining property).
(60)
662
W. K. Abou Salem, J. Fröhlich
(ii) sup Uτ (s, 0) − Ua (s, 0) ≤
s∈[0,1]
C , 1+τ
for τ > 0 and C a finite constant. In particular, sup Uτ (s, 0) − Ua (s, 0) = O(τ −1 ),
s∈[0,1]
for τ 1. We refer the reader to [18] for a proof of Theorem 3.2. Remark. Assumption (C1) can be relaxed, but the result of Theorem 3.2 will be weakened. Suppose A(t) generates a quasi-bounded semigroup, i.e., there exist finite positive constants M and γ such that U (t) ≤ Meγ t , t ∈ R+ , then (ii) in Theorem 3.2 becomes sup Uτ (s, 0) − Ua (s, 0) ≤ C
s∈[0,1]
eτ γ , τ
for 1 τ γ −1 . 3.3. Adiabatic evolution of resonances that appear as isolated eigenvalues of spectrally deformed Hamiltonians. We consider a quantum mechanical system satisfying Assumptions (B1)–(B5), Subsect. 3.1. Denote by Uτ (s, s , θ ) the propagator corresponding to the deformed time evolution, which is given by ∂s Uτ (s, s , θ ) = −iτ Hg (s, θ )Uτ (s, s , θ ), Uτ (s, s, θ ) = 1,
(61)
for 0 ≤ s ≤ s ≤ 1 and I mθ ∈ [0, β). We make the following assumption on the existence of the deformed time evolution, which can be shown to hold in specific physical models; see [30, 4, 32] and [35], Chap. IX. (B6) For fixed θ with I mθ ∈ (0, β), Uτ (s, s , θ ), 0 ≤ s ≤ s ≤ 1, exists and is unique as a bounded semigroup with some dense domain of definition D.9 In particular, there exists a finite constant M such that Uτ (s, s , θ ) ≤ M, 0 ≤ s ≤ s ≤ 1. The generator of the deformed adiabatic time evolution is given by Ha (s, θ ) := Hg (s, θ ) +
i ˙ [ Pg (s, θ ), Pg (s, θ )], τ
(62)
and it generates the propagator ∂s Ua (s, s , θ ) = −iτ Ha (s, θ )Ua (s, s , θ ), Ua (s, s, θ ) = 1, for 0 ≤ s ≤ s ≤ 1 and fixed θ with I mθ ∈ (0, β). 9 We remark later how this assumption can be relaxed.
(63)
Adiabatic Theorems for Quantum Resonances
663
For fixed θ with I mθ ∈ (0, β), Assumptions (B4) and (B6) and perturbation theory for semigroups, [35], imply that Ua (s, s , θ ), s ≥ s , exists and [ P˙g (s, θ ), Pg (s, θ )] < ∞, I mθ ∈ (0, β) Ua (s, s , θ ) ≤ M ,
(64) (65)
where M is a finite constant independent of s, s ∈ [0, 1]. Assumptions (B1)–(B6) in Subsect. 3.1 imply Assumptions (C1)–(C4) in Subsect. 3.2, with the identification Hg (s, θ ) ↔ −i A(s), λg (s) ↔ −iλ(s), Pg (s, θ ) ↔ i Pλ (s), for fixed θ with I mθ ∈ (0, β). We consider a reference subspace corresponding to a projection P which is dilatation analytic, i.e., P(θ ) = U (θ )PU (−θ ) extends from real values of θ to a family in a strip |I mθ | < β, β > 0. Moreover, we assume that the initial state of the quantum mechanical system is ρ0 = |ψgN (0)ψgN (0)|, (66) where ψgN (s) has been defined in (50). We are interested in estimating the difference between the true state of the system and the instantaneous metastable state defined in (50) when Hg varies over a time scale smaller than the lifetime of the metastable state, τl = min (I mλg (s))−1 ∼ g −2 . s∈[0,1]
More precisely, we are interested in comparing pτ s := T r (PUτ (s, 0)ρ0 Uτ∗ (s, 0))
= T r (PUτ (s, 0)|ψgN (0)ψgN (0)|Uτ∗ (s, 0))
to
pτ s := T r (P|ψgN (s)ψgN (s)|).
(67) (68)
This is given in the following theorem. Theorem 3.3. (Adiabatic evolution of isolated resonances). Suppose Assumptions (B1)–(B6) hold. Then, for g small enough and for 1 τ τl ∼ g −2 , pτ s | = O(max(1/τ, g N τ, τ/τl (g))). | pτ s −
(69)
Proof. This result is a consequence of Theorem 3.2. Since Assumptions (B1)–(B6) hold, we know that, for fixed θ with I mθ ∈ (0, β), Ua (s, 0, θ )PgN (0, θ ) = PgN (s, θ )Ua (s, 0, θ ) + O(g N τ ), C sup Ua (s, 0, θ ) − Uτ (s, 0, θ ) ≤ , τ s∈[0,1]
(70) (71)
664
W. K. Abou Salem, J. Fröhlich
for τ 1, where C is a finite constant. For 1 τ g −2 , and I mθ ∈ (0, β), we have pτ s = Uτ (s, 0)ψgN (0), PUτ (s, 0)ψgN (0) = Uτ (s, 0, θ )ψgN (0, θ ), P(θ )Uτ (s, 0, θ )ψgN (0, θ ) = Ua (s, 0, θ )ψgN (0, θ ), P(θ )Ua (s, 0, θ )ψgN (0, θ ) + O(1/τ ) = ψgN (s, θ ), P(θ )ψgN (s, θ ) + O(max(τ/τl (g), 1/τ, g N τ )) = pτ s + O(max(1/τ, g N τ, τ/τl (g))). Remarks. (1) To estimate the survival probability of the true state of the system, choose P = |ψgN (0)ψgN (0)|, where ψgN (s) is defined in (50). (2) One may also estimate the difference between the true expectation value of a bounded operator A and its expectation value in the instantaneous metastable state, provided the operator A is dilatation analytic. Similar to the proof of Theorem 3.3, one can show that ψgN (0), Uτ (s, 0)∗ AUτ (s, 0)ψgN (0) = ψgN (s), AψgN (s) + O(max(1/τ, g N τ, τ/τl (g))), for 1 τ g −2 . (3) The results of this section can be extended to study the quasi-static evolution of equilibrium and nonequilibrium steady states of quantum mechanical systems at positive temperatures, e.g., when one or more thermal reservoirs are coupled to a small system with a finite dimensional Hilbert space; see [18, 29] for further details. In these applications, the generator of time evolution is deformed using complex translations instead of complex dilatations. (4) Assumption (B6) can be relaxed. Fix θ with I mθ ∈ (0, β). Suppose that Hg (s, θ ) generates a quasi-bounded semigroup,
Uτ (s, s , θ ) ≤ Me gατ (s−s ) ,
(72)
where M and α are positive constants and g is the coupling constant. It follows from Assumption (B4) that 1 C sup [ P˙g (s, θ )Pg (s, θ )] ≤ τ s∈[0,1] τ for finite C. Together with (72), this implies that
Ua (s, s , θ ) ≤ M e gατ (s−s ) , where M is a finite constant. Then, under Assumptions (B1)–(B6), the result of Theorem 3.3 becomes | pτ s − pτ s | = O(max(e gατ /τ, g N τ, τ/τl (g))), for 1 τ g −2 .
Adiabatic Theorems for Quantum Resonances
665
(5) The results of this section can be extended to study “superadiabatic” evolution of quantum resonances.10 In the last decade, there has been a lot of progress in studying superadiabatic processes (see for example [19] and references therein). Depending on the smoothness of the generator of the time evolution, superadiabatic theorems give improved estimates of the difference between the true time evolution and the adiabatic one. Very recently, and after the submission of this paper, superadiabatic theorems with a gap condition have been extended to evolutions generated by nonselfadjoint operators [20]. Using superadiabatic theorems and methods developed in [21], the results of this section can be extended to longer time scales under additional regularity assumptions on the Hamiltonian. Further details will appear in [30].
4. General Resonances In this section, we study the case of resonances which emerge from eigenvalues of an unperturbed Hamiltonian embedded in the continuous spectrum after a perturbation has been added to the Hamiltonian. Such resonances arise, for example, when a small system, say a toy atom or impurity spin, is coupled to a quantized field, e.g. to magnons or the electromagnetic field. The main result of this section is Theorem 4.1, which is based on an extension of the adiabatic theorem without a spectral gap; see for example [25, 27, 29]. The results of this section are more general than Sect. 3, since the perturbation is not restricted to be dilatation analytic. Consider a quantum mechanical system with a Hilbert space H and a family of time-dependent selfadjoint Hamiltonians {Hg (t)}t∈R such that Hg (t) = H0 (t) + gV (t), where H0 (t) is the unperturbed Hamiltonian with fixed common dense domain of definition D, ∀t ∈ R, and V (t) is a perturbation which is bounded relative to H0 (t) in the sense of Kato[35]. We assume that the variation of the true Hamiltonian, Hgτ (t), in time is given by Hgτ (t) ≡ Hg (s), where s ∈ [0, 1] is the rescaled time. We make the following assumptions on the model. (D1) Hg (s) is a generator of a contraction semigroup for s ∈ [0, 1] with fixed dense core. Let Rg (z, s) := (z − Hg (s))−1 for z ∈ ρ(Hg (s)), the resolvent set of Hg (s). We assume that Rg (i, s) is differentiable in s as a bounded operator, and Hg (s) R˙ g (i, s) is bounded uniformly in s ∈ [0, 1]. This assumption is sufficient to show that the unitary propagator generated by Hg (s) exists and is unique. (D2) λ0 (s) is a simple eigenvalue of H0 (s) which is embedded in the continuous spectrum of H0 (s), with corresponding eigenvector φ(s), H0 (s)φ(s) = λ0 (s)φ(s). Furthermore, the eigenprojection P0 (s) corresponding to λ0 (s) is twice differentiable in s as a bounded operator for almost all s ∈ [0, 1], and is continuous in s, s ∈ [0, 1], as a bounded operator. 10 We are grateful to an anonymous referee for indicating this possibility to us.
666
W. K. Abou Salem, J. Fröhlich
(D3) Let P 0 (s) := 1 − P0 (s), and, for a given operator A on H, denote by Aˆ s its restriction to the range of P 0 (s), Aˆ s := P 0 (s)A P 0 (s). Let F(z, s) := φ(s), V (s)P 0 (s)(z − Hˆ 0 (s)s )−1 P 0 (s)V (s)φ(s).
(73)
For each s ∈ [0, 1], we have I m F(λ0 (s) + i0, s) ≤ 0, (Fer mi s Golden Rule).
(74)
We note that P0 (s)Hg (s) = λ0 (s)P0 (s) + O(g), Hg (s)P0 (s) = λ0 (s)P0 (s) + O(g). (D4) Instantaneous metastable states. Let ξ ∈ C0∞ (R) be supported in a neighborhood of λ0 (s). For each fixed s ∈ [0, 1], we have φ(s), e−it Hg (s) ξ(Hg (s))φ(s) = ag (s)e−itλg (s) + bg (t), t ≥ 0,
(75)
where λg (s) = λ0 (s) + gφ(s), V (s)φ(s) + g 2 F(λ0 (s) − i0, s) + o(g 2 ), and |ag (s) − 1| ≤ Cg 2 , |bg (t)| ≤ Cg 2 (1 + t)−n , C is a finite constant independent of s ∈ [0, 1], for some n ≥ 1. Note that I mλg (s) ≤ 0. Equation (75) uniquely defines the instantaneous resonance state, up to an error O(g 4 ).11 11 The latter assumption is satisfied if the following holds, for each fixed s ∈ [0, 1]; see [7] for a proof of this claim in the s-independent case: (1) There exists a selfadjoint operator As such that
eit As D ⊂ D, for each fixed s ∈ [0, 1] and t ∈ R. This implies that D ∩ D(As ) is a core of H0 (s). j j−1 (2) Denote by ad A (·) := [As , ad A ], ad 1A (·) := [As , ·]. For some integer m ≥ n + 6, where n appears s
s
s
in (D4), the multiple commutators ad iA (H0 (s)) and ad iA (V (s)), i = 1, . . . , m, exist as H0 (s)-bounded s s operators in the sense of Kato. [35] (3) Mourre’s inequality holds for some open interval s λ0 (s), E s (H0 (s))i[H0 (s), As ]E s (H0 (s)) ≥ θ E s (H0 (s)) + K , where E s (H0 (s)) is the spectral projection of H0 (s) onto s , θ is a positive constant, and K is a compact operator.
Adiabatic Theorems for Quantum Resonances
667
A physical example where Assumptions (D1)–(D4) may be satisfied is a small system interacting with a field of noninteracting bosons or fermions, for example, a spin system coupled to a time-dependent magnetic field; see [22–24, 30] for further details on the relevant model of a toy atom interacting with the electromagnetic radiation field. We are interested in the adiabatic evolution of the quantum resonance over time scales which are much smaller than the lifetime of the resonance. We will prove an adiabatic theorem without a spectral gap condition for quantum resonances for weak coupling g (see [25, 27–29]). Let Uτ (s, s ) be the propagator given by ∂s Uτ (s, s ) = −iτ Hg (s)Uτ (s, s ), Uτ (s, s) = 1,
(76)
with some dense domain of definition D. Existence of Uτ as a unique unitary operator follows from Assumption (D1) and Theorem X.70 in [36]. Moreover, we introduce the generator of the adiabatic time evolution Ha0 (s) := Hg (s) +
i ˙ [ P0 (s), P0 (s)]. τ
(77)
The propagator corresponding to the approximate adiabatic evolution is given by ∂s Ua0 (s, s ) = −iτ Ha0 (s)Ua0 (s, s ), Ua0 (s, s) = 1,
(78)
with domain of definition D. Note that Ua exists as a unique unitary operator due to Assumptions (D1) and (D2). We have the following theorem, which is an extension of the results in [25, 27, 29]. Theorem 4.1. (Adiabatic theorem for embedded resonances). Suppose Assumptions (D1)–(D4) hold. Then, for small enough coupling g and large enough τ, Ua0 (s, 0)P0 (0)Ua0 (0, s) = P0 (s) + O(τ g), and sup Uτ (s, 0) − Ua0 (s, 0) ≤
s∈[0,1]
A τ 1/2
+ Bgτ 1/4 + C(τ −1/4 ),
(79)
(80)
where A and B are finite constants, and C(x) is a positive function of x ∈ R such that lim x→0 C(x) = 0. In particular, choosing τ ∼ g −2/3 gives sup Uτ (s, 0)P0 (0) − P0 (s) ≤ Ag 1/3 + C(g 1/6 ).
(81)
h(s, s ) := Ua0 (s, s )P0 (s )Ua0 (s , 0).
(82)
s∈[0,1]
Proof. Let Then
∂s h(s, s ) = iτ Ua0 (s, s ){Ha0 (s )P0 (s ) − P0 (s )Ha0 (s )}Ua0 (s , 0) i = iτ Ua0 (s, s ){λ0 (s )P0 (s ) + P˙0 (s )P0 (s ) − λ0 (s )P0 (s ) τ i + P0 (s ) P˙0 (s ) + O(g)}Ua0 (s , 0) τ = O(gτ ),
668
W. K. Abou Salem, J. Fröhlich
where we have used the definition of the generator of the adiabatic evolution and the property that P˙0 (s)P0 (s) + P0 (s) P˙0 (s) = 0. It follows that h(s, 0) = h(s, s), which is claim (79). Moreover, we are interested in estimating the difference between the true evolution and the adiabatic time evolution. For ψ ∈ D, we have that s (Uτ (s, 0) − Ua0 (s, 0))ψ = − ds ∂s (Uτ (s, s )Ua0 (s , 0))ψ 0 s = −iτ ds Uτ (s, s )[Hg (s ) − Ha0 (s )]Ua0 (s , 0)ψ 0 s =− ds Uτ (s, s )[ P˙0 (s ), P0 (s )]Ua0 (s , 0)ψ. 0
Since the domain of definition D is dense in H, it follows that s 0 Uτ (s, 0) − Ua (s, 0) = ds Uτ (s, s )[ P˙0 (s ), P0 (s )]Ua0 (s , 0).
(83)
0
We will now use a variant of Kato’s commutator method to express the integrand as a total derivative plus a remainder term, see [25]. Let X (s) := Rg (λ0 (s) + i, s) P˙0 (s)P0 (s) + P0 (s) P˙0 (s)Rg (λ0 (s) − i, s).
(84)
Note that [Hg (s), X (s)] = [Hg (s) − λ0 (s) − i, Rg (λ0 (s) + i, s) P˙0 (s)P0 (s)] + [Hg (s) − λ0 (s) + i, P0 (s) P˙0 (s)Rg (λ0 (s) − i, s)] = [ P˙0 (s), P0 (s)] + i X (s) + O(g/). Furthermore, ∂s (Uτ (s, s )X (s )Ua0 (s , 0)) = iτ Uτ (s, s )[Hg (s ), X (s )]Ua0 (s , 0) + Uτ (s, s )X (s )[ P˙0 (s ), P0 (s )]Ua0 (s , 0) + Uτ (s, s ) X˙ (s )Ua0 (s , 0). Therefore, s 1 ds Uτ (s, s )[ P˙0 (s ), P0 (s )]Ua0 (s , 0) ≤ sup { [X (s)(1 + 2 P˙0 (s)P0 (s)) τ 0 s∈[0,1] + X˙ (s)] + X (s)} + Cg/, (85) where C is a finite constant independent of s ∈ [0, 1]. We claim that the following estimates are true for small enough and g: (i) X (s) < C/, (ii) X˙ (s) < C/ 2 , (iii) X (s) < B() + Cg/, where lim→0 B() = 0, and C is a finite constant, uniformly in s ∈ [0, 1].
(86) (87) (88)
Adiabatic Theorems for Quantum Resonances
669
Estimates (i) and (ii) follow from our knowledge of the spectrum of Hg (s) and the resolvent identity. To prove estimate (iii), we compare the LHS of (88) to the case when g = 0. Let (89) X (s) := R0 (λ0 (s) + i, s) P˙0 (s)P0 (s) + P0 (s) P˙0 (s)R0 (λ0 (s) − i, s). Then, by the second resolvent identity, X (s) + Cg/ 2 , X (s) ≤ uniformly in s, for some finite constant C. We claim that X (s)2 = 0. lim 2 →0
(90)
Consider φ ∈ D, then ψ(s) = P˙0 (s)P0 (s)φ ∈ K er (P0 (s)). Using the spectral theorem for H0 (s), we have the following result: lim 2 R0 (λ0 (s) + i, s) P˙0 (s)P0 (s)φ2 = lim 2 ψ(s), R0 (λ0 (s) − i, s)R0 (λ0 (s) →0
→0
+i, s)ψ(s) = lim 2 dµψ(s) (λ)
1 (λ − λ0 (s))2 + 2 = µ(ψ(s) ∈ Ran(P0 (s))) = 0, →0
and hence claim (90). Therefore, sup Uτ (s, 0) − Ua0 (s, 0) ≤
s∈[0,1]
C1 C2 g + C(), + τ 2
(91)
where C1,2 are finite constants, and lim→0 C() = 0. Choosing = τ −1/4 gives (80). By choosing τ ∼ g −2/3 , (81) follows from Assumption (D4), (79) and (80). Remarks. (1) We note that, using an argument due to Kato, [16], the case of finitely many resonance crossings is already covered by Theorem 4.1, since the latter holds for P0 (s) twice differentiable as a bounded operator for almost all s ∈ [0, 1] and continuous as a bounded operator for s ∈ [0, 1]. Suppose that at time s0 ∈ [0, 1], a crossing of λ0 (s) with an eigenvalue of H0 (s) happens. It follows from continuity of P0 (s) that, for small > 0, Ran(P0 (s0 − )) and Ran(P0 (s0 + )) are close up to an error which is arbitrarily small in , and hence our claim follows. (2) Further knowledge of the spectrum of H0 (s) will yield a better estimate of the convergence of X to zero as → 0. For example, it is shown in [25–27] that if the spectral measure µφ(s) , φ(s) ∈ Ran(P0 (s)), is α-Hölder continuous, for α ∈ [0, 1], uniformly in s ∈ [0, 1], then12 sup R0 (λ0 (s) + i, s) P˙0 (s)P0 (s) ≤ A α/2 , (92) s∈[0,1]
for small enough, where A is a finite constant, and hence estimate (81) becomes sup Uτ (s, 0)P0 (0) − P0 (s) = O(g α/12 )
(93)
s∈[0,1]
for g small enough. Acknowledgements. WAS is grateful to an anonymous referee for pointing out references [12, 19–21]. 12 A measure µ is α-Hölder continuous, α ∈ [0, 1], if there exists a finite constant C such that, for every set with Lebesgue measure || < 1, µ() < C||α , see, e.g., [35].
670
W. K. Abou Salem, J. Fröhlich
5. Appendix Proof of Proposition 2.1, Sect. 2. Proof of Proposition 2.1 . This proposition effectively follows by integrating the Liouville equation and applying the Cauchy-Schwarz inequality. It is a special case of the generalized time-energy uncertainty relations derived in [8]. Consider an orthogonal projection P and selfadjoint operators A and B acting on a Hilbert space H. Then it follows from a direct application of the Cauchy-Schwarz inequality that T r (P[A, B])2 ≤ 4T r (P A2 − P A P A)T r (P B 2 − P B P B),
(94)
with equality when there exist a, b ∈ R\{0} such that [a A + ibB, P]P = 0.
(95)
We use inequality (94) to derive upper and lower bounds for psn . Let n n ps,s := T r (PUτ (s, s )P1 (0)Uτ (s , s)).
(96)
Then n n |∂s ps,s | = |iτ T r (PUτ (s, s )[H (s ), P1 (0)]Uτ (s , s))|
= |τ T r (P1n (0)[Uτ (s , s)PUτ (s, s ), H (s )])| ≤ 2τ T r (Uτ (s, s )P1n (0)Uτ (s , s)P2 − Uτ (s, s )P1n (0)Uτ (s , s)PUτ (s, s )P1n (0)Uτ (s , s)P)1/2 × T r (P1n (0)H (s )2 − P1n (0)H (s )P1n (0)H (s ))1/2 n − ( p n )2 f (P n (0), H (s )), ≤ 2τ ps,s 1 s,s √ where f (P, A) := T r (P A∗ (1 − P)A). It follows that s n ∂s ps,s n n | − ar csin | ds | = |ar csin ps,0 ps,s n − ( p n )2 0 ps,s s,s s ≤ 2τ ds f (P1n (0), H (s )), 0
and hence psn
=
n ≤ ps,0 sin 2 ≥ ∗
n ar csin T r (PP1 (0)) ± 2τ
s
ds
0
f (P1n (0)),
H (s )
.
(97)
We note that psn = T r (PUτ (s, 0)P1n (0)Uτ (0, s)) = T r (U1 (0, s)PU1 (s, 0)W (s, 0)P1n (0)W (0, s)). (98) Together with (97), and the identification P ↔ U1 (0, s)PU1 (s, 0) (s), H (s) ↔ H
Adiabatic Theorems for Quantum Resonances
671
(s) is the generator of the auxiliary propagator W , as defined in (19), we have where H s n≤ 2 n (s ))). ds f (P1n (0), H ps sin ∗ (ar csin T r (PU1 (s, 0)P1 (0)U1 (0, s)) ± 2τ ≥ 0 (99) Proof that the eigenstates of H1 , Sect. 2, decay like a Gaussian in space. It follows from Assumption (A3) that w,θ (x, s) defined in (7) is uniformly bounded by c, for s ∈ I. Therefore, the spectrum of H1 (s), for each fixed s ∈ I, can be computed by applying analytic perturbation theory (see, e.g., [35, 36]). Also using analytic perturbation theory, one can show that the eigenstates of H1 (s), for each fixed s, decay like a Gaussian away from the origin (see [31, 32]). To prove the last claim, choose E > 0. There exist finitely many sequences l(1) , . . . , l(k E ) , such that E ls( j) < E, j = 1, . . . , k E ,
(100)
where
E ), 0 A is a finite geometrical constant, 0 appears in Assumption (A2), and E ls( j) is given in (26). Let |l| := max li . Then |l ( j) | < E0 for j = 1, . . . , k E . Choose a contour γ E in the complex plane surrounding σ (H0 (s)) ∩ [0, E), such that k E ≤ A(
1 min(E s(k +1) − E ls(k E ) ) > 0. 2 s∈I l E For each fixed time s ∈ I , we define the spectral projection of H1 (s), 1 dz(z − H1 (s))−1 . PEθ, (s) := 2πi γ E d E := min dist[γ E , σ (H0 (s))] = s∈I
(101)
(102)
Let PE0 (s) be the orthogonal projection of H0 (s) onto the subspace H E(s) spanned by the eigenfunctions {φl(1) , . . . , φl(k E ) }, and choose such that c <
d E2 , 3(E + 0 )
(103)
where c is a finite constant appearing in Assumption (A3). It follows from analytic perturbation theory, with satisfying (103), that T r (PEθ, (s)) = T r (PE0 (s)) = k E ,
(104)
PEθ, (s) − PE0 (s) < 1.
(105)
and We have the following lemma.
Lemma A.1. Suppose Assumptions (A2) and (A3) hold. Choose satisfying (103), and fix s ∈ I. Furthermore, suppose that ψ s ∈ Ran PEθ, (s). Then there exist finite constants C > 1 and α > 0 (depending on ) such that, for sufficiently small α, eα|x| ψ s ≤ Cψ s . 2
Furthermore,
(106)
eα|x| U1 (s, s )ψ s ≤ Cψ s , for τ < ∞ and α small enough. 2
(107)
672
W. K. Abou Salem, J. Fröhlich
Proof. It follows from (105) that there exists φ s ∈ H E(s) , the subspace spanned by the eigenfunctions {φl(1) , . . . , φl(k E ) }, such that
and hence
ψ s := PEθ, φ s ,
(108)
ψ s ≤ Cφ s ,
(109)
for some finite constant C. Moreover, it follows from (102) and (108) that dz α|x|2 2 2 α|x|2 s e ψ = (z − H1 (s))−1 e−α|x| eα|x| φ s . e γ E(s) 2πi
(110)
For α small enough, we know from (27) and (28) that eα|x| φ s ≤ C φ s , 2
(111)
for some finite constant C . Moreover, for z ∈ γ E , it follows from analytic perturbation theory, [35], that eα|x| (z − H1 (s))−1 e−α|x| = (z − H 1 (s))−1 ) < ∞, 2
2
(112)
for α small enough (depending on ), where H 1 (s) := H1 (s) + 2αd − 4α 2 |x|2 + 4αx · ∇. The claim (106) follows from (110), (111) and (112). Now,
eα|x| U1 (s, s )ψ s = U 1 (s, s )eα|x| ψ s , 2
2
where U 1 = eα|x| U1 (s, s )e−α|x| is the propagator generated by H 1 (s). By applying analytic perturbation theory, it follows that 2
2
U 1 (s, s )eα|x| ψ s ≤ e M(α)τ eα|x| ψ s , 2
2
where M(α) is a positive constant such that M(α) → 0 as α → 0. Together with (106), this implies (107) for α small enough. Proof of Proposition 3.1, Sect. 3. Proof of Proposition 3.1 . Fix θ, with 0 < I mθ < β. By Assumptions (B1) and (B3), there exists an open interval I ⊂ N (s, θ ) ∩ R, with λ0 (s) ∈ I. Choose ξ ∈ C0∞ (I ). Then F(s, t) := ψgN (s), e−i Hg (s)t ξ(Hg (s))ψgN (s) dz −i zt = lim ξ(z)ψgN (s), (Rg (s, z − i) e →0 I 2πi − Rg (s, z + i))ψgN (s). Let f (θ, s, t) :=
1 2πi
I
dze−i zt ξ(z)ψgN (s, θ ), Rg (s, θ ; z)ψgN (s, θ ),
(113)
(114)
Adiabatic Theorems for Quantum Resonances
673
where ψgN (s, θ ) := U (θ )ψgN (s). Then F(s, t) = f (θ, s, t) − f (θ, s, t). The resolvent in N (s, θ ) can be decomposed into a singular and regular part, Rg (s, θ ; z) = analytic
where Rg
Pg (s, θ ) analytic (s, θ ; z), + Rg z − λg (s)
(115)
(s, θ ; z) is analytic in z. Note that analytic
Rg
analytic
(s, θ ; z)Pg (s, θ ) = Pg (s, θ )Rg
(s, θ ; z) = 0.
(116)
Using (116), the contribution of the regular part to f (θ, s, t) defined in (114) is 1 analytic N u g (s, θ ), dze−i zt ξ(z)Rg (s, θ ; z)u gN (s, θ ), 2πi I where u gN (s, θ ) :=
1 PgN (s)ψ0 (s)
[PgN (s, θ ) − Pg (s, θ )]ψ0 (s, θ ),
is of order g N . Since ξ ∈ C0∞ (I ), the last integral is bounded by Cm t −m for any m ≥ 0, and hence the contribution of the regular part is bounded by g 2N Cm t −m . The contribution of the singular part of the resolvent to F(s, t) is 1 1 −i zt −1 N N ag (s) e ξ(z)(z − λg (s)) − ag (s) dze−i zt ξ(z)(z − λg (s))−1 . 2πi I 2πi I (117) Using the fact that ξ = 1 in some open interval I0 λ0 , one may deform the path I into two contours, C0 and C1 , in the lower complex half-plane, as shown in Fig. 2. The term in (117) corresponding to the path C0 picks the residue agN (s)e−iλg (s)t . It follows from the identity PgN (s, θ )Pg (s, θ )PgN (s, θ ) = (PgN (s, θ ))2 + [PgN (s, θ ) − Pg (s, θ )][Pg (s, θ ) − 1][PgN (s, θ ) − Pg (s, θ )], and from the fact that PgN (s, θ ) − Pg (s, θ ) = O(g N ), that
agN (s) = 1 + O(g 2N ).
Fig. 2.
(118)
674
W. K. Abou Salem, J. Fröhlich
Using (118), one may write the remainder term in (117) due to the path C1 as dz −i zt e ξ(z)(z − λg (s))−1 (z − λg (s))−1 I mλg (s) C1 πi + O(g 2N ) dze−i zt (z − λg (s))−1 + C1 dze−i zt (z − λg (s))−1 , + O(g 2N ) C1
which is of order O(g 2N ). References 1. Dalfovo, F., Giorgini, S., Pitaevskii, L.P.: Theory of Bose-Einstein condensation in trapped gases. Rev. Mod. Phys. 71, 463–512 (1999) 2. Simon, B.: Resonances in N-body quantum systems with dilation analytic potentials and the foundations of time-dependent perturbation theory. Ann. Math. 97, 247 (1973) 3. Simon, B.: Resonances and complex scaling: a rigorous overview. Int. J. Quant. Chem. 14, 529 (1978) 4. Hunziker, W.: Resonances, metastable states and exponential decay laws in perturbation theory. Commun. Math. Phys. 132, 177 (1990) 5. Orth, A.: Quantum mechanical resonances and limiting absorption: the many body problem. Commun. Math. Phys. 126, 559 (1990) 6. Herbst, I.: Exponential decay in the Stark effect. Commun. Math. Phys. 87, 429 (1982/3) 7. Cattaneo, L., Graf, G.M., Hunziker, W.: A general resonance theory based on Mourre’s inequality. Annales Henri Poincare 7, 583–601 (2006) 8. Pfeifer, P., Fröhlich, J.: Generalized time-energy uncertainty relations and bounds on lifetimes of resonances. Rev. Mod. Phys. 67, 759 (1995) 9. Soffer, A., Weinstein, M.I.: Time-dependent resonance theory. Geom. Funct. Anal. 8, 1086 (1998) 10. Merkli, M., Sigal, I.M.: A time-dependent theory of quantum resonances. Commun. Math. Phys. 201, 549 (1999) 11. Jensen, A., Nenciu, G.: On the Fermi Golden Rule: Degenerate Eigenvalues. http://mpej.unige.ch/ mp-arc/html/html/c/06/06-157.pdf,2006 12. Davies, E.B.: An adiabatic theorem applicable to the Stark effect. Commun. Math. Phys. 89, 329–339 (1983) 13. Chakrabarti, B.K., Ascharyya, M.: Dynamical Transitions and Hysteresis. Rev. Mod. Phys. 71, 847–859 (1999) 14. Cohen-Tannoudji, C.: Manipulating atoms with photons. Rev. Mod. Phys. 70, 707–719 (1998) 15. Phillips, W.D.: Laser cooling and trapping of neutral atoms. Rev. Mod. Phys. 70, 721–741 (1998) 16. Kato, T.: On the adiabatic theorem of quantum mechanics. Phys. Soc. Jap. 5, 435–439 (1958) 17. Balslev, E., Combes, J.M.: Spectral properties of Schrödinger operators with dilation analytic interactions. Commun. Math. Phys. 22, 280 (1971) 18. Abou Salem, W.: On the quasi-static evolution of nonequilibrium steady states. http://arxiv.org/list/mathph/060104, 2006. To appear in Annales Henri Poincare (2007) 19. Hagedorn, G., Joye, A.: Elementary exponential error estimates for the adiabatic approximation. J. Math. Anal. Appl. 267, 235–246 (2002) 20. Joye, A.: General adiabatic evolution with a gap condition. http://arxiv.org/list/math-ph/0608059, 2006 21. Nenciu, G.: Linear adiabatic theory, exponential estimates. Commun. Math. Phys. 152, 479–496 (1993) 22. Bach, V., Fröhlich, J., Sigal, I.M.: Spectral analysis for systems of atoms and molecules coupled to the quantized radiation field. Commun. Math. Phys. 207, 249–290 (1999) 23. Bach, V., Fröhlich, J., Sigal, I.M.: Mathematical Theory of nonrelativistic matter and radiation. Lett. Math. Phys. 34, 183–201 (1995) 24. Bach, V., Fröhlich, J., Sigal, I.M.: Quantum electrodynamics of confined nonrelativistic particles. Adv. in Math. 137, 299–395 (1998) 25. Teufel, S.: A note on the adiabatic theorem without a gap condition. Lett. Math. Phys. 58, 261–266 (2002) 26. Teufel, S.: Adiabatic perturbation theory in quantum dynamics. Lecture Notes in Mathematics 1821, Heidelberg, New York: Springer-Verlag (2003) 27. Avron, J.E., Elgart, A.: Adiabatic theorem without a gap condition, Commun. Math. Phys. 203, 445– 463 (1999)
Adiabatic Theorems for Quantum Resonances
675
28. Avron, J.E., Elgart, A.: An adiabatic theorem without a spectral gap. In: Mathematical results in quantum mechanics (Prague 1998), Oper. Theory Adv. Appl. 108, Basel: Birkhäuser, (1999), pp. 3–12 29. Abou Salem, W., Fröhlich, J.: Adiabatic theorems and reversible isothermal processes. Lett. Math. Phys. 72, 153–163 (2005) 30. Abou Salem, W., Fröhlich, J.: In preparation 31. Combes, J.M., Thomas, L.: Asymptotic behaviour of eigenfunctions for multiparticle Schrödinger operators. Commun. Math. Phys. 34, 251 (1973) 32. Hunziker, W.: Notes on asymptotic perturbation theory for Schrödinger eigenvalue problems. Helv. Phys. Acta. 61, 257–304 (1988) 33. Herbst, I., Simon, B.: Dilation analyticity in constant electric field II. Commun. Math. Phys. 80, 181– 216 (1981) 34. Aguilar, J., Combes, J.M.: A class of analytic perturbations for one-body Schrödinger Hamiltonians. Commun. Math. Phys. 22, 269–279 (1971) 35. Kato, T.: Perturbation theory for linear operators, Berlin: Springer (1980) 36. Reed, M., Simon, B.: Methods of Modern Mathematical Physics, Vol. I (Functional Analysis), Vol. II (Fourier Analysis, Self-Adjointness). New York: Academic Press (1975) Communicated by B. Simon
Commun. Math. Phys. 273, 677–704 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0229-z
Communications in
Mathematical Physics
Spectral Analysis and Zeta Determinant on the Deformed Spheres M. Spreafico1, , S. Zerbini2 1 ICMC-Universidade de São Paulo, São Carlos, Brazil. E-mail: [email protected] 2 Dipartimento di Fisica, Universitá di Trento, Gruppo Collegato di Trento, Sezione INFN di Padova,
Padova, Italy. E-mail: [email protected] Received: 25 July 2006 / Accepted: 17 October 2006 Published online: 13 March 2007 – © Springer-Verlag 2007
Abstract: We consider a class of singular Riemannian manifolds, the deformed spheres SkN , defined as the classical spheres with a one parameter family g[k] of singular Riemannian structures, that reduces for k = 1 to the classical metric. After giving explicit formulas for the eigenvalues and eigenfunctions of the metric Laplacian S N , k we study the associated zeta functions ζ (s, S N ). We introduce a general method to deal k with some classes of simple and double abstract zeta functions, generalizing the ones appearing in ζ (s, S N ). An application of this method allows to obtain the main zeta k invariants for these zeta functions in all dimensions, and in particular ζ (0, S N ) and k ζ (0, S N ). We give explicit formulas for the zeta regularized determinant in the low k dimensional cases, N = 2, 3, thus generalizing a result of Dowker [25], and we compute the first coefficients in the expansion of these determinants in powers of the deformation parameter k. 1. Introduction In the last decades there has been a (continuously increasing) interest in the problem of obtaining explicit information on the zeta regularized determinant of differential operators [2, 49, 37, 50, 43, 61]. Despite the lack of a general method, a lot of results are available in the literature for various particular cases or by means of some kind of approximation. Moreover, quite complete results have been obtained for the geometric case of the metric Laplacian on a Riemannian compact manifold for some classes of simple spaces: spheres [18, 11], projective spaces [54], balls [5], orbifolded spheres [25], compact (and non-compact) hyperbolic manifolds [20, 9, 10] or in particular cases: Sturm operators on a line segment [8, 45], cone on a circle [56]. In particular, many works in the recent physical literature applied this zeta function regularization process to study the modifications induced at quantum level by some Partially supported by FAPESP: 2005/04363-4
678
M. Spreafico, S. Zerbini
kind of deformation of the background space geometry of physical models [40, 24, 22, 52]. In this context, a full class of deformed spaces, called deformed spheres, has been introduced in [52], where the perturbation of the heat kernel expansion has been studied. This is a particularly interesting class of spaces in Einstein theory of gravitation and in cosmology, since the appearance of a non-trivial deformation produces a symmetry breaking of the space. In fact, the deformed sphere may be considered as the Euclidean version of the a deformed de Sitter space, which is particularly relevant in modern cosmology, since it represents the inflationary as well as the recent accelerated phase. It is well known that the quantum effective action is related to the regularized functional determinant of Laplace type operators (see, for example [30] and references therein). As a consequence, an expansion of the functional determinant with respect to the deformation parameter around its spherical symmetric value describes the effects of such geometric symmetry breaking. It is therefore a natural question to see if the explicit calculation of the zeta determinant for the Laplace operator on this class of spaces is possible. In this work we give a positive answer to this question, establishing a general method that permits to compute the zeta regularized determinant on a deformed sphere of any dimension. Actually, for a particular discrete set of values of the deformation parameter k, the N -dimensional deformed sphere turns out to be isometric to the so-called orbifolded sphere, the quotient space S N / , of the standard N -sphere by a finite subgroup of the rotation group O N (R). Determinants on these spaces have been studied by J.S. Dowker in a series of works [25–27], where results are also obtained for different couplings. Under this point of view, the present work is a generalization of the results of Dowker to the continuous range of variation of the deformation parameter k, and in fact the results are consistent (see Sect. 4). The main motivation of the present work, beside the particular result, is that the method introduced has the advantage of being completely general and not related to this specific problem. In particular, we show how it can be applied to obtain the main zeta invariants of some classes of abstract simple and double zeta functions (Sect. 4.2 and 4.3). In order to give the explicit form for the zeta function on the deformed spheres, we produce an explicit description of the spectrum and of the eigenfunctions of the associated Laplace operator in any dimension (Proposition 3.2). In particular, the 2 dimensional case turns out to be very interesting, from the point of view of geometry: in fact the 2 dimensional deformed sphere is a space with singularities of conical type. This class of singular spaces was introduced and studied by Cheeger [17] and although since then became a subject of deep interest and investigation, there are in fact relatively few occasions where explicit results can be obtained. 2. The Geometry of the Deformed Spheres In this section we provide the definition of the N dimensional deformed sphere SkN , where k is the deformation parameter, and we study its geometry. This produces a particular interesting relation with elliptic function and conical singularity, at least in the 2 dimensional case. The deformed N -sphere is defined as the standard N -sphere with a singular Riemannian structure. When N = 2, we have an isometry with the surface immersed in R3 that can be obtained by rotating around an axis a curve described by an elliptic integral function. The surface obtained presents two singular points of conical type, as considered by Brüning and Seeley in [7] generalizing the definition of metric cone of Cheeger [17]. Thus, the 2 dimensional deformed sphere is a space with singularities of
Spectral Analysis and Zeta Determinant on the Deformed Spheres
679
conical type, and due to the great interest in this kind of singular space, both from the point of view of differential geometry and zeta function analysis (see for example [23, 34, 19, 6, 68, 21]), its study is of particular interest (compare also with [56]). Consider the immersion of the N + 1 dimensional sphere S N +1 in R N +2 , ⎧ x = sin θ0 sin θ1 . . . sin θ N , ⎪ ⎨ 0 x1 = sin θ0 sin θ1 . . . cos θ N , ⎪ ⎩... x N +1 = cos θ0 , and the induced metric (in local coordinates) g S N +1 = (dθ0 )2 + sin2 θ0 g S N . We deform this metric as follows. Let k be a real parameter with 0 < k ≤ 1, and consider the family g S N +1 [k] = (dθ0 )2 + sin2 θ0 g S N [k], g S 1 [k] = k 2 (dθ0 )2 . This is a one parameter family of singular Riemannian metric on S N +1 . We call the singular Riemannian manifolds (S N +1 , g S N +1 [k]) the deformed spheres of dimension N + 1 and we use the notation SkN +1 . By direct inspection, we see that the locus of the singular points of the metric in dimension N + 1 is a sub-manifold isomorphic to two disjoint copies of S N −1 . In particular, in the 2 dimensional case we have g S 2 [k] = (dθ0 )2 + k 2 sin2 θ0 (dθ1 )2 , that shows that the deformed 2-sphere Sk2 is a space with singularities of conical type as defined in [7]. Proceeding as in [7] Sect. 7, we will show in the next subsection that the singularity is generated by rotation of a curve in the plane. Observe that, in a different language, Sk2 is a periodic lune, that is to say it can be pictured by taking a segment of the standard 2-sphere (a lune) and identifying the sides. This situation generalizes to higher dimensions [1], and when the angle of the lune is πn , n ∈ Z, we obtain a spherical orbifold S N / , as pointed out in the introduction. Note also that, by direct verification on the local description of the metric g S N [k], the non-compact Riemannian manifold obtained by subtracting the singular subspace of the metric from SkN is a space of constant curvature and locally symmetric. It is not symmetric, as it is clear from the geometry of the low dimensional cases, or observing that it is not simply connected (see Corollary 8.3.13 of [65]). On the other side, the classical sphere S1N is a symmetric space; for example, the 2 dimensional one having the maximum number N (N2 +1) of global isometries, namely the 3 spatial rotations. Therefore, the variation of the parameter k away from the trivial value produces a breaking of the global symmetric type of the space. In particular for example on the 2 sphere it breaks two continuous rotations in one discrete symmetry, namely the reflection through the horizontal plane. We conclude this subsection with the explicit expression for the Laplace operator. With a = k1 , the (negative) of the induced Laplace operator on the deformed sphere SkN +1 is S N +1 = −dθ20 − N k
cos θ0 1 dθ0 + N. sin θ0 sin2 θ0 Sk
680
M. Spreafico, S. Zerbini
2.1. Elliptic integrals and the deformed 2-sphere. The geometry of the 2 dimensional case is particularly interesting and this subsection is dedicated to its study. The ellipse 2 x 2 + by2 = 1 can be given parametrically in the first quadrant by the formula x = t,√ y = b 1 − t 2, where 0 ≤ t ≤ 1. If we assume b ≤ 1, the arc length is t 1 − k2s2 ds, l(t) = 1 − s2 0
√ where k = 1 − b2 . With the new variables t = sin θ , s = sin ψ, we obtain x = sin θ, y = b cos θ, with 0 ≤ θ ≤
π 2,
and the arc length is θ E(θ, k) = l(sin θ ) = 1 − k 2 sin2 ψdψ, 0
that is the elliptic integral of the second kind in Legendre normal form [35] 8.110.2 (see [48] or [64] for elliptic functions and integrals). Note that we cannot find a parameterization of the curve by the arc length reversing the above equation using Jacobi elliptic functions. Consider now the curve f (sin θ ) = E(θ, k). This is a smooth curve in the interval 0 ≤ t ≤ 1, with f (0) = 0 and f (1) = E( π2 , k). We can rotate this function around the horizontal axis getting a surface with a geometric singularity at the origin. For further use, it is more convenient to place the surface in the upper half space. Thus, we consider the function 2 1− t 2
t f (t) = E(arccos , k) = k
k
0
1 − k2s2 ds, 1 − s2
with 0 ≤ t ≤ k, and the curve: x = t, y = f (t). We reparametrize this curve by its arc length t t 1 t 2 θ = l(t) = 1 + (y (s)) ds = ds = arcsin , √ 2 2 k k −s 0
obtaining
0
x = k sin θ, y = f (k sin θ ) = E( π2 − θ, k) =
with 0 ≤ θ ≤
π 2
π 2
0
−θ
1 − k 2 sin2 ψdψ,
(as before θ is the angle from the vertical axis).
Spectral Analysis and Zeta Determinant on the Deformed Spheres
681
Let us now consider the surface Yk+ obtained by rotating the above curve along the vertical axis. We have the parameterization ⎧ ⎨ x = k sin θ cos φ, Yk+ : y = k sin θ sin φ, ⎩ z = E( π − θ, k), 2 where 0 ≤ φ ≤ 2π . This is clearly a smooth surface except at the possible singular point (0, 0, E(π/2, k))), with the circle Ck : x 2 + y 2 = k 2 , z = 0 of radius k as boundary. Moreover, since the coordinate line tangent vectors on the boundary are vφ = − sin φex +cos φe y and vθ = ez , the tangent space is vertical and hence we can glue smoothly Yk+ with the surface Yk− obtained by reflecting through the horizontal plane. We call the surface obtained Yk2 = Yk+ ∪Yk− , and the parameter k deformation parameter. The surface X k obtained from Yk2 by removing the poles (0, 0, ±k) is clearly a smooth (non-compact) surface. The Riemannian metric induced on X k from the immersion in R3 is gY 2 (θ, φ) = (dθ )2 + k 2 sin2 θ (dφ)2 . k
It is clear that the local map f : (θ, φ) → (θ, φ), extends to a diffeomorphism f : Yk2 → S 2 , and since f ∗ g S 2 [k] = gY 2 , it follows that f is an isometry between k
Sk2 = (S 2 , g S 2 [k]) and (Yk2 , gY 2 ). k
3. Spectral Analysis In this section we give the eigenvalues and eigenfunctions of the Laplace operator on the deformed sphere. As observed in Sect. 2, the two dimensional case is of particular interest, since it represents an instance of a space with singularities of conical type that can be solved explicitly. Therefore we spend a few words to describe the concrete operator appearing in that case, using the language of spectral analysis for spaces with conical singularities [17, 7]. With a = k1 , the (negative) of the induced Laplace operator on the deformed sphere Sk2 is S 2 = −∂θ2 − 1/a
cos θ a2 2 ∂θ − ∂φ , sin θ sin2 θ
2 ). With the Liouville transform u = Ev, with E(θ ) = on L 2 (S1/a operator
1 a2 2 1 . 1 + L a = −∂θ2 − − ∂ φ 4 sin2 θ sin2 θ
This is a regular singular operator as defined in [7], L a = −dθ2 + where A(θ ) =
θ2 sin2 θ
1 A(θ ), θ2
1 1 −a 2 ∂φ2 − − θ 2, 4 4
√1 , sin θ
we obtain the
682
M. Spreafico, S. Zerbini
is a family of operators on the section of the cone, that radius 1. It is is the circle of clear that the operator −∂φ2 has the complete system µm = m 2 , eimφ , with m ∈ Z, where all the eigenvalues are double up to the null one that is simple with the unique eigenfunction given by the constant map. Since the problem decomposes spectrally on this system, we reduce to study the family of singular Sturm operators Tam = −dθ2 +
a2m2 − sin2 θ
1 4
1 − . 4
In order to define an appropriate self adjoint extension, we introduce the following boundary conditions at the singular points: √ 1 1 am BC0 : am + f (θ ) − θ f (θ ) = 0, lim θ θ→0 2 θ and BCπ :
lim (θ − π )
am
θ→π
√ 1 1 am + f (θ − π ) − θ − π f (θ − π ) = 0. 2 θ −π
These are the natural generalizations of the classical Dirichlet boundary conditions (compare with [62] 8.4) and were first considered in [7]. In particular, it was proved in [7], Sect. 7, that the self adjoint extension defined by these conditions is the Friedrich extension. The eigenvalues equation associated with the operators Tam , can be more easily studied going back to the original Hilbert space. This equation was in fact already studied by Gromes [36], who found a complete solution. Generalizing the standard approach used for the standard sphere (see for example [39]), we can prove that in fact this solution provide a complete set of eigenvalues and eigenfunctions for the metric Laplacian, as stated in the following lemma. Lemma 3.1. The operator S 2 , has the complete system: 1/a
λn,m = (a|m| + n)(a|m| + n + 1), n ∈ N, m ∈ Z, µ
where all the eigenvalues with m = 0 are double with eigenfunctions (where the Pν are the associated Legendre functions) −am −am eiamφ Pam+n , e−iamφ Pam+n ,
while the eigenvalues n(n + 1) are simple with eigenfunctions the functions Pn . Next, we pass to the higher dimensions. The (negative) of the induced Laplace operator on the deformed sphere SkN +1 is S N +1 = −dθ20 − N k
cos θ0 1 dθ + N sin θ0 0 sin2 θ0 Sk
on L 2 (SkN +1 ). Projecting on the spectrum of S N , we obtain the differential equation −dθ20
k
λS N cos θ0 k −N dθ + u = λ S N +1 u. k sin θ0 0 sin2 θ0
Spectral Analysis and Zeta Determinant on the Deformed Spheres
683
Following [52], we make the substitutions u(θ0 ) = sinb θ0 v(θ0 ), 1 (cos θ0 + 1), 2 where b = 21 1 − N + (N − 1)2 + 4λ S N . This gives the hypergeometric equation z=
k
[35] 9.151, z(1 − z)v + [γ − (α + β + 1)z]v − αβv = 0, with α=
1 2b + N ∓ N 2 + 4λ S N +1 , k 2
β=
1 2b + N ± N 2 + 4λ S N +1 , k 2
1 (2b + N + 1). 2 Boundary conditions give the equation 2n + 2b + N = ± N 2 + 4λ S N +1 , γ =
k
where n ∈ N, that, in turn, gives the recurrence relation 1 1 − N + (1 − N )2 + 4λ S N + 2λ S N . λ S N +1 = n 2 + 1 + (1 − N )2 + 4λ S N n + k k k k 2 We can prove that this recurrence relation is satisfied by the numbers
N − 1 2 (N − 1)2 λ S N = am + n 1 + · · · + n N −1 + , − k 2 4 where n i ∈ N, must be a positive integer. We have obtained b = am + n 1 + · · · + n N −1 , α = −n N , β = 2(am + n 1 + · · · + n N −1 ) + n N + N , N +1 , 2 and the family of solutions for the eigenvalues equation (up to a constant) γ = am + n 1 + · · · + n N −1 +
u n N (cos θ0 ) = sin
1−N 2
θ0 P
−am−n 1 −···−n N −1 − N 2−1
am+n 1 +···+n N −1 + N 2−1 +n N
(cos θ0 ).
Using standard argument, we can then prove the following result.
684
M. Spreafico, S. Zerbini
Lemma 3.2. The operator S N +1 , has the complete system: 1/a
λm,n 1 ,...,n N = (a|m| + n 1 + · · · + n N )(a|m| + n 1 + · · · + n N + N ), n i ∈ N, m ∈ Z, where all the eigenvalues with m = 0 are double with eigenfunctions (up to normalization) eiamθ N
N −1
sin
1−N + j 2
(θ j )P
j=0
e−iamθ N
N −1
sin
1−N + j 2
j −am−n 1 −···−n N −1− j − N −1− 2
j am+n 1 +···+n N − j + N −1− 2
(θ j )P
(cos θ j ),
j −am−n 1 −···−n N −1− j − N −1− 2
j am+n 1 +···+n N − j + N −1− 2
j=0
(cos θ j ),
while the eigenvalues with m = 0 are simple with eigenfunctions N −1
sin
1−N + j 2
(θ j )P
j=0
j −n 1 −···−n N −1− j − N −1− 2
j n 1 +···+n N − j + N −1− 2
(cos θ j ).
4. Zeta Regularized Determinants In this section we study the zeta function associated to the Laplace operator on the deformed sphere SkN +1 . For this, we introduce two quite general classes of zeta functions and we compute the main zeta invariants of them. This allows us to define a general technique to obtain the zeta regularized determinant of the Laplace operator on SkN +1 as a function of the deformation parameter. We apply this technique to the lower cases, N = 1 and 2, giving explicit formulas. Our last result is the computation of the coefficients in the expansions of the zeta determinants in powers of the deformation parameter. By Lemma 3.2, the zeta function on SkN +1 is the function defined by the series ζ (s, S N +1 ) = 1/a
n∈N0N
[n(n + N )]−s + 2
∞
[(am + n)(am + n + N )]−s ,
m=1 n∈N N
when Re(s) > N +1, and by analytic continuation elsewhere. Here n is a positive integer vector n = (n 1 , . . . , n N ), and the notation N0N means N × · · · × N − {0, . . . , 0}. Multidimensional gamma and zeta functions, namely zeta functions where the general term is of the form (n T an + b T n + c)−s , where a is a real symmetric matrix of rank k ≥ 1, b a vector in Rk , c a real number and n an integer vector in Zk , were originally introduced by Barnes [3, 4] and Epstein [32, 33] as natural generalizations of the Euler gamma function. Whenever the sum is on the integers (i.e. n ∈ Zk ), there is a large symmetry that allows one to express the zeta function by a theta series. Multidimensional theta series have been deeply studied in the literature, and by a generalization of the Poisson summation formula (see for example [16] XI.2, 3) it is possible to compute the main zeta invariants for multiple series of this type (see [63, 47, 30, 31 , and 15] and references there in). The main problem in the present case is that the zeta functions are associated to series of Dirichlet type, namely the sums are over Nk0 . We lose then
Spectral Analysis and Zeta Determinant on the Deformed Spheres
685
many symmetries and in particular a formula of Poisson type. Consequently, it is more difficult to find general results, and different techniques have been introduced to deal with the specific cases (see for example [12, 13, 18, 29, 54, 55] for simple series or series that can be reduced to simple series or [15, 46] for multiple linear series). Note in particular that the case of a double (k = 2) homogeneous quadratic series of Dirichlet type is much harder. The zeta functions of this type (with integer coefficients) appear when dealing with the zeta functions of a narrow ideal class for a real quadratic field as shown by Zagier in [66 and 67], where he also computes the values at non-positive integers (see also [51, 28, 14, 15], and in particular [58] for the derivative). Beside, we can overcome this difficulty in the case under study first by reducing the multi-dimensional zeta functions to a sum of 2 dimensional linear and quadratic zeta functions, and then studying the quadratic one by means of a general method introduced in [59] in order to deal with non homogeneous zeta functions. Note that, for particular values of the deformation parameter, the zeta function can be reduced to a sum of zeta functions of Barnes type [3], and this allows a direct computation of the main zeta invariants [25, 26]. This approach does not work for generic values of the deformation parameter, and therefore the more sophisticated technique introduced here is necessary. We present in the next subsection some generalizations of some results of [59] necessary in order to treat the present case, and we give in the following subsections some applications to the case of some general classes of abstract simple and double zeta functions. As explained hereafter, by means of these two classes of zeta functions, we can in principle calculate the zeta invariants for the deformed sphere in any dimensions. Eventually in the last subsections we apply the method to obtain the main zeta invariants for the zeta functions on the 2 and 3 dimensional deformed spheres. By the following lemma (see [60 or 59]), we can reduce ζ (s, S N +1 ) to a sum of k simple and double zeta functions. Lemma 4.1. Let f (z) be a regular function of z. Then
f (n) =
n∈N N +1
∞ n+N n=0
N
f (n),
n∈N0N +1
f (n) =
∞ n+N n=1
N
f (n).
Proposition 4.2. The zeta function associated with the Laplace operator on the N + 1 dimensional deformed sphere is (N ≥ 1) ζ (s, S N +1 ) = 1/a
∞ ∞ ∞ n+ N −1 n+ N −1 [n(n + N )]−s + 2 N −1 N −1 n=1
m=1 n=0
[(am + n)(am + n + N )]−s . −1 Since n+N N −1 = PN (n) is a polynomial of order N in n, and since given any polynomial PN (n) we have a polynomial Q N (n + x) for any given x, such that PN (n) = Q N (n + x) (and we can find explicitly the coefficients of Q as functions on those of P and x), it is sufficient to consider the two classes of zeta functions z(s; α, 2, x, p) =
∞ n=1
(n + x)α [(n + x)2 + p]−s ,
686
M. Spreafico, S. Zerbini
and Z (s; α, a, x, p) =
∞ ∞
n α [(n + am + x)2 + p]−s .
m=1 n=1
This will be done in Subsects. 4.2 and 4.3, but first, the next subsection is dedicated to recall and generalize some results on sequences of spectral type and associated zeta functions introduced in [59], necessary in the following.
4.1. Sequences of spectral type and zeta invariants. In this subsection we will use some concepts and results developed in [59], that briefly we recall here. We refer to that work for further details and complete proofs. Let T = {λn }∞ n=1 be a sequence of positive numbers with unique accumulation point at infinite, finite exponent s0 and genus q. We associate to T , the heat function f (t, T ) = 1 +
∞
e−λn t ,
n=1
the logarithmic Fredholm determinant ∞ z 1+ e log F(z, T ) = log λn
q j=1
(−1) j z j j j λn
,
n=1
and the zeta function ζ (s, T ) =
∞
λ−s n .
n=1
The sequence T is called of spectral type if there exists an asymptotic expansion of the associated heat function for small t in powers of t and powers of t times positive integer powers of log t. In particular it is said to be a simply regular sequence of spectral type if the associated zeta function has at most simple poles (see [59] pp. 4 and 9). Formulas to deal with the zeta invariants for sequences of spectral type are given in [59]. In particular, there are considered non-homogeneous sequences as well. We generalize the concept of non-homogeneous sequence here, by considering, for any given sequence ∞ of spectral type T0 = {λn }∞ n=1 , the shifted sequence Td = {λn + d}n=1 , where d is a parameter, subject to the unique condition that Re(λn + d) is always positive. We can prove the following results for a shifted sequence (see [59] Proposition 2.9 and Corollary 2.10 for details). Lemma 4.3. Let T0 = {λn }∞ n=1 be a sequence of finite exponent s0 and genus q, then the associated shifted sequence Td = {λn + d}∞ n=1 , with d such that Re(λn + d) > 0 for all n, is a sequence of finite exponent s0 and genus q. Moreover, T0 is of spectral type if and only if Td is of spectral type. If T0 is simply regular, so is Td .
Spectral Analysis and Zeta Determinant on the Deformed Spheres
687
Proposition 4.4. Let T0 = {λn }∞ n=1 be a simply regular sequence of spectral type with finite exponent s0 and genus q, and Td = {λn + d}∞ n=1 , with d such that Re(λn + d) > 0 for all n, an associated shifted sequence. Then, ζ (0, Td ) = ζ (0, T0 ) +
q (−1) j
j
j=1
Res1 (ζ (s, T0 ), s = j)d j ,
ζ (0, Td ) = ζ (0, T0 ) − log F(d, T0 ) + +
q (−1) j Res0 (ζ (s, T0 ), s = j) + (γ + ψ( j))Res1 (ζ (s, T0 ), s = j) d j . j j=1
Proposition 4.5. Let T0 = {λn }∞ n=1 be a simply regular sequence of spectral type with finite exponent s0 and genus q. Let L 0 = {λ2n }∞ n=1 , and d such that Re(λn + d) > 0 for all n. Then, 1 ζ (0, L d 2 ) = ζ (0, Tid ) + ζ (0, T−id , 2 p
ζ (0, L d 2 ) = ζ (0, Tid ) + ζ (0, T−id )−
j 2 (−1) j j=1
j
k=1
1 Res1 (ζ (s, T0 ), s = 2 j)d 2 j . 2k − 1
Remark 4.6. Note that the numbers λn in the sequence need not to be different, i.e. the cases with multiplicity are covered by Propositions 4.4 and 4.5. In particular, assume the sequence is T0 = {λn }∞ n=1 , each λn having multiplicity ρn (we cover the case of a general abstract multiplicity, given by any positive real number). Then, the unique difficulty can be in defining the exponent of convergence of the sequence. But actually for our purpose it is sufficient to know the genus, and this can be obtained whenever we know the asymptotic of λn and ρn for large n. In fact, if λn ∼ n b and ρn ∼ n a , then the generalterm of the associated zeta function behaves as n a−bs , and therefore the genus is q = a+1 (the integer part). b Some more remarks on these results are in order. First, note that the approach of considering some general class of abstract sequences and of studying the analytic properties of the associated spectral functions has been developed by various authors, and in particular instances of Proposition 4.4 can be found in the literature. The original idea is probably due to Voros [61], while a good reference for a rigorous and very general setting is the work of Jorgenson and Lang [41]. However, for our purpose here, the simpler setting of [59] is more convenient. Second, observe that Proposition 4.5 was originally proved by Choi and Quine in [18], and also obtained in [25], Eq. (25). In particular, the reader can see the proof given in [59], as the more rapid route to this result suggested in [25]. 4.2. A class of simple zeta functions. We consider the following class of simple zeta functions (compare with [57]) z(s; α, β, x, p) =
∞ n=1
(n + x)α [(n + x)β + p]−s ,
688
M. Spreafico, S. Zerbini
for Re(s) > 1+α β , where α and β are real positive numbers, and x and p are real numbers subject to the conditions that n + x > 0 and (n + x)β + p > 0 for all n. Note that different equivalent techniques could be applied to deal with this case; namely one could use the Plana theorem as in [54], a regularized product like in [18], a complex integral representation as in [55], or heat-kernel techniques [30, 31]. Proposition 4.7. The function z(s; α, β, x, p) has a regular analytic continuation in the whole complex s-plane up to simple poles at s = 1+α β − j, j = 0, 1, 2, . . . , whenever these values are not 0, −1, −2, . . . . The origin is a regular point and if positive integer
1+α β
is not a
z(0; α, β, x, p) = ζ H (−α, x + 1), and
z (0; α, β, x, p) = βζ H (−α, x + 1) +
α+1 β
(−1) j ζ H (β j − α, x + 1) p j + j j=1
∞ 1+ − log n=1
while if
1+α β
p (n + x)β
(n+x)α e
(n+x)α
α+1 β j=1
pj (−1) j j (n+x)β j
,
is a positive integer α+1
(−1) β α+1 p β , z(0; α, β, x, p) = ζ H (−α, x + 1) + α+1 and z (0; α, β, x, p) = βζ H (−α, x + 1) +
α+1 β −1
(−1) j ζ H (β j − α, x + 1) p j + j j=1
+
(−1)
α+1 β
α+1 β
α+1 1+α 1 −(x + 1) + γ + p β + β β
∞ 1+ − log n=1
p (n + x)β
(n+x)α e
(n+x)α
α+1 β j=1
pj (−1) j j (n+x)β j
.
Proof. The result follows applying Proposition 4.4. First, note that the unshifted β α sequences is T0 = {(n + x) }, with multiplicity (n + x) . By the Remark 4.6, the α+1 sequence has genus q = β . The associated zeta function is z(s; α, β, x, 0) = ζ H (βs − α, x + 1), and this clearly shows that T0 is a simply regular sequence of spectral type, and so is T p by Lemma 4.3. The unique pole is at s = 1+α β and
Spectral Analysis and Zeta Determinant on the Deformed Spheres
689
1+α 1 = , Res1 z(0; α, β, x, 0), s = β β
1+α Res0 z(0; α, β, x, 0), s = = −ψ(x + 1). β The associated Fredholm determinant is (n+x)α ∞ q j zj z (n+x)α j=1 (−1) j (n+x)β j . F(z, T0 ) = 1+ e (n + x)β n=1
Next, using the expression given in the proof of Proposition 4.4 we have ∞ −s ζ H (β(s + j) − α) p j , z(s; α, β, x, p) = j j=0
thus we have poles when β(s + j) − α = 1, i.e. s = 1+α β − j, j = 0, 1, 2, . . . , when ever these values are not 0, −1, −2, . . . , and the residua are easily computed. To obtain the value at s = 0, it is useful to distinguish two cases (see [57]). In fact, from the above expression, when s = 0 the unique term that is singular is the one with α+1 β j − α = 1, i.e. j = α+1 β , that is necessarily a positive integer since α ≥ 0. Now, if β = α+1 is not a positive integer, then we have no integer poles, q = α+1 β β , and hence z(0; α, β, x, p) = z(0; α, β, x, 0), and since Res0 (z(s; α, β, x, 0), s = j) = z( j; α, β, x, 0) = ζ H (β j − α, x + 1), z (0; α, β, x, p) = z (0; α, β, x, 0)
α+1 β
(−1) j Res0 (z(s; α, β, x, 0), s = j) p j − log F( p, T0 ). j j=1 α+1 = α+1 If α+1 β is a positive integer, we have a pole, q = β β , and we need to take in account also the residuum. As we have seen, since the Hurwitz zeta function has only one pole at s = 1 with residuum 1, all the terms up to the ones with j = 0 and the one with j = α+1 β have vanishing residuum, and we obtain +
z(0; α, β, x, p) = z(0; α, β, x, 0) +
(−1)
α+1 β
α+1 β
α+1 1+α p β , Res1 z(s; α, β, x, 0), s = β
and α+1 β −1
(−1) j Res0 (z(s; α, β, x, 0), s = j) p j + j j=1 α+1
(−1) β 1+α + α+1 + Res0 z(s; α, β, x, 0), s =
β β
α+1 1+α 1+α Res1 z(s; α, β, x, 0), s = p β − log F( p, T0 ), + γ + β β that gives the formula stated in the thesis. z (0; α, β, x, p) = z (0; α, β, x, 0) +
690
M. Spreafico, S. Zerbini
4.3. A class of double zeta functions. Consider the following class of double zeta functions ∞
Z (s; α, a, x, p) =
n α [(am + n + x)2 + p]−s ,
m,n=1
for Re(s) > 1 + α, and where x and p are real constants subject to the conditions that am + n + x > 0 and (am + n + x)2 + p > 0 for all n and m, and α is a non-negative integer (the case where α is any real number can be treated by similar methods, but is much more complicated, see [56]). Remark 4.8. In the more general case ∞
Z (s; α, β, a, x, p) =
n α [(am + n + x)β + p]−s ,
m,n=1
for Re(s) > 2(1+α) β , and where x and p are real constants subject to the conditions that am + n + x > 0 and (am + n + x)β + p > 0 for all n and m, we would have genus by Remark 4.6 since the leading term behaves like n α n −βs/2 , but we would q = 2(1+α) β not be able to prove that these are regular sequences of spectral type as in the following proof of Lemma 4.9. The sequences appearing in these zeta functions are: S0 = {λm,n = (am + n + ∞ x)2 }∞ m,n=1 and the associated shifted sequence S p = {λm,n + p}m,n=1 , both with multiα plicity n . These are sequences with finite exponent and genus q = [1 + α] by Remark 4.6. We first show that S p is a simply regular sequence of spectral type. Lemma 4.9. The sequence S p = {(am + n + x)2 + p}∞ m,n=1 is a simply regular sequence of spectral type. Proof. By Lemma 4.3, we need to show that there exists an expansion of the desired type for the heat function f (t, S0 ) = 1 +
∞
n α e−(am+n+x) t . 2
m,n=1 α Consider the sequence L = {am +n +x}∞ m,n=1 , with multiplicity n , of finite exponent ab
and genus 2 (since m a + n b ≤ (mn) a+b ). The associated heat function is f (t, L) = 1 +
∞
n α e−(am+bn+c)t ,
m,n=1
and the associated Fredholm determinant is F(z, L) =
∞ m,n=1
1+
n α z e am + bn + c
2 j=1
(−1) j nα z j j (am+bn+c) j
.
Spectral Analysis and Zeta Determinant on the Deformed Spheres
691
Since f (t, L) = 1 +
∞
n α e−(am+bn+c)t = 1 + e−ct
m,n=1
∞
e−amt
m=1
∞
n α e−bnt ,
n=1
and we have an expansion of each factor in powers of t (see [57] Sect. 3.1 for the last sum), it is clear that we have an expansion of the form f (t, L) =
∞
e j tδj .
j=0
By Lemma 2.5 of [57], L is simply regular, and hence the unique logarithmic terms in the expansion of F(z, L) are of the form z k log z, with integer k ≤ 2. Now, consider the product ∞ 1+ F(i z, L)F(−i z, L) = m,n=1
×e
2 j=1
iz am + bn + c
n α (i z) j (−1) j j (am+bn+c) j
2
e
j=1
n α 1−
iz am + bn + c
(−1) j n α (−i z) j j (am+bn+c) j
n α ×
.
Since i j + (−i) j = 0 for odd j, and −2 when j = 2, this gives ∞ 1+ F(i z, L)F(−i z, L) = m,n=1
z2 (am + bn + c)2
n α
1
nα z2
e 2 (am+bn+c)2 = F(z 2 , S0 ),
and we obtain a decomposition of the Fredholm determinant associated to the sequence S0 . This means that log F(z, S0 ) has an expansion with unique logarithmic terms of the form z k log z, with integer k ≤ 1, and therefore S0 is a simply regular sequence of spectral type by Lemma 2.5 of [59]. Lemma 4.9 shows that the sequence appearing in the definition of the function Z (s; α, a, x, p) = ζ (s, S p ) are such that we can apply Proposition 4.4 in order to obtain all the desired zeta invariants. For we need explicit knowledge of the zeta invariants of the sequence S0 . This is in the next lemma. Lemma 4.10. The function χ (s; α, a, x) defined for real a and x such that am+n+x > 0, for all m, n ∈ N0 , and α a non-negative integer, by the sum χ (s; α, a, x) =
∞
n α (am + n + x)−s ,
m,n=1
when Re(s) > 2(α + 1), can be continued analytically to the whole complex plane up to a finite set of simple poles at s = 1, 2, . . . , α + 2, by means of the following formula:
692
M. Spreafico, S. Zerbini
χ (s; α, a, x) = +
α
α(α − 1) . . . (α − j + 1) a j+1−s ζ H (s − j − 1, (x + 1)/a + 1) + (s − 1)(s − 2) . . . (s − j − 1)
j=1
+ia
−s
∞ 0
1 −s a 1−s a ζ H (s, (x + 1)/a + 1) + ζ H (s − 1, (x + 1)/a + 1) + 2 s−1
(1 + i y)α ζ H (s, (x + 1 + i y)/a + 1) − (1 − i y)α ζ H (s, (x + 1 − i y)/a + 1) dy. e2π y − 1
In particular, this shows that the point s = 0 is a regular point. Proof. We apply the Plana theorem as in [54]. Since the general term behaves as n α n −s/2 , we assume Re(s) > 2(α + 1), ∞ ∞ 1 χ (s; α, a, x) = (am + x + 1)−s + t α (am + t + x)−s dt + 2 ∞
m=1
m=1 1
∞ (1 + i y)α (am + x + 1 + i y)−s − (1 − i y)α (am + x + 1 − i y)−s +i dy. e2π y − 1 ∞
m=1 0
Recall that α is a non-negative integer, then we can integrate recursively the middle term obtaining, for α > 0, ∞
α
−s
t (am + t + x) dt =
α j=0
1
α(α − 1) . . . (α − j + 1) (am + x + 1) j+1−s ; (s − 1)(s − 2) . . . (s − j − 1)
this gives χ (s; α, a, x) =
∞ a 1−s 1 −s a (m + (x + 1)/a)1−s + (m + (x + 1)/a)−s + 2 s−1 m=1
+
α j=1
∞ α(α − 1) . . . (α − j + 1) a j+1−s (m + (x + 1)/a) j+1−s + (s − 1)(s − 2) . . . (s − j − 1) m=1
∞ (1 + i y)α (m + (x + 1 + i y)/a)−s − (1 − i y)α (m + (x + 1 − i y)/a)−s dy, e2π y − 1 ∞
+ia
−s
m=1 0
and, due to uniform convergence of the integral, concludes the proof. Remark 4.11. We could deal with this kind of double zeta function by applying the classical integral formula of Hermite as in the case of the Riemann zeta function. This approach confirms the above results, but it would not give a tractable expression for the singular part.
Spectral Analysis and Zeta Determinant on the Deformed Spheres
693
We can now obtain the zeta invariants of the zeta function Z (s; α, a, x, p) for all the acceptable values of the parameters. This allows us to compute the regularized determinant of the deformed sphere of any dimension, as pointed out at the beginning of this section. Besides, we will give explicit formulas and results for the low dimensional cases in the next subsections. 4.4. Zeta determinant on the deformed 2 sphere. By Proposition 4.2, the zeta function associated to the operator S 2 is the function defined by the series 1/a
ζ (s, S 2 ) = 1/a
∞ [n(n + 1)]−s + 2 n=1
∞
[(am + n)(am + n + 1)]−s ,
m=1,n=0
when Re(s) > 2, and by analytic continuation elsewhere. The aim of this section is to study this zeta function and in particular to obtain a formula for the values of ζ (0, S 2 ) and ζ (0, S 2 ). When a = 1, this reduces to the zeta function on the 1/a 1/a ∞ 2 −s [18, 54, 55]. The zeta function 2-sphere: ζ (s, S 2 ) = n=1 (2n + 1)(n + n) 1/a=1 ζ (s, SS 2 ) decomposes as 1/a
ζ (s, S 2 ) = z(s; 0, 2, 1/2, −1/4) + 2Z (s; 0, a, −1/2, −1/4), 1/a
and we can easily check that the values of the parameters satisfy the condition of definition of these functions. We provide two equivalent formulas for the zeta determinant on the deformed 2-sphere, Theorems 4.15 and 4.16. The first is obtained applying Proposition 4.4, the second applying Proposition 4.5. Computations are given in the proofs of the following lemmas. The first lemma follows by a direct application of Proposition 4.7 and properties of special functions. Lemma 4.12. z(0; 0, 2, 1/2, −1/4) = −1, z (0; 0, 2, 1/2, −1/4) = − log 2π.
Lemma 4.13. 1 a + , 12 12a
1 1 Z (0; 0, a, −1/2, −1/4) = − a log a+ 6 2a Z (0; 0, a, −1/2, −1/4) =
+ζ H (0, 1/(2a) + 1) − 2aζ H (−1, 1/(2a) + 1) − 2aζ H (−1, 1/(2a) + 1)+ ∞ +2i 0
ζ H (0, (1/2 + i y)/a + 1) − ζ H (0, (1/2 − i y)/a + 1) dy+ e2π y − 1
694
M. Spreafico, S. Zerbini
+
1 1 ζ H (2, 1/(2a) + 1) − ((1/(2a) + 1) + 1 + log a) + 8a 2 4a
i + 2 4a
∞ 0
ζ H (2, (1/2 + i y)/a + 1) − ζ H (2, (1/2 − i y)/a + 1) dy+ e2π y − 1 ∞ + 1− m,n=1
1 4(am + n − 1/2)2
1
e 4(am+n−1/2)2 .
Proof. The function Z (s; 0, a, −1/2, −1/4) is the zeta function associated with the sequence S−1/4 = {(am +n −1/2)2 −1/4}, all terms with multiplicity 1. By Lemma 4.9, S−1/4 is a simply regular sequence of spectral type. In order to apply Proposition 4.4, we need to study the unshifted sequence S0 = {(am + n − 1/2)2 }. This sequence has genus 1, the associate Fredholm determinant is ∞ z z − F(z, S0 ) = 1+ e (am+n−1/2)2 , 2 (am + n − 1/2) m,n=1
and the associated zeta function is ζ (s, S0 ) = χ (2s; 0, a, −1/2). By Proposition 4.4 and since the genus is 1, we have that 1 Z (0; 0, a, −1/2, −1/4) = χ (0; 0, a, −1/2) + Res1 (χ (2s; 0, a, −1/2), s = 1), 4 and that Z (0; 0, a, −1/2, −1/4) = χ (2s; 0, a, −1/2)|s=0 1 + Res0 (χ (2s; 0, a, −1/2), s = 1) − log F(−1/4, S0 ), 4 and hence we need to compute the values at s = 0 of ζ (s, S0 ) = χ (2s; 0, 1, −1/2), and the residua at s = 1. For, we use the formula provided in Lemma 4.10, namely χ (2s; 0, a, −1/2) =
+ia
−2s
∞ 0
1 −2s 1 a ζ H (2s, 1/(2a) + 1) + a 1−2s ζ H (2s −1, 1/(2a) + 1)+ 2 2s − 1
ζ H (2s, (1/2 + i y)/a + 1) − ζ H (2s, (1/2 − i y)/a + 1) dy. e2π y − 1
We obtain χ (0; 0, a, −1/2) = ∞ +i 0
1 ζ H (0, 1/(2a) + 1) − aζ H (−1, 1/(2a) + 1)+ 2
a 1 ζ H (0, (1/2 + i y)/a + 1) − ζ H (0, (1/2 − i y)/a + 1) dy = − , e2π y − 1 12 24a
(1)
Spectral Analysis and Zeta Determinant on the Deformed Spheres
695
where we have used [35] 9.531 and 9.611.1. Next, we use Eq. (1) to compute the residua at the pole s = 1. The unique singular term is the middle one, so we expand the different factors in it near s = 1, using [35] 9.533.2, a −2s 1 1 ζ H (2s − 1, 1 + 1/(2a)) = 2s − 1 2a s − 1 1 − ((1 + 1/(2a)) + 1 + log a) + O(s − 1). a This gives a
Res1 (χ (2s; 0, a, −1/2), s = 1) =
1 , 2a
and Res0 (χ (2s; 0, a, −1/2), s = 1) = i 1 − (1/(2a) + 1) + 2 a a
∞ 0
1 1 ζ H (2, 1/(2a) + 1) − (1 + log a) + 2 2a a
ζ H (2, (1/2 + i y)/a + 1) − ζ H (2, (1/2 − i y)/a + 1) dy. e2π y − 1
Last, we compute the derivative: χ (0; 0, a, −1/2) = −2χ (0; 0, a, −1/2) log a+ +ζ H (0, 1/(2a) + 1) − 2aζ H (−1, 1/(2a) + 1) − 2aζ H (−1, 1/(2a) + 1)+ ∞ +2i 0
=
1 6
ζ H (0, (1/2 + i y)/a + 1) − ζ H (0, (1/2 − i y)/a + 1) dy = e2π y − 1
1 1 a 1 1 −a log a + (1/(2a)+1)− log 2π + + + −2aζ H (−1, 1/(2a) + 1)+ 2a 2 6 4a 2 ∞ +2i
log 0
dy ((1/2 + i y)/a + 1) dy. 2π ((1/2 − i y)/a + 1) e y − 1
Collecting, we obtain the thesis. Lemma 4.14.
1 a Z (0; 0, a, −1/2, −1/4) = − + 6 6a
+
log a −
1 1 1 log 2π + log (1 + )+ 2 2 a
a 1 3 1 + + − aζ R (−1) − aζ H (−1, 1 + )+ 6 2 4a a ∞
+i
log 0
(1 + i ay )(1 + (1 − i ay )(1 +
1 a 1 a
+ i ay )
dy . −1
− i ay ) e2π y
696
M. Spreafico, S. Zerbini
Proof. In the language of Proposition 4.5, we have L 0 = {(am + n + x)2 },
L b2 = {(am + n + x)2 + b2 },
S0 = {am + n + x},
Sib = {am + n + x + ib},
where the genus of S0 is p = 2. Therefore, by Proposition 4.5, Z (0; 0, a, x, b2 ) = ζ (0, L b2 ) = ζ (0, Sib ) + ζ (0, S−ib )) − Res1 (ζ (s, S0 ), s = 2)b2 . Also, we have that ∞
ζ (s, Sib ) =
(am + n + x + ib)−s = χ (s; 0, a, x + ib),
m,n=1
and therefore, we need information on χ . Use Lemma 4.10. We have, with z = x ± ib, χ (s; 0, a, z) =
+ia
−s
a 1−s z+1 z+1 1 −s a ζ H (s, + 1) + ζ (s − 1, + 1)+ 2 a s−1 a
∞
ζ H (s,
z+1+i y a
0
y + 1) − ζ H (s, z+1−i + 1) a dy. 2π y e −1
This gives Res1 (χ (s; 0, a, z), s = 2) =
χ (0; 0, a, z) =
1 , a
1 a 1 z2 z z + + + + + , 4 12 12a 2a 2a 2
1 z+1 z+1 χ (0; 0, a, z) = −χ (0; 0, a, z) log a + ζ H (0, + 1) − aζ H (−1, + 1)+ 2 a a
−aζ H (−1,
z+1 + 1) + i a
∞ log 0
(1 + (1 +
z+i y+1 dy a ) . z−i y+1 e2π y − 1 ) a
Using the decomposition at the beginning of this subsection and the results in Lemmas 4.12, 4.13 and 4.14 respectively, we can prove the following theorems. Theorem 4.15. 1 a + , 6 6a
1 1 a a+ log a + 2 log (1/(2a) + 1)+ ζ (0, S 2 ) = −2 log 2π + 1 + − 1/a 3 3 a ζ (0, S 2 ) = −1 + 1/a
Spectral Analysis and Zeta Determinant on the Deformed Spheres
+
697
1 1 ζ H (2, 1/(2a) + 1) − (1/(2a) + 1) − 4aζ H (−1, 1/(2a) + 1)+ 2 4a 2a ∞
dy ((1/2 + i y)/a + 1) + ((1/2 − i y)/a + 1) e2π y − 1
log
+4i 0
i + 2 2a
∞ 0
ζ H (2, (1/2 + i y)/a + 1) − ζ H (2, (1/2 − i y)/a + 1) dy+ e2π y − 1
∞ 1− −2 log m,n=1
1 4(am + n − 1/2)2
1
e 4(am+n−1/2)2 .
Theorem 4.16. ζ (0, S 2 ) = −
1/a
1 a + 3 3a
−2aζ R (−1) − 2aζ H (−1, 1 +
log a − 2 log 2π +
1 ) + 2i a
∞ log 0
a 3 1 +1+ + log (1 + )+ 3 2a a
(1 + i ay )(1 + (1 − i ay )(1 +
1 a 1 a
+ i ay )
dy . −1
− i ay ) e2π y
Observe that, although the formula given in Theorem 4.16 looks nicer, it is in fact less useful than the one given in Theorem 4.15, since convergence of the integral is much lower than convergence of the infinite product. Note also that the analytic formulas obtained in the previous theorems, provide a rigorous answer to the problem studied in [27], where an attempt to obtain such formulas was performed. In particular, we can compare the graphs given in [27] Sect. XI (where observe the opposite sign), with the following one, where ζ (0, S 2 ) is plotted using the formula given in Theorem 4.15, 1/a
and the relation with the lune angle ω is a =
π ω.
0.4 0.2 0 –0.2 –0.4 –0.6 –0.8 –1
2
4
a
6
8
10
698
M. Spreafico, S. Zerbini
4.5. The zeta determinant on the deformed 3 sphere. On the deformed 3-sphere we have N = 2 and ζ (s, S 3 ) = 1/a
∞ ∞ ∞ (n + 1)[n(n + 2)]−s + 2 (n + 1)[(am + n)(am + n + 2)]−s . n=1
m=1 n=0
We can check that this reduces to the usual zeta function on the 3-sphere ζ (s, S 3 ) = 1 ∞ 2 −s [54, 18], and we can decompose it as follows n=1 (n + 1) [n(n + 2)] ζ (s, S 3 ) = z(s; 1, 2, 1, −1) + 2Z (s; 1, a, 0, −1). 1/a
As in the previous subsection, we apply Propositions 4.4 and 4.5 and properties of special functions to prove the following lemmas. Observe that, in this case, an application of Proposition 4.5 gives a simpler formula for z (0; 1, 2, 1, −1), we thank the referee for pointing out this fact. Lemma 4.17. z(0; 1, 2, 1, −1) = −1, (−1, 2) − log z (0; 1, 2, 1, −1) = γ − 1 + 2ζ H
∞ 1 n 1 (−1, 2) + log 2 − 1. 1− 2 e n = 2ζ H n
n=2
Remark 4.18. The above result allows to obtain the following interesting formulas for the Barnes G-function G(z) and the double sine function S(z) (see [3, 53 or 59] for the definition of the G-function, and [44 or 59] for the multiple sine function): lim
z→1
lim
z→1
4 G(1 − z) = , 1−z e
π S(π(1 − z)) = . 1−z e
The proofs of the next lemmas are the same as for Lemmas 4.13 and 4.14. Besides the increasing difficulty of the calculation and the fact that now the multiplicity is not trivial (α = 1), the main difference is that a new singular term appears in the unshifted zeta function, namely applying Lemma 4.10, we obtain the expression χ (2s; 1, a, 0) =
+
+ia
−2s
∞ 0
a −2s a 1−2s ζ H (2s, 1/a + 1) + ζ H (2s − 1, 1/a + 1)+ 2 2s − 1
a 2−2s ζ H (2s − 2, 1/a + 1)+ (2s − 1)(2s − 2)
(1 + i y)ζ H (2s, (1 + i y)/a + 1) − (1 − i y)ζ H (2s, (1 − i y)/a + 1) dy e2π y − 1
instead of formula (1).
Spectral Analysis and Zeta Determinant on the Deformed Spheres
699
Lemma 4.19. Z (0; 1, a, 0, −1) = − Z (0; 1, a, 0, −1) =
+
5 , 24
3 a 1 5 11 − + + log a − log 2π + 2 log (1/a + 1)+ 4 12 2a 12 12
1 1 ζ H (2, 1/a + 1) − (1 + 1/a) − 2aζ H (−1, 1/a + 1) + a 2 ζ H (−2, 1/a + 1)+ 2a 2 a ∞
+2i 0
dy ((1 + i y)/a + 1) −2 log ((1 − i y)/a + 1) e2π y − 1 ∞
i + 2 a
0
−
1 a2
y log |((1 + i y)/a + 1)|2 0
dy + e2π y − 1
ζ H (2, (1 + i y)/a + 1) − ζ H (2, (1 − i y)/a + 1) dy+ e2π y − 1
∞ y 0
∞
ζ H (2, (1 + i y)/a + 1) + ζ H (2, (1 − i y)/a + 1) dy+ e2π y − 1
− log
∞ 1− m,n=1
1 (am + n)2
n
n
e (am+n)2 .
Lemma 4.20. Z (0; 1, a, 0, −1) =
a 5 1 2 5 log a − 1 − − log 2π + log ( + 1)+ 12 12 12 2 a
a2 2 2 ζ R (−2) + ζ H (−2, + 1) + −a ζ R (−1) + ζ H (−1, + 1) + a 2 a ∞ +i
log 0
∞ −
y log 0
(1 + i ay )(1 + (1 − i ay )(1 +
2+i y dy a ) + 2−i y e2π y − 1 a )
2 − iy dy πy 2 + iy )(1 + ) 2π y . (1 + a a e −1 ash πay
Using the decomposition at the beginning of this subsection and the results in Lemmas 4.17, 4.19 and 4.20 we can prove the following theorems.
700
M. Spreafico, S. Zerbini
Theorem 4.21. ζ (0, S 3 ) = −1, 1/a
ζ (0, S 3 ) = γ − 1 + 2ζ H (−1, 2) − log 1/a
∞ 1 n 1 1 − 2 en + n
n=2
11 3 a 1 5 log 2π + 4 log (1/a + 1)+ + − + + log a − 2 6 a 6 6 +
1 2 ζ H (2, 1/a + 1) − (1 + 1/a) − 4aζ H (−1, 1/a + 1) + 2a 2 ζ H (−2, 1/a + 1)+ a2 a ∞ +4i 0
2i + 2 a
(1 + i y) log ((1 + i y)/a + 1) − (1 − i y) log ((1 − i y)/a + 1) dy+ e2π y − 1
∞ 0
(1 + i y)ζ H (2, (1 + i y)/a + 1) − (1 − i y)ζ H (2, (1 − i y)/a + 1) dy+ e2π y − 1 ∞ −2 log 1− m,n=1
1 (am + n)2
n
n
e (am+n)2 .
Theorem 4.22. a 5 2 5 log a − 2 − − log 2π + log ( + 1)+ 6 6 6 a
2 2 −2a ζ R (−1) + ζ H (−1, + 1) + a 2 ζ R (−2) + ζ H (−2, + 1) + a a
ζ (0, S 3 ) = log 2 + 2ζ R (−1) − 1 + 1/a
∞ +2i 0
(1+i ay )(1+ 2+ia y )
dy −2 log y 2−i y e2π y −1 (1−i )(1+ ) a
a
∞ y log 0
y π y(1+ 2+ia y )(1+ 2−i a )
ash πay
dy . e2π y −1
4.6. Expansions. In this subsection we give explicit formulas and numerical values of the first coefficients appearing in the expansions of the determinants of the Laplace operator on the 2 and 3 dimensional deformed sphere SkN for small deformations of the parameter k = 1 − δ, with small positive δ. We first state a lemma that allows to deal with the expansion of the values of the zeta function, and thus justify the formal series expansion of all the functions appearing in Theorems 4.15 and 4.21 up to the infinite products, but the last can be treated directly. The proof of Lemma 4.23 follows by the same argument as the one used in the proof of Proposition 4.4. Lemma 4.23. Let x, q and δ be real with 0 ≤ δ ≤ 1, then for all Re(s) > −2 we have the expansion ζ H (s, 1 + x + qδ)) = ζ H (s, 1 + x) − sζ H (s + 1, 1 + x)qδ +
s(s + 1) ζ H (s + 2, 1 + x)q 2 δ 2 + O(δ 3 ), 2
Spectral Analysis and Zeta Determinant on the Deformed Spheres
701
and ζ H (s, 1 + x + qδ)) = ζ H (s, 1 + x) − ζ H (s + 1, 1 + x) + sζ H (s + 1, 1 + x) qδ+
1 s(s + 1) + s+ ζ H (s + 2, 1 + x) + ζ H (s + 2, 1 + x) q 2 δ 2 + O(δ 3 ), 2 2 where note that the coefficients of the second and third term in the second formula are defined as limits. Proposition 4.24. For a = 1 + δ + O(δ 2 ), ζ (0, S 2 ) = ζ (0, S 2 ) + Z 2 δ + O(δ 2 ), 1−δ
1
where ζ (0, S 2 ) = 4ζ H (−1) − 1
∞
2 π γ i = − log 2π − + + − 4ζ H (−1, 1/2) + 3 8 2 2
0
∞ +4i
1 = 2
(3/2 + i y) − (3/2 − i y) dy+ e2π y − 1
∞ 1 1 dy (3/2 + i y) 4(m+n+1/2)2 1 − e − 2 log log (3/2 − i y) e2π y − 1 4(m + n + 1/2)2 m,n=1
0
= −1.161684575, π2 7 1 γ + ζ R (3) − 4ζ H (−1, 1/2) + 2π Z2 = − + − 3 2 8 4
∞ 0
∞ +4 0
i − 4
1 (1/2 + i y) + (1/2 − i y) dy + y e2π y − 1 2
∞ 0
∞ y 0
tanh π y dy+ e2π y − 1
(3/2 + i y) + (3/2 − i y) dy+ e2π y − 1
∞ ∞ (3/2 + i y) − (3/2 − i y) 1 m dy − 4 = e2π y − 1 4j (m + n + 1/2)2 j+1 j=2
m,n=1
= 0.7116523492. Corollary 4.25. det S 2
1−δ
= det S 2 − Z 2 det S 2 δ + O(δ 2 ) = 3.195311305 − 2.273950797δ + O(δ 2 ). 1
1
702
M. Spreafico, S. Zerbini
Proposition 4.26. For a = 1 + δ + O(δ 2 ), ζ (0, S 3 ) = ζ (0, S 3 ) + Z 3 δ + O(δ 2 ), 1−δ
1
where ζ (0, S 3 ) = 2ζ R (−2) + 2ζ R (0) + log 2 = 1
∞ 1 n 1 π2 11 5 1 − 2 en + log(2π ) + + 2ζ R (−2) − log = 3γ − − 2ζ R (−1) − 3 6 6 n n=2
∞ log
+4i ∞ +2i 0
0
dy (2 + i y) −4 (2 − i y) e2π y − 1
∞ y log |(2 + i y)|2 0
ζ H (2, 2 + i y) − ζ H (2, 2 − i y) dy − 2 e2π y − 1 ∞ = −2 log 1− m,n=1
1 (m + n)2
∞ y 0
n
dy + e2π y − 1
ζ H (2, 2 + i y) + ζ H (2, 2 − i y) dy+ e2π y − 1 n
e (m+n)2 = −1.205626800,
1 Z 3 = − + 2γ + 2ζ R (3) − 8ζ R (−1) − 2 log(2π ) + 4ζ R (−2)+ 2 ∞ (1 + i y)2 (2 + i y) − (1 − i y)2 (2 − i y) −4i dy+ e2π y − 1 0
∞
−4i ∞ −2i
0
(1 + i y) (2 + i y) − (1 − i y) (2 − i y) dy+ e2π y − 1
(1+i y)2 (2 + i y) − (2 − i y)2 (2−i y) e2π y − 1
0
dy +
3 π2 2 − = 0.6666666661 = . 2 9 3
Corollary 4.27. det S 3
1−δ
= det S 3 − Z 3 det S 3 δ + O(δ 2 ) = 3.338845845 − 2.225897228δ + O(δ 2 ). 1
1
Acknowledgements. We would like to thank an anonymous referee for useful remarks and suggestions. One of the authors, M. S., thanks the Departments of Mathematics and Physics of their University of Trento, and the INFN for their nice hospitality. S. Z. thanks V. Moretti for discussions.
References 1. Apps, J.S., Dowker, J.S.: The C2 heat-kernel coefficient in the presence of boundary discontinuities. Class. Quant. Grav. 15, 1121–1139 (1998) 2. Atiyah, M., Bott, R., Patodi, V.K.: On the Heat Equation and the Index Theorem. Invent. Math. 19, 279– 330 (1973) 3. Barnes, E.W.: The theory of the multiple Gamma function. Trans. Cambridge Phil. Soc. 19, 374– 425 (1904) 4. Barnes, E.W.: The theory of the G function. Quart. J. Math. 31, 264–314 (1899)
Spectral Analysis and Zeta Determinant on the Deformed Spheres
703
5. Bordag, M., Geyer, B., Kirsten, K., Elizalde, E.: Zeta function determinant of the Laplace operator on the D-dimensional ball. Commun. Math. Phys. 179, 215–234 (1996) 6. Bordag, M., Dowker, J.S., Kirsten, K.: Heat kernel and functional determinants on the generalized cone. Commun. Math. Phys. 182, 371–394 (1996) 7. Brüning, J., Seeley, R.: The resolvent expansion for second order regular singular operators. J. Funct. Anal. 73, 369–429 (1987) 8. Burghelea, D., Friedlander, L., Kappeler, T.: On the determinant of elleptic boundary value problems on a line segment. Proc. Am. Math. Soc. 123, 3027–3028 (1995) 9. Bytsenko, A.A., Cognola, G., Vanzo, L., Zerbini, S.: Quantum fields and extended objects on space-times with constant curvature spatial section. Phys. Rept. 266, 1–126 (1996) 10. Bytsenko, A.A., Cognola, G., Zerbini, S.: Determinant of Laplacian on non-compact 3-dimensional hyperbolic manifold with finite volume. J. Phys. A: Math. Gen. 30, 3543–3552 (1997) 11. Camporesi, R.: Harmonic analysis and propagators on homogeneous spaces. Phys. Reports 196, 1–134 (1990) 12. Carletti, E., Monti Bragadin, G.: On Dirichlet series associated with polynomials. Proc. Am. Math. Soc. 121, 33–37 (1994) 13. Carletti, E., Monti Bragadin, G.: On Minakshisundaram-Pleijel zeta functions on spheres. Proc. Am. Math. Soc. 122, 993–1001 (1994) 14. Cassou-Noguès, P.: Valeurs aux intieres négatifs des fonctions zêta et fonctions zêta p-adiques. Invent. Math. 51, 29–59 (1979) 15. Cassou-Noguès, P.: Dirichlet series associated with a polynomial. Number theory and Physics, Springer Proc. Phys. 47, 247–252 (1990) 16. Chandrasekharan, K.: Elliptic functions. Springer GMW 281, Berlin Heidelberg-New York Springer, (1985) 17. Cheeger, J.: Spectral geometry of singular Riemannian spaces. J. Diff. Geom. 18, 575–657 (1984) 18. Choi, J., Quine, J.R.: Zeta regularized products and functional determinants on spheres. Rocky Mount. J. Math. 26, 719–729 (1996) 19. Cognola, G., Kirsten, K., Vanzo, L.: Free and self-interacting scalar fields in the presence of conical singularities. Phys. Rev. D 49, 1029–1038 (1984) 20. Cognola, G., Vanzo, L., Zerbini, S.: Regularization dependence of vacuum energy in arbitrarily shaped cavities. J. Math. Phys. 33, 222–228 (1992) 21. Cognola, G., Zerbini, S.: Zeta determinant on a generalized cone. Lett. Math. Phys. 42, 95–101 (1997) 22. Critchley, R., Dowker, J.S.: Vacuum stress tensor for a slightly squashed Einstein universe. J. Phys. A: Math. Gen. 14, 1943–1955 (1981) 23. Dowker, J.S.: Quantum field theory on a cone. J. Phys. A: Math. Gen. 10, 115–124 (1977) 24. Dowker J.S.: Vacuum energy in a squashed Einstein universe. In: Quantum theory of gravity. S. M. Christensen, ed. Bristol Adam Hilger (1994) 25. Dowker, J.S.: Effective actions in spherical domains. Commun. Math. Phys. 162, 633–647 (1994) 26. Dowker, J.S.: Functional determinants on spheres and sections. J. Math. Phys. 35, 4989–4999 (1994) 27. Dowker, J.S.: Magnetic fields and factored two-spheres. J. Math. Phys. 42, 1501–1532 (2001) 28. Eie, M.: On the values at negative half integers od Dedekind the zeta function of a real quadratic field. Proc. Am. Math. Soc. 105, 273–280 (1989) 29. Eie, M.: On a Dirichlet series associated with a polynomial, Proc. Am. Math. Soc. 110, 583–590 (1990) 30. Elizalde E., Odintsov S.D., Romeo A., Bytsenko A.A., Zerbini S.: Zeta regularization techniques with applications. Singapure: Word Scientific, 1994 31. Elizalde, E.: Ten physical applications of spectral zeta functions. Springer-Verlag, Berlin-HeidelbergNew York (1995) 32. Epstein, P.: Zur Theorie allgemeiner Zetafunctionen. Math. Ann. 56, 615–645 (1903) 33. Epstein, P.: Zur Theorie allgemeiner Zetafunctionen II. Math. Ann. 63, 205–216 (1907) 34. Fursaev, D.V.: The heat-kernel expansion on a cone and quantum fields near cosmic strings. Class. Qauntum Grav. 11, 1431–1443 (1994) 35. Gradshteyn, I.S., Ryzhik, I.M.: Table of integrals, series and products. Londen-New York, Ac. Press (1980) 36. Gromes, D.: Über die asymptotische Verteilung der Eigenwerte des Laplace-Operators für Gebiete auf der Kugeloberfläche. Math. Zeit. 94, 110–121 (1966) 37. Hawking, S.W.: Zeta function regularization of path integrals in curved space time. Commun. Math. Phys. 55, 139–170 (1977) 38. Higgins, J.R.: Completeness and basis properties of sets of special functions. Cambridge University Press, Cambridge (1977) 39. Hobson, E.W.: The theory of spehrical and ellipsoid harmonics, Cambridge Univ. Press, Cambridge (1955)
704
M. Spreafico, S. Zerbini
40. Hu, B.L.: Scalar waves in the Mixmaster universe. I. The Helmholtz equation in a fixed background. Phys. Rev. D 8, 1048–1060 (1973) 41. Jorgenson, J., Lang, S.: Complex analytic properties of regularized products. Lect. Notes Math. 1564, Berlin Heidelberg-New York Springer, (1993) 42. Kolmogorov, A.N., Fomin, S.V.: Elements de la theorie des functiones et de l’analyse fonctionelle. Moscow, Editions Mir (1977) 43. Kontsevich, M., Vishik, S.: Geometry of determinants of elliptic operators. Functional Analysis on the Eve of The 21st century, Progr. Math. 131, 173–197 (1995) 44. Kurokawa, N.: Multiple sine functions and the Selberg zeta function. Proc. Jpn. Acad. A 67, 61–64 (1991) 45. Lesch, M.: Determinants of regular singular Sturm-Liouville operators. Math. Nachr. 194, 139– 170 (1998) 46. Matsumoto, K.: Asymptotic series for double zeta, double gamma and Hencke L-functions. Math. Proc. Cambridge Phil. Soc. 123, 385–405 (1998) 47. Ortenzi, G., Spreafico, M.: Zeta function regularization for a scalar field in a compact domain. J. Phys. A: Math. Gen. 37, 11499–11517 (2004) 48. Prasolov, V., Solovyev, Y.: Elliptic functions and elliptic integrals. AMS Translations of Monoraphs 170 Providenco, RI: Amer. Math. Soc. (1997) 49. Ray, D.B., Singer, I.M.: R-torsion and the Laplacian on Riemannian manifolds. Adv. Math. 7, 145– 210 (1974) 50. Sarnak, P.: Determinants of Laplacians. Commun. Math. Phys. 110, 113–120 (1987) 51. Shitani, T.: On evaluations of zeta functions of totally real algebraic number fields at nonpositive integers. J. Fac. Sci. Univ. Tokyo 23, 393–417 (1976) 52. Shtykov, N., Vassilevich, D.V.: The heat kernel for deformed spheres. J. Phys. A: Math. Gen. 28, 37– 43 (1995) 53. Shuster, R.: A generalized Barnes G-function. Z. Analysis Anwend. 11, 229–236 (1992) 54. Spreafico, M.: Zeta function and regularized determinant on projective spaces. Rocky Mount. J. Maths. 33, 1499–1512 (2003) 55. Spreafico, M.: On the non-homogenous Bessel zeta function. Mathematika 51, 123–130 (2004) 56. Spreafico, M.: Zeta function and regularized determinant on a disc and on a cone. J. Geom. Phys. 54, 355– 371 (2005) 57. Spreafico, M.: A generalization of the Euler Gamma function. Funct. Anal. Appl. 39, 156–159 (2005) 58. Spreafico, M.: Zeta invariants for Dirichlet series. Pacific J. Math. 224, 100–114 (2006) 59. Spreafico, M.: Zeta functions, special functions and the Lerch formula. Proc. Royal Soc. Ed. 136, 865– 889 (2006) 60. Vardi, I.: Determinants of Laplacians and multiple Gamma functions. SIAM J. Math. Anal. 19, 493– 507 (1988) 61. Voros, A.: Spectral functions, special functions and the Selberg zeta function. Comm. Math. Phys. 110, 439–465 (1987) 62. Weidmann, J.: Linear operators in Hilbert spaces. GTM 68, Berlin-Heidelberg-New York (1980) 63. Weil, A.: Elliptic functions according to Eisenstein and Kronecker. Springer-Verlag, Berlin-Heidelberg-New York (1976) 64. Whittaker, E.T., Watson, G.N.: A course in modern analysis. Cambridge Univ. Press, Cambridge (1946) 65. Wolf, J.A.: Spaces of constant curvature. McGraw-Hill, New York (1967) 66. Zagier, D.: A Kronecker limit formula for real quadratic fields. Ann. Math. 213, 153–184 (1975) 67. Zagier, D.: Valeurs des fonctions zeta des corps quadratiques reèls aux entiers negatifs. Astérisque 41-42, 135–151 (1977) 68. Zerbini, S., Cognola, G., Vanzo, L.: Euclidean approach to the entropy for a scalar field in Rindler-like space-time. Phys. Rev. D 54, 2699–2710 (1996) Communicated by G.W. Gibbons
Commun. Math. Phys. 273, 705–754 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0255-x
Communications in
Mathematical Physics
Geometrical (2+1)-Gravity and the Chern-Simons Formulation: Grafting, Dehn Twists, Wilson Loop Observables and the Cosmological Constant C. Meusburger Perimeter Institute for Theoretical Physics, 31 Caroline Street North, Waterloo, Ontario N2L 2Y5, Canada. E-mail: [email protected] Received: 27 July 2006 / Accepted: 8 November 2006 Published online: 15 May 2007 – © Springer-Verlag 2007
Abstract: We relate the geometrical and the Chern-Simons description of (2+1)-dimensional gravity for spacetimes of topology R × Sg , where Sg is an oriented two-surface of genus g > 1, for Lorentzian signature and general cosmological constant and the Euclidean case with negative cosmological constant. We show how the variables parametrising the phase space in the Chern-Simons formalism are obtained from the geometrical description and how the geometrical construction of (2+1)-spacetimes via grafting along closed, simple geodesics gives rise to transformations on the phase space. We demonstrate that these transformations are generated via the Poisson bracket by one of the two canonical Wilson loop observables associated to the geodesic, while the other acts as the Hamiltonian for infinitesimal Dehn twists. For spacetimes with Lorentzian signature, we discuss the role of the cosmological constant as a deformation parameter in the geometrical and the Chern-Simons formulation of the theory. In particular, we show that the Lie algebras of the Chern-Simons gauge groups can be identified with the (2+1)-dimensional Lorentz algebra over a commutative ring, characterised by a formal parameter whose square is minus the cosmological constant. In this framework, the Wilson loop observables that generate grafting and Dehn twists are obtained as the real and the -component of a Wilson loop observable with values in the ring, and the grafting transformations can be viewed as infinitesimal Dehn twists with the parameter . 1. Introduction The quantisation of Einstein’s theory of gravity is often viewed as the problem of constructing a quantum theory of geometry. In particular, a physically meaningful quantum theory of gravity should allow one to recover spacetime geometry from the gauge theory-like formulations used in most quantisation approaches. While the quantisation of gravity in (3+1) dimensions is far from complete, the (2+1)-dimensional version of the theory has been used successfully as a testing ground for various quantisation formalisms [1, 2]. As in the (3+1)-dimensional case, most of these formalisms are based on gauge
706
C. Meusburger
theoretical descriptions of the theory. To apply these results to concrete physics questions, it would therefore be necessary to recover their geometrical interpretation. Yet the relation between the phase space variables used in these approaches and spacetime geometry is not fully clarified even in the classical theory. The simplifications in (2+1)-dimensional gravity compared to the (3+1)-dimensional case are due to the absence of local gravitational degrees of freedom and the finitedimensionality of its phase space. In the geometrical formulation of the theory, this manifests itself in the fact that vacuum solutions of Einstein’s equations are flat or of constant curvature. They are therefore locally isometric to certain model spacetimes, into which any simply connected region of the spacetime can be embedded. The physical degrees of freedom are purely topological and encoded in transition functions, which take values in the isometry group of the model spacetime and relate the embedding of different spacetime regions. From a gauge theoretical perspective, the absence of local gravitational degrees of freedom in (2+1)-dimensional gravity results in its formulation as a Chern-Simons gauge theory with the isometry group of the associated model spacetime as the gauge group [3, 4]. The Einstein equations then take the form of a flatness condition on the gauge field, and their solutions can be locally trivialised, i. e. written as pure gauge. The physical degrees of freedom are then encoded in a set of elements of the gauge group which relate the trivialisations on different regions of the spacetime manifold. The advantage of the Chern-Simons formulation of (2+1)-dimensional gravity is that it allows one to apply gauge theoretical concepts and methods to achieve an explicit parametrisation of the phase space that serves as a starting point for quantisation. As gauge fields solving the equations of motions are flat, physical states can be characterised in terms of the holonomies along closed curves in the spacetime manifold. Conjugation invariant functions of such holonomies then define a complete set of gauge invariant Wilson loop observables, which were first investigated in the context of (2+1)-dimensional gravity in [5–7, 9, 8, 10, 11]. Moreover, by parametrising the phase space in terms of the holonomies along a set of generators of the fundamental group, one obtains an efficient description of the Poisson structure [12, 13]. These descriptions were used in [14] to investigate the classical phase space of theory and are the basis of Alekseev, Grosse and Schomerus combinatorial quantisation formalism [15, 16] and the related approaches in [17, 18]. The drawback of the Chern-Simons formulation is that it complicates the physical interpretation of the theory by obscuring the underlying spacetime geometry. Except for particularly simple spacetimes such as static spacetimes and the torus universe, it is in general difficult to reconstruct spacetime geometry from the gauge theoretical variables that parametrise the phase space. In a geometrical framework, the relation between holonomies and geometry was first investigated by Mess [19], who shows how the holonomies determine the geometry of the spacetime. More recent results on this problem were obtained by Benedetti and Guadagnini [20] and by Benedetti and Bonsante [21, 22], who focus on the construction of (2+1)-dimensional spacetimes via grafting and relate the resulting spacetimes for different values of the cosmological constant. However, despite these results, the relation between spacetime geometry and the description of the phase space of (2+1)-dimensional gravity in the Chern-Simons formalism is still not fully clarified. While the results in [19–22] establish a relation between holonomies and geometry in the geometrical formulation of the theory, they do not relate these variables to the quantities encoding the physical degrees of freedom in the Chern-Simons formalism. In particular, it is not clear how the embedding of spacetime regions into model spacetimes and the associated transition functions are related to
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
707
the corresponding concepts in Chern-Simons theory, the trivialisation of the gauge field and the gauge group elements linking the trivialisations on different regions. Moreover, a full understanding of the relation between spacetime geometry and the Chern-Simons formulation should clarify the role of phase space and Poisson structure. This includes a geometrical interpretation of the phase space transformations generated by the Wilson loop observables as well as the question of how constructions that change the geometry of a spacetime such as grafting and Dehn twists manifest themselves on the phase space of the theory. These questions concerning the relation between geometrical and Chern-Simons formulation in (2+1)-dimensional gravity are the subject of the present paper, in which we consider vacuum spacetimes of topology R × Sg , where Sg is an orientable two-surface of general genus g > 1. Our results are valid for spacetimes of Lorentzian signature and with general cosmological constant and for the Euclidean case with negative cosmological constant. They can be summarised as follows. 1. Embedding and trivialisation. We relate the embedding of spacetime regions into model manifolds in the geometrical formulation and the trivialisation of the gauge field in the Chern-Simons formalism and derive explicit formulas linking the variables which encode the physical degrees of freedom in the two approaches. 2. Grafting transformations on phase space. We show how the geometrical construction of (2+1)-spacetimes by grafting along closed, simple geodesics gives rise to a transformation on the phase space in the Chern-Simons formulation and derive explicit expressions for the action of this transformation on the holonomies along a set of generators of the fundamental group. 3. The transformations generated by Wilson loop observables. We investigate the two basic Wilson loop observables associated to a closed, simple curve on Sg and to the two linearly independent Ad-invariant, symmetric bilinear forms on the Lie algebra of the gauge group. We derive explicit expressions for the phase space transformations these observables generate via the Poisson bracket and show that one of these observables acts as a Hamiltonian for the grafting transformations, while the other generates infinitesimal Dehn twists. 4. Relation between grafting and Dehn twists. We demonstrate that the phase space transformations representing grafting and Dehn twists are closely related for all values of the cosmological constant and that this relation is reflected in a general symmetry relation for the corresponding Wilson loop observables. We show that grafting can be viewed as a Dehn twist with a formal parameter whose square is identified with minus the cosmological constant. 5. The cosmological constant as a deformation parameter. We establish a unified description for spacetimes of Lorentzian signature in which the cosmological constant plays the role of a deformation parameter. In the geometrical description, its square root appears as a parameter relating the embedding into the different model spacetimes and the action of the associated isometry groups. In the Chern-Simons formulation, it plays the role of a deformation parameter in the gauge group and the associated Lie algebra. More precisely, we demonstrate that the Lie algebra of the gauge group can be viewed as the (2+1)-dimensional Lorentz algebra over a commutative ring with a multiplication law that depends on the cosmological constant. Results similar to 1 to 4 were obtained in an earlier paper [23] for the case of Lorentzian (2+1)-spacetimes with vanishing cosmological constant. Although the general approach in [23] is similar, the reasoning and many proofs in [23] make use of specific simplifications resulting from the properties of Minkowski space and the (2+1)-
708
C. Meusburger
dimensional Poincaré group. The inclusion of these spacetimes in the present paper allows one to see how these results arise from a general pattern present for all values of the cosmological constant and to investigate the role of the cosmological constant as a deformation parameter. The paper is structured as follows: In Sect. 2 we introduce definitions and notations for the Lie groups and Lie algebras considered in this paper and summarise some facts from hyperbolic geometry used in the geometrical description of (2+1)-spacetimes. Section 3 gives an overview of the geometrical description of (2+1)-dimensional spacetimes of topology R × Sg for Lorentzian signature and general cosmological constant and for the Euclidean case with negative cosmological constant. We start by introducing the relevant model spacetimes which are (2+1)-dimensional Minkowski space, anti de Sitter space and de Sitter space, respectively, for Lorentzian signature and vanishing, negative and positive cosmological constant and the three-dimensional hyperbolic space for the Euclidean case with negative cosmological constant. We then review the description of (2+1)-spacetimes of topology R × Sg which are obtained as the quotients of regions in the model spacetimes by certain actions of a cocompact Fuchsian group. After summarising the description of static universes, we describe the construction of evolving universes via grafting along closed, simple geodesics following the presentation in [21, 22]. In Sect. 4 we review the formulation of (2+1)-dimensional gravity as a Hamiltonian Chern-Simons gauge theory, where the gauge group is the isometry group of the associated model spacetime, the (2+1)-dimensional Poincaré group P SU (1, 1) R2 ∼ = P S L(2, R) R3 for Lorentzian signature and vanishing cosmological constant, the group P SU (1, 1)× P SU (1, 1) ∼ = P S L(2, R)× P S L(2, R) for Lorentzian signature and negative cosmological constant and S L(2, C)/Z2 for Lorentzian signature and positive cosmological constant and for the Euclidean case. We discuss how the local trivialisation of the gauge field gives rise to a parametrisation of the phase space in terms of the holonomies along a set of generators of the fundamental group π1 (Sg ) and introduce Fock and Rosly’s description of the Poisson structure [12]. Section 5 relates the geometry of (2+1)-spacetimes to their description in the ChernSimons formalism. We discuss the relation between the variables encoding the physical degrees of freedom in the geometrical and in the Chern-Simons approach and show how the embedding into the model spacetimes is obtained from the trivialisation of the gauge field in the Chern-Simons formalism. In Sect. 6 we demonstrate how the construction of evolving (2+1)-spacetimes via grafting along closed, simple geodesics in [21, 22] is implemented in the Chern-Simons formalism and show that gives rise to a transformation on phase space, given explicitly by its action on the holonomies along a set of generators of the fundamental group π1 (Sg ). In Sect. 7, we relate this transformation to the Poisson structure and to the Wilson loop observables. We show that the phase space transformation obtained by grafting along a closed, simple geodesic η is generated via the Poisson bracket by one of the two basic Wilson loop observables associated to η, while the other observable acts as the Hamiltonian for Dehn twists. We discuss the properties of the grafting transformations and their relation to Dehn twists, which manifests itself in a general symmetry relation for the Poisson brackets of the associated observables. Section 8 investigates the role of the cosmological constant in spacetimes of Lorentzian signature. Using the results by Benedetti and Bonsante [21, 22], we show
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
709
that its square root can be viewed as a deformation parameter in the geometrical description of both static and grafted (2+1)-spacetimes. For the Chern-Simons formulation, we establish a common framework relating the different gauge groups by identifying their Lie algebras with the (2+1)-dimensional Lorentz algebra over a commutative ring. The cosmological constant then appears in the ring’s multiplication law and can be implemented by introducing a formal parameter whose square is minus the cosmological constant. We show that the grafting transformations can be viewed as Dehn twists with this parameter . Section 9 contains a discussion of our results and conclusions. 2. Definitions and Notation 2.1. Lie groups and Lie algebras. Throughout the paper we employ Einstein’s summation convention. Indices are raised and lowered either with the three-dimensional Minkowski metric x · y = η L (x, y) = ηab x a y b = −x0 y0 + x1 y1 + x2 y2
(1)
or with the three-dimensional Euclidean metric x · y = η E (x, y) = δab x a y b = x0 y0 + x1 y1 + x2 y2 .
(2)
To avoid confusion, we denote the signature of the spacetime by a variable S and write S = L for Lorentzian and S = E for Euclidean signature. In the following we consider a set of six-dimensional Lie algebras h,S over R whose generators we denote by JaS , PaS , a = 0, 1, 2. For Lorentzian signature, the Lie algebras h,S depend on a parameter ∈ R, and their Lie brackets are given by [JaL , JbL ] = abc JcL
[JaL , PbL ] = abc PcL
[PaL , PbL ] = abc JcL ,
(3)
where indices are raised and lowered with the three-dimensional Minkowski metric (1) and abc is the three-dimensional antisymmetric tensor satisfying 012 = 1. For Euclidean signature, we consider parameters < 0, and the Lie algebra h,E has the bracket [JaE , JbE ] = ab c JcE ,
[JaE , PbE ] = ab c PcE , [PaE , PbE ] = ab c JcE ,
< 0, (4)
where indices are raised with the Euclidean metric (2)1 . The generators JaE in (4) span the real Lie algebra su(2) and can be represented by the matrices i 0 0 i 0 −1 , J1E = 21 , J2E = 21 . (5) J0E = 21 0 −i i 0 1 0 Similarly, the bracket of the generators JaL in (3) is the Lie bracket of the three-dimensional Lorentz algebra so(2, 1) ∼ = sl(2, R) ∼ = su(1, 1). A set of sl(2, R)-matrices representing these generators is given by 0 −1 1 0 0 1 J0 = 21 , J1 = 21 , J2 = 21 . (6) 1 0 0 −1 1 0 1 Note that the parameter in (3), denoted by λ in [4], is not equal to the cosmological constant but to minus the cosmological constant for Lorentzian signature, while its Euclidean analogue in (4) agrees with the cosmological constant. See also the discussion at the beginning of Sect. 3.1.
710
C. Meusburger
However, in the following we will mostly work with the Lie algebra su(1, 1), which is conjugate to sl(2, R) in sl(2, C) via 1 1 1 i 1 −i · sl(2, R) · √ . (7) su(1, 1) = √ 2 i 1 2 −i 1 The su(1, 1) matrices associated to the generators (6) are given by i 0 L L 1 1 0 −i 1 0 1 L J1 = 2 J2 = 2 , J0 = 2 0 −i i 0 1 0
(8)
and by exponentiating linear combinations of these matrices over R, one obtains the Lie group a b 2 2 ∼ SU (1, 1) = | a, b ∈ C, |a| − |b| = 1 (9) = S L(2, R). b¯ a¯ The group SU (1, 1) ∼ = S L(2, R) is the double cover of the proper orthochronous Lorentz group in three dimensions S O(2, 1)+ = P S L(2, R) ∼ = P SU (1, 1) = SU (1, 1)/Z2 . In the following, we will often parametrise elements of SU (1, 1) and P SU (1, 1) via the a L exponential map which in both cases we denote by exp : pa JaL → e p Ja . Using expressions (8) for the generators of su(1, 1), we find that the parametrisation of SU (1, 1) in terms of a vector p ∈ R3 is given by exp : su(1, 1) → SU (1, 1),
⎧ | p| | p| ⎪ cosh 2 1 + 2 sinh 2 pˆ a JaL for p2 > 0 ⎪ ⎨ a L pa JaL → e p Ja = 1 + pa JaL
for p2 = 0
⎪ ⎪ ⎩cos | p| 1 + 2 sin | p| pˆ a J L for p2 < 0 a 2 2
pˆ = √ 1 2 p. |p | (10)
a
L
Elements u = e p Ja ∈ SU (1, 1) are called elliptic, parabolic and hyperbolic, respectively, for p2 < 0, p2 = 0 and p2 > 0. It follows directly from expression (10) that the exponential map for SU (1, 1) is neither surjective nor injective. The exponential map exp : su(1, 1) → P SU (1, 1) ∼ = S O(2, 1)+ is surjective, but again not injective, aJL p 2 2 since e a = 1 for p = −(2π n) , n ∈ Z. However, in the following we will mainly consider hyperbolic elements of P SU (1, 1), for which the parametrisation in terms of a vector p = ( p 0 , p 1 , p 2 ) ∈ R3 is unique. For = 0, the six-dimensional real Lie algebra (3) is the three-dimensional Poincaré algebra h0,L = iso(2, 1) = su(1, 1) ⊕ R3 , and the associated Lie group obtained by exponentiation is the semidirect product SU (1, 1) R3 ∼ = S L(2, R) R3 , where 3 ∼ SU (1, 1) acts on R = su(1, 1) via the adjoint action (u 1 , a1 ) · (u 2 , a2 ) = (u 1 u 2 , a1 + Ad(u 1 )a2 ), u 1 , u 2 ∈ SU (1, 1), a1 , a2 ∈ R3 . (11) For > 0, one can introduce an alternative set of generators Ja± , in terms of which the Lie bracket (3) takes the form of a direct sum Ja± = 21 (JaL ±
√1 PaL )
⇒
c ± [Ja± , Jb± ]>0 = ab Jc
[Ja± , Jb∓ ]>0 = 0. (12)
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
711
Hence, for > 0, the Lie algebra h>0,L = su(1, 1) ⊕ su(1, 1) is the direct sum of two copies of su(1, 1) and the associated Lie group is SU (1, 1) × SU (1, 1), whose elements we will parametrise using an index + for the first and − for the second component (u + , u − ) · (v+ , v− ) = (u + v+ , u − v− )
u ± , v± ∈ SU (1, 1).
(13)
For the Lie algebras h<0,S a set of matrices representing the generators JaS , PaS in (3), (4) √ is obtained by setting PaS = i ||JaS . This implies that the Lie algebras h<0,L , h<0,E are both isomorphic to sl(2, C). In the first case sl(2, C) is realised as the complexification h<0,L = sl(2, C) = sl(2, R) ⊕ i sl(2, R) of its normal real form sl(2, R). In the second case it is given as the complexification h<0,E = sl(2, C) = su(2) ⊕ i su(2) of its compact real form su(2). Hence, depending on the signature S and on the parameter , the Lie algebras h,S and the associated Lie groups H,S are given by ⎧ 3 ⎪ = 0, S = L ⎨su(1, 1) ⊕ R h,S = su(1, 1) ⊕ su(1, 1) > 0, S = L , ⎪ ⎩sl(2, C) < 0, S = L , E ⎧ 3 ⎪ = 0, S = L ⎨ SU (1, 1) R H,S = SU (1, 1) × SU (1, 1) > 0, S = L ⎪ ⎩ S L(2, C) < 0, S = L , E. For all signatures and all values of the parameter , the three-dimensional Lorentz algebra su(1, 1) ∼ = sl(2, R) is a subalgebra of the Lie algebra h,S . The corresponding embedding of the group SU (1, 1) into the groups H,S is given by ⎧ 3 ⎪ = 0, S = L ⎨(v, 0) ∈ SU (1, 1) R ı,S (v) = (v, v) ∈ SU (1, 1) × SU (1, 1) > 0, S = L (14) ⎪ ⎩v ∈ S L(2, C) < 0, S = L , E. The embedding ı,S (±1) induces an action of Z2 on H,S . The quotients of H,S by this action are the (2+1)-dimensional Poincaré group H0,L /Z2 = P SU (1, 1)R3 , the group H>0,L /Z2 = SU (1, 1) × SU (1, 1)/Z2 and the proper orthochronous Lorentz group in (3+1)-dimensions H<0,L /Z2 = H<0,E /Z2 = S L(2, C)/Z2 = S O(3, 1)+ . In addition to these groups, we will need to consider the group P SU (1, 1) × P SU (1, 1) = H>0,L /(Z2 × Z2 ), whose elements we parametrise as in (13). The embedding (14) then induces an embedding of P SU (1, 1) into the quotients H,S /Z2 and into the group P SU (1, 1) × P SU (1, 1), which we will also denote by ı,S . In the following we will sometimes parametrise elements of the groups H,S , H,S /Z2 and of P SU (1, 1) × P SU (1, 1) via the exponential map, for which in all cases we use the symbol exp,S . Depending on the value of the parameter and the signature, these exponential maps are given by ⎧ pa J L (e a√, −T (− p)k) √ ⎪ ⎪ ⎪ ⎨(e( pa + k a )JaL , e( pa − k a )JaL ) √ exp,S ( pa JaS + k a PaS ) = ( pa +i ||k a )JaL ⎪ e ⎪ ⎪ ⎩ ( pa +i √k a )J E a e
= 0, > 0, < 0, < 0,
Lorentzian Lorentzian (15) Lorentzian Euclidean,
712
C. Meusburger a
S
where expressions of the form e p Ja denote the image of the exponential map (10) or a a S the associated exponential map for P SU (1, 1), expressions e( p +iq )Ja the image of the 3 exponential map for S L(2, C) and S L(2, C)/Z2 and T ( p) : R → R3 , p ∈ R3 is a bijective linear map given via the identification R3 ∼ = su(1, 1) by (T ( p)k)
a
JaL
=
∞ adn (k a JaL ) pa J L a
n=0
(n + 1)!
= k a JaL + 21 [ p b JbL , k a JaL ]
+ 16 [ p c JcL , [ p b JbL , k a JaL ]] + . . . .
(16)
Note that for all values of and all signatures under consideration, the exponential map exp,S : h,S → H,S is neither surjective nor injective, which follows from the corresponding statement for the group SU (1, 1). For the groups P SU (1, 1) R3 , S L(2, C)/Z2 and P SU (1, 1) × P SU (1, 1) = SU (1, 1) × SU (1, 1)/(Z2 × Z2 ) the exponential maps are surjective but again not injective. 2.2. Hyperbolic geometry. In this section we summarise some facts and definitions from hyperbolic geometry used in this paper. For a general reference, we refer the reader to the book [24] by Benedetti and Petronio, for a specialised treatment focusing on Fuchsian groups to the book [25] by Katok. In the following, we denote by Hdk the d-dimensional hyperbolic space of curvature −|k|, realised as the hyperboloid 1 Hdk = {x = (x 0 , x 1 , . . . , x d ) ∈ Rd+1 | − x02 + x12 + . . . + xd2 = − |k| , x 0 > 0}
(17)
with the metric induced by the (d + 1)-dimensional Minkowski metric. In the twodimensional case, we also work with the disc model, in which H2 = H21 is realised as the unit disc D = {z ∈ C | |z| < 1}
ds 2 =
4|dz|2 , (1 − |z|2 )2
(18)
and which is related to the two-dimensional hyperboloid model (17) via a map z ∈ D → x(z) ∈ H2k , x0 (z) =
2 √1 1+|z| 2 |k| 1−|z|
x1 (z) =
√1 2Re(z)2 |k| 1−|z|
x 2 (z) =
√1 2Im(z)2 |k| 1−|z|
∀z ∈ D.
(19)
In the hyperboloid model, the geodesics of Hdk are obtained as the intersection of Hdk with d-dimensional hyperplanes through the origin. In the two-dimensional disc model, the geodesics are the diameters of the disc and arcs of circles orthogonal to its boundary. The isometry group Isom(H2k ) = Isom(D, ds 2 ) is the proper orthochronous Lorentz group P S L(2, R) ∼ = P SU (1, 1) = SU (1, 1)/Z2 , which acts on the hyperboloid H2k via its canonical action on Minkowski space and whose action on the disc D is given by az + b ab . (20) ∈ SU (1, 1) : z → ¯ + a¯ b¯ a¯ bz The uniformization theorem states that every orientable two-surface of genus g > 1 with a metric of constant curvature −|k| is isometric to a quotient H2k / of H2k by the action of a cocompact Fuchsian group with 2g hyperbolic generators −1 = v A1 , v B1 , . . . , v Ag , v Bg | [v Bg , v −1 A g ] · · · [v B1 , v A1 ] = 1 ⊂ P SU (1, 1). (21)
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
713
The group induces a tessellation of H2k by geodesic arc 4g-gons, which are mapped into each other by the elements of . Hence, for each polygon in the tessellation, there exist 4g elements of which map this polygon into its 4g neighbours and identify its sides pairwise. The surface H2k / is obtained by glueing these pairs of sides of a polygon in the tessellation. In particular, there exists a polygon, in the following referred to as a fundamental polygon and denoted by P , which is mapped into its 4g neighbours by a fixed set of generators of and their inverses. If we label the sides of P as in Fig. 5, the generators v Ai , v Bi in (21) identify the sides of this fundamental polygon P according to v Ai : ai → ai
v Bi : bi → bi .
(22)
The geodesics on the surface H2k / are obtained by projecting the geodesics on H2k . In particular, closed geodesics η : [0, 1] → H2k / , η(0) = η(1) on H2 / arise as the projections of geodesics cη : [0, 1] → H2k for which there exists an element of that maps these geodesics to itself: cη (1) = vη cη (0),
a
vη = en η Ja ∈ .
(23)
In the following we will refer to this element as the translation element of η and to 2 the associated vectors nη and nˆ η = nη / |nη | as the translation vector and unit translation vector of η. Closed geodesics on the surface H2k / are therefore in one-to-one correspondence with elements of the cocompact Fuchsian group , which is isomorphic to the surface’s fundamental group π1 (H2k / ) ∼ = . In the following we will often not distinguish notationally between such geodesics, their homotopy equivalence classes in π1 (H2k / ) and general curves on the surface which represent these homotopy equivalence classes. 3. (2+1)-Dimensional Gravity: The Geometrical Formulation 3.1. Model spacetimes. In this section, we summarise the geometrical description of (2+1)-spacetimes as quotients of certain model spacetimes. A general reference for (2+1)-spacetimes is the book [1] by Carlip. A more specific treatment focusing on the construction of (2+1)-spacetimes via grafting is given in the papers by Benedetti and Bonsante [21, 22]. (2+1)-dimensional gravity is a theory without local gravitational degrees of freedom. As the curvature tensor of a three-dimensional manifold is determined completely by its Ricci tensor, vacuum solutions of the (2+1)-dimensional Einstein equations are flat or of constant curvature. This implies that they are locally isometric to a three-dimensional model spacetime. In this paper, we consider Lorentzian (2+1)-gravity with general cosmological constant and the Euclidean case with negative cosmological constant. In the following, we work with a parameter ∈ R which is identified with minus the cosmological constant for Lorentzian spacetimes and agrees with the cosmological constant in the Euclidean case. The choice of this convention is motivated by the conventions in the Chern-Simons formulation of the theory and leads to notational simplifications there. The model spacetimes for Lorentzian signature are then three-dimensional Anti de Sitter space AdS , three-dimensional Minkowski space M3 , three-dimensional de Sitter space dS , respectively, for > 0 (negative cosmological constant), = 0 (vanishing cosmological constant) and < 0 (positive cosmological constant). The
714
C. Meusburger
model spacetime for Euclidean signature and negative cosmological constant ( < 0) is three-dimensional hyperbolic space H3 . In the following, we parametrise these spacetimes in terms of matrices, which is convenient for establishing a link with their description in the Chern-Simons formalism. For the Lorentzian case with vanishing cosmological constant, the relevant model spacetime is (2+1)-dimensional Minkowski space X0,L = M3 and the group of orientation and time orientation preserving isometries is the (2+1)-dimensional Poincaré group Isom(X0,L ) = P SU (1, 1) R3 = H0,L /Z2 . In the canonical identification of Minkowski space with the set of su(1, 1)-matrices i x0 −i(x 1 + i x 2 ) ∈ su(1, 1), M3 x = (x 0 , x 1 , x 2 ) → X = 2x a JaL = −i x 0 i(x 1 − i x 2 ) (24) the (2+1)-dimensional Minkowski metric agrees with the Killing form of su(1, 1), −x0 y0 + x1 y1 + x2 y2 = 21 Tr (X · Y ) ,
(25)
and the action of Isom(X0,L ) = P SU (1, 1) R3 is given by (u, a) ∈ P SU (1, 1) R3 : X → u Xu −1 + 2a b JbL
∀X ∈ su(1, 1).
(26)
The model spacetime for negative cosmological constant ( > 0) and Lorentzian signature is three-dimensional Anti de Sitter space AdS . We adopt the conventions of [21, 22] in which Anti de Sitter space is realised as a quotient of the universal cover AdS by the action of Z2 . The universal cover AdS is the manifold AdS = {(t1 , t2 , x1 , x2 ) ∈ R4 | t12 + t22 − x12 − x22 = X>0,L =
1 ,
},
d x 2 = −(dt1 )2 − (dt2 )2 + (d x1 )2 + (d x2 )2 .
(27)
Via the map AdS x = (t1 , t2 , x1 , x2 ) → X =
t1 + it2 −i(x1 + i x2 ) i(x1 − i x2 ) t1 − it2
∈ SU (1, 1), (28)
it can be identified with the group SU (1, 1) such that its metric is given by minus the determinant AdS = { √1 A | A ∈ SU (1, 1)}
d x 2 = − det ( d X ).
(29)
The group of orientation and time orientation preserving isometries of AdS is the group SU (1, 1) × SU (1, 1)/Z2 = H>0,L /Z2 , whose action is given by the action of SU (1, 1) × SU (1, 1) via (G + , G − ) ∈ SU (1, 1) × SU (1, 1) : X → G + X G −1 −
∀X ∈ SU (1, 1).
(30)
AdS by the action of the elements (±1, ∓1) Anti de Sitter space AdS is the quotient of via (30) and its isometry group is the quotient P SU (1, 1) × P SU (1, 1)/Z2 , AdS /Z2 , Isom(AdS ) = P SU (1, 1) × P SU (1, 1). (31) X>0,L = AdS =
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
715
For positive cosmological constant ( < 0) and Lorentzian signature, the model spacetime is three-dimensional de Sitter space X<0,L = dS = {x = (x 0 , x 1 , x 2 , x 3 ) ∈ M4 | − x02 + x12 + x22 + x32 =
1 || },
(32)
and for the Euclidean case with negative cosmological constant ( < 0), it is threedimensional hyperbolic space 1 , x 0 > 0}. X<0,E = H3 = {x = (x 0 , x 1 , x 2 , x 3 ) ∈ M4 | − x02 + x12 + x22 + x32 = − ||
(33) In both cases the metric is the one induced by the four-dimensional Minkowski metric, and the group of orientation and, in the de Sitter case, time orientation preserving isometries is the proper orthochronous Lorentz group in three dimensions Isom(X<0,L ) = Isom(X<0,E ) = S O(3, 1)+ = S L(2, C)/Z2 . The parametrisation of these spacetimes in terms of matrices is obtained by identifying vectors in four-dimensional Minkowski space with certain sets of matrices in G L(2, C). The first identification is the standard identification of Minkowski space with the set of hermitian G L(2, C) matrices 0 x + x3 x1 + i x2 4 0 1 2 3 0 i , (34) M x = (x , x , x , x ) → X = x 1 + x σi = x1 − i x2 x0 − x3 in which the four-dimensional Minkowski metric takes the form − det ( X ) = −x02 + x12 + x22 + x32 .
(35)
The action of the isometry group S O(3, 1)+ ∼ = S L(2, C)/Z2 on the set of hermitian 2 × 2-matrices is given by the action of S L(2, C) via G ∈ S L(2, C) : X → G X G †
∀X ∈ G L(2, C), X † = X.
(36)
As this action has kernel ±1 and leaves the determinant invariant, it induces an action of S O(3, 1)+ which preserves the metric (35). Hence, three-dimensional hyperbolic space H3 can be identified with the set of hermitian S L(2, C) matrices with metric (35) and the action of the isometry group Isom(H3 ) = S O(3, 1)+ is given by (36), 1 H3 = { √|| A | A ∈ S L(2, C), A† = A}.
(37)
The matrix representation of dS is similar, but instead of the identification (34), one uses the identification x3 + x0 x1 + i x2 , (38) M4 x = (x 0 , x 1 , x 2 , x 3 ) → X = −(x 1 − i x 2 ) x 3 − x 0 which assigns to each vector in Minkowski space a matrix in the set C L(2, C) = {A ∈ G L(2, C) | A◦ = A} ◦ † a¯ −c¯ −i 0 a b i 0 a b , = = 0 i c d 0 −i c d −b¯ d¯
(39) (40)
such that the four-dimensional Minkowski metric is given by the determinant det (X) = −x02 + x12 + x22 + x32 .
(41)
716
C. Meusburger
As the map ◦ : G L(2, C) → G L(2, C) satisfies (A◦ )◦ = A, (AB)◦ = B ◦ A◦ , det(A◦ ) = det A, one obtains an action of the group S L(2, C) on C L(2, C) via G ∈ S L(2, C) : M → G M G ◦
(42)
which has kernel ±1, preserves the determinant, and thus induces an action of S O(3, 1)+ = S L(2, C)/Z2 which preserves the metric (41). Hence, de Sitter space dS can be realised as the set of S L(2, C) matrices invariant under the operation ◦ with metric (41), and with isometry group S L(2, C)/Z2 whose action is given by (42) 1 A | A ∈ S L(2, C), A◦ = A}. dS = { √||
(43)
Hence, depending on the cosmological constant and the signature, the model spacetimes X,S can be identified with the sets of matrices ⎧ ⎪ M3 ∼ = 0, S = L = su(1, 1) ⎪ ⎪ ⎪ 1 ⎨AdS ∼ > 0, S = L = √ P SU (1, 1) X,S = (44) 1 ◦ ∼ √ dS = || {A ∈ S L(2, C) : A = A } < 0, S = L ⎪ ⎪ ⎪ ⎪ 1 ⎩H3 ∼ {A ∈ S L(2, C) : A = A† } < 0, S = E = √|| with metrics given by (25), (29), (35), (41) and their groups of (orientation and time orientation) preserving isometries ⎧ ⎪ P SU (1, 1) R3 = 0, S = L ⎪ ⎪ ⎨ P SU (1, 1) × P SU (1, 1) > 0, S = L (45) Isom(X,S ) = ⎪ S L(2, C)/Z2 < 0, S = L ⎪ ⎪ ⎩ > 0, S = E S L(2, C)/Z2 act via (26), (30), (36), (42). 3.2. Static universes and the embedding of hyperbolic space H2 . The defining characteristic of the model spacetimes introduced in the last subsection is that their topology is trivial. (2+1)-spacetimes with nontrivial topology are obtained as the quotients of domains U,S ⊂ X,S in the model spacetimes by the action of certain subgroups of the isometry groups Isom(X,S ). In this paper we restrict attention to spacetimes for which these subgroups are cocompact Fuchsian groups with 2g > 2 generators and act via group homomorphisms h ,S : → Isom(X,S ). The resulting spacetimes have topology R × Sg , where Sg is an oriented two-surface of genus g > 1. The simplest such spacetimes are the static spacetimes associated to a cocompact Fuchsian group , for a detailed discussion see for example [1]. For Lorentzian signature, the associated domain U,L ⊂ X,L in the model spacetime is the interior of a forward lightcone, i. e. the set of points connected to a given point x ,L ∈ X,L by timelike geodesics. In the Euclidean case, it is the whole model spacetime H3 . In each model spacetime X,S , this domain is foliated by two-surfaces U,S (T ) of constant cosmological time T , i. e. surfaces of constant geodesic distance T from a given point x ,S , which represents a singularity of the spacetime. For all values of the cosmological constant and all signatures under consideration, the surfaces U,S (T ) are surfaces of constant curvature and can be identified with copies of two-dimensional hyperbolic space. The
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
717
action of the (2+1)-dimensional Lorentz group P SU (1, 1) via its canonical embedding ı,S : P SU (1, 1) → Isom(X,S ) preserves the surfaces U,S (T ) and agrees with the action induced by (20). This induces an action of the cocompact Fuchsian group and a tessellation of each surface U,S (T ) by geodesic arc 4g-gons as described in Sect. 2.2. st The static spacetimes M,,S associated to are then obtained by identifying on each surface of constant cosmological time the points related by this action of , st = U,S / . M,,S
(46)
To obtain explicit expressions for the static domains U,S ⊂ X,S and their foliation by copies of hyperbolic space, we consider timelike geodesics c,L in the Lorentzian model spacetimes X,L and an associated geodesic c,E in H3 , ⎧ ⎪ 2T J0L√ , T ∈ (0, ∞) = 0, Lorentzian ⎪ ⎨ √ L 1 T J 0 √ e , T ∈ (0, π/ ) > 0, Lorentzian c,S (T ) = (47) √ ⎪ ⎪ ⎩ √1 e−i ||T J0L , T ∈ (0, ∞) < 0, Lorentzian and Euclidean, ||
which are parametrised by arclength and based at the identity. Furthermore we introduce a map 1 1 z . (48) g : H2 → SU (1, 1), z → g(z) = 1 − |z|2 z¯ 1 A brief calculation shows that - up to right-multiplication with a phase - the action of SU (1, 1) on the disc via (20) corresponds to left-multiplication of the image g(z), g(M z) = M · g(z) · eψ(M,z)J0 , L
ψ(M, z) ∈ R
∀M ∈ P SU (1, 1), z ∈ H2 . (49)
As the phase commutes with J0L and is mapped to its inverse by the operations ◦ , †, one finds the map ,S : H2 → X,S defined by T ⎧ ⎪ g(z)c0,L (T )g(z)−1 = 0, Lorentzian ⎪ ⎪ ⎨ −1 > 0, Lorentzian g(z)c>0,L (T )g(z) (50) ,S T (z) = ı,S ◦ g(z) c,S (T ) = ⎪g(z)c ◦ (T )g(z) < 0, Lorentzian <0,L ⎪ ⎪ ⎩g(z)c † < 0, Euclidean <0,E (T )g(z) satisfies the covariance condition
⎧ −1 ⎪ M,L ⎪ T (z)M ⎪ ⎨ M,L (z)M −1 ,S T ,S T (M z) = ı,S (M) T (z) = ⎪ ◦ M,L ⎪ T (z)M ⎪ ⎩ † M,E T (z)M
= 0, > 0, < 0, < 0,
Lorentzian Lorentzian Lorentzian Euclidean.
(51)
The action of the (2+1)-dimensional Lorentz group P SU (1, 1) via its canonical 2 embedding into Isom(X,S ) therefore preserves the images ,S T (H ) and agrees with its action induced by its action on the Poincaré disc via (20). Furthermore, as the geode2 sics (47) are parametrised by arclength, all points in the image ,S T (H ) have constant geodesic distance T from the initial singularity c,S (0). Hence, one obtains a foliation
718
C. Meusburger
of the forward lightcone or, for Euclidean signature, of H3 by surfaces of constant cosmological time ⎧ = 0, Lorentzian ⎪ T ∈(0,∞) U0,L (T ) ⎪ ⎪ ⎨ √ U ,L (T ) > 0, Lorentzian U,S = T ∈(0,π/ ) U,S (T ) = φT,S (H2 ). (52) ⎪ U (T ) < 0, Lorentzian ⎪ ⎪T ∈(0,∞) ,L ⎩ < 0, Euclidean T ∈(0,∞) U,E (T ) To obtain concrete expressions for the matrices in (50), one evaluates (47) using expression (10) for the exponential map. For Lorentzian signature and vanishing cosmological constant, this yields ⎛ ⎞ 1+|z|2 2i z −T 1−|z| i T 1−|z| 2 2 0,L ⎠. T (z) = ⎝ (53) 1+|z|2 2i z¯ T 1−|z| −i T 1−|z| 2 2 By comparing with (24), we recover the formula (19) which relates the disc model of hyperbolic space to the hyperboloids H21/T 2 of curvature 1/T 2 . For Lorentzian signature
and > 0, we consider the associated geodesic in the double cover AdS and find that the parameters in (28) and the metric (27) take the form √
√
√ √ 1+|z|2 2Re(z) 2Im(z) , x1 = sin(√ T ) , x2 = sin(√ T ) , 2 2 1−|z| 1−|z| 1−|z|2 (54) √ 2 2 4 sin ( T ) |dz| . d x 2 = −(dt1 )2 − (dt2 )2 + (d x1 )2 + (d x2 )2 = −dT 2 + (1 − |z|2 )2
t1 = cos(√ T ) , t2 = sin(√ T )
The surfaces U>0,L (T ) ⊂ X>0,L = AdS therefore have constant curvature √ 2 −/sin ( T ). For Lorentzian signature and < 0, the coordinates parametrising d S in (38) and the metric are given by √ √ 1 + |z|2 ||T ) 2Re(z) ||T ) 2Im(z) , x 1 = − sinh(√|| , x 2 = − sinh(√|| , 2 2 1 − |z| 1 − |z| 1 − |z|2 (55) √ 2 2 √ 4 sinh ( ||T ) |dz| ||T ) x 3 = cosh(√|| , d x 2 = −dT 2 + , || (1 − |z|2 )2 √ and the surfaces U<0,L (T ) have constant curvature −||/sinh2 ( ||T ). For √ Euclid2 ean signature and < 0, the curvature of the surfaces U<0,E (T ) is −/cosh ( T ), since the parameters in (34) and the metric (35) take the form
x0 =
√ sinh(√ ||T ) ||
√ 1 + |z|2 ||T ) 2Re(z) , x 1 = cosh(√|| , 2 1 − |z| 1 − |z|2 √ √ ||T ) 2Im(z) sinh(√ ||T ) 3 x 2 = cosh(√|| , x = , || 1 − |z|2 √ cosh2 ( ||T ) |dz|2 d x 2 = dT 2 + 4 . || (1 − |z|2 )2
x0 =
√ cosh( √ ||T ) ||
(56)
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
719
The cocompact Fuchsian group ⊂ P SU (1, 1) acts on the domains U,S freely and properly discontinuously via the canonical inclusion ı,S : P SU (1, 1) → Isom(X,S ) induced by (14). It follows from the identity (51) that this action preserves the surfaces U,S (T ) of constant cosmological time and agrees with the action induced by the idenst tification of these surfaces with hyperbolic space. The static (2+1)-spacetimes M,S, associated to are given as the quotients of the domains U,S by this action of , st M,S, = U,S / =
U,S (T )/ .
(57)
T
3.3. The construction of evolving (2+1)-spacetimes via grafting. After discussing the static spacetimes associated to a cocompact Fuchsian group , we will now summarise the construction of evolving (2+1)-spacetimes via grafting following the presentation in [21, 22]. Grafting along measured geodesic laminations is a method for constructing twosurfaces. The simplest case are geodesic laminations which are sets of non-intersecting closed, simple geodesics on a two-surface. Grafting along closed, simple geodesics was first investigated in the context of complex projective structures and Teichmüller theory [26–28]. General geodesic laminations were first considered by Thurston [29, 30], for historical remarks see for instance [31]. The role of geodesic laminations in (2+1)-dimensional gravity was first explored by Mess [19] who investigated the characterisation of (2+1)-dimensional spacetimes in terms of holonomies. More recent work on grafting in the context of (2+1)-dimensional gravity are the papers by Benedetti and Bonsante [21, 22], which relate the construction of (2+1)-spacetimes via grafting for different values of the cosmological constant. The ingredients of the grafting construction are a cocompact Fuchsian group and a measured geodesic lamination on the associated two-surface H2k / . In the following, we restrict attention to the case where this geodesic lamination is a weighted multicurve on H2k / , i. e. a set of non-intersecting closed, simple geodesics ηi on H2k / , each equipped with a weight wi > 0. G = {(ηi , wi ) | i ∈ I }.
(58)
Geometrically, grafting along the multicurve (58) amounts to cutting the surface H2k / along each geodesic ci , and inserting a strip of width wi as shown in Fig. 1. c
w Fig. 1. Grafting along a closed simple geodesic c with weight w on a genus 2 surface
720
C. Meusburger
In the construction of (2+1)-spacetimes via grafting, the grafting procedure is applied to each two-surface U,S (T )/ in (57). The construction is performed on their universal covers, i. e. the constant cosmological time surfaces U,S (T ), which are identified with copies of hyperbolic space via (50) and foliate the static domains U,S ⊂ X,S as in (52). The first step in the grafting construction is to lift each geodesic ηi in the multicurve (58) to a geodesic cηi on the universal cover H2k . By acting on these geodesics with the cocompact Fuchsian group , one obtains a -invariant multicurve on H2k , G Hk = {(vcηi , wi ) | i ∈ I, v ∈ }, 2
(59)
i. e. a set of non-intersecting geodesics on H2k with associated weights wi > 0, which are mapped into each other by the elements of . Via the maps ,S : H2 → U,S (T ) ⊂ T X,S in (50), which identify hyperbolic space with the constant cosmological time surfaces U,S (T ), one then obtains a -invariant set of non-intersecting geodesics on each surface U,S (T ). Grafting along the multicurve (58) assigns to each surface U,S (T ) a deformed surG (T ) constructed as follows. One selects a basepoint q ∈ H2 outside of the face U,S 0
geodesics in the multicurve (58) and considers the images ,S T (q0 ) on the surfaces U,S (T ). One then cuts each surface U,S (T ) along the images ,S T (vcηi ), i ∈ I , v ∈ of the geodesics in the multicurve (59) on U,S (T ). The resulting pieces which do not contain the images of the basepoint are then shifted away from the basepoint in the direction determined by the geodesics’ unit translation vectors and by a distance given by the geodesic’s weight. Finally, one inserts strips, which connect the shifted pieces of each constant cosmological time surface U,S (T ), and thus obtains a connected deformed G (T ) surface U,S The union of these deformed surfaces for all values of the cosmological time T then forms a simply connected regular domain in X,S : ⎧ G = 0, Lorentzian ⎪ T ∈(0,∞) U0,L (T ) ⎪ ⎪ ⎨ √ U G (T ) > 0, Lorentzian G (60) = T ∈(0,π/ )G ,L U,S ⎪ U,L (T ) < 0, Lorentzian ⎪ T ∈(0,∞) ⎪ ⎩ G < 0, Euclidean. T ∈(0,∞) U,E (T )
Under the grafting construction, the initial singularity of the static domains U,S is G (T ) mapped to a graph in X,S . It is shown in [21, 22] that the deformed surfaces U,S are surfaces of constant geodesic distance T from this graph and therefore again surfaces of constant cosmological time T . It is discussed in [21, 22] that the cocompact Fuchsian group acts on the grafted G (T ) via a group homomorphism h G : → Isom(X domain U,S ,S ). This action is ,S G (T ). Hence, by taking free and properly discontinuous and preserves each surface U,S G (T )/ h G () of the deformed constant cosmological time surfaces by the quotient U,S ,S ,G this action of one obtains a two-surface of genus g. The grafted spacetimes M,S associated to the cocompact Fuchsian group and the multicurve (58) on H2k / are then given as the union of these surfaces for all values of the cosmological time or, G by this action of , equivalently, as the quotient of the regular domains U,S ,G G G G G M,S = U,S / h ,S () = U,S (T )/ h ,S (). (61) T
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
721
w
Fig. 2. Grafting along a geodesic with weight w in hyperbolic space
The procedure is most easily visualised in Lorentzian (2+1)-gravity with vanishing cosmological constant, where the surfaces of constant cosmological time are the hyperboloids H21/T 2 which foliate the interior of the forward lightcone. Geodesics on the hyperboloids H21/T 2 are given as the intersection of H21/T 2 with planes through the origin, whose unit normal vector is the unit translation vector of the geodesic given in (23). Cutting each surface U0,L (T ) along these geodesics therefore amounts to cutting the interior of the forward lightcone along the associated planes. The resulting pieces are then shifted away from the basepoint in the direction of the plane’s normal vector by a distance given by the weight of the associated geodesic as shown in Fig.2. The strips connecting the different pieces of a surface U0,L (T ) are obtained by connecting the points of the different pieces of U0,L (T ) which correspond to a single point on a geodesic by straight lines. For the other model spacetimes the construction is similar but its description is more involved. As we will not need the details of the construction, we refer the reader to the papers [21, 22], which give an explicit parametrisation of the resulting surfaces and relate these surfaces for different values of the cosmological constant. In the following, we will only make use of a formula for the translation of the images ,S T (x) ∈ U,S (T ) of points outside of the geodesics in the multicurve (59). The relative shift of such points under the grafting construction is determined by their position relative to the geodesics in (59) and given by a map BG,,S : H2 × H2 → Isom(X,S ). To determine the value of BG,,S ( p, q) for two points p, q ∈ H2 outside the geodesics in (59), one connects them with a geodesic a pq on H2 oriented towards q. One then determines the geodesics in the multicurve (59) which intersect this geodesic as well as the associated oriented intersection numbers. It is shown in [21, 22], see in particular Sects. 4.2.1, 4.4.1, 4.6.1 and 4.7.2, that if these geodesics are labelled by ci , i = 1, . . . , m, such that the intersection point of a pq with ci occurs before the one with c j for i < j and if i are the associated oriented intersection numbers with the convention i = 1 if ci crosses a pq from the left to the right, then the relative shift BG,,S ( p, q) is given by2 √
√ , || are not present in [21, 22], where only spacetimes with cosmological constant ∈ {0, ±1} are considered. However, this normalisation is suggested by the fact that the associated spacelike geodesics should be parametrised by arc length. 2 The factors
722
C. Meusburger
BG,0,L ( p, q) = BG,>0,L ( p, q) =
m
i wi nˆ i ∈ R3 ⊂ P SU (1, 1) R3 ,
i=1 + (BG,>0,L ( p, q),
(62)
− BG,>0,L ( p, q)) ∈ P SU (1, 1) × P SU (1, 1) √
± ( p, q) = e± BG,>0,L
BG,<0,S ( p, q) = BG,<0,E ( p, q) = ei ∈ S L(2, C)/Z2 ,
√
√ 1 w1 nˆ a1 JaL ± 2 w2 nˆ a2 JaL
√
e
||1 w1 nˆ a1 JaL
ei
√
||2 w2 nˆ a2 JaL
· · · e±
n wn nˆ am JaL
· · · ei
||n wn nˆ am JaL
√
,
where wi is the weight of the geodesic ci and nˆ i its unit translation vector as defined in (23). It is shown in [21, 22] that map BG,,S : H2 × H2 → Isom(X,S ) satisfies the identities BG,,S ( p, q) · BG,,S (q, r ) = BG,,S ( p, r ) BG,,S (vp, vq) = ı,S (v) · BG,,S ( p, q) · ı,S
∀ p, q, r ∈ H2 ,
(v)−1
(63)
∀ p, q ∈ H , v ∈ P SU (1, 1), 2
which reflect the geometrical properties of the grafting procedure. This allows one to G : → Isom(X define a group homomorphism h ,S ,S ) from the cocompact Fuchsian group into the isometry group of the model spacetime by setting G h ,S (v) = BG,,S (q0 , vq0 ) · ı,S (v)
∀v ∈ ,
(64)
where ı,S : P SU (1, 1) → Isom(X,S ) is the canonical embedding of P SU (1, 1) into the isometry group of the model spacetime given by (14) and q0 ∈ H2 the basepoint. It is discussed in [21, 22] that this group homomorphism defines a free and properly G which maps each discontinuous action of the group on the grafted domains U,S G (T ) to itself. Furthermore, for any two points (x), (x ) ∈ U surface U,S ,S (T ) T T outside the geodesics which are related by the canonical action (51) of an element v ∈ , G (T ) are related by the action of v the corresponding points on the grafted surface U,S via (64) T T (x ) = ı,S (v) · ,S (x) ,S
⇒
BG,,S (q0 , x
T ),S (x )
=
G h ,S (v) ·
(65) T BG,,S (q0 , x),S (x).
G ⊂ X The quotient (61) of the domains U,S ,S by this action of is therefore welldefined and gives rise to a spacetime of topology R × Sg .
4. (2+1)-Dimensional Gravity: The Chern-Simons Formulation 4.1. (2+1)-dimensional gravity as a Chern-Simons gauge theory. The absence of local gravitational degrees of freedom in (2+1)-dimensional gravity allows one to formulate the theory as a Chern-Simons gauge theory [3, 4]. The Chern-Simons formulation of (2+1)-dimensional gravity is derived from Cartan’s description, in which a spacetime manifold M is characterised in terms of a dreibein of one forms ea , a = 0, 1, 2, and spin connection one-forms ωa , a = 0, 1, 2, on M. The metric on M is given by the dreibein g = ηab S ea ⊗ eb ,
(66)
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
723
where ηab S denotes the Minkowski metric (1) or the Euclidean metric (2), while the one-forms ωa are the coefficients of the spin connection ω = ωa JaS . Einstein’s equations of motion then take the form of the requirements of vanishing torsion and constant curvature T a = dea + abc ωb ec = 0
Fωa = dωa + 21 abc ωb ∧ ωc = − 2 abc eb ∧ ec .
(67)
To obtain the Chern-Simons formulation of (2+1)-dimensional gravity, one combines dreibein and spin connection into the Cartan connection [32] or Chern-Simons gauge field, A = ωa JaS + ea PaS ,
(68)
where JaS , PaS , a = 0, 1, 2, denote the generators of the six-dimensional Lie algebras h,S with bracket (3),(4). Hence, depending on the signature and the cosmological constant, the Chern-Simons gauge field is a one-form on M with values in the Lie algebra h,S . The choice of the Lie algebra determines the gauge group of the associated ChernSimons theory up to coverings, and in the following, we will take the isometry groups Isom(X,S ) of the associated model spacetimes as the gauge groups. When expressed in terms of the one-form (68), the Einstein-Hilbert action in Cartan’s formulation of the theory takes the form of a Chern-Simons action SC S [A] = A ∧ d A + 23 A ∧ A ∧ A , (69) M
where , is an Ad-invariant, non-degenerate bilinear form on the Lie algebra h,S given by S JaS , PbS = ηab
JaS , JbS = PaS , PbS = 0.
(70)
The equations of motion derived from (69) are a flatness condition on the gauge field F = d A + A ∧ A = 0,
(71)
which combines the requirements (67) of vanishing torsion and constant curvature F = T a PaS + (Fωa +
b a 2 bc e
∧ ec ) JaS .
(72)
The Chern-Simons action (69) is invariant under Chern-Simons gauge transformations A → γ Aγ −1 + γ dγ −1
γ : M → Isom(X,S ).
(73)
It has been shown by Witten [4] that infinitesimal Chern-Simons gauge transformations are on-shell equivalent to infinitesimal diffeomorphisms. The space of metrics solving Einstein’s equation modulo infinitesimally generated diffeomorphisms is therefore isomorphic to the space of flat Chern-Simons gauge fields modulo infinitesimally generated Chern-Simons gauge transformations. Note, however, that some caution should be applied when identifying the phase space of (2+1)-dimensional gravity in its geometrical formulation with the phase space of the associated Chern-Simons theory. First, the equivalence between diffeomorphisms and Chern-Simons gauge transformations does not hold for large diffeomorphisms, which are not infinitesimally generated, and for the large gauge transformations arising in Chern-Simons theory with non-simply connected gauge groups. Second, in order to
724
C. Meusburger
define a metric of Lorentzian or Euclidean signature via (66), the dreibein ea has to be non-degenerate, which is not required in the Chern-Simons formalism. It is discussed in [33] for the Lorentzian case with vanishing cosmological constant and spacetimes containing particles that this leads to differences in the global structure of the phase spaces. A similar result for a spacetime with three particles is derived in [22], Sect. 4.9, where it is argued that such problems arise generically when the spacetime is of topology R × S, where S is a non-compact two-surface. However, as this paper restricts attention to spacetimes with compact spatial surfaces and is mainly concerned with the local properties of the phase space, we will not address these issues in the following. On the spacetime manifolds of topology M ∼ = R × Sg considered in this paper, it is possible to give a Hamiltonian formulation of the theory. For this, one introduces coordinates x 0 , x 1 , x 2 on M ≈ R × Sg such that x 0 parametrises R and x 1 , x 2 are coordinates on Sg and splits the gauge field (68) as A = A0 d x 0 + A S ,
(74)
where A0 : R × Sg → h,S is a function with values in the Lie algebras h,S and A S a gauge field on Sg . The Chern-Simons action (69) on M then takes the form 1 S[A S , A0 ] = dx0 (75) 2 ∂0 A S ∧ A S + A0 , FS , R
Sg
where FS is the curvature of the spatial gauge field A S , FS = d S A S + A S ∧ A S ,
(76)
with d S denoting differentiation on the surface Sg . The function A0 plays the role of a Lagrange multiplier. Varying it leads to the flatness constraint FS = 0,
(77)
while variation of A S results in the evolution equation ∂0 A S = d S A0 + [A S , A0 ].
(78) Isom(X
)
,S of flat The phase space of the theory is therefore the moduli space Mg Isom(X,S )-connections A S modulo gauge transformations on the spatial surface Sg .
4.2. Trivialisation and holonomies. As discussed in Sect. 3, the absence of local gravitational degrees of freedom in (2+1)-dimensional gravity implies that each (2+1)-spacetime is locally isometric to one of the model spacetimes X,S . In the Chern-Simons formalism, this absence of local degrees of freedom manifests itself in the fact that gauge fields solving the equations of motions are flat and can be trivialised, i. e. written as pure gauge on any simply connected region R ⊂ R × Sg , A = γ dγ −1 ,
γ : R → Isom(X,S ).
(79)
Given a function γ : R → Isom(X,S ) which trivialises a flat gauge field A on R, the associated functions γ (x 0 , ·) trivialise the corresponding flat spatial gauge fields As for all values of x 0 , A S (x 0 , ·) = γ (x 0 , ·)d S γ −1 (x 0 , ·).
(80)
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
725
bj
bj
aj
bj
aj
bj
aj
aj
bj Fig. 3. Cutting the surface Sg along the generators of π1 (Sg )
To simplify notation, we will often neglect the dependence on the parameter x 0 in the following and denote this function also by γ . A maximal simply connected region in R × Sg is obtained by cutting the spatial surface Sg along a set of generators of the fundamental group π1 (Sg ) as in Fig. 3. As discussed in Sect. 2.2, the fundamental group of a genus g surface Sg is isomorphic to a cocompact Fuchsian group with 2g generators, which are subject to a single defining relation, π1 (Sg ) = a1 , b1 , . . . , ag , bg ; [bg , ag−1 ] · · · [b1 , a1−1 ] = 1 , [bi , ai−1 ] = bi ◦ ai−1 ◦ bi−1 ◦ ai .
(81)
Throughout the paper, we work with a fixed system of generators ai , bi , i = 1, . . . , g, which are the homotopy equivalence classes of two loops around each handle and based at a point p ∈ Sg as shown in Fig. 4. Cutting the surface along each of the curves representing these generators results in a 4g-gon Pg pictured in Fig. 5. As discussed by Alekseev and Malkin [13], a function γ : Pg → Isom(X,S ) on Pg defines a flat gauge field on Sg if and only if it satisfies an overlap condition relating its value on the two sides which correspond to a given generator of the fundamental group. For any y ∈ {a1 , b1 , . . . , ag , bg }, y ∈ {a1 , b1 , . . . , ag , bg } one must have A S | y = γ d S γ −1 | y = γ d S γ −1 | y = A S | y ,
(82)
which is the case if and only if there exist constant elements NY ∈ Isom(X,S ) such that γ −1 | y = NY γ −1 | y .
(83)
The elements NY , Y ∈ {A1 , B1 , . . . , A g , Bg } are the Chern-Simons analogue of the group isomorphisms (64) in the geometrical formulation. They contain all information
726
C. Meusburger j
i
i−1 i−2
g−1
bj
ai
ai
bj g
h i−2
2
1
p
Fig. 4. Generators and dual generators of the fundamental group π1 (Sg )
p
p4g−3 p4g−2
4g−4
ag
bg
p
8
p
p7
4g−1
p =p 4g
0
1
6
1
a1 p
p
NA
b2
NB1
b
a2
1
p2
p
5
p p
4
3
Fig. 5. The polygon Pg
about the physical state and are closely related to the holonomies Ai , Bi along the generators ai , bi of the fundamental group. These holonomies are given by the value of the trivialising function on the corners of the polygon Pg [13], Ai = γ ( p4i−3 )γ ( p4i−4 )−1= γ ( p4i−2 )γ ( p4i−1 )−1 Bi = γ ( p4i−3 )γ ( p4i−2 )−1= γ ( p4i )γ ( p4i−1 )−1 ,
(84)
and satisfy a single relation arising from the defining relation of the fundamental group π1 (Sg ), −1 [Bg , A−1 g ] · · · [B1 , A1 ] ≈ 1
[Bi , Ai−1 ] = Bi · Ai −1 · Bi −1 · Ai .
(85)
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
727
Via the overlap condition (82), one can relate the value of the trivialising function γ at the corners of the polygon Pg to its value at a given corner p0 , −1 −1 γ −1 ( p4i ) = N Hi N Hi−1 · · · N H1 γ −1 ( p0 ) = γ −1 ( p0 )H1−1 · · · Hi−1 Hi ,
(86)
−1 −1 −1 Hi Ai+1 , γ −1 ( p4i+1 ) = N A−1 N B−1 N Ai+1 N Hi · · · N H1 γ −1 ( p0 ) = γ −1 ( p0 )H1−1 · · · Hi−1 i+1 i+1 −1 −1 −1 Hi Ai+1 Bi+1 , γ −1 ( p4i+2 ) = N B−1 N Ai+1 N Hi · · · N H1 γ −1 ( p0 ) = γ −1 ( p0 )H1−1 · · · Hi−1 i+1 −1 −1 −1 −1 −1 −1 −1 γ ( p4i+3 ) = N Ai+1 N Hi · · · N H1 γ ( p0 ) = γ ( p0 )H1 · · · Hi−1 Hi Ai+1 Bi+1 Ai+1 , N Hi = [N Bi , N A−1 ], Hi = [Bi , Ai−1 ] i
which allows one to express the holonomies Ai , Bi along the generators ai , bi ∈ π1 (Sg ) in terms of the Poincaré elements N Ai , N Bi in the overlap condition (82) and vice versa −1 −1 −1 · · · NH NH · N Bi · N Hi−1 · · · N H1 γ −1 ( p0 ), Ai = γ ( p0 )N H 1 i−1 i
(87)
−1 −1 −1 Bi = γ ( p0 )N H · · · NH NH · N Ai · N Hi−1 · · · N H1 γ −1 ( p0 ), 1 i−1 i −1 −1 N Ai =γ −1 ( p0 )H1−1 · · ·Hi−1 Hi Bi Hi−1 · · ·H1 γ ( p0 ), −1 −1 −1 −1 N Bi =γ ( p0 )H1 · · ·Hi−1 Hi Ai Hi−1 · · ·H1 γ ( p0 ).
(88)
Up to conjugation with the value γ ( p0 ) of the trivialising function at the basepoint, the expressions (87) and (88) relating the holonomies Ai , Bi and the group elements N Ai , N Bi are of the same form. This reflects the fact that, up to conjugation with γ ( p0 ), the elements N Ai , N Bi are the holonomies along another system of generators a i , bi ∈ π1 (Sg ) pictured in Fig. 4 and given by −1 a i = h −1 1 ◦ . . . ◦ h i ◦ bi ◦ h i−1 ◦ . . . ◦ h 1 −1 bi = h −1 1 ◦ . . . ◦ h i ◦ ai ◦ h i−1 ◦ . . . ◦ h 1 .
(89)
These generators are investigated in detail in [34], where it is shown that their representatives can be viewed as a dual graph for the curves representing ai , bi ∈ π1 (Sg ) and that they can be used to determine the intersection points of a general embedded curve on Sg with the generators ai , bi . In the following we will therefore refer to the generators a i , bi as the dual generators. As the elements N Ai , N Bi ∈ Isom(X,S ) in the overlap condition or, equivalently, the holonomies Ai , Bi ∈ Isom(X,S ) contain all information about the physical state, they can be used to parametrise the phase space of the theory. Taking into account that these variables are subject to a constraint (85) and that gauge transformations on the surface Sg act on the holonomies Ai , Bi by simultaneous conjugation, one finds that the moduli space of flat Isom(X,S )-connections on Sg is given as the quotient Isom(X,S )
Mg
= {(A1 , B1 , . . . , A g , Bg )
−1 ∈ Isom(X,S )2g | [Bg , A−1 g ] · · · [B1 , A1 ] = 1}/Isom(X,S ).
(90)
728
C. Meusburger
4.3. Phase space and Poisson structure. The advantage of the Chern-Simons formulation of (2+1)-dimensional gravity is that it allows one to give a rather simple description of the Poisson structure on the phase space, which is based on its parametrisation (90) in terms of the holonomies along a set of generators ai , bi ∈ π1 (Sg ) [13, 12]. In the following we will use the formalism by Fock and Rosly [12], which is defined for ChernSimons theory with a general gauge group H and parametrises the Poisson structure on the moduli space MgH in terms of an auxiliary Poisson structure on the manifold H 2g . The description is summarised in the following theorem. Theorem 4.1 (Fock,Rosly [12]). Consider Chern-Simons theory with gauge group H on a manifold R × Sg . Denote by Ta , a = 1, . . . , dim H , a basis of the Lie algebra h = Lie H and by tab the matrix representing the Ad-invariant symmetric bilinear form in the Chern-Simons action (69) with respect to this basis tab = Ta , Tb
t ab tbc = δca .
(91)
Let r = r ab Ta ⊗ Tb be a classical r -matrix for the gauge group H , i. e. an element r ∈ h ⊗ h which satisfies the classical Yang Baxter equation (CYBE) [[r, r ]] = [r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ] = 0, (92) r12 = r ab Ta ⊗ Tb ⊗ 1, r13 = r ab Ta ⊗ 1 ⊗ Tb , r23 = r ab 1 ⊗ Ta ⊗ Tb , and whose symmetric part is dual to the bilinear form (91) r = r(s) + r(a) , r(a) = 21 (r ab − r ba )Ta ⊗ Tb , r(s) = 21 (r ab + r ba )Ta ⊗ Tb = 21 t ab Ta ⊗ Tb .
(93)
Consider the manifold H 2g , where the different copies of H are identified with the holonomies Ai , Bi ∈ H along a set of generators of the fundamental group π1 (Sg ) and denote by RaX , L aX , X ∈ {A1 , B1 , . . . , A g , Bg } the left-and right-invariant vector fields associated to a basis of h and the different components of H 2g , d |t=0 f (. . . , e−t Ta · X, . . .) dt d RaX f (A1 , . . . , Bg ) = |t=0 f (. . . , X · et Ta , . . .). dt L aX f (A1 , . . . , Bg ) =
(94)
Then, the bivector B ∈ Vec(H ) ⊗ Vec(H ), ⎛ ⎞ ⎛ ⎞ g g A A B B A A B B ab⎝ Ra j + L a j + Ra j + L a j ⎠ ⊗ ⎝ Rb j + L b j + Rb j + L b j ⎠ B = r(a) j=1
+ 21 t ab
g
j=1 A
A
B
B
(RaAi + L aAi + RaBi + L aBi ) ∧ (Rb j + L b j + Rb j + L b j )
i, j=1, i< j g 1 ab +2t RaAi i=1
∧ (RbBi + L bAi + L bBi ) + RaBi ∧ (L bAi + L bBi ) + L aAi ∧ L bBi ,
(95)
defines a Poisson structure on H 2g . After imposing the constraint (85) and dividing by the associated gauge transformations, which act by simultaneous conjugation of all
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
729
components with H , this Poisson structure agrees with the canonical Poisson structure on the moduli space −1 MgH = {(A1 , B1 , . . . , A g , Bg ) ∈ H 2g | [Bg , A−1 g ] · · · [B1 , A1 ] = 1}/H.
(96)
In Fock and Rosly’s formalism, physical observables are given by functions on the manifold H 2g which are invariant under simultaneous conjugation of all components with the gauge group H . Note that the Poisson bracket of such observables with a general function g ∈ C ∞ (H 2g ) does not depend on the particular choice of the classical r -matrix but only on the matrix t ab representing the Ad-invariant, symmetric bilinear form in the Chern-Simons action. As the component of the bivector (95) which depends g on the antisymmetric component r(a) is proportional to terms of the form i=1 RaAi + L aAi + RaBi + L aBi , this contribution vanishes if one of the functions is invariant under simultaneous conjugation of its arguments with H . A particular set of physical observables, in the following referred to as Wilson loop observables, are conjugation invariant functions of the holonomies along closed curves on Sg . As the equations of motion are a flatness condition on the gauge field, these observables do not depend on the curve itself but only on its homotopy equivalence class in π1 (Sg ) and are invariant under a change of the basepoint. In Fock and Rosly’s formalism, these observables are described by expressing the holonomy along an element η ∈ π1 (Sg ) as a product in the holonomies along the generators ai , bi ∈ π1 (Sg ), η = xrαr · · · x1α1 , xk ∈ {a1 , . . . , bg }, αk ∈ {±1}
⇒
Hη = X rαr · · · X 1α1 .
(97)
The Wilson loop observable f η ∈ C ∞ (H 2g ) associated to η ∈ π1 (Sg ) and a general conjugation invariant function f ∈ C ∞ (H ) on the gauge group is then given by f η : (A1 , . . . , Bg ) → f (Hη )
(98)
with Hη given as a product in the elements Ai , Bi and their inverses as in (97). It follows directly that the Wilson loop observables are invariant under simultaneous conjugation of all holonomies Ai , Bi with elements of H and satisfy f τ ◦η◦τ −1 = f η
∀η, τ ∈ π1 (Sg ).
(99)
In order to apply Fock and Rosly’s description [12] of phase space and Poisson structure to the Chern-Simons formulation of (2+1)-dimensional gravity, one needs classical r -matrices for the Lie algebras h,S such that the symmetric components of these r -matrices agree with the Ad-invariant, symmetric bilinear forms (70). For Lorentzian signature, such a classical r -matrix is given by r = PaL ⊗ JLa + n a abc JbL ⊗ JcL
(100)
with a constant vector n = (n 0 , n 1 , n 2 ) ∈ R3 satisfying η L (n, n) = . The corresponding r -matrix for the Euclidean case with < 0 has the form r E = PaE ⊗ J Ea + n a abc JbE ⊗ JcE
(101)
with a constant vector n = (n 0 , n 1 , n 2 ) ∈ R3 satisfying η E (n, n) = ||. Note that the choice of the classical r -matrix and hence the Poisson structure (95) is not necessarily unique - for a list of classical r -matrices for the (2+1)-dimensional Poincaré algebra see [35]. However, in the following we will only consider Poisson brackets where at least one of the functions is a Wilson loop observable so that our results do not depend on this choice.
730
C. Meusburger
5. Trivialisation and Embedding As discussed in Sect. 3 and Sect. 4.2, the absence of local gravitational degrees of freedom manifests itself in the geometrical and the Chern-Simons description of (2+1)spacetimes in, respectively, the embedding of simply connected regions into the model spacetimes X,S and the trivialisation of the Chern-Simons gauge field. In this section, we discuss the relation between these concepts and show how the embedding of spacetime regions can be constructed from the function trivialising the Chern-Simons gauge field. We consider a simply connected region R ⊂ R × Sg in the spacetime manifold with a metric g and a flat Chern-Simons gauge field A, related to the metric via (66). We denote by X ,S : R → X,S the embedding into the model spacetime X,S and by γ : R → Isom(X,S ) a function which trivialises the gauge field as in (79). The decomposition (68) of the gauge field in terms of the generators JaS , PaS then implies that the dreibein ea and the spin connection ωa are given by ea = γ dγ −1 , JaS
ωa = γ dγ −1 , PaS ,
(102)
where JaS , PaS denote the generators of the Lie algebras (3), (4) and , the Ad-invariant bilinear form (70) in the Chern-Simons action. The expression (66) for the metric in terms of the dreibein relating then implies that the metric g on R takes the form S ea ⊗ eb = ηab γ dγ −1 g = ηab , JaS γ dγ −1, JbS = −4 det(η S ) det(γ dγ −1 , JSa JaS ). S (103)
On the other hand, the metric g must agree with the pull-back of the metric in the model spacetime X,S via the embedding X ,S . To relate the trivialising function γ : R → Isom(X,S ) to the embedding X ,S : R → X,S , one therefore has to construct a function ,S : Isom(X,S ) → X,S from the isometry group into the model spacetime such that the pull-back of the metric in the model spacetime via ,S ◦ γ −1 agrees with the metric (103), −1 S −1 S d(,S ◦ γ −1 )2 = ηab S γ dγ , Ja γ dγ , Jb .
(104)
Furthermore, as the embedding of the region R into the model spacetime X,S is only defined up to a global action of the isometry group Isom(X,S ), two embeddings related by such an action of the isometry group should correspond to the same gauge field on R. This the case if and only if the action of Isom(X,S ) on X,S corresponds to leftmultiplication of the trivialising function γ −1 → N γ −1 , N ∈ Isom(X,S ), i. e. if the function ,S : Isom(X,S ) → X,S satisfies the condition ,S (N γ −1 ) = N ,S (γ −1 )
∀N ∈ Isom(X,S ).
(105)
This suggests that the functions ,S : Isom(X,S ) → X,S should be defined as 0,L (v, x) = x >0,L (v+ , v− ) = <0,L (v) = <0,E (v) =
∀(v, x) ∈ P SU (1, 1) R3 , −1 √1 v+ v− ◦
√1 vv || √1 vv † ||
∀(v+ , v− ) ∈ P SU (1, 1) × P SU (1, 1), ∀v ∈ S L(2, C)/Z2 , ∀v ∈ S L(2, C)/Z2 ,
(106)
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
731
since this ensures that identity (105) is satisfied. It remains to show that these functions yield the right metric on R, i. e. that identity (104) holds for each value of the cosmological constant and each signature of the spacetime. The simplest case is the one with Lorentzian signature and vanishing cosmological constant, which is investigated in [14, 23]. Parametrising the trivialising function as γ −1 = (v, x) with v : R → P SU (1, 1), x : R → R3 and using the group multiplication law (11), one finds γ dγ −1 = ωa JaL + ea PaL = v −1 dv + v −1 d x
ea PaL = v −1 d x, ωa JaL = v −1 dv. (107)
The Lorentz invariance of the (2+1)-dimensional Minkowski metric then implies that the metric given by (103) agrees with the pull-back of the (2+1)-dimensional Minkowski metric via 0,L ◦ γ −1 , −1 −1 2 2 2 g = ηab L (v d x)a (v d x)b = −d x 0 + d x 1 + d x 2 .
(108)
For Lorentzian signature and > 0, the trivialising function can be parametrised as γ −1 = (γ+ , γ− ) : R → P SU (1, 1) × P SU (1, 1). Using the relation between the generators Ja , Pa of the Lie algebra (3) and the alternative generators Ja± defined by (12), we find that the gauge field is given by γ dγ −1 = (ωa + ea )Ja+ + (ωa − ea )Ja− = γ+−1 dγ+ + γ−−1 dγ− ,
(109)
and the expression for the dreibein in terms of γ takes the form ea =
1 √ γ −1 dγ+ 2 +
− γ−−1 dγ− , JaL = γ−−1 · (γ+ γ−−1 )−1 d(γ+ γ−−1 ) · γ− , JaL . (110)
To prove that the metric defined by this dreibein via (103) agrees with the one induced by the metric on the model spacetime X,S , we use the following lemma, which can be proved by direct calculation. Lemma 5.1. For general M ∈ SU (1, 1) parametrised as in (9), we have M −1 d M = 2ea JaL
⇒
det d M = det(M −1 d M) = e02 − e12 − e22 = |da|2 − |db|2 . (111)
Applying this lemma to γ+ γ−−1 together with expression (110) for the dreibein and using the Ad-invariance of the determinant, we find that the metric defined by (103) agrees with the pull-back of the AdS-metric via the embedding <0,L ◦ γ −1 , g = − 1 det(d(γ+ γ−−1 )) = ηab γ dγ −1 , JaL γ dγ −1 , JbL .
(112)
For Lorentzian and Euclidean signature and < 0, the gauge field takes the form
(113) γ dγ −1 = ωa JaS + ea PaS = (ωa + i ||ea )JaS . To prove that the metric defined via (103) agrees with the pull-back of the metric on the model spacetime via ,S ◦ γ −1 we apply the following lemma to M = γ −1 (γ −1 )◦ and M = γ −1 (γ −1 )† .
732
C. Meusburger
Lemma 5.2. For general M ∈ S L(2, C) we have
1 M −1 d M = (ωa + i ||ea )JaL ⇒ −e02 + e12 + e22 = || det(d(M M ◦ )), (114)
1 M −1 d M = (ωa + i ||ea )JaE ⇒ e02 + e12 + e22 = − || det(d(M M † )). (115) Proof. The proof is a straightforward calculation. Parametrising the matrix M ∈ S L(2, C) as u v uz − vw = 1, (116) M= w z we find that the S L(2, C) matrices M M ◦ and M M † are given by
|u|2 − |v|2 v z¯ − u w¯ MM = −(vz ¯ − uw) ¯ |z|2 −|w|2 ◦
MM = †
|u|2 + |v|2 v z¯ + u w¯ vz ¯ + uw ¯ |z|2 + |w|2
, (117)
and, after some computation det(d(M M ◦ )) = |wdu|2 + |udw|2 + |zdv|2 + |vdz|2 − 2|udz − vdw|2 +2Re(dudz − dvdw − v z¯ d vd ¯ z¯ − u wd ¯ udw), ¯ † 2 2 2 2 det(d(M M )) = −(|wdu| + |udw| + |zdv| + |vdz| ) − 2|udz − vdw|2 +2Re(dudz − dvdw + v z¯ d vd ¯ z¯ + u wd ¯ udw). ¯
(118) (119)
On the other hand, expanding M −1 d M = (ωa + iea )JaL yields
||e0 = 2Re(udz −wdv),
||e1 = −Re(udw−wdu)+Re(zdv−vdz),
||e2 = Im(udw−wdu)+Im(zdv−vdz),
(120)
while the corresponding expressions for the Euclidean case are given by
||e0 = −2Re(udz −wdv),
||e1 = Re(udw−wdu)+Re(zdv−vdz),
||e2 = −Im(udw−wdu)+Im(zdv−vdz).
(121)
After some further computation using udz − wdv = vdw − zdu we obtain (114), (115). Hence, we have shown for all values of the cosmological constant and all signatures under consideration that the maps ,S : Isom(X,S ) → X,S defined in (106) satisfy the identities (104) and (105). The embedding into the model spacetimes X,S characterised by these conditions is thus given by composing these maps with the trivialising function γ −1 : R → Isom(X,S ), X ,S = ,S ◦ γ −1 : R → X,S .
(122)
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
733
6. Grafting in the Chern-Simons Formalism 6.1. Embedding into the regular domain and action of the group . After deriving explicit expressions which relate the embedding of a spacetime region into the model spacetimes X,S to the function trivialising the Chern-Simons gauge field, we will now apply these results to investigate the construction of (2+1)-spacetimes via grafting from the Chern-Simons viewpoint. The reasoning is similar to the one in [23] but does not make use of the simplifications specific to Lorentzian spacetimes with vanishing cosmological constant. To see how grafting manifests itself on the phase space of the associated Chern-Simons theory, one needs to determine how the variables parametrising the phase space, the holonomies Ai , Bi along a set of generators of the fundamental group π1 (Sg ), transform under the grafting construction. This requires relating these holonomies to the variables which encode the physical degrees of freedom in the geometrical description. For this, we recall that in the geometrical formulation of (2+1)-dimensional gravity, spacetimes are given as quotients of regular domains U,S ⊂ X,S by the action of G : → Isom(X,S ). This a cocompact Fuchsian group via a homomorphism h ,S action leaves the surfaces U,S (T ) of constant cosmological time T invariant, and the spacetime is given by identifying on each surface U,S (T ) the points related by this action of . The physical degrees of freedom are therefore encoded in the cocompact G : → Isom(X Fuchsian group and the group homomorphism h ,S ,S ). In the Chern-Simons formalism, the physical degrees of freedom are given by the holonomies Ai , Bi ∈ Isom(X,S ) or, equivalently, the elements N Ai , N Bi ∈ Isom(X,S ) which arise in the overlap condition (83). By cutting the manifold along the representatives of the generators of the fundamental group and trivialising the gauge field on the resulting region, one obtains a set of functions γ (x 0 , ·) : Pg → Isom(X,S ) on a 4g-gon Pg . The values of γ −1 at the two sides corresponding to a given generator are related by left-multiplication with the elements N Ai , N Bi , and it is shown in Sect. 5 that the left multiplication of the trivialising function γ −1 with elements of the isometry group Isom(X,S ) corresponds to the action of this group on the model space time X,S . This suggests identifying the parameter x 0 in the splitting (74) of the Chern-Simons gauge field with the cosmological time T , identifying the generators ai , bi ∈ π1 (Sg ) with the projection of the geodesics on U,S (T ) which are identified by the action of the generators v Ai , v Bi ∈ and to take the corner p0 of the resulting polygon Pg as the basepoint for the grafting. With these identifications, the embedding constructed from the trivialising function γ (x 0 , ·) : Pg → Isom(X,S ) for constant x 0 = T then maps the polygon Pg into the surface U,S (T ) of constant cosmological time T . As the sides of the embedded polygon in U,S (T ) are identified pairwise by the action h ,S (vY ), Y ∈ {A1 , B1 , . . . , A g , Bg }, of the generators of , this implies that the group elements in the overlap condition (82) must agree with the image of these generators under the group homomorphism h ,S : → Isom(X,S ), NY = h ,S (vY )
∀Y ∈ {A1 , B1 , . . . , A g , Bg }.
(123)
Using the formula (86) which expresses the value of the trivialising function at the corners pi of the polygon Pg in terms of the elements N Ai , N Bi and the value of γ at a point p0 ∈ Pg we can then determine the holonomies Ai , Bi and find that they are given by −1 −1 Ai = γ ( p0 ) · h ,S (v −1 H1 · · · v Hi v Bi v Hi−1 · · · v H1 ) · γ ( p0 ) , −1 −1 Bi = γ ( p0 ) · h ,S (v −1 H1 · · · v Hi v Ai v Hi−1 · · · v H1 ) · γ ( p0 ) .
(124)
734
C. Meusburger
6.2. The transformation of the holonomies under grafting. We will now use the relation between the phase space variables N Ai , N Bi in the Chern-Simons formalism and the group homomorphism h ,S : → Isom(X,S ) in the geometrical description to derive the transformation of the holonomies Ai , Bi along a set of generators of the fundamental group under grafting. We start by considering the static universes associated to the cocompact Fuchsian group . As discussed in Sect. 3.2, the surfaces of constant cosmological time are then copies of hyperbolic space H2 embedded into the model spacetime via the maps : H2 → X,S defined in (50). The polygon Pg obtained by cutting the spatial sur,S T T (P ) in the tessellation face Sg is embedded into the image fundamental polygon ,S of U,S (T ) induced by T X st ,S (T, ·) : Pg → ,S (P ) ⊂ U,S (T ).
(125)
The group homomorphism h st ,S : → Isom(X,S ) is given by the canonical embedding of P SU (1, 1) into the isometry group of the model spacetime h st ,S (v) = ı,S (v). Hence, using the identities (123) and (124), we find that, up to conjugation with the value of γ at the basepoint p0 ∈ Sg , all holonomies Ai , Bi are purely Lorentzian −1 −1 Ai = γ ( p0 ) · ı,S (v −1 H1 · · · v Hi v Bi v Hi−1 · · · v H1 ) · γ ( p0 ) ,
(126)
−1 −1 Bi = γ ( p0 ) · ı,S (v −1 H1 · · · v Hi v Ai v Hi−1 · · · v H1 ) · γ ( p0 ) .
We now consider the spacetimes obtained from these static spacetimes by grafting along a closed, simple geodesic η on H2k / with weight w. Again, the identification of the parameter x 0 in (74) with the cosmological time implies that for each value of the parameη ter x 0 = T the polygon Pg is embedded into a surface U,S (T ) of constant cosmological time. However, these surfaces are no longer copies of hyperbolic space but the deformed surfaces obtained by inserting a strip along each geodesic in the -invariant multicurve on U,S (T ) associated to η. The cocompact Fuchsian group acts on these surfaces η of constant cosmological time via the group homomorphism h ,S : → Isom(X,S ) defined by (62),(64), and the group elements in the overlap condition are given by η NY = h ,S (vY ). To derive a formula for the transformation of the holonomies Ai , Bi along the generators of the fundamental group, we consider a generic side y of the polygon Pg with starting point and endpoint piY , p Yf ∈ { p0 , . . . p4g }. Denoting by viY , v Yf ∈ the elements of the cocompact Fuchsian group that relate the value of the trivialising function γ −1 at the points piY , p Yf to its value at p0 in the static case γst−1 ( piY ) = ı,S (viY )γst−1 ( p0 )
γst−1 ( p Yf ) = ı,S (v Yf )γst−1 ( p0 ),
(127)
we can express the holonomy along y as η
η
Y = γ ( p Yf )γ −1 ( piY ) = γ ( p0 )h ,S (v Yf )−1 h ,S (viY )γ ( p0 )−1 .
(128)
η
Using identity (64) for the group homomorphism h ,S and the identities (63) we then obtain Y = γ ( p0 )ı,S (v Yf )−1 Bη,,S (q0 , v Yf q0 )−1 Bη,,S (q0 , viY q0 )ı,S (viY )γ ( p0 )−1 (129) = γst ( p Yf )Bη,,S (viY q0 , v Yf q0 )−1 γst ( piY )−1 = Yst · γst ( piY )Bη,,S (viY q0 , v Yf q0 )−1 γst ( piY )−1 ,
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
735
where γst , Yst denote, respectively, the trivialising function and the holonomy along y in the static universe associated to , q0 ∈ H2 the basepoint for the grafting, which we took to coincide with the embedding of the corner p0 and Bη,,S is given by (62). Setting piAi = p4(i−1) , p Af i = p4i−3 and piBi = p4i−2 , p Bf i = p4i−3 in (129) and using expression (86) for the value of the trivialising function γ at the corners of the polygon Pg , we find that the group elements viAi , viBi , v Af i , v Bf i are given by viAi = v Hi−1 · · · v H1 , viBi = v −1 Bi v Ai v Hi−1 · · · v H1 ,
−1 v Af i = v −1 Ai v Bi v Ai v Hi−1 · · · v H1 , −1 v Bf i = v −1 Ai v Bi v Ai v Hi−1 · · · v H1 ,
(130)
and that the transformation of the holonomies Ai , Bi under grafting along η takes the form st st Aist → Aist ·(Hi−1 · · · H1st ) γ ( p0 )Bη,,S (viAi q0 , v Af i q0 )−1 γ −1 ( p0 ) (Hi−1 · · · H1st )−1 ,
(131) st · · · H1st ) γ ( p0 )Bη,,S (viBi q0 , v Bf i q0 )−1 γ −1 ( p0 ) Bist → Bist · (Bist −1 Aist Hi−1 st ×(B st −1 Aist Hi−1 · · · H1st )−1 .
We will now evaluate this formula for the case of a simple, closed geodesic η with weight w on H2k / using the concrete expression (62) for the map Bη,,S : H2 × H2 → Isom(X,S ). As discussed in Sect. 3, the geodesic η lifts to a -invariant multicurve on H2k , H2
k = {Ad(v)cη | v ∈ }, G η,
(132)
where cη : R → H2k , cη (0) ∈ P is the lift of η with basepoint in the fundamental polygon P ⊂ H2k and with translation element a
cη (1) = vη cη (0) = en η Ja ,
cη (t) = vcη (0) ∀v ∈ , t ∈ (0, 1).
(133)
In order to evaluate the formula (62) for the multicurve (132), we need to determine which geodesics in (132) intersect a given side of the fundamental polygon P and to derive their translation vectors. For this, we note that a geodesic c = vcη , v ∈ , intersects a side of P if and only if cη intersects the corresponding side of the polygon P = v −1 P . Hence, the intersection points of geodesics in the multicurve (132) with the sides of P are in one-to-one correspondence with intersection points of cη |[0,1) : [0, 1) → H2k with polygons in the tessellation of H2k induced by . Furthermore, these intersection points are labelled by the factors in the expression of the translation element (133) as a product in the generators v Ai , v Bi of , which can be seen as follows. Because cη is the lift of closed, simple geodesic η on H2k / , it traverses a sequence of polygons in the tessellation of H2k induced by P1 = P ,
P2 = vr P ,
P3 = vr −1 vr P , . . . ,
Pr +1 = v1 · · · vr P = vη P , (134)
which are mapped into each other by group elements vi ∈ , until it reaches the point cη (1) = v1 · · · vr cη (0) = vη cη (0) ∈ Pr +1 identified with cη (0). As the elements of the Fuchsian group which map the polygon P into its neighbours are the generators v Ai , v Bi ∈ and their inverses, we find that the group element vr is of the form
736
C. Meusburger
vr = v αXrr , with v X r ∈ {v A1 , . . . , v Bg }, αr ∈ {±1}. Similarly, for a general polygon P = v P the elements of which map this polygon into its neighbours are given by ±1 −1 −1 vv ±1 Ai v , vv Bi v , which implies that the group elements vk in (134) are of the form r vk = v αXrr · · · v αXk+1 v αk v −αk+1 · · · v −α Xr k+1 X k X k+1
(135)
with v X i ∈ {v A1 , . . . v Bg }, αi =∈ {±1}. In particular, the translation element vη in (133) and the associated translation vector nη are given by vη = v1 · · · vr = v αXrr · · · v αX11 = en η Ja . a
(136) H2
k Hence, intersection points of the geodesics in the multicurve G η, with a given side y ∈ {a1 , b1 , . . . , ag , bg } of P are in one-to-one correspondence with factors v αXkk , X k = Y , in the expression (136) of the translation element vη in terms of the generators of . The geodesics in (132) which intersect the fundamental polygon P are therefore given by
α
c1 = cη , c2 = v αX11 cη , c3 = v αX22 v αX11 cη , . . . , cr = v Xrr−1 · · · v αX11 cη , (137) −1 and the associated translation vectors take the form α
nck = Ad(v Xk−1 · · · v αX11 )nη . k−1
(138)
Furthermore, we note that the geodesic in (137) which intersects the side y = xk is ck if αk = 1 and ck+1 if αk = −1. Taking into account the orientation of the sides ai , bi in the polygon Pg , see Fig. 5, we find that intersections with sides ai have positive intersection number for αk = 1 and negative intersection number for αk = −1, while the intersection numbers for sides bi are positive and negative, respectively, for αk = −1 and αk = 1. With the definition (Y ) = 1 for Y = Ai , (Y ) = −1 for Y = Bi and α Ad(v Xk−1 · · · v αX11 )nη for αk = 1 k−1 (139) nk = αk for αk = −1, Ad(v X k · · · v αX11 )nη we can express the group elements Bη,,S (viY q0 , v Yf q0 ) in (131) as
Bη,>0,L (viY q0 , v Yf q0 ) = (
X k =Y
e
αk nˆ k , X k =Y √ −w (Y )αk nˆ ak JaL
Bη,=0,L (viY q0 , v Yf q0 ) = −w(Y )
(140)
,
ew
√
(Y )αk nˆ ak JaL
X k =Y
Bη,<0,L (viY q0 , v Yf q0 ) = Bη,<0,E (viY q0 , v Yf q0 ) =
e−i
√
),
||w(Y )αk nˆ ak JaL
,
X k =Y
where the factors are ordered from the left to the right in the order in which the intersection points occur on the generator y ∈ {a1 , . . . , bg }. Inserting formula (140) into (131), then yields an expression for the transformation of the holonomies under grafting along η in terms of the translation vector of cη . We obtain the following theorem.
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
737
Theorem 6.1. Consider a closed, simple geodesic η : [0, 1] → H2k / with weight w > 0 and its lift cη : [0, 1] → H2k with basepoint cη (0) ∈ P . Let vη ∈ be the translation element of cη defined as in (133) and given in terms of the generators v Ai , v Bi ∈ by (136) and denote by vηk the associated cyclic permutations of (136), α v k−1 · · · v α1 v αr · · · v αXkk αk = 1 k n ak Ja (141) = αXkk−1 α1 X 1αr X r αk+1 vη = e v X k · · · v X 1 v X r · · · v X k+1 αk = −1. Then the transformation of the holonomies Ai , Bi under grafting along η with weight w is given by ⎛ ⎞ −wα w st k Grη,,S : Aist → Aist · (Hi−1 · · · H1st )γ ( p0 ) ⎝ F (vηk )⎠ X k =Ai
st ×γ −1 ( p0 )(Hi−1 · · · H1st )−1 ,
,S
⎛
st · · · H1st )γ ( p0 ) ⎝ Bist → Bist · (Bist −1 Aist Hi−1
X k =Bi
×γ
−1
st ( p0 )(Bist −1 Aist Hi−1 ···
⎞
(142)
wαk k ⎠ F,S (vη )
H1st )−1 ,
where the factors are ordered from the right to the left in the order in which the associated intersection point occurs on the generators ai , bi and Fw : P SU (1, 1) → Isom(X,S ) is given by ⎧ ⎪ ˆ n) = 0, Lorentzian ⎨(1, w √ a L √ a L w n a JaL w J −w n ˆ J n ˆ a a ) = (e F,S (e (143) ,e ) > 0, Lorentzian ⎪ ⎩ i √||wnˆ a JaL e < 0, Lorentzian and Euclidean. nˆ = √ 1 2 n ∈ R3 . |n | 7. Grafting and Poisson Structure 7.1. The transformations generated by the physical observables. Theorem 6.1 gives a formula for the transformation of the holonomies Ai , Bi for a static spacetime under grafting along a closed simple geodesic η on H2k / with weight w. In this section, we will demonstrate that the associated one-parameter group of diffeomorphisms on the phase space is generated via the Poisson bracket by a gauge invariant Hamiltonian. We find that for all values of the cosmological constant and all signatures under consideration this Hamiltonian is a Wilson loop observable associated to η and constructed from an Ad-invariant, symmetric bilinear form on the Lie algebra h,S . In the Chern-Simons formulation of (2+1)-dimensional gravity, the Poisson brackets of Wilson loop observables were first investigated in the work of Regge and Nelson [5–7, 9, 8] and by Ashtekar, Husain, Rovelli and Smolin [10], for the case of a punctured disc see also Martin [11]. In a mathematical context, the Poisson brackets of Wilson loop observables and the associated flows on phase space were first derived in the classical paper [36] by Goldman, who considers the moduli space of flat H -connections on surfaces Sg and for general groups H . However, the formulation in [36] is rather abstract
738
C. Meusburger
and does not characterise these flows in terms of the holonomies of a set of generators of the fundamental group π1 (Sg ). It is shown in [34] that Fock and Rosly’s description of the phase space [12] allows one to obtain concrete expressions for the transformations of these holonomies under the flow generated by the Wilson loop observables associated to a general simple curve on Sg and a general conjugation invariant function of the gauge group. The results are valid for Chern-Simons theory with a general gauge group H and can be summarised as follows. Theorem 7.1 ([34], see also [36]). Consider Fock and Rosly’s description of the moduli space MgH of flat H -connections on Sg with the notation introduced in Theorem 4.1. Let η, λ be closed, simple curves on Sg , whose homotopy equivalence classes are given, respectively, as a product of the dual generators a i , bi defined in (89) and as a product in the generators ai , bi and their inverses α
r −1 ◦ . . . ◦ x α1 1 , η = x rαr ◦ x r −1
λ=
ysβs
◦
βs−1 ys−1
◦ ... ◦
x k ∈ {a 1 , b1 , . . . , a g , b g }, αk ∈ {±1},
β y1 1 ,
(144)
yk ∈ {a1 , . . . , bg }, βk ∈ {±1}.
Denote by ηk , λk the cyclic permutations of the products in (144) ηk = λk =
α
k−1 x k−1 ◦ . . . ◦ x α1 1 ◦ x rαr ◦ . . . ◦ x αk k αk = 1 αk k+1 x k ◦ . . . ◦ x α1 1 ◦ x rαr ◦ . . . ◦ x αk+1 αk = −1
β
β
β
(145)
β
k−1 ◦ . . . ◦ y1 1 ◦ ys s ◦ . . . ◦ yk k βk = 1 yk−1 β β β βk+1 yk k ◦ . . . ◦ y1 1 ◦ ys s ◦ . . . ◦ yk+1 βk = −1
(146)
and by Hηk , Hλk the associated holonomies. Then, the intersection points of η and λ, respectively, with curves representing the generators ai , bi ∈ π1 (Sg ) and a i , bi are in one-to-one correspondence with factors x k = a i , x k = bi in (145) and factors yk = ai , yk = bi in (146) and the exponents αk , βk determine the associated oriented intersection numbers. Furthermore, let f η , h λ be the Wilson loop observables associated to η, λ and to conjugation invariant functions f, h ∈ C ∞ (H ) of the gauge group. Then, the following statements hold. 1. The Poisson bracket of the Wilson loop observables f η , h λ is given by {h λ , f η }(A1 , . . . , Bg ) =
g
−1 αk βl g f (Hηk ), gh (Hi−1 · · · H1 Hλl H1−1 · · · Hi−1 )
i=1 x k=a i ,yl=ai
−
g
−1 −1 αk βl g f (Hηk ), gh (Bi−1 Ai Hi−1 · · · H1 Hλl H1−1 · · · Hi−1 Ai Bi )) ,
i=1 x k=bi ,yl =bi
(147) where g f , gh : H → h are defined by the action of the left invariant vector fields Ra ∈ Vec(H ) on f, h ∈ C ∞ (H ), g f (u), Ta = Ra f (u),
gh (u), Ta = Ra h(u)
∀u ∈ H.
(148)
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
739
t : H 2g → H 2g generated by the 2. The one-parameter group of diffeomorphisms T f,η Wilson loop observable f η via the Poisson bracket acts on the holonomies Ai , Bi according to
d t |t=0 h ◦ T f,η ∀h ∈ C ∞ (H 2g ), dt −tα : Ai → Ai · (Hi−1 · · · H1 ) · G f k (Hηk ) · (Hi−1 · · · H1 )−1 ,
{ f η , h} = t T f,η
x k =a i
Bi → Bi ·
(Bi−1 Ai Hi−1 · · ·
H1 ) ·
k G −tα (Hηk ) · (Bi−1 Ai Hi−1 · · · H1 )−1 , f
x k =bi
(149) where the factors are ordered from the right to the left in the order in which the associated intersection points occur on the curves representing the generators ai , bi and the function G tf : H → H is obtained by exponentiating the function g f : H → h, G tf (u) = etg f (u)
∀u ∈ H.
(150)
A particular set of Wilson loop observables which will be relevant in the following are the observables associated to Ad-invariant, symmetric bilinear forms κ ∈ h∗ ⊗ h∗ on the Lie algebra of the gauge group. These observables are constructed using the parametrisation via the exponential map exp : h → H by setting κ(e ˜ k
aT a
) = 21 κ(k a Ta , k a Ta ) = 21 κab k a k b .
(151)
As it is supposed in [34] that the exponential map is surjective, but not necessarily bijective, one has to restrict the range of admissible vectors k ∈ Rdim H to obtain a well-defined expression. The resulting function κ˜ : H → R is then not necessarily continuous everywhere. However, in the following we are interested in the local properties of the phase space. As the exponential map is locally bijective, any element u ∈ H has a neighbourhood in which the Wilson loop observable κ˜ : H → R is a C ∞ -function. It has been shown by Goldman [36] that the associated functions gκ˜ : H → h, G tκ˜ : H → H defined in (148), (150) then take the form gκ˜ (ek
aT a
) = l κ (k a Ta )
G tκ˜ (ek
aT a
) = et l κ (k
aT ) a
(152)
with a linear map l κ : h → h given by l κ (x), y = x, l κ ( y) = κ(x, y)
∀x, y ∈ h.
(153)
In particular, one obtains a generic set of observables constructed from the Ad-invariant, non-degenerate, symmetric bilinear form , in the Chern-Simons action with associated conjugation invariant function t˜ ∈ C ∞ (H ), t˜(ek
aT a
) = 21 k a Ta , k a Ta = 21 tab k a k b .
(154)
In this case, the linear map l , : h → h is the identity l , = idh and the associated functions gt˜, G tt˜ in (148), (150) are given by gt˜(e x ) = x
G tt˜(e x ) = et x
∀x ∈ h.
(155)
740
C. Meusburger
The transformations (149) the associated observables t˜λ generate via the Poisson bracket are investigated in [34], for the case of semidirect product gauge groups G g∗ , see also [37], where it is shown that they have the interpretation of infinitesimal Dehn twists along λ. We will discuss these observables and the associated flows on phase space in more detail in Sect. 7.3 and Sect. 8.3, where we investigate the relation between grafting and infinitesimal Dehn twists. 7.2. Hamiltonians for grafting. Using the results from [34, 36], summarised in the last subsection, we can now investigate the transformations on phase space generated by observables associated to certain Ad-invariant, symmetric bilinear forms in the ChernSimons formulation of (2+1)-dimensional gravity and show that they agree with the grafting transformation (142). The first step is to identify the Ad-invariant bilinear forms on the Lie algebras h,S . It is shown in [4] that for all values of the cosmological constant and all signatures under consideration, the space of Ad-invariant, symmetric bilinear forms on h,S is two-dimensional. Besides the pairing , in the Chern-Simons action given by (70), the Lie algebras h,S admit another Ad-invariant symmetric bilinear form, which is given in terms of the generators JaS , PaS by S κ(JaS , JbS ) = ηab
κ(JaS , PbS ) = 0
S κ(PaS , PbS ) = ηab .
(156)
It follows that the associated linear maps l κ : h,S → h,S defined in (153) take the form l κ (JaS ) = PaS l κ (PaS ) = JaS .
(157)
: Isom(X,S ) → Isom(X,S ) defined in (150) is obtained The associated map by exponentiating this map l κ : h,S → h,S as in (152). As the exponential maps exp,S : h,S → Isom(X,S ) are surjective and therefore locally bijective for all isometry groups Isom(X,S ), see the remark after (16), we obtain a one-parameter group of transformations G tκ,,S : Isom(X,S ) → Isom(X,S ) given by ˜ G tκ,,S ˜
(exp,S ( pa JaS + k a PaS )) G tκ,,S ˜ = exp,S (t ( pa Pas + k a JaS )) ⎧ (1, t p) ⎪ ⎪ ⎪ t √( pa +√k a )J L −t √( pa −√k a )J L ⎨ a ,e a ) (e √ = a +i √||k a )J L it ||( p a ⎪ e ⎪ ⎪ ⎩ it √( pa +i √k a )J E a e
(158) = 0, > 0, < 0, < 0,
Lorentzian Lorentzian Lorentzian Euclidean,
: Isom(X,S ) → Isom(X,S ) as and can formally express the map G tκ,,S ˜ G tκ,=0,L (e p ˜
aJL a
, a) = (1, t p)
√ √ t −t (u , u ) = (u , u ) G tκ,>0,L + − − + ˜ √ (u) = G t<0,E (u) = u it || G tκ,<0,L ˜
∀ p, a ∈ R3 ,
(159)
∀u ± ∈ P SU (1, 1), ∀u ∈ S L(2, C)/Z2 .
The fact that the exponential map exp,S : h,S → Isom(X,S ) is locally, but not globally bijective, implies that the maps (158), (159) are only defined locally. In order
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
741
to obtain a unique parametrisation of elements of Isom(X,S ) in terms of elements of h,S , one has to restrict the range of the vectors p, k ∈ R3 appropriately, which implies that the Wilson loop observable κ˜ and the map G tκ,,S : Isom(X,S ) → Isom(X,S ) ˜ are not necessarily continuous everywhere. However, as the exponential map is locally bijective, every element of Isom(X,S ) has a neighbourhood in which the Wilson loop observable κ˜ is C ∞ and the map G tκ,,S a one-parameter group of diffeomorphisms. ˜ In particular, the parametrisation via the exponential map is unique for elements of the group P SU (1, 1) embedded into Isom(X,S ) via (14), see the remark after (10), which is the form the holonomies take for the static universes. Hence, the parametrisation via the exponential map is well-defined and C ∞ in this case, and we can insert expressions (158), (159) into the general formula (149). We obtain the following theorem, which generalises Theorem 5.2. in [23]. Theorem 7.2. Let η be a closed simple curve on Sg whose homotopy equivalence class is given as a product in the dual generators (89) and their inverses as in (144) and κ the Ad-invariant, symmetric bilinear form on h,S defined in (156). Then, the one-parameter t group of diffeomorphisms Tκ,η,,S : Isom(X,S )2g → Isom(X,S )2g generated by the ˜ Wilson loop observable κ˜ η is given by d 2g t |t=0 g ◦ Tκ,η,,S ∀g ∈ C ∞ (H ), ˜ dt −tα k : Ai → Ai · (Hi−1 · · · H1 ) · G κ,,S (Hηk ) · (Hi−1 · · · H1 )−1 , ˜
{κ˜ η , g} = t Tκ,η,,S ˜
x k =a i
Bi →
Bi · (Bi−1 Ai Hi−1 · · ·
H1 ) ·
(160)
−tαk G κ,,S (Hηk ) · (Bi−1 Ai Hi−1 · · · H1 )−1 , ˜
x k =bi
where Hηk : Isom(X,S )2g → Isom(X,S ) represents the holonomy along ηk defined by (145), G tκ,,S : Isom(X,S ) → Isom(X,S ) is given by (158), (159) and the factors ˜ are ordered from the right to the left in the order in which the corresponding intersection points occur on the generators ai , bi . t : We will now demonstrate that the one-parameter group of transformations Tκ,η,,S ˜ 2g 2g Isom(X,S ) → Isom(X,S ) is related to the transformation (142) of the holonomies Ai , Bi under grafting along a closed simple geodesic η on the surface H2k / with homotopy equivalence class (144). For this, we evaluate (160) for the static case, where the group elements N Ai , N Bi in the overlap condition (83) are the images of the generators v Ai , v Bi ∈ under the canonical embedding (14) of P SU (1, 1) into the isometry group of the model spacetime N Ai = ı,S (v Ai ), N Bi = ı,S (v Bi ). From expressions (124) for the holonomies Ai , Bi in terms of N Ai , N Bi it then follows that the holonomies along the elements ηk defined in (145) take the form
Hηk = γ ( p0 )vηk γ −1 ( p0 ) = γ ( p 0 )en k Ja γ −1 ( p0 ) a
L
(161)
: with vηk , nk given by (141). As it is shown in [36, 34] that the functions G tκ,,S ˜ Isom(X,S ) → Isom(X,S ) satisfy the covariance condition G tκ,,S (gug −1 ) = g · G t,S (u) · g −1 ˜
∀u, g ∈ Isom(X,S )
(162)
742
C. Meusburger
we obtain an expression in terms of the translation vectors nk defined in (139) ⎧ −1 = 0, Lorentzian ⎪ 0) ⎨γ ( p0 )(1,√t nk )γ ( p√ a aJ L tαk t n J − tn −1 a a k k G κ,,S (Hηk ) = γ ( p0 )(e ,e )γ ( p0 ) > 0, Lorentzian ˜ √ ⎪ a L ⎩ γ ( p0 )eit ||n k Ja γ −1 ( p0 ) < 0, Lorentzian or Euclidean. By inserting this expression into (160) and comparing with the transformation of the holonomies under grafting given by (142), (143), we find that these transformations agree up to normalisation. Theorem 7.3. Consider a static spacetime, where the holonomies Ai , Bi along the generators of the fundamental group are given by −1 −1 Ai = γ ( p0 ) · ı,S (v −1 H1 · · · v Hi v Bi v Hi−1 · · · v H1 ) · γ ( p0 ) , −1 −1 Bi = γ ( p0 ) · ı,S (v −1 H1 · · · v Hi v Ai v Hi−1 · · · v H1 ) · γ ( p0 ) .
(163)
Then, the transformation (142) of the holonomies under grafting along a closed simple geodesic η on H2k / agrees with the transformation (160) generated by the observable κ˜ η if the parameter t in (160) is related to the weight w by t = w|nη |, w t Grη,,S (A1 , . . . , Bg ) = Tκ,η,,S (A1 , . . . , Bg ) ˜
for t = w |κ(nη , nη )| = w|nη |. (164)
The fact that the transformation of the holonomies Ai , Bi under grafting is generated by a gauge invariant observable allows us to directly deduce some of its properties, which are summarised in the following corollary, for the Lorentzian case with vanishing cosmological constant see also [23]. Corollary 7.4. t act by Poisson isomorphisms 1. The grafting transformations Tκ,η,,S ˜ t t t , g ◦ Tκ,η,,S } = { f, g} ◦ Tκ,η,,S { f ◦ Tκ,η,,S ˜ ˜ ˜
∀ f, g ∈ C ∞ (Isom(X,S )2g ). (165)
t 2. The grafting transformation Tκ,η,,S leaves the constraint (85) invariant and com˜ mutes with the associated gauge transformations by simultaneous conjugation with Isom(X,S ), t t (g A1 g −1 , . . . , g Bg g −1 ) = gTκ,η,,S (A1 , . . . , Bg )g −1 ∀g ∈ Isom(X,S ). Tκ,η,,S ˜ ˜
(166) 3. In the Lorentzian case with = 0, all grafting transformations commute. Proof. The first two statements are a direct consequence of the fact that the transfort are generated by the gauge invariant observable κ˜ η . The last statement mations Tκ,η,,S ˜ follows immediately from the formula (158).
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
743
c
c 2 πw
Fig. 6. A Dehn twist with parameter w around a closed, simple geodesic c on a genus 2 surface
7.3. Grafting and Dehn twists. After demonstrating that the transformation of the holonomies under grafting along a closed, simple curve η on Sg is generated by the Wilson loop observable κ˜ η , we will now investigate the relation between this grafting transformation and the action of infinitesimal Dehn twists along η. Geometrically, an infinitesimal Dehn twist along a closed geodesic η on a two-surface H2k / with parameter w amounts to cutting the surface along η and rotating the edges of the cut by an angle 2π w as shown in Fig. 6. In Chern-Simons theory, infinitesimal Dehn twists along closed, simple curves on the spatial surface are present generically for any gauge group H and give rise to a transformation on the moduli space of flat H -connections. The action of (infinitesimal) Dehn twists in Fock and Rosly’s description [12] of the moduli space is investigated in [34], for the case of semidirect product groups G g∗ see also [37], where it is shown that they are generated by the gauge invariant observable constructed from the Ad-invariant bilinear form , in the Chern-Simons action. The results can be summarised as follows. Theorem 7.5 ([34, 37]). Consider Fock and Rosly’s description of the moduli space MgH in terms of the auxiliary Poisson structure (95) on H 2g . Let η be a closed, simple curve on Sg , whose homotopy equivalence class is given as a product in the dual generators (89) and their inverses as in (144). Denote by ηk the cyclic permutations of this expression as in (145) and by Hηk the associated holonomies. Then the transformation Tt˜t,η : H 2g → H 2g generated by the Wilson loop observable t˜η associated to η and the Ad-invariant symmetric bilinear form in the Chern-Simons action as in (154) is given by −1 k · (H Tt˜t,η : Ai → Ai · (Hi−1 · · · H1 ) · Hη−tα i−1 · · · H1 ) , k x k =a i
Bi →
Bi · (Bi−1 Ai Hi−1 · · ·
H1 ) ·
−1 k · (B −1 A H Hη−tα i i−1 · · · H1 ) , i k
(167)
x k =bi
where the elements ηk ∈ π1 (Sg ) with holonomies Hηk are defined as in (145) and the products are ordered from the right to the left in the order in which the associated intersection points occur on the generators ai , bi ∈ π1 (Sg ). For flow parameter t = −1, this transformation agrees with the transformation of the holonomies Ai , Bi under a Dehn twist around η, = Dη , Tt˜−1 ,η
(168)
744
C. Meusburger
where Dη : H 2g → H 2g is the diffeomorphism induced by the action of the Dehn twist along η on the generators of the fundamental group π1 (Sg ). Using the results from [34, 37] summarised in Theorem 7.5, we can now compare t the transformation of the holonomies Ai , Bi under the grafting transformations Tκ,η,,S ˜ in (160) with their transformations (167) under infinitesimal Dehn twists. For this, we note that the formulas (160), (167) differ only in the fact that instead of the factor Hηtαk k in (167), the grafting transformation (160) contains a factor tαk (Hηk ) = G 1κ,,S (Hηtαk k ), G κ,,S ˜ ˜
(169)
with the map G tκ,,S : Isom(X,S ) → Isom(X,S ) given by (159). For the Lorentz˜ ian case with = 0, the map G tκ,0,L assigns to each element of the gauge group ˜ 3 P SU (1, 1) R the translation vector associated to its Lorentz component G tκ,0,L (exp=0,L (( pa JaL + k a PaL ))) = G tκ,0,L ((e p ˜ ˜ = (1, t p) = exp0,L (t p
a
PaL ).
aJ a
, T ( p)k)) (170)
For static universes the holonomies Hηk in (160) and (167) are conjugated to elements of the (2+1)-dimensional Lorentz group P SU (1, 1) and grafting acts on the holonomies Ai , Bi by right-multiplication with the associated elements of the Lie algebra su(1, 1). Hence, for the Lorentzian case with vanishing cosmological constant, grafting along a closed, simple curve η can be viewed as the derivative or first-order approximation of a Dehn twist along η. For Lorentzian signature with > 0 and gauge group tαk P SU (1, 1)× P SU (1, 1), identity (159) allows one to express the factors G κ,>0,L (Hηk ) ˜ in (160) as √
tαk G κ,>0,L (Hηk ) = ((Hηαkk )+t ˜
, (Hηαkk )−t −
√
)
where Hηαkk = ((Hηαkk )+ , (Hηαkk )− ).
(171) √ Hence, we find that the grafting transformation (160) with √ parameter t along η acts on the first component as an infinitesimal Dehn twist along η with parameter t √ and as an infinitesimal Dehn twist with parameter −t on the second component of P SU (1, 1) × P SU (1, 1). In Lorentzian and Euclidean (2+1)-gravity with < 0, the factor G tκ,,S (Hηαkk ) in the expression (160) for the transformation of the holonomies ˜ under grafting is (Hηαkk ) = Hηitk G tκ,,S ˜
√
||αk
,
(172)
t with parameter t given by (160) can therefore and the grafting transformation Tκ,η,,S ˜ √ be viewed as a Dehn twist (167) with parameter it ||. We will come back to this relation between grafting and Dehn twist in Sect. 8.3, where we discuss the role of the cosmological constant as a deformation parameter. These relations between the transformation of the holonomies under grafting and Dehn twists for the different values of the cosmological constant and different signatures are mirrored in a relation for the Poisson brackets of the Wilson loop observables t˜η , κ˜ η associated to closed, simple curves η on Sg . By inserting the maps gt˜, gκ˜ : Isom(X,S ) → h,S into the formula (147) for the Poisson brackets of the Wilson loop observables and using the identity l 2κ = idh,S , we obtain the following theorem which generalises Theorem 5.4. in [23] for the case of Lorentzian signature with = 0.
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
745
Theorem 7.6 (Symmetry relation for the observables). For any two closed, simple curves λ, η on Sg , the associated Wilson loop observables t˜λ , κ˜ λ and t˜η , κ˜ η satisfy the symmetry relations {t˜η , κ˜ λ } = {κ˜ η , t˜λ }
{κ˜ η , κ˜ λ } = {t˜η , t˜λ }.
(173)
Theorem 7.6 establishes a relation between the transformation of the gauge invariant observables t˜η , κ˜ η associated to a closed, simple curve η on Sg under infinitesimal Dehn twist and grafting along another closed simple curve λ. The first identities in (173) imply that, infinitesimally, the transformation of the observable t˜η under grafting along λ is the same as the transformation of κ˜ η under a Dehn twist along λ. The second identity states that the transformation of the observable κ˜ η under infinitesimal grafting along λ corresponds to the transformation of t˜η under an infinitesimal Dehn twist along λ, rescaled by a factor . 8. The Cosmological Constant as a Deformation Parameter 8.1. The geometrical description. In this section we restrict attention to (2+1)-spacetimes with Lorentzian signature and investigate the role of the cosmological constant as a deformation parameter in both the geometrical and the Chern-Simons formulation of (2+1)-dimensional gravity. In the geometrical formulation, the cosmological constant appears as a parameter in G ⊂ X the model spacetime X,L , the domains U,L ,L and in the group isomorphism G h ,L : → Isom(X,L ) which determines the action of the cocompact Fuchsian group G . The dependence of the domains U G on the sign of the cosmo on the domain U,L ,S logical constant is investigated in detail in the papers by Benedetti and Bonsante [21, 22] who show that Lorentzian spacetimes with ∈ {0, ±1} and the Euclidean case with = −1 can be related by rescalings and a Wick rotation compatible with the associated actions of the cocompact Fuchsian group . In contrast, our focus is on the role of the cosmological constant as a continuous parameter deforming these domains, the associated model spacetimes and group isomorphisms. We start by considering the static spacetimes associated to . As discussed in Sect. 3.2, these spacetimes are given as quotients of the interior of the forward lightcone U,L ⊂ X,L by the canonical action of via the embedding ı,S : P SU (1, 1) → Isom(X,S ) into the isometry group of the model spacetime. The parametrisation of these lightcones in terms of matrices and their foliation by copies of hyperbolic space are given by expressions (53), (54) and (55), respectively, for = 0, > 0 and < 0. For vanishing cosmological constant, the entries in the su(1, 1) matrix (24) parametrising the forward lightcone are given by the identification (19) of the disc model of hyperbolic space with the hyperboloids. For > 0, the entries in the P SU (1, 1)-matrices (28) which √ parametrise √ the model spacetime are given by Eq. (54) and involve the functions sin( T ), cos( T ). Using the identities √ √ 1 (174) √lim cos( T ) = 1 √lim √ sin( T ) = T, →0
→0
√ one finds that in the limit → 0, these parameters behave according to 2
√lim t2 (T, z) →0
1+|z| L = T 1−|z| 2 = x 0 (T, z),
√lim x 2 (T, z) →0
Re(z) L = T 1−|z| 2 = x 2 (T, z),
Re(z) √lim x 1 (T, z) = T 1−|z|2 →0 √lim t1 (T, z) →0
= ∞,
= x1L (T, z),
(175)
746
C. Meusburger
where x0L (T, z), x1L (T, z), x2L (T, z) denote the coordinates in the identification of the unit disc with the hyperboloid H21/T 2 given by (19). For < 0, the entries in the parametrisation (38) of the forward lightcone is given by (55). Using the identities
1 (176) √ lim cosh( ||T ) = 1 √ lim √|| sinh( ||T ) = T, ||→0
||→0
in (55), we obtain √ lim x 0 (T, z) ||→0
2
1+|z| L = T 1−|z| 2 = x 0 (T, z),
√ lim x 1 (T, z) ||→0
Re(z) L = T 1−|z| 2 = x 1 (T, z),
(177) √ lim x 2 (T, z) ||→0
=T
Re(z) 1−|z|2
=
x2L (T, z),
√ lim x 3 (T, z) ||→0
→ ∞.
√ Hence, for both positive and negative cosmological constant, in the limit || → 0 one coordinate in the parametrisation of the forward lightcone in the model spacetime X,L tends to infinity while the other coordinates converge to the corresponding coordinates parametrising the forward lightcone in Minkowski space. To determine the role of the cosmological constant as a deformation parameter of G ⊂X the domains U,L ,L associated to grafted spacetimes, we make use of a result by G on Benedetti and Bonsante [22] concerning the dependence of the grafted domains U,L the weight of the multicurve G. In [22], Proposition 4.7.1, they consider the multicurve t G on H2k obtained by multiplying all weights in a multicurve G with a factor t and the G , U tG ⊂ X associated domains U,L ,L for cosmological constant ∈ {0, ±1}. They ,L t G are rescaled by a factor 1/t, they converge to the show that if the grafted domains U,L G in Minkowski space in the limit t → 0, corresponding domain U0,L tG lim 1 U,L t→0 t
G = U0,L ,
∈ {0, ±1},
(178)
where the limit is understood in terms of the coordinates in a certain parametrisation of these domains given in [22]. Although they do not consider cosmological constants = 0, ±1, this result can be applied to determine the dependence of the grafted domains G on the cosmological constant . For this, it is sufficient to recall from Sect. 3.1 that U,L the parametrisation of the model spacetimes X,L for || = 0, 1 is√obtained by rescaling the associated matrices parametrising X±1,L with a factor 1/ ||. Furthermore, we found in Sect. 3.3, see in particular √ Eq. (62), that the weight of the multicurves for || = 1 had to be rescaled with a factor || to ensure that the associated geodesics are G ⊂X parametrised by arclength. The domains U,L ,L for non-vanishing cosmological G constant are therefore related to the associated domains U±1,L via the identity G = U,L
√ ||G √1 U sgn(),L ||
= 0,
(179)
and Eq. (178) implies G √ lim U,L ||→0
G = U0,L .
(180)
Hence, with an appropriate identification of the coordinates parametrising the domains √ G ⊂ X U,L in the model spacetimes, the limit || → 0 is defined and yields the ,L
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
747
G in Minkowski space for both positive and coordinates parametrising the domain U0,L negative cosmological constant. Finally, we consider the role of the cosmological constant as a deformation paramG eter of the group homomorphism h ,L : → Isom(X,L ) by which the cocompact G Fuchsian group acts on the domains U,L ⊂ X,L . For this we note that the map 2 2 BG,,L : H × H → Isom(X,L ) in formula (62) satisfies
BG,,L ( p, q) = B√||G , sgn() , M ( p, q)
∀ p, q ∈ H2 , = 0,
(181)
which implies d √ √ | d =0
G G G h >0,L (v) = (h 0,L (v), −h 0,L (v)) ∈ su(1, 1) ⊕ su(1, 1)
√d |√ d || ||=0
G h <0,L (v)
∀v ∈ , (182)
=
G i h 0,L (v)
∈ sl(2, C)
∀v ∈ . (183)
Hence, one finds that in the geometrical description√of grafted (2+1)-spacetimes given by Benedetti and Bonsante [22], the square root || of the cosmological constant G ⊂ X plays the role of a deformation parameter for both the domains U,L ,L and the G isomorphisms h ,L : → Isom(X,L ) which determine how the group acts on these domains. In the former, it appears as a parameter in the coordinates parametrising the √ G in the model spacetimes X domains U,L ,L , and the limit || → 0 relates these coorG in Minkowski space. In the dinates to the parametrisation of the associated domain U0,L G : → Isom(X latter, it appears as a parameter in the group homomorphisms h ,S ,S ), G G and one finds that the group homomorphism h 0,L is given by the derivatives of h >0,L √ G and h <0,L with respect to ||. 8.2. The Chern-Simons description. In the Chern-Simons description of (2+1)spacetimes, the cosmological constant appears as a parameter in the Lie algebras h,L and the associated gauge groups Isom(X,L ). To clarify in what sense it can be viewed as a deformation parameter, one has to introduce a common framework which encompasses the Lie algebras (3) for all signs of the cosmological constant. Such a framework can be realised by interpreting the Lie algebras (3) as the (2+1)-dimensional Lorentz algebra over a commutative ring R with a -dependent multiplication law. Lemma 8.1. Consider the vector space R2 with the usual vector addition and a multiplication operation · : R2 × R2 → R2 which depends on a real parameter and is given by (a, b) · (c, d) = (ac + bd, ad + bc)
∀a, b, c, d ∈ R.
(184)
Then, (184) equips R2 with the structure of a commutative ring R = (R2 , +, ·) with the unit elements for the addition + and the multiplication · given by, respectively, (0, 0), (1, 0).
748
C. Meusburger
In the following we will express elements of the ring R in terms of a formal parameter and write a + b for the element (a, b). Formally, the product of two elements (a, b), (c, d) ∈ R is then given by (a + b ) · (c + d ) = ac + (ad + bc) + bd2 ,
(185)
and to obtain agreement with the multiplication law (184), one must set 2 = , and the parameter can therefore be viewed as a formal square root of . For < 0, the commutative ring R is isomorphic to the field C and the formal √ √ parameter is the complex number = = i ||. For = 0, the formal parameter satisfies 20 = 0 like the one occurring in supersymmetry and corresponds to the formal parameter θ used to parametrise the (2+1)-dimensional Poincaré group in [11], for a more detailed discussion see also [23]. This implies that the commutative ring R0 is not a field, since elements of the form b, b ∈ R \ {0} are zero divisors b · c = 0
∀b, c ∈ R.
(186)
For > 0, the formal parameter satisfies 2 = > 0. Again, R is not a field √ and has zero divisors a ± a, a ∈ R \ {0}, which satisfy √ √ √ √ √ ( a + a)( a − a) = 0 ( a ± a)2 = 2a ( a ± a). (187) The ring R allows one to identify all Lie algebras h,L with brackets (3) with the (2+1)dimensional Lorentz algebra, only that now this Lie algebra is no longer considered as a Lie algebra over R but as a Lie algebra over the commutative ring3 R . This identification of the Lie algebras h,L with the (2+1)-dimensional Lorentz algebra over R generalises the concept of complexification of real Lie algebras and in the case of negative cosmological constant yields the complexification sl(2, C) = sl(2, R) ⊕ i sl(2, R). We consider the (2+1)-dimensional Lorentz algebra with generators Ja , a = 0, 1, 2, and bracket [Ja , Jb ] = abc Jc .
(188)
By identifying the generators Ja with the sl(2, R) matrices in (6), we obtain a R mod3 into the set sl(2, R ) of traceless two-by-two matrices with ule isomorphism from R 2 with entries in R or, equivalently, the set of endomorphisms of the R -module R vanishing trace form, x + y ∈ R2 →(x a + y a )Ja (189) 1 1 1 0 1 1 0 2 2 − 2 (x + y ) 2 (x + y ) + 2 (x + y ) . = 1 1 1 − 21 (x 0 + y 0 ) + 21 (x 2 + y 2 ) 2 (x + y ) The commutator of two sl(2, R ) matrices then agrees with the bracket obtained by extending (188) bilinearly in R , and with the identification Pa = Ja
(190)
one recovers the Lie bracket (3) of the real Lie algebras h,L . 3 Definitions and properties concerning Lie algebras over commutative rings can be found for instance in [38], Chapter 1, but in the following we will make only use of some basic concepts.
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
749
Moreover, the identification of the Lie algebras g,L with sl(2, R ) allows one to relate the Ad-invariant, symmetric bilinear forms , and κ on h,L defined by (70), (156) to the Killing form of the (2+1)-dimensional Lorentz algebra. For this one extends the Killing form g K on sl(2, R) L g K (Ja , Jb ) = 21 ηab = Tr(Ja · Jb )
(191)
bilinearly to an Ad-invariant symmetric R -bilinear form g K : sl(2, R )×sl(2, R ) → R . Using the parametrisation (189) and comparing the resulting expressions with (70), (156) one obtains g K (( p + k), (q + m)) L L = 21 ( pa q b + k a m b )ηab + 21 ( pa m b + k a q b )ηab =
a 1 2 κ( p Ja
+ k Pa , q Jb + m a
b
b
Pb ) + 21 pa Ja
(192)
+ k Pa , q Jb + m Pb . a
b
b
Hence, the Ad-invariant symmetric forms κ and , on h,L are realised as the projections of the Killing form on sl(2, R ) on, respectively, the first and second component of the ring R = (R2 , +, ·). This generalises the situation for < 0, where these forms can be identified with the real and imaginary part of the Killing form on sl(2, C). Moreover, it sheds some light on the distinguished role played by the Ad-invariant symmetric bilinear forms , and κ on the Lie algebra h,L . While any linear combination of these two forms is again an Ad-invariant, symmetric bilinear form on h,L , the forms , and κ are the only ones that arise canonically from the Killing form on sl(2, R ). We will now demonstrate how the identification of the Lie algebra h,L with the (2+1)-dimensional Lorentz algebra over the commutative ring R gives rise to the associated matrix groups H,L . Although in general the exponential of Lie algebras over commutative rings cannot be defined in a straightforward manner, the particularly simple structure of the ring R allows us to obtain the groups H>0,L = SU (1, 1)×SU (1, 1) ∼ = S L(2, R)×S L(2, R), H0,L = SU (1, 1)R3 ∼ = S L(2, R)R3 and H<0,L = S L(2, C) by exponentiating matrices in sl(2, R ). For this, we consider the formal expression exp (x + y) =
∞ ((x a + y a )Ja )n n!
x + y ∈ sl(2, R ),
(193)
n=0
where ((x a + y a )Ja )n stands for the n th power of√the matrices (189). In the case of negative cosmological constant, we have = i || and expression (193) is the exponential map exp<0,L : sl(2, C) → S L(2, C). The case of vanishing cosmological constant is investigated in [23]. Using identity 20 = 0, one can express (193) as exp=0 ((x a + 0 y a )Ja ) =
∞ (x a Ja )n n=0
n!
+ 0
∞ n−1 (x a Ja ) L (y b Jb )(x c Jc )n−m−1 . n! n=0 m=0 (194)
To simplify this expression further, we move the factors y b Jb in (194) to the left and evaluate the resulting commutators using the formulas L m adkx a Ja (y b Jb ) · (x c Jc )m−k , [(x Ja ) , y Jb ] = k a
L
b
k=1
n−1 n m , = k+1 k
m=k
(195)
750
C. Meusburger
which can be proved by induction. After some further computation, this yields exp=0 ((x a + 0 y a )Ja ) = (1 + 0 (T (x) y)b Jb ) · e x
aJ a
,
(196)
x a Ja
stands for the exponential of the sl(2, R)-matrix x a Ja given by (10) and where e the linear map T (x) : R3 → R3 is the one defined in (16). By comparing with the parametrisation of the (2+1)-dimensional Poincaré group SU (1, 1) R3 introduced in Sect. 2, we find that (196) agrees with (15) if we identify (u, y) ∈ SU (1, 1) R3 ∼ ∀u ∈ S L(2, R), y ∈ R3 , (197) = (1 + 0 y a Ja ) · u and we recover the multiplication law (11) (1 + 0 x a Ja )u · (1 + 0 y b Jb )v = (1 + 0 (x a Ja + Ad(u)y b Jb )uv ×∀u, v ∈ S L(2, R), x, y ∈ R3 . Hence, by exponentiating the Lie algebra sl(2, R ) for cosmological constant = 0, one obtains the parametrisation of the (2+1)-dimensional Poincaré group in Sect. 2 whose description in terms of a formal supersymmetry parameter 0 satisfying 20 = 0 was first introduced in the context of (2+1)-dimensional gravity by Martin [11]. For > 0, the exponential (193) can be evaluated by introducing the generators defined in (12) Ja± = 21 (1 ±
√ )J , a
(198)
in terms of which the argument of (193) takes the form √ √ (x a + y a )Ja = (x a + y a )Ja+ + (x a − y a )Ja− .
(199)
Using identity (187), we recover the splitting of h>0,L into the direct sum sl(2, R) ⊕ sl(2, R), Ja+ · Jb− = Ja− · Jb+ = 0
Ja± · Jb± = 21 (1 ±
1 √ )( η 4 ab
+ 21 abc )Jc ,
(200)
and by applying these identities to (193) we obtain √ √ ∞ ∞ ((x a + y a )Ja+ )n ((x a − y a )Ja− )n a a + exp>0 ((x + y )Ja ) = n! n! n=0 √ (x a + y a )Ja 1 )e = 2 (1 + √
+
n=0 √ (x a − y a )Ja 1 √ (1 − )e . 2
(201) With the identification (u + , u − ) ∼ = 21 (1 +
√ )u +
+ 21 (1 −
√ )u −
∀u ± ∈ S L(2, R)
(202)
we then recover formula (15) for the exponential map exp : sl(2, R) ⊕ sl(2, R) → S L(2, R) × S L(2, R) and the group multiplication law (13). Hence, for all values of the cosmological constant, the group H,L and the exponential map exp,L : h,L → H,L can be obtained from the identification of the Lie algebra h,L with sl(2, R ). The cosmological constant can therefore be implemented in the Chern-Simons formulation of (2+1)-dimensional gravity by interpreting it as a parameter in the multiplication law (184) of a commutative ring R . By parametrising the elements of this ring in terms of a formal parameter satisfying 2 = one then obtains a unified description of the Lie algebras h,L and the associated Lie groups H,L , which can be identified, respectively, with the Lie algebra sl(2, R ) and the associated matrix groups obtained by exponentiation.
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
751
8.3. Grafting transformations as deformed Dehn twists. We will now apply these results to demonstrate that the parameter which can be viewed as a formal square root of appears as a deformation parameter relating the Dehn twist and grafting transformations (167) and (160) for all values of the cosmological constant . For this we recall the discussion from Sect. 7.2 and Sect. 7.3, where it is shown that the grafting transformat tion Tκ,η,,L : Isom(X,S )2g → Isom(X,S )2g associated to a closed, simple curve η ˜ on Sg and the corresponding Dehn twist Tt˜t,η,,L : Isom(X,S )2g → Isom(X,S )2g are generated, respectively, by Wilson loop observables κ˜ η and t˜η constructed from bilinear forms κ and , on h,S . The fact that the bilinear forms κ and , on h,L appear as projections on, respectively, the real and the -component of Killing form g K on sl(2, R ) allows one to interpret the Wilson loop observables κ˜ η , t˜η as projections on ∞ 2g real and component of a Wilson loop observable g K η ∈ C (Isom(X,L ) ) which takes values in the Ring R , 1 1 g K η (A1 , . . . , Bg ) = 2 κ˜ η (A1 , . . . , Bg ) + 2 t˜η (A1 , . . . , Bg ).
Moreover, we found in Sect. 7.3 that the grafting and Dehn twist transformations are of a similar form. Both act on the holonomies Ai , Bi by right-multiplication with functions G tt˜,,L , G tκ,,L : Isom(X,L ) → Isom(X,L ) of the holonomies of certain curves con˜ jugated to η, which are obtained as exponentials of linear maps l t , l κ : h,L → h,L . For all values of the cosmological constant, the linear map for the Dehn twist is the identity l t = idh,L , and the associated map G tt˜,,L takes the form G tt˜,,L (u) = u t
∀u ∈ Isom(X,L ).
(203)
The linear map l κ and the associated one parameter group of diffeomorphisms G tκ,,L ˜ for the grafting transformation are given by, respectively, (157) and (158). Unlike the corresponding maps for the Dehn twists, they show an explicit dependence on the cosmological constant . The identification of the Lie algebra h,L with sl(2, R ) allows us to relate these maps for the different values of the cosmological constant and to establish a link with the corresponding maps for Dehn twists. For this we note that for all values of the cosmological constant the linear map l κ on h,L given by (157) can be identified with a linear map on the Lie algebra sl(2, R ), which acts by multiplication with , l κ (x + y) = (x + y)
∀x + y ∈ sl(2, R ).
(204)
The discussion in the previous subsection then allows us to express the associated oneparameter group of transformations G tκ,,L : Isom(X,L ) → Isom(X,L ) via the expo˜ nential map (193), G tκ,,L (exp (x + y)) = exp (t (x + y)) ˜
∀x + y ∈ sl(2, R ). (205)
√ Evaluating this expression by setting = i || for < 0 and by using expressions (196) and (201) for = 0 and > 0, we recover expression (158). Hence, the oneparameter group of transformations G tκ,,L , G tt˜,,L : Isom(X,L ) → Isom(X,L ) are ˜ formally related by the identity = G tt G tκ,,L ˜ ˜,,L .
(206)
752
C. Meusburger
After inserting this identity in the expressions (167), (160) for the Dehn twist and the grafting transformation, we find that formally, the transformation of the holonomies Ai , Bi under grafting along a closed, simple curve η on Sg with parameter t can be expressed as a Dehn twist with parameter t, t = Tt˜t . Tκ,η,,L ˜ ,η,,L
(207)
By interpreting the Lie algebra h,L of the gauge group in Chern-Simons formulation as a (2+1)-dimensional Lorentz algebra over the commutative ring R , we therefore establish a common pattern which relates the grafting and Dehn twist transformations for all values of the cosmological constant. The dependence on the cosmological constant is encoded in the formal parameter satisfying 2 = which plays the role of a deformation parameter. In this formalism, the two Wilson loop observables κ˜ η , t˜η associated to a closed, simple curve η on Sg arise canonically as the projection on the -component and on the real component of the R -valued Wilson loop observable (203) constructed from the Killing form on sl(2, R ). Via the Poisson bracket, these two canonical observables generate the two basic geometry changing transformations on the phase space. The former acts as the Hamiltonian for the Dehn twist transformations (167), while the latter is the Hamiltonian for the grafting transformation (160). Viewed as transformations over the ring R , these two phase space transformations exhibit a similar structure and can be transformed into each other by substituting t → t . 9. Conclusions In this paper we clarified the relation between the geometrical and the Chern-Simons description of (2+1)-dimensional spacetimes of topology R × Sg for Lorentzian signature and general cosmological constant and for the Euclidean case with negative cosmological constant. We showed how the fact that such spacetimes are obtained as quotients of the model spacetimes X,S corresponds to the trivialisation of the gauge field in the Chern-Simons formalism. This allowed us to relate the variables encoding the physical degrees of freedom in the two approaches, the group homomorphism G h , : → Isom(X,S ) in the geometric formulation and the holonomies along a set of generators of the fundamental group π1 (Sg ) in the Chern-Simons description. We demonstrated how the construction of evolving (2+1)-spacetimes via grafting along closed, simple geodesics η gives rise to a transformation on the phase space of the associated Chern-Simons theory. After deriving an explicit expression for the transformation of the holonomies, we showed that this transformation is generated via the Poisson bracket by one of the two canonical Wilson loop observables associated to η, while the other observable generates Dehn twists. We found a close relation between the action of these transformations on the phase space which is reflected in a general symmetry relation for the associated Wilson loop observables. Finally, we investigated the role of the cosmological constant in the geometrical and the Chern-Simons formulation of the theory with Lorentzian signature. We found that the square root of minus the cosmological constant can be viewed as a deformation G ⊂X parameter in the parametrisation of the domains U,L ,L and in the group homoG morphisms h ,L : → Isom(X,L ). In the Chern-Simons formulation, we obtained a unified description for the different signs of the cosmological constant by identifying the Lie algebras of the gauge groups with the (2+1)-dimensional Lorentz algebra sl(2, R ) over a commutative ring R . In this framework, the cosmological constant
Geometrical (2+1)-Gravity and the Chern-Simons Formulation
753
arises as a parameter in the ring’s multiplication law and can be implemented via a formal parameter satisfying 2 = . By extending the Killing form on the (2+1)-dimensional Lorentz algebra to an Ad-invariant, bilinear form on sl(2, R ) and considering the associated Wilson loop observables with values in R , we found that the Wilson loop observables generating grafting and Dehn twists arise canonically as the projections on the real and the component of this R -valued observable. Moreover, we found that a grafting transformation with weight w associated to a closed, simple curve η on Sg can be viewed as a Dehn twist around η with parameter w. These results clarify the relation between spacetime geometry and the description of the phase space in the Chern-Simons formalism and provide a geometrical interpretation of the Wilson loop observables. Moreover, we obtained explicit expressions for the action of grafting and Dehn twists in Fock and Rosly’s description of the phase space [12], which is the basis of the combinatorial quantisation formalism [15, 16] and the related approaches [17] and [18] for the group S L(2, C) and semidirect product groups G g∗ such as the (2+1)-dimensional Poincaré group. It would therefore be interesting to see how these results can be applied to the quantised theory and to use them to investigate concrete physics questions in quantum (2+1)-gravity. Acknowledgements. I thank Bernd Schroers for comments on the draft of this paper. Research at Perimeter Institute is supported in part by the Government of Canada through NSERC and by the Province of Ontario through MEDT.
References 1. Carlip, S.: Quantum gravity in 2+1 dimensions. Cambridge University Press, Cambridge (1998) 2. Carlip, S.: Quantum Gravity in 2+1 Dimensions: The Case of a Closed Universe. Living Rev.Rel. 8, 1 (2005) 3. Achucarro, A., Townsend, P.: A Chern–Simons action for three-dimensional anti-de Sitter supergravity theories. Phys. Lett. B 180, 85–100 (1986) 4. Witten, E.: 2+1 dimensional gravity as an exactly soluble system. Nucl. Phys. B 311, 46–78 (1988), Nucl. Phys. B 339, 516–32 (1988) 5. Nelson, J.E., Regge, T.: Homotopy groups and (2+1)-dimensional quantum gravity. Nucl. Phys. B 328, 190–202 (1989) 6. Nelson, J.E., Regge, T.: (2+1) Gravity for genus > 1. Commun. Math. Phys. 141, 211–23 (1991) 7. Nelson, J.E., Regge, T.: (2+1) Gravity for higher genus. Class Quant Grav. 9, 187–96 (1992) 8. Nelson, J.E., Regge, T.: The mapping class group for genus 2. Int. J. Mod. Phys. B 6, 1847–1856 (1992) 9. Nelson, J.E., Regge, T.: Invariants of 2+1 quantum gravity. Commun. Math. Phys. 155, 561–568 (1993) 10. Ashtekar, A., Husain, V., Rovelli, C., Samuel, J., Smolin, L.: (2+1) quantum gravity as a toy model for the (3+1) theory. Class. Quant. Grav. 6, L185–L193 (1989) 11. Martin, S.P.: Observables in 2+1 dimensional gravity. Nucl. Phys. B 327, 178–204 (1989) 12. Fock, V.V., Rosly, A.A.: Poisson structure on moduli of flat connections on Riemann surfaces and r -matrices. Am. Math. Soc. Transl. 191, 67–86 (1999) 13. Alekseev, A.Y., Malkin, A.Z.: Symplectic structure of the moduli space of flat connections on a Riemann surface. Commun. Math. Phys. 169, 99–119 (1995) 14. Meusburger, C., Schroers, B.J.: Poisson structure and symmetry in the Chern-Simons formulation of (2+1)-dimensional gravity. Class. Quant. Grav. 20, 2193–2234 (2003) 15. Alekseev, A.Y., Grosse, H., Schomerus, V.: Combinatorial quantization of the Hamiltonian Chern-Simons Theory. Commun. Math. Phys. 172, 317–58 (1995) 16. Alekseev, A.Y., Grosse, H., Schomerus, V.: Combinatorial quantization of the Hamiltonian Chern-Simons Theory II. Commun. Math. Phys. 174, 561–604 (1995) 17. Buffenoir, E., Noui, K., Roche, P.: Hamiltonian Quantization of Chern-Simons theory with S L(2, C) Group. Class. Quant. Grav. 19, 4953–5016 (2002) 18. Meusburger, C., Schroers, B.J.: The quantisation of Poisson structures arising in Chern-Simons theory with gauge group G g∗ . Adv. Theor. Math. Phys. 7, 1003–1043 (2004) 19. Mess, G.: Lorentz spacetimes of constant curvature. Preprint IHES/M/90/28, Avril 1990 20. Benedetti, R., Guadgnini, E.: Cosmological time in (2+1)-gravity. Nucl. Phys. B 613, 330–352 (2001)
754
C. Meusburger
21. Benedetti, R., Bonsante, F.: Wick rotations in 3D gravity: ML(H2 ) spacetimes. http://arxiv./org/list/ math.DG/0412470, 2004 22. Benedetti, R., Bonsante, F.: Canonical Wick Rotations in 3-dimensional gravity. http://arxiv./org/list/ math.DG/0508485, 2004 23. Meusburger, C.: Grafting and Poisson structure in (2+1)-gravity with vanishing cosmological constant. Commun. Math. Phys. 266, 735–775 (2006) 24. Benedetti, R., Petronio, C.: Lectures on Hyperbolic Geometry. Berlin-Heidelberg: Springer Verlag, 1992 25. Katok, S.: Fuchsian Groups. Chicago: The University of Chicago Press, 1992 26. Goldman, W.M.: Projective structures with Fuchsian holonomy. J. Diff. Geom. 25, 297–326 (1987) 27. Hejhal, D.A.: Monodromy groups and linearly polymorphic functions. Acta. Math. 135, 1–55 (1975) 28. Maskit, B.: On a class of Kleinian groups. Ann. Acad. Sci. Fenn. Ser. A 442, 1–8 (1969) 29. Thurston, W.P.: Geometry and Topology of Three-Manifolds. Lecture notes, Princeton, NJ: Princeton University, 1979 30. Thurston, W.P.: Earthquakes in two-dimensional hyperbolic geometry. In: Epstein, D.B. (ed.), Low dimensional topology and Kleinian groups. Cambridge: Cambridge University Press, 1987, pp. 91–112 31. McMullen, C.: Complex Earthquakes and Teichmüller theory. J. Amer. Math. Soc. 11, 283–320 (1998) 32. Sharpe, R.W.: Differential Geometry. Springer Verlag, New York (1996) 33. Matschull, H.-J.: On the relation between (2+1) Einstein gravity and Chern-Simons Theory. Class. Quant. Grav. 16, 2599–609 (1999) 34. Meusburger, C.: Dual generators of the fundamental group and the moduli space of flat connections. J. Phys. A: Math. Gen. 39, 14781–14832 (2006) 35. Stachura, P.: Poisson-Lie structures on Poincaré and Euclidean groups in three dimensions. J. Phys. A 31, 4555–4564 (1998) 36. Goldman, W.M.: Invariant functions on Lie groups and Hamiltonian flows of surface group representations. Invent. Math. 85, 263–302 (1986) 37. Meusburger, C., Schroers, B.J.: Mapping class group actions in Chern-Simons theory with gauge group G g∗ . Nucl. Phys. B 706, 569–597 (2005) 38. Bourbaki, N. (Pseud.): Elements of Mathematics, Lie groups and Lie algebras, Part I: Chapters 1–3. Paris: Hermann, 1975 Communicated by G.W. Gibbons
Commun. Math. Phys. 273, 755–783 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0265-8
Communications in
Mathematical Physics
The Parameter Planes of λz m exp(z) for m ≥ 2 Núria Fagella1 , Antonio Garijo2 1 Dep. de Matemàtica Aplicada i Anàlisi, Universitat de Barcelona, Gran Via de les Corts Catalanes,
585, 08005 Barcelona, Spain. E-mail: [email protected]
2 Dep. d’Eng. Informàtica i Matemàtiques, Universitat Rovira i Virgili, Av. Països Catalans, 26,
43007 Tarragona, Spain. E-mail: [email protected] Received: 3 August 2006 / Accepted: 24 November 2006 Published online: 31 May 2007 – © Springer-Verlag 2007
Abstract: We consider the families of entire transcendental maps given by Fλ,m (z) = λz m exp(z), where m ≥ 2. All functions Fλ,m have a superattracting fixed point at z = 0, and a critical point at z = −m. In the parameter planes we focus on the capture zones, i.e., λ values for which the critical point belongs to the basin of attraction of z = 0, denoted by A(0). In particular, we study the main capture zone (parameter values for which the critical point lies in the immediate basin, A∗ (0)) and prove that is bounded, connected and simply connected. All other capture zones are unbounded and simply connected. For each parameter λ in the main capture zone, A(0) consists of a single connected component with non-locally connected boundary. For all remaining values of λ, A∗ (0) is a quasidisk. On a different approach, we introduce some families of holomorphic maps of C∗ which serve as a model for Fλ,m , in the sense that they are related by means of quasiconformal surgery to Fλ,m . 1. Introduction and Results One of the central topics in complex dynamics is the study of the dynamics of the quadratic polynomial Q c (z) = z 2 + c. The dynamical behavior of the map Q c is determined by the orbit of the unique critical point z = 0. These maps have been thoroughly studied by many authors (see for example [DH1, DH2, CG, M1, L]). In analogy with the quadratic family of polynomials Q c , the exponential map E λ (z) = λ exp(z), with a unique asymptotic value at v = 0, is the simplest example of an entire transcendental map with rich and interesting dynamics. The systematic study of cubic polynomials began with the work of Branner and Hubbard ([BH1]), who considered the two parameter family of monic and centered cubic polynomials which, after a suitable normalization, is given by Ca,b (z) = z 3 − 3a 2 z + b. Both authors were supported by MTM2005-02139/Consolider (including a FEDER contribution) and CIRIT 2005 SGR01028. The first author was also supported by MTM2006-05849/Consolider (including a FEDER contribution).
756
N. Fagella, A. Garijo
Notice that any cubic polynomial is affine conjugate to one in this family. The dynamics of monic centered cubic polynomials is determined by the orbits of the two critical points located at ±a. Moreover, they proved that the cubic connectedness locus, which is a subset of C2 , consisting of all the parameters (a, b) ∈ C × C such that J (Ca,b ) is connected, is compact and connected. Many authors have investigated subfamilies, or slices, of the family of cubic polynomials (among others see [M, Fau, BH2, R, Z, BuHe]). Milnor studied the one parameter family of cubic polynomials having a superattracting fixed point ([M]). These polynomials are given by Ma (z) = z 3 −
3 2 az . 2
(1)
It is easy to see that Ma has a superattracting fixed point at z = 0, and a free critical point at z = a. When z = a belongs to the basin of attraction of the superattracting fixed point z = 0 we say that the critical point z = a has been captured. The connected components of the parameter space for which this phenomenon occurs are called capture zones. We also define the main capture zone, as the set of parameter values a for which the critical point z = a belongs to the immediate basin of z = 0. The original parametrization of the Milnor cubic polynomials was C˜ a (z) = z 3 − 3a 2 z + 2a 3 + a, but both families are equivalent since they are conjugate under an affine change of coordinates. Milnor ([M]) suggested two questions about the family of cubic polynomials Ma , one in the dynamical plane and another one in the parameter plane. The first one was to investigate whether for all parameter values a, the boundary of the immediate basin of attraction of z = 0 is a Jordan curve. The second one was to investigate whether the boundary of the main capture zone is a Jordan curve. Both questions were answered by Faught ([Fau]) using a modification of Yoccoz’s puzzle for a rational like mapping (see [R]). Faught proved that for all parameter values, a ∈ C, the immediate basin of attraction of z = 0 is a Jordan domain and also the boundary of the main capture zone is a Jordan curve. Roesch ([R]) generalized this result, in the dynamical plane, for an extension family of the Milnor cubic polynomial. More precisely, we can consider the family of polynomials Mm,a (z) = z m+1 −
m+1 m az m
(2)
as a generalized family of the Milnor cubic polynomials. For each m ≥ 2 the point z = 0 is a superattracting fixed point of multiplicity m, and z = a is a free critical point (when m = 2 we find exactly the Milnor cubic polynomial Ma ). It is proven ([R]) that for every value of m ≥ 2 and for all parameters a ∈ C, the boundary of the immediate basin of attraction of the superattracting fixed point z = 0 is a Jordan curve. Our goal in this work is to study some dynamical aspects of the families of entire transcendental maps Fλ,m (z) = λz m exp(z), m ≥ 2.
(3)
All functions of the form Fλ,m , with m ≥ 2, have a superattracting fixed point at z = 0 of multiplicity m, which is also an asymptotic value. The only other critical point is z = −m. The coexistence of a superattracting fixed point and a free critical point makes this family an entire transcendental analogue of the generalized Milnor polynomials (Eq. (2)).
The Parameter Planes of λz m exp(z) for m ≥ 2
757
Some functions in the family Fλ,m = λz m exp(z) for m ≥ 2 have been used in the literature as examples of certain dynamical phenomena (see for example [Be], for a Baker domain at a positive distance from any singular orbit for a lift of a certain member Fλ,m ). We also notice that fixed points of Fλ,m appear in a different mathematical context. More precisely, Fλ,m (z) = z is the characteristic equation of the following delay differential equation d m−1 x 1 = x(t − 1). dt m−1 λ If we search for some value z 0 such that x(t) = c e z 0 t is a solution, we obtain the characteristic equation λz 0m exp(z 0 ) = z 0 . In ([FG]) we made an initial study of the discrete dynamical system generated by the map Fλ,m . We focussed our attention in a description of the dynamical planes, and specially on the basin of attraction of the superattracting fixed point at z = 0. In this paper we turn our attention to the parameter planes of the family of functions Fλ,m . As usual in complex dynamics as A. Douady said: “you first plow in the dynamical plane and then harvest in the parameter space”. As we mentioned, the origin is a superattracting fixed point of the function Fλ,m , for all m ≥ 2 and λ ∈ C. We denote by A(0) = Aλ,m (0) the basin of attraction of the origin, given by ◦n A(0) = Aλ,m (0) = {z ∈ C, Fλ,m (z) → 0 as n → ∞}.
(4)
The immediate basin of attraction of z = 0 is the connected component of A(0) containing z = 0, and we denote it by A∗ (0) = A∗λ,m (0). One of the main objectives of this work is the study of Aλ,m (0). We would like to answer the following questions: How many connected components does Aλ,m (0) have? Are they simply connected? Are they bounded? When is the boundary of A∗λ,m (0) locally connected? For some parameter values, the free critical point z = −m belongs to the basin of attraction of z = 0, in which case we say that it has been captured. The connected components of parameter space for which this phenomenon occurs are called capture zones, and they clearly do not exist for members of the family Fλ,m with m < 2, i.e., for the exponential family. We will study the capture zones given by n n Hm = {λ ∈ C | Fλ,m (−m) ∈ A∗λ,m (0)and n is the smallest number with this property}. (5) 0 , as the set of parameter values As a special case, we define the main capture zone, Hm λ for which the critical point z = −m itself belongs to the immediate basin of 0. That is, 0 Hm = {λ ∈ C | − m ∈ A∗λ,m (0)}.
(6)
We shall see that this is a quite special capture zone since its boundary separates the parameter values for which F(Fλ,m ) has one connected component from those for which it has infinitely many. n connected? In the parameter plane we will answer the following questions: Is Hm n simply connected? Are they bounded? How does Are the connected components of Hm the boundary of A∗λ,m (0) depend on λ? Is ∂ A∗λ,m (0) locally connected when λ belongs n? to Hm
758
N. Fagella, A. Garijo
In order to answer all of these questions we divide our study into two parts. In the first one, we study directly the family of functions Fλ,m = λz m exp(z) using standard tools in complex dynamics. In the second one, we relate it to a new family of maps given by G α,β,m (z) = exp(iα)z m exp(β/2(z − 1/z)) ,
(7)
where α and β are real numbers and m ≥ 2. The family of functions G α,β,2 have been investigated as real maps on the unit circle by M. Misiurewicz and A. Rodrigues ([MR]). Using quasiconformal surgery, we relate members of G α,β,m to those of Fλ,m , and use this correspondence to prove some results for the original maps. We first concentrate on the dynamical plane and especially in the basin of attraction Aλ,m (0). More precisely, we prove the following result related to the topology of the connected components of Aλ,m (0). Proposition A. Let λ ∈ C, m ∈ N, m ≥ 2 and Fλ,m (z) = λz m exp(z). Let Aλ,m (0) and A∗λ,m (0) be the basin and the immediate basin of attraction of z = 0 for the map Fλ,m , respectively. The following statements hold. a) All connected components of the Fatou set of Fλ,m are simply connected. b) Aλ,m (0) has either one or infinitely many connected components. c) All the connected components of Aλ,m (0) different from A∗λ,m (0) are unbounded. Further we describe the main features of the parameter planes of the functions Fλ,m and, in particular, the structure of the capture zones. We summarize some of these facts in the following theorems. In the first one we study the topology of the capture zones. In the second one we investigate the local connectivity of the boundary of A∗λ,m (0). In 0. the third we study the complement of the closure of the main capture zone Hm n , H0 be the capture zones as in Theorem B. For all parameters m ∈ N, m ≥ 2, let Hm m (5) and (6), respectively. The following statements hold:
a) The critical point −m belongs to A∗λ,m (0) if and only if the critical value Fλ,m (−m) 1 = ∅. belongs to A∗λ,m (0). Hence Hm 0 ⊂ D , b) There exist ρ = ρ(m), ρ = ρ (m) verifying 0 < ρ < ρ such that Dρ ⊂ Hm ρ where Dr = {z ∈ C | |z| < r }. 0 is connected and simply connected. c) The main capture zone Hm n are simply connected and d) Let n ≥ 2. All the connected components of Hm unbounded. n , H0 be the capture zones as in (5) and Theorem C. Let λ ∈ C, m ∈ N, m ≥ 2. Let Hm m (6), respectively. The following statements hold: ∗ 0 then A 0 then A / Hm a) If λ ∈ Hm λ,m (0) = Aλ,m (0). However if λ ∈ λ,m (0) has infinitely many connected components. 0 the boundary of A∗ (0) (which is equal to the Julia set) is a Cantor b) If λ ∈ Hm λ,m bouquet not locally connected. 0 . If λ ∈ U , then the c) Let Um be the unbounded connected component of C \ Hm m n for any n ≥ 2 the boundary of A∗λ,m (0) is a quasicircle. In particular, if λ ∈ Hm boundary of A∗λ,m (0) is a quasicircle.
The Parameter Planes of λz m exp(z) for m ≥ 2
759
0 . The following Theorem D. Let Um be the unbounded connected component of C \ Hm statements hold: 0 = ∂U . a) ∂Hm m 0 , then U , H0 and V b) If there exists a bounded connected component V of C \ Hm m m are lakes of Wada, i.e., they have a common boundary.
There is another question which will remain unanswered in this work and which we state as a conjecture. 0 is a Jordan curve. Conjecture. The boundary of Hm Finally, we take a second approach, using quasiconformal surgery, to further describe the maps at hand. More precisely,
G α,β,m (z) = exp(iα)z m exp(β/2(z − 1/z)), where α and β are real numbers and m ≥ 2, which we relate to the original one by means of quasiconformal surgery. Roughly speaking quasiconformal surgery is a technique to construct holomorphic maps with some prescribed dynamics. In our case, we combine two dynamical systems acting in different parts of the plane to construct a new system that combines the dynamics of both. In this process we use quasiconformal mappings to glue different behaviors. The key ingredient of this technique is to use the Measurable Riemann Mapping Theorem ([Ah, LV]) in order to assure that the corresponding mapping is a holomorphic map. For our construction, it will play a fundamental role the fact that G α,β preserves the unit circle S 1 . More precisely, G α,β induces a one dimensional mapping on the unit circle α,β,m : θ → α + mθ + β sin(θ ) G
mod (2π ) ,
θ ∈ R/2π Z.
We are interested in the set of parameters α,β,m is quasisymmetrically conjugate to θ → mθ } Wm = {α, β | G α,β,m is an expanding map on the unit which in particular includes all those for which G circle ([SS]). This is summarized in the following theorem. 0 such Theorem E. For any (α, β) ∈ Wm , there exists a λ in the complement of Hm that Fλ,m is quasiconformally conjugate on the complement of A∗λ,m (0) to G α,β on the complement of the closed unit disc. For this value of λ the boundary of A∗λ,m (0) is a quasicircle.
The rest of the paper is organized as follows. In Sect. 2 we present some previous results concerning the basin of attraction of the origin. In Sect. 3 we summarize some tools which we will use in this paper. Finally, Sects. 4 and 5 are devoted to prove the main results of this work. Experts can read directly Sect. 4.
760
N. Fagella, A. Garijo
Fig. 1. Sketch of some sets included in the basin of attraction of z = 0. Precisely, D 0 ⊂ A∗λ,m (0), k )⊂ H Fλ,m (H|λ|,m ) ⊂ D 0 and Fλ,m (Sλ,m |λ|,m
2. Preliminaries The systematic study of the functions Fλ,m was started in [FG]. In this section we recall some results from that work that will be useful later on. Theorem 2.1 (Skeleton of Aλ,m (0), see Fig. 1). Let λ ∈ C, m ∈ N, m ≥ 2 and Fλ,m (z) = λz m exp(z). Let A∗λ,m (0) be the immediate basin of attraction of z = 0 for the map Fλ,m . The following statements hold: a) For λ = 0, if we define 0 = 0 (|λ|, m) > 0 as the unique positive solution of x m−1 e x = 1/|λ|; then A∗λ,m (0) contains the disk D 0 = {z ∈ C ; |z| < 0 }. b) There exist x0 = x0 (|λ|, m) < 0 and a continuous (decreasing) function x → C(x) > 0 defined for x < x0 such that the open set x ∈ (−∞, x0 ) H|λ|,m = z = x + yi y ∈ (−C(x), C(x)) satisfies Fλ,m (H|λ|,m ) ⊂ D 0 . k , which are preimages of H c) There exist infinitely many strips, denoted by Sλ,m |λ|,m . These horizontal strips extend to +∞, and they have asymptotic width equal to π . The skeleton of the main components of Aλ,m (0) is needed to study later the parameter planes. In the first statement of Theorem 2.1 we give an estimate of the size of the immediate basin of attraction of z = 0. Since z = 0 is a superattracting fixed point, there exists 0 > 0 such that the open disk D 0 = {z ∈ C ; |z| < 0 } is contained in the immediate basin of attraction of z = 0. In the second statement we find the first preimage D 0 , which contains an unbounded open set in C extending to the left and containing an unbounded interval (−∞, x0 ) for some real value x0 . In the third statement we find the second preimage of D 0 , which contains countably many horizontal strips extending to +∞. In the following auxiliary result we find a lower bound for 0 , which will be used later on. 1
1 m−1 Lemma 2.2. The value of 0 is always larger than or equal to min{1, ( |λ|e ) }.
The Parameter Planes of λz m exp(z) for m ≥ 2
761
3. Tools In this section we present well known tools in complex dynamics which we will use in this paper. We also present applications of some of them to our particular case. The first tool is a classical result related to the behavior of holomorphic maps near a superattracting fixed point ([Bo]), which we apply to make a detailed description of the superattracting basin of z = 0 for Fλ,m . The second section is related to the extension of a Holomorphic motion, established by Słodkowski ([Sl]). In the third one, we recall shortly the relevant definitions and results relative to quasiconformal mappings ([Ah, LV]). Finally, in the miscellanea section we provide precise definitions of several concepts related to circle maps. 3.1. Böttcher coordinates near a superattracting fixed point. Theorem 3.1. Suppose that f is an holomorphic map, defined in some neighborhood U of 0, having a superattracting fixed point at 0, i.e., f (z) = am z m + am+1 z m+1 + · · · where m ≥ 2, and am = 0. Then, there exists a local conformal change of coordinate w = ϕ(z), called Böttcher coordinate at 0 (or Böttcher map), such that ϕ ◦ f ◦ϕ −1 is the map w → w m throughout some neighborhood of ϕ(0) = 0. Furthermore, ϕ is unique up to multiplication by an (m − 1)st root of unity. In practice, it is customary to make a linear change of coordinates so that the map f is monic, i.e., so that am = 1. When f is monic we obtain a unique Böttcher coordinate such that lim z→0 ϕ(z) z = 1. Also it is natural to extend ϕ to a maximal domain using the functional relation ϕ( f (z)) = ϕ(z)m (see, [DH1, DH2] or [BuHe] for details). One might hope that the change of coordinates z → ϕ(z) extends throughout the entire immediate basin of attraction of the superattractive point as a holomorphic mapping. However, this is not always possible. Such an extension involves computing expressions of the form z → m ϕ( f (z)), and this does not work in general since the n th root cannot be defined as a single valued function. For example, when some other point in the basin maps exactly onto the superattracting point, or when the basin is not simply connected. Using the Böttcher map we can define a useful polar coordinate near 0. We define the dynamical ray of argument θ , where θ ∈ R/Z, to be the image under the inverse of the Böttcher map of the half line through 0 with argument θ turns, i.e. 2π θ radians, R0 (θ ) = ϕ −1 ({se2πiθ | s ≥ 0}). We say that the dynamical ray R0 (θ ) lands if and only if there exist lim ϕ −1 (se2πiθ ).
s→1
When a dynamical ray R0 (θ ) lands we call the limit the landing point of the ray R0 (θ ). We define the dynamical equipotential of level s, where 0 < s < 1, to be the image under the inverse of the Böttcher map of the circle of radius s and centered at 0, E 0 (s) = ϕ −1 ({se2πiθ | 0 ≤ θ < 1}).
762
N. Fagella, A. Garijo
Since ϕ conjugates f to w → wm , the dynamics under f is easy to compute on these dynamical objects (rays and equipotentials). Precisely, we have f (R0 (θ )) = R0 (mθ )
and
f (E 0 (s)) = E 0 (s m ).
As we already mentioned, the Böttcher map verifies the functional equation or Böttcher equation ϕ( f (z)) = ϕ(z)m . On the other hand, there exists an explicit form of the Böttcher map, given by ϕ(z) = lim ( f ◦n (z))1/m . n
n→∞
In order to remove the ambiguity of the root, we write the sequence in the following form: ϕ(z) = z ·
f (z) zm
1/m ◦2 1/m 2 1/m n f (z) f ◦n (z) · . . . .... ( f (z))m ( f ◦(n−1) (z))m
(8)
For the general term, we have
f ◦n (z) ( f ◦(n−1) (z))m
1/m n
1/m n = 1 + O( f ◦(n−1) (z)) .
Hence, in a neighborhood of the superattracting fixed point z = 0, we can define the root by the binomial formula: (1 + u)α =
∞
α(α − 1) · · · (α − n + 1) n=0
n!
u n when |u| < 1.
It is not difficult to see that the product converges uniformly. In our case, z = 0 is a superattracting fixed point of Fλ,m = λz m exp(z). Using a suitable linear change of variables we obtain a new family of entire transcendental maps, so that near the superattracting fixed point z = 0, the functions can be written as z m + O(z m+1 ), and thus have a preferred Böttcher coordinate in this region. More precisely, we consider the following auxiliary family of entire transcendental maps: L a,m (z) = z m e z/a , a ∈ C \ {0}, and m ∈ N, m ≥ 2.
(9)
In the next lemma we prove some fundamental properties of the Böttcher coordinate near z = 0 for the map L a,m . In particular, we obtain an explicit expression of the Böttcher map and we see that it extends to the whole immediate basin of attraction of z = 0. Lemma 3.2. Consider L a,m (z) = z m exp(z/a) for a = 0 and m ≥ 2. Then, the Böttcher coordinate ϕa extends to the whole immediate basin of attraction of the superattracting fixed point z = 0.
The Parameter Planes of λz m exp(z) for m ≥ 2
763
Proof. The map L a,m is affine conjugate to Fλ,m with λ = a m−1 through the map ca (z) = az. In other words, if we choose two parameter values λ0 and a0 such that λ0 = a0m−1 , then Fλ0 ,m and L a0 ,m are conformally conjugate, i.e. L a0 ,m (z) = (ca−1 ◦ Fλ0 ,m ◦ ca0 )(z) 0
∀z ∈ C.
For each a = 0, and when z is small enough we can write the Böttcher coordinate ϕa (z) using the auxiliary expression (8). More precisely, we have
L a,m (z) ϕa,m (z) = z · zm
1/m 2 1/m n 1/m ◦2 (z) ◦n (z) L a,m L a,m · ... .... ◦(n−1) (L a,m (z))m (L a,m (z))m
For the general term, we have
◦n (z) L a,m ◦(n−1)
(L a,m
1/m n
(z))m
◦(n−1) L a,m (z) = exp . a mn
Hence, in a neighborhood of the superattracting fixed point z = 0, we obtain ∞
L ◦n (z) . ϕa (z) = z exp a m n+1
(10)
n=0
Finally, we observe that this holomorphic map is well defined (the series converges) in the whole immediate basin of attraction of z = 0. 3.2. Holomorphic motions. ˆ we say that a map Definition. Let X ⊂ C, ˆ : D× X → D×C (c, z) → (c, z) = (c, c (z)) = (c, z (c)) is a holomorphic motion of X parameterized by D if (a) 0 (z) = z for all z ∈ X . (b) c (z) is injective for all fixed c ∈ D. (c) For all z ∈ X , the map z : D → C is holomorphic. There are two important theorems studying the extension of a holomorphic motion. The first one is the λ Lemma ([MSS]) and it extends a holomorphic motion of X to the closure of X . The second one is Słodkowski Lemma ([Sl]) and it extends a holomorphic motion parameterized in D to the whole Riemann sphere. We only recall the Słodkowski Lemma, since it is a generalization of the λ Lemma. ˆ be a holomorphic Theorem 3.3 (Słodkowski Lemma, [Sl]). Let : D × X → D × C ˆ ˆ Moreover, motion. Then, we can extend to a holomorphic motion : D× C → D× C. ˆ ˆ c : C → C is a quasiconformal homeomorphism for every parameter c ∈ D, the map 1+|c| . whose dilatation ratio K c is bounded by 1−|c|
764
N. Fagella, A. Garijo
In the following lemma we prove that the holomorphic motion of a quasidisk is also a quasidisk. This property will play a fundamental role to prove Theorem C. Lemma 3.4. Let U be a quasidisk, i.e., assume that there exists a quasiconformal mapping h : C → C so that U = h(D). Let : D × U → D × C be a holomorphic motion of U. Then for all c ∈ D we have that c (U) is also a quasidisk. Proof. Applying the Słodkowski Lemma (Theorem 3.3) we can extend to a holoˆ → D×C ˆ such that for every parameter c ∈ D, the map : D×C morphic motion ˆ →C ˆ is a quasiconformal mapping. If we denote by Uc := {(c, z) | z ∈ U}, we c : C c ◦ h : C → C is a quasiconformal mapping and Uc = c ◦ h(D). have that 3.3. Quasiconformal surgery. Definition. A quasiconformal map of C is a homeomorphism ϕ such that small infinitesimal circles are mapped onto small infinitesimal ellipses of bounded axes ratio. The analytic formulation of this condition is that ϕ(x + i y) is absolutely continuous in x for almost every y and in y for almost every x and that the partial derivatives are locally square integrable and satisfy the Beltrami differential equation ∂ϕ ∂ϕ = µ(z) for almost all z ∈ C, ∂z ∂z where µ is a complex measurable function with |µ(z)| ≤ κ < 1 for z ∈ C. In this case we say that ϕ is κ–quasiconformal. An almost complex structure σ on C is a measurable field of ellipses (E z )z∈C , equivalently defined by a measurable Beltrami form µ on C, µ=u
d z¯ . dz
The correspondence between Beltrami forms and complex structures is as follows: the argument of u(z) is twice the argument of the major axis of E z , and |u(z)| = KK−1 +1 , where K ≥ 1 is the ratio of the lengths of the axes. The standard complex structure σ0 is defined by circles or by the Beltrami form µ0 = 0. Suppose that ϕ : C → C is a quasiconformal homeomorphism. Then ϕ gives rise to an almost complex structure σ on C. For almost every z ∈ C, ϕ is differentiable and the R−linear tangent map Tϕ : Tz C → Tϕ(z) C defines, up to multiplication by a positive factor, an ellipse E z in Tz C: E z = (Tz ϕ)−1 (S 1 ). Moreover, there exists a constant K > 1 such that the ratio of the axes of E z is bounded by K for almost every z ∈ C. The smallest bound is called the dilatation ratio of ϕ. Equivalently, ϕ defines a measurable Beltrami form on C µ=
∂ϕ = ∂ϕ
∂ϕ ∂ z¯ ∂ϕ ∂z
d z¯ d z¯ = u(z) . dz dz
The Parameter Planes of λz m exp(z) for m ≥ 2
765
An almost complex structure is quasiconformally equivalent to the standard structure if it is defined by a measurable field of ellipses with bounded dilatation ratio. Given ϕ : C → C a quasiconformal homeomorphism, an almost complex structure σ on C can be pulled back into an almost complex structure ϕ ∗ σ on C. If σ is defined by an infinitesimal field of ellipses (E z )z∈C , then ϕ ∗ σ is defined by (E z )z∈C , where E z = (Tz ϕ)−1 E ϕ(z) whenever defined. To integrate an almost complex structure σ means to find a quasiconformal homeomorphism ϕ such that (Tz ϕ)−1 (S 1 ) = ρ(z)E z for almost every z ∈ C. Informally, we will say that σ is transported to σ0 by σ . Surgery techniques are based on the following result: Theorem 3.5 (Measurable Riemann mapping Theorem, [Ah, LV]). Let σµ be any almost complex structure on C given by the Beltrami form µ=u
d z¯ dz
with bounded dilatation ratio, i.e., ||µ||∞ := sup |u(z)| < m < 1. Then σµ is integrable, i.e., there exists a quasiconformal homeomorphism ϕ such that µ=
∂ϕ , ∂ϕ
or equivalently ϕ ∗ σ0 = σµ . Moreover, ϕ : C → C is unique up to composition with an affine map. Remarks 3.6. The application of Ahlfors-Bers’ theorem to complex dynamics is the following. Let f and σµ be, a quasiregular mapping of C and an almost complex structure with bounded dilatation ratio, such that f ∗ σµ = σµ . If we apply Theorem 3.5 to integrate σµ , we obtain a quasiconformal mapping ϕ such that ϕ ∗ σ0 = σµ . Then g = ϕ ◦ f ◦ ϕ −1 verifies g ∗ σ0 = σ0 , and hence g is a holomorphic map of C. Moreover, f and g are quasiconformally conjugate, i.e., they have the same dynamics. 3.4. Miscellanea. Our goal in this subsection is to make precise definitions of expanding maps ([dMvS]), and the quasiconformal extension of a quasisymmetric map on the circle ([Pom]). We also need the concept of growth order of a continuous function. Definition. We say that a C 1 map f : T → T is expanding if there exist real constants C > 0 and µ > 1 such that |D( f ◦n (x))| > Cµn for all n ∈ N and all x ∈ T. We observe that a sufficient condition to assure that f is expanding is given by min{| f (x)| , x ∈ T} > 1. The following theorem states that any two expanding maps of the same degree are quasisymmetrically conjugate.
766
N. Fagella, A. Garijo
Theorem 3.7 (Shub and Sullivan, [SS]). Let f, g : T → T be expanding and C 1+δ , with δ ∈ (0, 1), maps of degree m. Then there exist a quasisymmetric conjugacy ϕ : T → T such that f = h −1 ◦ g ◦ h. Quasisymmetry is precisely the property that allows a circle map to be extendable to a quasiconformal map of the disc, as shown by the following theorem. Theorem 3.8 (Beurling and Ahlfors [BA], Douady and Earle [DE]). Let h : T → T be an orientation preserving quasisymmetric map. We can extend h to a quasiconformal
: D → D. Moreover, if σ, τ ∈ M ob(D) map H ¨ then the extension of σ ◦ h ◦ τ is given
◦ τ. by σ ◦ H Finally we will need the definition of the growth order of a continuous function. Definition. Let f : C → C be a continuous function. We define M(r, f ) := max|z|=r | f (z)| and the growth order ρ( f ) by ρ( f ) := lim sup r →∞
log+ log+ M(r, f ) , log r
where log+ (t) = log(max(1, t)), 4. Transcendental Part When we consider a holomorphic map f : C → C with an essential singularity at infinity, this point plays a crucial role. For instance, the little Picard Theorem says that an entire function assumes every value in the complex plane with at most one exception, in any neighborhood of infinity. Thus, in general, iteration of entire transcendental maps is more complicated than rational maps. As an example, there are transcendental maps presenting wandering domains ([B1, B2]) and/or Baker domains ([F]), also called “parabolic domains at ∞”. We concentrate on the class of entire transcendental maps of finite type, that is S = { f : C → C, f trans. entire with only finitely many critical and asymptotic values}. Dynamically, entire maps of finite type share some of the properties of polynomials since their Fatou sets cannot include wandering or Baker domains, nor Herman rings ([EL2, GK]). Observe that the family of functions Fλ,m (z) = λz m exp(z) belongs to S. The function Fλ,m has two critical values at 0 and at λ(−m)m ex p(−m), since the critical points are located at z = 0 and z = −m. It has also an asymptotic value at v = 0, since the function tends to 0 as z tends to ∞ along R− . If f ∈ S, there exists a characterization of the Julia set ([EL1]), namely as the closure of the set of points whose orbits tend to ∞. Using the characterization above we can plot an approximation of J (Fλ,m ). Generally, orbits tend to ∞ in specific directions. ◦n (z)| = +∞ , then we have lim ◦n In our case, if limn→∞ |Fλ,m n→∞ Re(Fλ,m (z)) = +∞. Thus, an approximation of the Julia set is given by the set of points whose orbit contains a point with real part greater than, say, 90. Observe that filled black regions are due to numerics, since the Julia set contains no open set.
The Parameter Planes of λz m exp(z) for m ≥ 2
767
(a)
(b)
(c)
(d)
(e)
(f)
Fig. 2. The Julia set for Fλ,2
In Fig. 2, we display the Julia set of Fλ,m for different values of λ and m. The immediate basin of attraction of z = 0 is shown 1 in blue, while the other components of Aλ,m (0) \ A∗λ,m (0) are shown in red. The components of the Fatou set different from Aλ,m (0) are shown in orange. Points in the Julia set are shown in black. We show the dynamical plane of the function Fλ,2 = λz 2 exp(z) , for three different values of λ and different ranges. As we proved in [FG] the basin of 0 contains an infinite number of horizontal strips, that extend to +∞ as their real parts tend to +∞. Between these strips we find the well known structures, named Cantor Bouquets which are invariant sets of curves governed by some symbolic dynamics. This kind of structures in the Julia set are typical for critically finite entire transcendental functions ([DT]). Also, as we change the parameter λ we observe that the relative position of these bands also changes, but not their width. Finally, we can see the existence of an unbounded region that extends to −∞ contained in Aλ,m (0). In the zoom plates of Fig. 2, range (−1, 1) × (−1, 1), we can see the dynamical plane near the origin. It seems that the immediate basin of attraction of z = 0 is a Jordan domain for λ = −8 and λ = 6.9. The orbit of the free critical point z = −m, determines in large measure the dynamics of Fλ,m . Indeed, the functions Fλ,m (z) = λz m exp(z) are entire maps with a finite number of critical and asymptotic values, hence we know that if the orbit of z = −m tends to ∞ no other Fatou components can exist besides those that belong to Aλ,m (0). Hence the Fatou set must coincide with the basin of 0, i.e., F(Fλ,m ) = Aλ,m (0). The set Bm is defined as 1 Color plots are available in the online version of this paper. Otherwise, blue is darker than red and orange is light.
768
N. Fagella, A. Garijo
(a)
(b)
(c) Fig. 3. Parameter plane for Fλ,2 . Color codes are explained in the text ◦n Bm = {λ ∈ C | Fλ,m (−m) ∞}.
In each of these sets, we may also distinguish between two different behaviors: those parameter values for which −m ∈ Aλ,m (0) and those for which this does not occur. Let int(Bm ) denote the interior of Bm . Definition. Let U be a connected component of int(Bm ). We say that U is a capture ◦n (−m) = 0, or in other words, zone if for all λ in U we have that limn→+∞ Fλ,m −m ∈ Aλ,m (0). We then say that the orbit of the critical point is captured by the basin of attraction of the superattracting fixed point z = 0. In Fig. 3, we show a numerical approximation of the set B2 . The main capture zone is shown in blue, while other capture zones are shown in red. All other components of B2 are shown in orange. The parameter values for which the orbit of the free critical point tends to ∞ are shown in black. In these sets we can see a countable quantity of 0. horizontal strips. In Fig. 3 (c) we can see the main capture zone Hm 4.1. Dynamical plane: Proof of Proposition A. The first assertion of this theorem, i.e. that all connected components of the Fatou set are simply connected, is a general result for all functions in class S ([B]), which we have included here for completeness. To see that the number of connected components of Aλ,m (0) is either 1 or ∞, we observe that the basin of z = 0, Aλ,m (0), consists of the immediate basin A∗λ,m (0) and all its preimages. For all connected components of Aλ,m (0) other than A∗λ,m (0) there i exists a number i > 0 such that Fλ,m (U ) ⊂ A∗λ,m (0), where i is the smallest number with this property. Suppose that there exist a finite number of connected components, say A∗λ,m (0) , U1 , U2 , . . . , U N . By assumption, for each Uk there exist a number i k
The Parameter Planes of λz m exp(z) for m ≥ 2
769
ik such that Fλ,m (Uk ) ⊂ A∗λ,m (0), for 1 ≤ k ≤ N . Let il be the maximum of the indexes −1 (z) belong i 1 , · · · , i N . Consider z ∈ Ul such that is not exceptional; then, points in Fλ,m ∗ to Aλ,m (0), but not to Aλ,m (0) ∪ U1 ∪ · · · ∪ U N , which is a contradiction. It remains to prove that all connected components of Aλ,m (0) are unbounded except, maybe, A∗λ,m (0). To this end, suppose that U is a connected component of Aλ,m (0) differi (U ) ⊂ A∗λ,m (0). ent from A∗λ,m (0), and let i > 0 be the smallest number such that Fλ,m i (z) and 0. The Let z ∈ U , and denote by γ a simple path in A∗λ,m (0) that joins Fλ,m preimage of γ in U must include a path γ1 that joins z and ∞, since 0 is an asymptotical value with no other finite preimage than itself. Thus we conclude that U is unbounded. This concludes the proof of Proposition A.
4.2. Parameter plane: Proof of Theorem B. In this section we describe some properties n . We are mainly interested in their topological properties. For of the capture zones Hm clarity’s sake we prove each of the statements in a different proposition. Proposition 4.1. The critical point −m belongs to A∗λ,m (0) if and only if the critical 1 = ∅. value Fλ,m (−m) belongs to A∗λ,m (0). Hence Hm Proof. Suppose that Fλ,m (−m) ∈ A∗λ,m (0). Let γ be a simple path in A∗λ,m (0) that joins Fλ,m (−m) and 0. The set of preimages of γ must include a path γ1 that joins −∞ with −m, and also a path γ2 that joins −m and 0 (since −m is a critical point and 0 is a fixed point and asymptotic value). Hence γ1 ∪ γ2 ⊂ A∗λ,m (0) and so does −m. Conversely, if −m ∈ A∗λ,m (0) we have that Fλ,m (−m) ∈ A∗λ,m (0). We define ρ = min{ 1e , ( me )m }, i.e., ρ = 1/e for m = 2, 3 and ρ = ( me )m for m ≥ 4. e m−1 ) . We also define ρ = ( m−1 0 ⊂ D . Proposition 4.2. Dρ ⊂ Hm ρ 0 . For λ ∈ D , we will prove Proof. First we prove that Dρ = {λ ∈ C ; |λ| < ρ} ⊂ Hm ρ that Fλ,m (−m) lies in D 0 which we know belongs to A∗λ,m (0). In order to do so, we 1 1/(m−1) ) ) (Lemma 2.2). If λ ∈ Dρ , then |λ| < 1e , and hence use that 0 ≥ min(1, ( |λ|e 0 ≥ 1. The condition λ ∈ Dρ also implies that |λ| < ( me )m . Hence m m |Fλ,m (−m)| = |λ||(−m)m e−m | = |λ| < 1 ≤ 0 , e
and Fλ,m (−m) lies in A∗λ,m (0). 0 ⊂ D . We will prove that −m ∈ Second we prove that Hm / A∗λ,m (0) for all λ ∈ C ρ e m−1 ) . Let D be the disk centered at 0 of radius m −1. If we calculate such that |λ| > ( m−1 the modulus of the image of its boundary, {|z| = m − 1}, we obtain |Fλ,m (z)| = |λ||z|m e Re(z) ≥ |λ|(m − 1)m e−(m−1) > m − 1, e m−1 ) . This shows that D ⊂ Fλ,m (D). where the inequality is obtained using |λ| > ( m−1 −1 Let W be the component of Fλ,m (D) that contains the origin. It is clear that W ⊂ D and A∗λ,m (0) ⊂ W . Moreover, Fλ,m is a proper function of degree m from W onto
770
N. Fagella, A. Garijo
Fig. 4. Fλ,m is a polynomial-like mapping of degree m near the origin
D, (see Fig. 4). In the terminology of polynomial-like mappings, developed by Douady and Hubbard ([DH3]), the triple (Fλ,m ; W, D) is a polynomial-like mapping of degree m. By the Straightening Theorem ([DH3]), there exists a quasiconformal mapping, φ, that conjugates Fλ,m to a polynomial P of degree m, on the set W . That is (φ −1 ◦ Fλ,m ◦ φ)(z) = P(z) for all z ∈ W . Since z = 0 is superattracting for Fλ,m and φ is a conjugacy, we have that z = 0 is superattracting for P. Hence, after perhaps a holomorphic change of variables, we may assume that P(z) = z m . 0 is bounded. Hence A∗λ,m (0) ⊂ D. Since −m ∈ / D we conclude that Hm 0 is connected and simply connected. Proposition 4.3. The main capture zone Hm 0 is conformally a disk. Since F Proof. We prove that Hm λ,m (z) has a superattracting fixed point at z = 0, we can use the Böttcher coordinate near the origin (see Sect. 3.1) to define a suitable biholomorphic map in the main capture zone. Using a suitable linear change of variables we obtain a new family of entire transcendental maps, so that near the superattracting fixed point z = 0, the functions can be written as z m + O(z m+1 ), and thus having a preferred Böttcher coordinate in this region (see Sect. 3.1). We consider
L a,m (z) = z m e z/a , a ∈ C \ {0}, and m ∈ N, m ≥ 2.
(11)
Under this map, the superattracting fixed point z = 0 is still at z = 0, and the free critical point (located at z = −m for Fλ,m ) is now at ca,m = −m a for L a,m . We now define the following auxiliary set for the family of maps L a,m which is closely related to the main capture zone, more precisely 0 0 Hˆ m = {a ∈ C, such that a m−1 ∈ Hm }. 0 → a m−1 ∈ H0 is a (m − 1)–fold branched covering. By construction, a ∈ Hˆ m m We consider the following mapping 0 : Hˆ m →D a → ϕa,m (ca,m ),
(12)
where ϕa is the Böttcher coordinate defined in the immediate basin of attraction of z = 0 (Sect. 3.1 and Lemma 3.2). We claim that the map is well defined and, in fact, is a ˆ0 conformal isomorphism which is tangent to a → −m e a at the origin. If a ∈ Hm \ {0},
The Parameter Planes of λz m exp(z) for m ≥ 2
771
the Böttcher map extends until the critical point ca,m = −ma, and using Eq. (10) we have that ∞
L ◦n (−ma) ϕa,m (ca,m ) = (−ma) exp . a m n+1 n=0
◦n
We see inductively that L (−ma) is a holomorphic function of a m−1 . Indeed, using a m z/a the definition of L a,m (z) = z e , we have ◦0 (−ma) L a,m = −m a
Assuming then that see that ◦(n+1) L a,m (−ma)
a
=
L ◦n (−ma) a
◦1 (−ma) L a,m = a
and
−m e
m a m−1 .
= R(a m−1 ), where R(w) is a holomorphic map on w, we
◦n (−ma) L a,m
m
exp a
L ◦n
a,m (−ma)
a
= a m−1 [R(a m−1 )]m exp[R(a m−1 )]
◦n
proving thus that L (−ma) is a holomorphic function of a m−1 . a As a → 0, a brief computation shows that ϕa,m (ca,m ) = − me a η(a m−1 ), where η(w) is a holomorphic mapping so that η(0) = 1. Hence the apparent singularity at a = 0 is removable. Since the correspondence a → ϕa,m (cm,a ) (Eq. 12) is well defined and holomorphic, it suffices to show that ϕa,m (ca,m ) is a proper map of degree one from Hm onto D. To this end, we first consider a boundary point a0 ∈ ∂H0,m . Then, as noted earlier, the Böttcher mapping from the immediate basin Aa∗0 ,m (0) onto the unit disc has no critical points, and in fact is a conformal diffeomorphism. In particular, ϕa0 ,m can be defined as a single valued function on the disc of radius 1 − , for any > 0. This last property must be preserved under any small perturbation of a0 , and it follows that |ϕa,m (−m a)| > 1 − for any a ∈ H0,m sufficiently close to a0 . Thus is a proper map from H0,m onto D. Since −1 (0) is the single point 0, with (0) = − me = 0, it follows that is a conformal diffeomorphism. 0 to D using the construction We can define now the conformal mapping from Hm 0 → D, is written as (a) = −e a η(a m−1 ), above. Since the conformal mapping : Hˆ m m 2π it follows that −1 sends a sector S = {z ∈ D, 0 ≤ arg(z) ≤ m−1 } into a sector 0 with an amplitude equal to 2π . We can see that S ∼ D. Hence we −1 (S) ⊂ Hˆ m = m−1 0 defined as obtain a conformal mapping from S to Hm 0 : S → Hm m−1 z → −1 (z) .
(13)
n are unbounded. Proposition 4.4. For all n, m ≥ 2, the connected components of Hm
772
N. Fagella, A. Garijo
0 . We assume Proof. Let U be a connected component of a capture zone different from Hm that U is bounded, then
sup |λ| = M1 < +∞.
λ∈∂U
0 , we observe that 0 ∈ Since λ = 0 belongs to Hm / U. We claim that there exist 1 (m) > 0 such that for all λ ∈ ∂U , we have that ◦k (−m)| ≥ (m) for all k ≥ 0. To see this, we only need to prove that for all |Fλ,m 1 λ ∈ ∂U we can find 1 > 0 such that D(0, 1 ) ⊂ A∗λ,m (0). For all λ ∈ C there exists 0 > 0, depending on |λ| and m, (Theorem 2.1) such that D(0, 0 ) ⊂ A∗λ,m (0). We also know (see Lemma 2.2) that 1 1 m−1 0 (|λ|, m) ≥ min 1, . |λ|e
If λ belongs to ∂U , then |λ| ≤ M1 , and we have
1 |λ| e
Hence, we define 1 = min{1,
1 M1 e
1 m−1
≥
1 m−1
1 M1 e
1 m−1
.
} and this proves the claim.
Let λ0 ∈ U . Since U is a capture zone, by definition we have that Fλ◦k0 ,m (−m) → 0 as k → ∞. Let k0 ≥ 0 be such that, for all k ≥ k0 , we have |Fλ◦k0 ,m (−m)| < 1 /2. We consider now the mapping, ◦k0 Fλ,m (−m) : U → C ◦k0 λ → Fλ,m (−m).
On the one hand, this is a holomorphic function of λ. On the other hand, since 0 ∈ / U, ◦k0 we have that Fλ,m (−m) = 0 for all λ ∈ U (the only preimage of z = 0 under Fλ,m (z) ◦k0 (−m), we have is z = 0). If we apply the minimum principle to Fλ,m
1 ◦k0 ◦k0 0 ≥ |Fλ◦k0 ,m (−m)| ≥ inf |Fλ,m (−m)| = inf |Fλ,m (−m)| ≥ 1 , λ∈U λ∈∂U 2 obtaining thus a contradiction.
n are simply Proposition 4.5. For all n, m ≥ 2, the connected components of Hm connected.
Proof. The proof uses a surgery construction (see Sect. 3.3 for preliminaries on this n where m, n ≥ 2. We consider the technique). Let U be a connected component of Hm following mapping: U : U → D \ {0} ◦k+1 (−m)), λ → ϕλ (Fλ,m where ϕλ denotes the Bötcher coordinate near the origin. As in the previous proposition the map U is a proper mapping and we will prove that it is a local homeomorphism.
The Parameter Planes of λz m exp(z) for m ≥ 2
773
Let λ0 ∈ U and z 0 = U (λ0 ). The idea of this surgery construction is the following: k+1 (−m) has Böttcher coordifor z near z 0 we can build a map Fλ(z),m such that Fλ(z),m nate z. We denote by Wλ0 the connected component of Aλ0 ,m (0) containing Fλ◦n0 ,m (−m), preimage of A∗λ0 ,m (0). Let Cλ0 be an small open neighborhood of Fλ◦n+1 (−m) contained 0 ,m in A∗λ0 ,m (0), and Bλ0 ⊂ Wλ0 be the preimage of Cλ0 containing Fλ◦n0 ,m (−m). For any 0 < < min{|z 0 |, 1 − |z 0 |} and any z ∈ D(z 0 , ), we choose a diffeomorphism δz : Bλ0 → Cλ0 with the following properties: • δz 0 = Fλ0 ,m ; • δz coincides with Fλ0 ,m in a neighborhood of ∂ Bλ0 for any z; (z). • δz (Fλ◦k0 ,m (−m)) = ϕλ−1 0 We consider, for any z ∈ D(z 0 , ), the following mapping G z : C → C: if x ∈ Bλ0 δ (x) G z (x) = z Fλ0 ,m (x) if x ∈ / Bλ0 . We proceed to construct an invariant almost complex structure, σz , with bounded dilatation ratio. Let σ0 be the standard complex structure of C. We define a new almost complex structure σz in C, ⎧ ∗ on Bλ0 ⎨ (δz ) σ0 n ∗ σ on F −n (B ) for all n ≥ 1 (F ) σz := . λ0 ,m λ0 ⎩ λ0 ,m on C \ n≥1 Fλ−n (B ) σ0 λ 0 ,m 0 By construction σ is G z -invariant, i.e., (G z )∗ σ = σ , and it has bounded distortion since δz is a diffeomorphism and Fλ0 is holomorphic. If we apply the Measurable Riemann Mapping Theorem (see Sect. 3.3 and Remark 3.6) we obtain a quasiconformal map τz : C → C such that τz integrates the complex structure σz , i.e., (τz )∗ σ = σ0 , normalized so that τ (0) = 0 and τ (−m) = −m. Finally, we define Rz = τz ◦ G z ◦ τz−1 , which is analytic, hence an entire function. We claim that this resulting mapping is Rz (x) = λx m exp(x), for some λ. Indeed, the map Rz : C → C is an entire map (∞ is an essential singularity) with a superattracting fixed point at the origin. Moreover, Rz has a critical point at z = −m. Thus Rz (x) = νx m exp(h 1 (x)). It is easy to show that the growth order of Fλ,m is equal to 1, hence Rz has the same growth order. We know that the composition of a function of finite growth order by a quasiconformal function can only change the growth order by a real factor ([G]). We can conclude that Rz has finite growth order, hence h 1 (x) is a polynomial of degree d. Then there are d directions where Re(h 1 (x)) → +∞ and I m(h 1 (x)) is bounded for x → ∞, separated by d directions where Re(h 1 (x)) → −∞ and I m(h 1 (x)) is bounded. Thus there are d directions where Rz → ∞ separated by d directions where Rz → 0. This behavior is invariant under topological conjugation. Since Fλ0 ,m has only one direction (along the positive real axis) where Fλ0 ,m → ∞ and one (the negative real axis) where Fλ0 ,m → 0, we conclude that d = 1 and Rz (x) = νx m exp(a0 + a1 x). If we use that −m is a critical point, then a1 must be equal to 1. Finally, if we define λ = ν exp(a0 ), we can conclude that Rz (x) = λx m exp(x). By construction, τz 0 is the identity for z = z 0 , then there exists a continuous function z ∈ D(z 0 , ) → λ(z) ∈ U such that λ(z 0 ) = z 0 and Fλ(z),m = τz ◦ G z ◦ τz−1 .
774
N. Fagella, A. Garijo
Moreover, τz is holomorphic on A∗λ0 ,m (0) conjugating Fλ0 ,m and Fλ(z),m . Hence, observing the following commutative diagram: D ⏐ ϕλ0 ⏐
z2
−−−−→
D ⏐ ⏐ ϕλ 0
Fλ0 ,m
A∗λ0 ,m (0) −−−−→ A∗λ0 ,m (0) ⏐τ z ⏐ τz ⏐ ⏐ Fλ(z),m
A∗λ(z),m (0) −−−−→ A∗λ(z),m (0) we have that ϕλ(z) = ϕλ0 ◦ τz−1 is the Böttcher coordinate of A∗λ(z),m (0). Finally we conclude that ◦n+1 (−m)) = z, U (λ(z)) = ϕλ(z) (Fλ(z),m ◦n+1 (−m) = τ ◦G ◦n+1 ◦τ −1 (−m) = τ ◦G ◦n+1 (−m) = τ ◦G (F ◦n (−m)) = since Fλ(z),m z z z z λ0 ,m z z z −1 −1 τz ◦ ϕλ−1 (z) = τz ◦ τz−1 ◦ ϕλ(z) (z) = ϕλ(z) (z). 0
4.3. Parameter Plane: Proof of Theorem C. ∗ 0 then A 0 then Proposition 4.6. If λ ∈ Hm / Hm λ,m (0) = Aλ,m (0). Otherwise if λ ∈ Aλ,m (0) has infinitely many connected components. 0 . As in Proposition 4.1, let γ be a simple path in A∗ (0) that joins Proof. Let λ ∈ Hm λ,m Fλ,m (−m) and 0. The preimage of γ must include a path γ˜ contained in A∗λ,m (0) that joins −∞ with 0 passing through −m (γ˜ maps 2-1 to γ ). Since H|λ|,m intersects γ˜ , it follows that H|λ|,m ⊂ A∗λ,m (0). We recall that H|λ|,m is a preimage of a small disk of radius 0 (see Sect. 2). All preimages of γ˜ , are contained in A∗λ,m (0) as well, since they all intersect H|λ|,m . In fact, we have that Aλ,m (0) = A∗λ,m (0) since any preimage of D 0 must contain points of H|λ|,m . Hence Aλ,m (0) has a unique connected component. 0 . From Proposition A(b) we have that A Now assume λ ∈ / Hm λ,m (0) has either one or infinitely many connected components. If we suppose that Aλ,m (0) has only one connected component, then Aλ,m (0) is a completely invariant component of the Fatou set. Then, all the critical values of Fλ,m are in Aλ,m (0) (see [B2]), and hence we conclude 0. that −m belongs to Aλ,m (0). However, this is impossible if λ ∈ / Hm 0 , the boundary of A∗ (0) (which is equal to the Julia set) Proposition 4.7. If λ ∈ Hm λ,m is a Cantor bouquet and it is not locally connected. 0 , we have Proof. Using the proposition above, if λ belongs to the main capture zone, Hm ∗ that Aλ,m (0) = Aλ,m (0). Hence, the Fatou set contains a totally invariant component. In fact, from [BD], it follows that the Julia set has an uncountable number of connected components and it is not locally connected at any point. Using standard techniques analogous to [DT] one can show that the Julia set contains a Cantor Bouquet tending to ∞ in the direction of the positive real axis. Indeed,
The Parameter Planes of λz m exp(z) for m ≥ 2
775
it is sufficient to construct a hyperbolic exponential tract on which Fλ,m has asymptotic direction θ ∗ . To this end, let Br be an open disk containing Fλ,m (−m). The preimage of this set is an open set similar to H|λ|,m (see Sect. 2). Let D be the complement of this set. We have that Fλ,m maps D onto the exterior of Br , then D is an exponential tract for Fλ,m . We may choose the negative real axis to define the fundamental domains in D. More precisely, we can find the preimage of the negative real axis under the function Fλ,m . Hereafter, we denote by Arg(.) ∈ (−π, π ] the principal argument. Using the definition of Fλ,m it is easy to see that Arg(Fλ,m (z)) = Arg(λ) + m Arg(z) + I m(z)
mod (2π ).
Finding the preimages of R− is equivalent to solving Arg(Fλ,m (z)) = π. The equation above is equivalent to Arg(λ) + mα + rsin(α) = (2k + 1)π
k ∈ Z,
where r = |z| and α = Arg(z). Hence, we obtain r = ρ(α) =
(2k + 1)π − mα − Arg(λ) sin(α)
α ∈ (−π, π ).
We denote each of these curves by σk = σk (λ, m), where the possible values of the argument depend on k. As their real parts tend to +∞, the σk ’s are asymptotic to the lines I m(z) = (2k + 1)π − Arg(λ). Since the curves σk for k ∈ Z are mapped by Fλ,m onto the negative real axis, it follows that D has asymptotic direction θ ∗ = 0. Furthermore, since Fλ,m (z) = λz m exp(z), one may check readily that D is a hyperbolic exponential tract. Before proving assertion (c) of Theorem C we prove the following auxiliary lemma. e m−1 Lemma 4.8. If |λ| > ( m−1 ) , then the boundary of A∗λ,m (0) is a quasicircle. 0 be such that |λ| > ( e )m−1 . By using the same arguments of Proof. Let λ ∈ / Hm m−1 Proposition 4.2 we have that Fλ,m is polynomial-like of degree m near the origin. From this construction we obtain that ∂ A∗λ,m (0) = φ(T), and the lemma follows. e m−1 Remarks 4.9. The reason to ask for |λ| > ( m−1 ) as a condition is as follows. We want to find a value K > 0 such that if |z| = K then |Fλ,m (z)| > K . This condition is equivalent to
|Fλ,m (z)| ≥ |λ||z|m e−|z| = |λ|(K )m e−K > K or equivalently |λ| > K 1−m e K .
776
N. Fagella, A. Garijo
We want to use this argument for the largest possible region of values of λ. Hence, we choose K > 0, such that K 1−m e K is minimum. This minimum value is reached exactly at K = m − 1. 0 . If λ ∈ U , Proposition 4.10. Let Um be the unbounded connected component of C\Hm m then the boundary of A∗λ,m (0) is a quasicircle. 0 . Since U is unbounded let Proof. Let Um be the unbounded component of C \ Hm m λ0 ∈ Um be such that ∂ A∗λ0 ,m (0) is a quasicircle (see Lemma 4.8), and hence A∗λ0 ,m (0) is a quasidisk. On the other hand, since Um is an open and simply connected set, let ψ : D → Um be the Riemann mapping such that ψ(0) = λ0 . We claim that for all λ ∈ Um , the Böttcher mapping ϕλ conjugating Fλ,m to z → λz m extends to the whole immediate basin of attraction A∗λ,m (0) (see Sect. 3.1). To see the claim we only need to observe that, when λ ∈ Um the critical point −m does not belong to A∗λ,m (0), hence no other critical point than z = 0 belongs to A∗λ,m (0). It follows that for all λ ∈ Um , the Böttcher coordinate
ϕλ : A∗λ,m (0) → D, is a conformal mapping. We can define now a holomorphic motion of A∗λ0 ,m (0) (see Sect. 3.2). We use as main ingredients the Böttcher map, ϕλ , and the conformal Riemann mapping ψ. More precisely, we consider the following map: : D × A∗λ0 ,m (0) → D × C −1 (c, z) → (c, c (z)) = (c, z (c)) = (c, ϕψ(c) ◦ ϕλ0 (z))
(14)
We can check that is a holomorphic motion. By construction, we have that 0 (z) = −1 ◦ ϕλ0 (z) = z. If we fix the parameter c we must see that the map c (z) is injective. ϕψ(0) This is immediate, since the Bötcher mapping ϕλ is conformal. Finally, if we fix a point z ∈ A∗λ0 ,m (0) we must see that z : D → C is a holomorphic map. In this case the map z is a composition of holomorphic maps, since the Böttcher map depends analytically on parameters (see Fig. 5). Geometrically, if we fix λ ∈ Um , the map z → ψ −1 (λ) (z) sends points in A∗λ0 ,m (0) to points in A∗λ,m (0) according to the Böttcher coordinates. Finally, we apply Lemma (3.4) to the holomorphic motion , which roughly speaking, says that a holomorphic motion of a quasidisk is also a quasidisk. The final assertion of Theorem C(c), follows directly from the fact that all the sets n are unbounded and hence belong to U Hm m ˆ of A∗ (0), we have that Remarks 4.11. Since extends to a holomorphic motion λ0 ,m ˆ c (z 2 ) for all z 1 , z 2 ∈ A∗ (0). In other words, if we take ˆ c (z 1 ) = for all c ∈ D, λ0 ,m z 1 and z 2 in the boundary of A∗λ0 ,m (0), the property above proves that two internal rays never land at a common point. 4.4. Parameter Plane: Proof of Theorem D. From Theorem B, statements b) and c), 0 is bounded, connected and simply connected. As we mentioned in we know that Hm 0 is a topological disc. If this were the case then the introduction, we conjecture that Hm
The Parameter Planes of λz m exp(z) for m ≥ 2
777
Fig. 5. Sketch of the Holomorphic motion c (z), where λ = ψ(c). Geometrically, c (z) sends equipotentials and rays from A∗λ ,m (0) to A∗λ,m (0) according to Bötcher coordinates 0
0 would consist of only one connected component which would be unbounded. But C\Hm 0 might have other connected as long as this result it is not proven, the complement of Hm components different from the unbounded one, which we denote by Um . In Theorem D we study the topological relation between these sets.
Proof. of Theorem D. In this proof we use the monic family of functions L a,m = z m exp(z/a) (see Sect. 3.1 and Proposition 4.3). We recall that L a,m (z) is conformally conjugate to Fλ,m (z). We introduce this new family of maps in order to obtain a preferred Böttcher coordinate near z = 0. We also recall that the free critical point for the family L a,m (z) is at the point ca,m = −ma. We denoted by ϕa the Böttcher coordinate defined in the whole immediate basin of attraction of z = 0 (Lemma 3.2) and by (a) = ϕa (ca,m ) (Eq. 12) the uniformization mapping of the main capture zone (Proposition 4.3). 0 and U have a common boundary. Since U is the unWe want to show that Hm m m 0 we have that ∂U ⊂ ∂H0 . Now, we will prove that bounded component of C \ Hm m m 0 ⊂ ∂U and thus statement a) follows. In order to do this, we first observe that the ∂Hm m n , for n ≥ 2, are contained in U since they are unbounded rest of the capture zones Hm m 0 0 , the sequence of and disjoint from Hm . Second, notice that for any point a0 in ∂Hm n {L a,m (ca,m )}n≥0 is not a normal family in any neighborhood of a0 . 0 meets Hn for Third, we claim that any arbitrary neighborhood of any point in ∂Hm m 0 some n ≥ 2. To see the claim, let a0 be a point in ∂Hm , let W be a neighborhood of n = ∅ for some n ≥ 2. We also consider α = 1/2 and a0 . We must show that W ∩ Hm β1 , · · · , βm are complex numbers such that (βi )m = α . Set √ m 0 0 K = {a ∈ Hm | |(a)| > | α |} and P = C \ {a ∈ Hm | |(a)| ≤ | m α |}. By shrinking W , if necessary, we can assume that W ⊂ P. Define functions α(a) = ϕa−1 (α ) and βi (a) = ϕa−1 (βi ) for i = 1, · · · , m. See Fig. 6 for a sketch of the relevant 0 \ {0} then objects of this construction. Notice that by construction of ϕa , if a ∈ Hm −1 the forward orbit of the free critical point ca,m = −ma is contained in ϕa (D|(a)| ). In n (c particular, if a ∈ K and L a,m a,m ) = α(a), then n−1 L a,m (ca,m ) ∈ {β1 (a), . . . , βm (a)}.
778
N. Fagella, A. Garijo
Fig. 6. Sketch of the relevant objects in proof of Theorem D
Now, let xa0 be a preimage of α(a0 ), that is not equal to βi (a0 ) for any 1 ≤ i ≤ m, n is ∞ to 1. We cannot have c n notice that L a,m a,m = x a0 because then L a,m (ca,m ) would be normal in a neighborhood of a0 . From the implicit function theorem, we know that there exists a holomorphic function x(a) such that L a,m (x(a)) = α(a) in some neighborhood of a0 , which we can suppose is W by shrinking it, if necessary. Again by shrinking W , we can suppose that x(a) = βi (a) for all a ∈ W , for i = 1, . . . , m. By lack of normality, n (c the iterates L a,m a,m ) do not avoid 0, ∞ and x(a). So, there exist a ∈ W and n ≥ 0 such that
L an ,m (ca ,m ) = x(a ). n for some n ≥ 0. We finally claim that n > 0. If n = 0, then It follows that a ∈ Hm +1 n a ∈ K , and L a ,m (ca ,m ) = α(a ); this would mean that L an ,m (ca,m ) = βi (a ) for some i, a contradiction. To prove the second statement of Theorem D, let V be a bounded connected compo0 . Hence, we have that ∂V ⊂ ∂H0 and ∂V ⊂ ∂U , since, by statement nent of C \ Hm m m 0 . Then, V has a common boundary with H0 and U . a), ∂Um = ∂Hm m m
5. A Model for Fλ,m For each natural value m ∈ N, m ≥ 2 and α, β ∈ R we define the two-parameter family of maps G α,β,m (z) = eiα z m eβ/2(z−1/z) . It is easy to check that, G α,β,m preserves the unit circle, S 1 , and on this circle we have the following dynamical system: α,β,m : θ → α + mθ + β sin(θ ) G
mod (2π ) ,
θ ∈ R/2π Z.
The Parameter Planes of λz m exp(z) for m ≥ 2
779
When β < 1 and m = 1 this family of circle diffeomorphisms is known as the standard family or Arnold family and its parameter space contains the well known Arnold α,β,m is an m to 1 Tongues ([A]). When m ≥ 2 the situation is very different because G map of T and hence not a circle diffeomorphism. For each parameter value α and β the map G α,β,m is a holomorphic function defined on the punctured plane, C∗ , with 0 and ∞ as essential singularities. We denote by P this class of functions. Maps of this type, are studied in [Ke, Ko1, Ko2] and [Mak] among others. Let f be a holomorphic self-mapping of C∗ . The usual definitions of Fatou and Julia sets apply for functions in class P, although in this case, the Julia set can by characterized by the closure of the set of points whose orbits tend to 0 or to ∞ under iteration. Using the above characterization we can plot the Julia set of G α,β,m for different values of α, β and m. Sullivan’s Theorem of nonwandering domains has been extended to the class P ∩ S by many authors ([Ke, Ko2]). Also for this kind of functions it is proved ([EL]) that they do not have Baker domains. Hence the classification of Fatou components is exactly the same as in the rational case. The map G α,β,m is of finite type, because it has only two critical points in C∗ . Indeed, if we compute G α,β,m we obtain G α,β,m (z) =
1 iα m−2 β/2(z−1/z) e z e (βz 2 + 2mz + β), 2
and hence the two critical points z + (β) and z − (β) are given by, −m ± m 2 − β 2 . z± = β In the case where α and β are real parameters we have that G α,β,m is symmetric with respect to the unit circle which is also invariant. This condition is equivalent to τ ◦ G α,β,m = G α,β,m ◦ τ, where τ (z) = 1/z. When |β| < m the critical points z ± have the same dynamical behavior since τ (z − ) = z + . Also, it is easy to check that z + belongs to D and hence z − ∈ C \ D. In Fig. 7 we display the parameter plane of G α,β,m for m = 2, 3 and 4. We distinguish between two different behaviors of the free critical points z ± . Parameter values α and β for which the critical points tend to infinity or to zero are plotted in color, depending on the rate of escape. Parameter values α and β for which this does not occur are plotted in black. Black shapes that look like chess figures consist of parameter regions (shaped as Arnold tongues) where the attracting periodic orbit is contained in the unit circle and parameter regions (shaped as Mandelbrot sets) where the attracting periodic orbit is disjoint from the unit circle. An exhaustive analysis of these Arnold tongues can be found in [MR]. In Fig. 8 we display the dynamical plane of G α,β,m for m = 2 and different values of α and β. Points tending to z = 0 and z = ∞ are shown in color, depending on the rate of escape, while points for which this does not occur are shown in black. We also plot the unit circle in blue. The following is the main idea of our surgery construction. First, we consider two α,β,m is quasisymmetrically conjugate to θ → mθ real parameters α and β such that G
780
N. Fagella, A. Garijo
(b)
(a)
(c)
Fig. 7. Parameter planes of G α,β,m for m = 2, 3 and m = 4
(a)
(b)
(c)
Fig. 8. Dynamical plane of G α,β,m for m = 2. Range (−2, 2) × (−2, 2)
on the unit circle. Under this condition we can change the behavior of G α,β,m on the unit disk. More precisely, we quasiconformally paste the superattracting behavior of z → z m inside the unit disk. The corresponding map acts like G α,β,m outside on the complement of D and acts like z → z m on D. Second, applying the Measurable Riemann Mapping Theorem ([Ah, LV]) we can obtain a holomorphic mapping with this dynamical behavior, and finally we will prove that this map is precisely Fλ,m (z) = λz m exp z for some parameter λ. We obtain thus that Fλ,m is quasiconformally conjugate on the complement of A∗ (0) to G α,β,m on the complement of the closed unit disc. 5.1. The connection: Proof of Theorem E. Before proving Theorem E we can prove that Wm contains an open set of parameters. We recall that Wm is given by α,β,m is quasisymmetrically conjugate to θ → mθ }. Wm = {α, β | G Lemma 5.1. {(α, β) ∈ R2 | |β| < m − 1} ⊂ Wm . α,β,m is quasisymmetrically conjugate to Proof. From Theorem 3.7 we can prove that G θ → mθ if we are able to prove that G α,β,m is an expanding map. In order to do so, a sufficient condition is to impose that min{|G α,β,m (θ )|, θ ∈ T} > 1. From the definition α,β,m (θ ) = θ → α + mθ + β sin θ we have that of G α,β,m (θ ) = m + β cos θ. G
The Parameter Planes of λz m exp(z) for m ≥ 2
781
Hence, it is easy to see that when |β| < m − 1 we obtain that min{|G α,β,m (θ )|, θ ∈ T} > 1. Proof of Theorem E. Let α and β be in Wm . Let h = h α,β,m be the quasisymmetric α,β,m = h −1 ◦ g ◦ h, where g(θ ) = mθ . conjugacy, defined on the unit circle, such that G
Consider H = Hα,β,m : D → D to be the Douady-Earle quasiconformal extension of h
(0) = 0. such that H We now define a new function R = Rα,β,m : C → C as follows: G (z) z∈ /D R(z) := α,β,m
(z))m ) z ∈ D . H −1 (( H This map is equal to G α,β,m outside D and it has the desired superattracting dynamics in D, but is not holomorphic on D. We proceed to construct an invariant almost complex structure, σ = σα,β,m , with bounded dilatation ratio. Let σ0 be the standard complex structure of C. We define a new almost complex structure σ in C, ⎧
)∗ σ0 on D ⎨ (H (D) for all n ≥ 1 . σ := (R n )∗ σ on R −n ⎩σ on C \ n≥1 R −n (D) 0 By construction σ is R-invariant, i.e., (R)∗ σ = σ , and it has bounded distortion
is quasiconformal and R is holomorphic outside D. If we apply the Measurable since H Riemann Mapping Theorem we obtain a quasiconformal map ϕ = ϕα,β,m : C → C such that ϕ integrates the complex structure σ , i.e., (ϕ)∗ σ = σ0 , normalized so that ϕ(0) = 0 and ϕ(z − ) = −m. Finally, we define R˜ = R˜ α,β,m = ϕ ◦ R ◦ ϕ −1 , which is analytic, hence an entire function. Our goal now is to show that there exists a complex ˜ value λ such that R(z) = λz m exp(z). ˜ The map R : C → C is an entire map (∞ is an essential singularity) with a superattracting fixed point at the origin. Near the origin R˜ is conjugate to the map z → z m . Moreover, R˜ has a critical point at z = −m, since the map R has one critical point at z − ∈ C \ D and ϕ(z − ) = −m. The other critical point of G α,β,m is at z + and it has been erased by the quasiconformal surgery construction because it belonged to D. Thus ˜ R(z) = νz m exp(h 1 (z)). By using the same arguments as in Proposition 4.5 we can ˜ conclude that R(z) = Fλ,m (z) = λz m exp(z) for a suitable value of λ. By construction, the boundary of A∗λ,m (0) is a quasicircle, since A∗ (0) is the quasiconformal image of the unit disk, obtaining thus a value λ ∈ C such that ∂ A∗λ,m (0) is a quasicircle. Acknowledgements. We would like to thank Christian Henriksen for many discussions and in particular for providing the idea of the proof of Theorem D. We would also like to thank Adrien Douady for all his valuable suggestions.
References [Ah] [A]
Ahlfors, L.: Lectures on quasiconformal mappings. New York: Wadswoth & Brooks/Cole Mathematics Series, 1966 Arnold, V.: Small denominators i, on the mappings of the circumference into itself. Amer. Math. Soc. Transl. (2) 46, 213–284 (1965)
782
[B] [B1] [B2] [BD] [BA] [Be] [Bo] [BH1] [BH2] [BuHe] [CG] [DE] [DT] [dMvS] [DH1] [DH2] [DH3] [EL] [EL1] [EL2] [FG] [F] [Fau] [G] [GK] [Ke] [Ko1] [Ko2] [L] [LV] [Mak] [MSS]
N. Fagella, A. Garijo
Baker, I.N.: The domains of normality of an entire function. Ann. Acad. Sci. Fenn. Ser. A I Math. 1, 277–283 (1975) Baker, I.N.: An entire function which has wandering domains. J. Austral. Math. Soc. 22, 173– 176 (1976) Baker, I.N.: Wandering domains in the iteration of entire functions. Proc. London Math. Soc. 49, 563– 576 (1984) Baker, I.N., Dominguez, P.: Some connectedness properties of julia sets. Complex Variables Theory Appl. 41, 371–389 (2000) Beurling, A., Ahlfors, L.V.: The boundary correspondence under quasiconformal mappings. Acta Math. 96, 125–142 (1956) Bergweiler, W.: Invariant domains and singularities. Math. Proc. Camb. Phil. Soc. 117, 525– 532 (1995) Böttcher, L.E.: The principal laws of convergence of iterates and their application to analysis (russian). Izv. Kazan. Fiz.-Mat. Obshch. 14, 155–234 (1904) Branner, B., Hubbard, J.H.: The iteration of cubic polynomials part i: the global toology of parameter space. Acta Math. 169, 143–206 (1992) Branner, B., Hubbard, J.H.: The iteration of cubic polynomials part ii: patterns and parapatterns. Acta Math. 169, 229–325 (1992) Buff, X., Henriksen, C.: Julia sets in parameter spaces. Commun. Math. Phys. 220, 333–375 (2001) Carleson, L., Gamelin, Th.: Complex Dynamics. Berlin-Heidelberg-New York: Springer, 1993 Douady, A., Earle, C.J.: Conformally natural extension of homeomorphism of the circle. Acta Math. 157, 23–48 (1986) Devaney, R.L., Tangerman, F.: Dynamics of entire functions near the essential singularity. Ergodic Theory Dynam. Systems 6, 489–503 (1986) de Melo, W., van Strien, S.: One-Dimensional dynamics. Berlin-Heidelberg-New York: SpringerVerlag, 1993 Douady, A., Hubbard, J.H.: Étude dynamique des polynômes complexes. Part I. Publ. math. d’Orsay, 1984 Douady, A., Hubbard, J.H.: Étude dynamique des polynômes complexes. Part II. Publ. math. d’Orsay, 1985 Douady, A., Hubbard, J.H.: On the dynamics of polynomial-like mappings. Ann. Scient. Ec. Norm. Sup. 18, 287–343 (1985) Eremenko, A.E., Lyubich, M.Yu.: Iterates of entire functions. Soviet Math. Dokl. 30, 592–594 (1984); translation from Dokl. Akad. Nauk. SSSR 279, 25–27 (1984) Eremenko, A.E., Lyubich, M.Yu.: The dynamics of analytic transforms. Leningrad. Math. J. 1, 563– 634 (1990) Eremenko, A.E., Lyubich, M.Yu.: Dynamical properties of some classes of entire functions. Ann. Inst. Fourier 42, 989–1020 (1992) Fagella, N., Garijo, A.: Capture zones of the family of functions f λ,m (z) = λz m exp(z). Inter. J. of Bif. and Chaos (3) 9, 2623–2640 (2003) Fatou, P.: Sur l’iterátion des fonctions transcendentes entières. Acta Math. 47, 337–370 (1926) Faught, D.: Local connectivity in a family of cubic polynomials. Ph.D Thesis, Cornell University, 1992 Geyer, L.: Siegel discs, herman rings and the arnold family. Trans. Amer. Math. Soc. 353, 3661– 3683 (2001) Goldberg, L.R., Keen, L.: A finiteness theorem for a dynamical class of entire functions. Ergodic Th. Dynam. Sys. 6, 183–192 (1986) Keen, L.: Dynamics of holomorphic self-maps of C∗ . Proc. Workshop of Holomorphic Functions and Moduli, Berlin-Heidelberg-New York: Springer-Verlag 1988, pp. 9–30 Kotus, J.: Iterated holomorphic maps on the punctered plane. In: Dynamical Systems, Kurzhanski, A.B., Sigmund, K.J. eds. 287, Berline-Heidelber-New York: Springer Verlag, 1987, pp. 10–29 Kotus, J.: The domains of normality of holomorphic self-maps of C∗ . Ann. Acad. Sci. Fenn. (Ser. A, I. Math) 15, 329–340 (1990) Lei, T.: The Mandelbrot set, theme and variations. London Math. Soc. Lecture Note Ser. 274, Cambridge: Cambridge Univ. Press, 2000 Letho, O., Virtanen, K.I.: Quasiconformal mappings in the plane. Berlin-Heidelberg-New York: Springer-Verlag, 1973 Makienko, P.: Iteration of analytic functions of C∗ (Russian). Dokl. Akad. Nauk. SSRR 297, 35–37 (1987); Translation in Sov. Math. Dokl 36, 418–420 (1988) Mañé, R., Sad, P., Sullivan, D.: On the dynamics of rational maps. Ann. Sci. École Norm. Sup. 16, 193–217 (1983)
The Parameter Planes of λz m exp(z) for m ≥ 2 [M] [M1] [MR] [Pom] [R] [SS] [Sl] [Z]
783
Milnor, J.: On cubic polynomials with periodic critical point. Stony Brook Institute for Mathematical Sciences. http://www.math.sunysb.edu/dynamics/surveys.html, 1991 Milnor, J.: Dynamics in one complex variable: Introductory lectures. Weshaden: Vieweg, 1999 Misiurewicz, M., Rodrigues, A.: Double standard maps. Preprint. http://www.math.iupui.edu/ mmisiure/publlist.html Pommrenke, Ch.: Boundary Behavior of Conformal Maps, Berlin-Heidelberg-New York: SpringerVerlag, 1991 Roesch, P.: Puzzles de yoccoz pour les applications à allure rationnelle. Enseign. Math. (2) 45(1–2), 133–168 (1999) Shub, M., Sullivan, D.: Expanding endomorphims of the circle revisited. Ergodic Theory Dynamical Systems 5, 285–289 (1985) Słodkowski, Z.: Holomorphic motions and polynomials hulls. Proc. Amer. Math. Soc. 111, 347– 355 (1991) Zakeri, S.: Dynamics of cubic siegel polynomials. Commun. Math. Physics. 206, 185–233 (1999)
Communicated by G. Gallavotti
Commun. Math. Phys. 273, 785–801 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0259-6
Communications in
Mathematical Physics
Partial Regularity of Solutions to the Four-Dimensional Navier-Stokes Equations at the First Blow-up Time Hongjie Dong1,2, , Dapeng Du3 1 Department of Mathematics, University of Chicago, 5734 S. University Avenue, Chicago, IL 60637, USA 2 School of Mathematics, Institute for Advanced Study, Einstein Drive, Princeton, NJ 08540, USA.
E-mail: [email protected]
3 School of Mathematical Sciences, Fudan University, Shanghai 200433, People’s Republic of China.
E-mail: [email protected] Received: 16 August 2006 / Accepted: 8 December 2006 Published online: 15 May 2007 – © Springer-Verlag 2007
Abstract: The solutions of incompressible Navier-Stokes equations in four spatial dimensions are considered. We prove that the two-dimensional Hausdorff measure of the set of singular points at the first blow-up time is equal to zero. 1. Introduction In this paper we consider both the Cauchy problem and the initial-boundary value problem for incompressible Navier-Stokes equations in four spatial dimensions with unit viscosity and zero external force: u t + u∇u − u + ∇ p = 0, div u = 0
(1.1)
in a smooth domain Q T = × (0, T ) ⊂ Rd × R. Boundary condition u|×[0,T ] = 0 is imposed if = Rd . Here d = 4 and the initial data a is in the closure of {u ∈ C0∞ () ; div u = 0} in L d () if is bounded, or is in the closure of {u ∈ C ∞ () ; div u = 0} in L d () ∩ L 2 () if = Rd . The local well-posedness of such problems is well-known (see, for example, [9] and [5]). The solution u is locally smooth in both spacial and time variables. We are interested in the partial regularity of u at the first blow-up time T . Many authors have studied the partial regularity of solutions (in particular, weak solutions) of the Navier-Stokes equations, especially when d is equal to three. V. Scheffer studied partial regularity in a series of papers [17, 18, 20]. In three space dimensions, he established various partial regularity results for weak solutions satisfying the so-called local energy inequality. For d = 3, the notion of suitable weak solutions was first introduced in a celebrated paper [1] by L. Caffarelli, R. Kohn and L. Nirenberg. They called Hongjie Dong was partially supported by the National Science Foundation under agreement No. DMS0111298. Dapeng Du was partially supported by a postdoctoral grant from School of Mathematical Sciences at Fudan University.
786
H. Dong, D. Du
a pair consisting of velocity u and pressure p a suitable weak solution if u has finite energy norm, p belongs to the Lebesgue space L 5/4 , u and p are weak solutions to the Navier-Stokes equations and satisfy a local energy inequality. After proving some criteria for local boundedness of solutions, they established partial regularity of solutions and estimated the Hausdorff dimension of the singular set. They proved that, for any suitable weak solution u, p, there is an open subset where the velocity field u is Hölder continuous and they showed that the 1-D Hausdorff measure of the complement of this subset is equal to zero. In [15], with zero external force, F. Lin gave a more direct and sketched proof of Caffarelli, Kohn and Nirenberg’s result. A detailed treatment was later given by O. Ladyzhenskaya and G. A. Seregin in [13]. Very recently, some extended results are obtained in Seregin [16] and Gustafson, Kang and Tsai [6]. For d = 4, V. Scheffer proved in [19] that there exists a weak solution u in R4 × R+ such that u is continuous outside a locally closed set of R4 × R+ whose 3-D Hausdorff measure is finite. Although Scheffer’s paper is not recent, it appears to us that this is the only published results on the partial regularity of 4-D Navier-Stokes equations. Remark 1.1. The weak solution considered in [19] doesn’t verify the local energy estimate. The existence of a weak solution satisfying the local energy estimate is still an open problem. Now let’s state our result. Instead of dealing with weak solutions, we work on classical solutions of 4-D Navier-Stokes equations, which are regular before they blow up. Our first result is that the singular set at the first blowup time is a compact set with zero 2-D Hausdorff measure. We show this after two partial regularity criterions are obtained. Our proof is conceptually similar to Lin’s in [15], but the problem is technically harder. The main difficulty is the lack of certain compactness. We overcome it by a novel use of the backward heat kernel (see the proof of Lemma 2.12) and by the use of two appropriate scaled norms of the pressure. It is possible because the nonlinear term is controlled by using the Sobolev embedding theorem, although we don’t have a compact embedding here. Remark 1.2. In the setting of classical solutions, our result is the 4-D version of Caffarelli, Kohn and Nirenberg’s theorem in [1]. As an application of one of the partial regularity criterions derived in the proof of the first result we get our second result: in case = R4 if a solution blows up, it must blow up at a finite time. Remark 1.3. We can prove a similar result in 3-D by using the same argument. Detailed discussions on the long-time behavior of solutions to 3-D Navier-Stokes can be found in J. Heywood [7] and M. Wiegner [24] (and references therein). It seems that we need some new method to deal with the five or higher dimensional case. To the authors’ best knowledge all the existing methods on partial regularity for the Navier-Stokes equations share the following prerequisite condition: in the energy inequality the nonlinear term should be controlled by the energy norm under the Sobolev imbedding theorem. Actually, four is the highest dimension in which we have such condition. In five or higher dimensions, such condition fails. Therefore, we cannot hope the existing methods work in the five or higher dimensional case. The article is organized as follows. Our main theorems (Theorem 2.1-2.5) are given in the following section. Some auxiliary estimates are proved in Sects. 3 and 4, among which Lemma 2.12 plays a crucial role. We give the proof of our main theorems in the last section.
Partial Regularity of Solutions to the NSE
787
To conclude this Introduction, we explain some notation used in what follows: Rd is a d-dimensional Euclidean space with a fixed orthonormal basis. A typical point in Rd is denoted by x = (x1 , x2 , . . . , xd ). As usual the summation convention over repeated indices is enforced. And x · y = xi yi is the inner product for x, y ∈ Rd . For t > 0, we denote Ht = R4 × (0, t) and space points are denoted by z = (x, t). Various constants are denoted by N in general and the expression N = N (· · · ) means that the given constant N depends only on the contents of the parentheses. 2. Setting and Main Results We shall use the notation in [13]. Let ω be a domain in some finite-dimensional space. Denote L p (ω; Rn ) and W pk (ω; Rn ) to be the usual Lebesgue and Sobolev spaces of functions from ω into Rn . Denote the norm of the spaces L p (ω; Rn ) and W pk (ω; Rn ) by · L p ,ω and · W pk ,ω respectively. As usual, for any measurable function u = u(x, t) and any p, q ∈ [1, +∞], we define u(x, t) L tp L qx := u(x, t) L qx L p . t
For summable functions p, u = (u i ) and τ = (τi j ), we use the following differential operators ∂u ∂u , u ,i = , ∇ p = ( p,i ), ∇u = (u i, j ), ∂t ∂ xi div u = u i,i , div τ = (τi j, j ), u = div∇u,
∂t u = u t =
which are understood in the sense of distributions. We use the notation of spheres, balls and parabolic cylinders, S(x0 , r ) = {x ∈ R4 ||x − x0 | = r }, S(r ) = S(0, r ), S = S(1); B(x0 , r ) = {x ∈ R4 ||x − x0 | < r },
B(r ) = B(0, r ),
B = B(1);
Q(z 0 , r ) = B(x0 , r ) × (t0 − r 2 , t0 ),
Q(r ) = Q(0, r ),
Q = Q(1).
Also we denote mean values of summable functions as follows: 1 [u]x0 ,r (t) = u(x, t) d x, |B(r )| B(x0 ,r ) (u)z 0 ,r (t) =
1 |Q(r )|
Q(z 0 ,r )
u dz.
In case = Rd , in a well-known paper [9] Kato proved that the problem is locally well-posed. By known local regularity theory for Navier-Stokes equations it can be proved that Kato’s (also known as mild) solutions are smooth in Rd × (0, T∗ ] for some T∗ > 0. Meanwhile, for bounded , it is also known (see [5]) that there exists a unique solution u of (1.1) satisfying (1) (2) (3)
u ∈ C([0, T∗ ]; L d ), u(0) = a for some T∗ > 0; u ∈ C((0, T∗ ]; D(Aα )) for any 0 < α < 1; Aα u(t) = o(t −α ) as t → 0.
788
H. Dong, D. Du
Here A is the stokes operator. Moreover, such solution is smooth in × (0, T∗ ]. In both cases, let T = sup T∗ be the first blow-up time. Then u is a smooth function in Q T . Let η(x) be a smooth function on R4 supported in the unit ball B(1), 0 ≤ η ≤ 1 and ¯ η ≡ 1 on B(2/3). Let z 0 be a given point in × (0, T ] and r > 0 a real number such that Q(z 0 , r ) ⊂ Q T . It’s known that for a.e. t ∈ (t0 − r 2 , t0 ), in the sense of distribution, one has ∂2 ui u j ∂ xi ∂ x j ∂2 = (u i − [u i ]x0 ,r )(u j − [u j ]x0 ,r ) in B(x0 , r ). ∂ xi ∂ x j
p =
For these t, we consider the decomposition p = p˜ x0 ,r + h x0 ,r in B(x0 , r ), where p˜ x0 ,r is the Newtonian potential of (u i − [u i ]x0 ,r )(u j − [u j ]x0 ,r )η(x/r ). Then h x0 ,r is harmonic in B(x0 , r/2). In the sequel, we omit the indices of p˜ and h whenever there is no confusion. The following notation will be used throughout the article: 1 A(r ) = A(r, z 0 ) = ess supt0 −r 2 ≤t≤t0 2 |u(x, t)|2 d x, r B(x0 ,r ) 1 E(r ) = E(r, z 0 ) = 2 |∇u|2 dz, r Q(z 0 ,r ) 1 C(r ) = C(r, z 0 ) = 3 |u|3 dz, r Q(z 0 ,r ) 1 D(r ) = D(r, z 0 ) = 3 | p − [h]x0 ,r |3/2 dz, r Q(z 0 ,r ) 1 2α 1 t0 F(r ) = F(r, z 0 ) = 2 | p − [h]x0 ,r |1+α d x 2α dt 1+α , r t0 −r 2 B(x0 ,r ) where α ∈ (0, 1) is a number to be specified later. Notice that these objects are all invariant under the natural scaling. Here are our main results: Theorem 2.1. Let be a smooth bounded set or the whole space R4 and let (u, p) be the solution of (1.1). There is a positive number ε0 satisfying the following property. Assume that for a point z 0 ∈ × T the inequality lim sup E(r ) ≤ ε0 r ↓0
(2.1)
holds. Then z 0 is a regular point. Theorem 2.2. Let be a smooth bounded set or the whole space R4 and (u, p) be the solution of (1.1). There is a positive number ε0 satisfying the following property. Assume that for a point z 0 ∈ Q T and for some ρ0 > 0 such that Q(z 0 , ρ0 ) ⊂ Q T and C(ρ0 ) + D(ρ0 ) + F(ρ0 ) ≤ ε0 . Then z 0 is a regular point.
(2.2)
Partial Regularity of Solutions to the NSE
789
Remark 2.3. It is worth noting that the object under estimation in condition (2.1) involves the gradient of u while the objects in condition (2.2) involve only u and p themselves. However, by using condition (2.1) one can obtain a better estimate of the Hausdorff dimension of the set of all singular points. Theorem 2.4. Let be a smooth bounded set or the whole space R4 and (u, p) be the solution of (1.1). Then the 2-D Hausdorff measure of the set of singular points in × T is equal to zero. Theorem 2.5. Assume is the whole space R4 . Let (u, p) be the solution of (1.1). If the solution does not blow up in finite time, then u is bounded and smooth in R4 × (0, +∞). In the sequel, we shall make use of the following well-known interpolation inequality. Lemma 2.6. For any functions u ∈ W21 (R4 ) and real numbers q ∈ [2, 4] and r > 0, q−2 2−q/2 |u|q d x ≤ N (q) |∇u|2 d x |u|2 d x Br
Br
+r −2(q−2)
Br
|u|2 d x
q/2 .
(2.3)
Br
Let (u, p) be the solution of the Navier-Stokes equation (1.1). Lemma 2.7. (i) We have u ∈ L ∞ (0, T ; L 2 (; R4 )) ∩ L 2 (0, T ; W21 (; R4 )) ∩ L 3 (Q T ),
(2.4)
p ∈ L 3/2 (Q T ).
(2.5)
(ii) For 0 < t ≤ T and for all non-negative functions ψ ∈ C0∞ ( × (0, ∞)), the following generalized energy inequality is satisfied: 2 ess sup0<s≤t |u(x, s)| ψ(x, s) d x + 2 |∇u|2 ψ d xds Qt ≤ {|u|2 (ψt + ψ) + (|u|2 + 2 p)u · ∇ψ} d x ds.
(2.6)
Qt
Sketch of the proof. To prove u ∈ L ∞ (0, T ; L 2 (; R4 )) ∩ L 2 (0, T ; W21 (; R4 )), it suffices to multiply the first equation of (1.1) by u and integrate by parts. By using Lemma 2.6 with q = 3 and integrating in t, we obtain u ∈ L 3 (Q T ). Since in Q T it holds that p =
∂2 ui u j , ∂ xi ∂ x j
(2.5) follows from the Calderón-Zygmund estimate. Next, let’s prove part (ii). First, notice that for 0 < t < T , (2.6) can be obtained by multiplying the first equation of
790
H. Dong, D. Du
(1.1) by uψ, integrating by parts and integrating with respect to t. For the case when t = T , due to part (i) and Hölder’s inequality both sides of (2.6) are finite. Then it remains to let t → T and take the limit on both sides. We shall prove Theorem 2.4 in three steps. First, we want to control A, C, D, F in a smaller ball by their values in a larger ball under the assumption that E is sufficiently small. Similar results can be found in [13] or [15] in the case when the space dimension is three. Lemma 2.8. Suppose γ ∈ (0, 1), ρ > 0 are constants and Q(z 0 , ρ) ⊂ Q T . Then we have C(γρ) ≤ N γ −3 A1/2 (ρ)E(ρ) + γ −9/2 A3/4 (ρ)E 3/4 (ρ) + γ C(ρ) , (2.7) where N is a constant independent of γ , ρ and z 0 . Lemma 2.9. Suppose α ∈ (0, 1/2], γ ∈ (0, 1/3], ρ > 0 are constants and Q(z 0 , ρ) ⊂ Q T . Then we have 1−α 2α 3−α F(γρ) ≤ N (α) γ −2 A 1+α (ρ)E 1+α (ρ) + γ 1+α F(ρ) , (2.8) where N (α) is a constant independent of γ , ρ and z 0 . In particular, for α = 1/2 we have, D(γρ) ≤ N γ −3 A1/2 (ρ)E(ρ) + γ 5/2 D(ρ) . (2.9) Moreover, it holds that
D(γρ) ≤ N (α) γ −3 (A(ρ) + E(ρ))3/2 + γ (9−3α)/(2+2α) F 3/2 (ρ) .
(2.10)
Lemma 2.10. Suppose θ ∈ (0, 1/2], ρ > 0 are constants and Q(z 0 , ρ) ⊂ Q T . Then we have A(θρ) + E(θρ) ≤ N θ −2 C 2/3 (ρ) + C(ρ) + C 1/3 (ρ)D 2/3 (ρ) . In particular, when θ = 1/2 we have A(ρ/2) + E(ρ/2) ≤ N [C 2/3 (ρ) + C(ρ) + C 1/3 (ρ)D 2/3 (ρ)].
(2.11)
As a conclusion, we obtain Proposition 2.11. For any ε0 > 0, there exists ε1 > 0 small such that for any z 0 ∈ Q T ∪ (R4 × {T }) satisfying lim sup E(r ) ≤ ε1 , (2.12) r →0
we can find ρ0 sufficiently small such that A(ρ0 ) + E(ρ0 ) + C(ρ0 ) + D(ρ0 ) + F(ρ0 ) ≤ ε0 .
(2.13)
In the second step, our goal is to estimate the values of A, E, C and F in a smaller ball by the values of themselves in a larger ball. Lemma 2.12. Suppose ρ > 0, θ ∈ (0, 1/3] are constants and Q(z 1 , ρ) ⊂ Q T . Then we have A(θρ) + E(θρ) ≤ N θ 2 A(ρ) + N θ −3 [A(ρ) + E(ρ) + F(ρ)]3/2 , where N is a constant independent of ρ, θ and z 1 .
(2.14)
Partial Regularity of Solutions to the NSE
791
Lemma 2.13. Suppose ρ > 0 is constant and Q(z 1 , ρ) ⊂ Q T . Then we can find θ1 > 0 small such that A(θ1 ρ) + E(θ1 ρ) + F(θ1 ρ) ≤
1 A(ρ) + E(ρ) + F(ρ) 2
3/2 +N (θ1 ) A(ρ) + E(ρ) + F(ρ) ,
(2.15)
where N is a constant independent of ρ and z 1 . Proposition 2.14. For any ε2 > 0, there exists ε0 > 0 small such that: if for some z 0 ∈ Q T ∪ (R4 × {T }) and ρ0 > 0 satisfying Q(z 0 , ρ0 ) ⊂ Q T and C(ρ0 ) + D(ρ0 ) + F(ρ0 ) ≤ ε0 ,
(2.16)
then for any ρ ∈ (0, ρ0 /4) and z 1 ∈ Q(z 0 , ρ/4) we have A(ρ, z 1 ) + C(ρ, z 1 ) + E(ρ, z 1 ) + F(ρ, z 1 ) ≤ ε2 .
(2.17)
Finally, we apply Schoen’s trick to prove the main theorems. 3. Proof of Proposition 2.11 We will prove these lemma briefly. For more detail, we refer the reader to [13]. Proof of Lemma 2.8. Denote r = γρ. We have, by using Poincaré’s inequality and Cauchy’s inequality, 2 |u| − [|u|2 ]x0 ,ρ d x + |u|2 d x = [|u|2 ]x0 ,ρ d x B(x0 ,r ) B(x0 ,r ) B(x0 ,r ) r 4 ≤ Nρ |∇u||u| d x + |u|2 d x ρ B(x0 ,ρ) B(x0 ,ρ) 1/2 1/2 2 |∇u| d x |u|2 d x ≤ Nρ B(x0 ,ρ) B(x0 ,ρ) r 4 2 + |u| d x ρ B(x0 ,ρ) 1/2 |∇u|2 d x ≤ Nρ 2 A1/2 (ρ) B(x0 ,ρ) 2/3 r 4 + |u|3 d x ρ 4/3 . ρ B(x0 ,ρ) Owing to Lemma 2.6 with q = 3 and using the inequality above, one gets B(x0 ,r )
|∇u|2 d x ρ A1/2 (ρ) B(x0 ,r ) 3/4 r 4 3 −2 3/4 2 + ρ r A (ρ) |∇u| d x + |u|3 d x . ρ B(x0 ,ρ) B(x0 ,ρ)
|u| d x ≤ N 3
792
H. Dong, D. Du
By integrating with respect to t on (t0 − r 2 , t0 ) and applying Hölder’s inequality, we get Q(z 0 ,r )
|∇u|2 dz ρ A1/2 (ρ) Q(z 0 ,ρ) 3/4 r
4 3 −3/2 3/4 A (ρ) |∇u|2 dz + |u|3 dz . +ρ r ρ Q(z 0 ,ρ) Q(z 0 ,r )
|u| dz ≤ N 3
The conclusion of Lemma 2.8 follows immediately. Proof of Lemma 2.9. Denote r = γρ. Recall the decomposition of p introduced in Sect. 2. By using the Calderón-Zygmund estimate, Lemma 2.6 with q = 2(1 + α) and the Poincaré inequality, one has | p˜ x0 ,r (x, t)|1+α d x B(x0 ,r ) ≤N |u − [u]x0 ,r |2(1+α) d x B(x0 ,r ) 2α 1−α ≤N |∇u|2 d x |u − [u]x0 ,r |2 d x B(x0 ,r ) B(x0 ,r ) 1+α −4α + Nr |u − [u]x0 ,r |2 d x B(x0 ,r ) 2α 1−α ≤N |∇u|2 d x |u|2 d x . (3.1) B(x0 ,r )
B(x0 ,r )
Here we also use the inequality |u − [u]x0 ,r |2 d x ≤ B(x0 ,r )
Similarly, B(x0 ,ρ)
| p˜ x0 ,ρ |1+α d x ≤ N
B(x0 ,ρ)
B(x0 ,r )
|∇u|2 d x
|u|2 d x.
2α B(x0 ,ρ)
|u|2 d x
1−α
.
(3.2)
Since h x0 ,ρ is harmonic in B(x0 , ρ/2), any Sobolev norm of h x0 ,ρ in a smaller ball can be estimated by any of its L p norm in B(x0 , ρ/2). Thus, by using the Poincaré inequality one can obtain |h x0 ,ρ − [h x0 ,ρ ]x0 ,r |1+α d x B(x0 ,r ) ≤ Nr 1+α |∇h x0 ,ρ |1+α d x B(x0 ,r )
≤ Nr 5+α sup |∇h x0 ,ρ |1+α B(x0 ,r )
≤N
r 5+α
|h x0 ,ρ (x, t) − [h x0 ,ρ ]x0 ,ρ |1+α d x ρ B(x0 ,ρ/2)
r 5+α ≤N | p(x, t) − [h x0 ,ρ ]x0 ,ρ |1+α + | p˜ x0 ,ρ (x, t)|1+α d x . ρ B(x0 ,ρ)
(3.3)
Partial Regularity of Solutions to the NSE
793
Combining (3.2) and (3.3) together yields, | p(x, t) − [h x0 ,ρ ]x0 ,r |1+α d x B(x0 ,r ) 2α 1−α 2 ≤N |∇u(x, t)| d x |u(x, t)|2 d x B(x0 ,ρ) B(x0 ,ρ) r 5+α +N | p(x, t) − [h x0 ,ρ ]x0 ,ρ |1+α d x. ρ B(x0 ,ρ)
(3.4)
Since p˜ x0 ,r + h x0 ,r = p = p˜ x0 ,ρ + h x0 ,ρ in B(x0 , r ), by Hölder’s inequality, |[h x0 ,ρ ]x0 ,r − [h x0 ,r ]x0 ,r |1+α d x B(x0 ,r )
= Nr 4 |[h x0 ,ρ ]x0 ,r − [h x0 ,r ]x0 ,r |1+α = Nr 4 |[ p˜ x0 ,ρ ]x0 ,r − [ p˜ x0 ,r ]x0 ,r |1+α ≤N | p˜ x0 ,ρ |1+α + | p˜ x0 ,r |1+α d x.
(3.5)
B(x0 ,r )
From (3.1), (3.2), (3.4) and (3.5) we get | p(x, t) − [h x0 ,r ]x0 ,r |1+α d x B(x0 ,r ) 2α 1−α 2 ≤N |∇u(x, t)| d x |u(x, t)|2 d x B(x0 ,ρ) B(x0 ,ρ) r 5+α +N | p(x, t) − [h x0 ,ρ ]x0 ,ρ |1+α d x. ρ B(x0 ,ρ)
(3.6)
Raising to the power 1/(2α) and integrating with respect to t in (t0 − r 2 , t0 ) completes the proof of (2.8) and also (2.9). To prove (2.10), we use a slightly different estimate from (3.3). Again, since h is harmonic in B(x0 , ρ/2), we have |h x0 ,ρ − [h x0 ,ρ ]x0 ,r |3/2 d x B(x0 ,r ) ≤ Nr 3/2 |∇h x0 ,ρ |3/2 d x B(x0 ,r )
≤ Nr ≤N ≤N
11/2
sup |∇h x0 ,ρ |3/2
B(x0 ,r )
r 11/2 ρ 3/2+6/(1+α) r 11/2
B(x0 ,ρ)
|h x0 ,ρ/2 (x, t) − [h x0 ,ρ ]x0 ,ρ |1+α d x | p(x, t) − [h x0 ,ρ ]x0 ,ρ |1+α d x
ρ 3/2+6/(1+α) B(x0 ,ρ)
3 2(1+α) . + | p˜ x0 ,ρ (x, t)|1+α d x B(x0 ,ρ)
3 2(1+α)
3 2(1+α)
(3.7)
794
H. Dong, D. Du
Similar to (3.6), we obtain | p(x, t) − [h x0 ,r ]x0 ,r |3/2 d x B(x0 ,r ) ≤N |∇u(x, t)|2 d x B(x0 ,ρ) r 11/2
B(x0 ,ρ)
|u(x, t)|2 d x
1/2
3 + N 3/2+6/(1+α) [ | p(x, t) − [h]x0 ,ρ |1+α d x 2(1+α) ρ B(x0 ,ρ) 3α 3(1−α) 1+α 2(1+α) . + |∇u(x, t)|2 d x |u(x, t)|2 d x B(x0 ,ρ)
B(x0 ,ρ)
(3.8)
Integrating with respect to t in (t0 −r 2 , t0 ) and applying Hölder’s inequality completes the proof of (2.10). Proof of Lemma 2.10. Let r = θρ. In the energy inequality (2.6), we put t = t0 and choose a suitable smooth cut-off function φ such that ψ ≡ 0 in Q t0 \ Q(z 0 , ρ), 0 ≤ ψ ≤ 1 in Q T , ψ ≡ 1 in Q(z 0 , r ), |∇ψ| < Nρ −1 , |∂t ψ| + |∇ 2 ψ| < Nρ −2 in Q t0 . By using (2.6) and because u is divergence free, we get A(r ) + 2E(r )
1 N 1 |u|2 dz (|u|2 + 2| p − [h]x0 ,ρ |)|u| dz . ≤ 2 2 r ρ Q(z 0 ,ρ) ρ Q(z 0 ,ρ) Due to Hölder’s inequality, one can obtain 2/3 |u|2 dz ≤ |u|3 dz Q(z 0 ,ρ)
Q(z 0 ,ρ)
Q(z 0 ,ρ)
dz
1/3
≤ ρ 4 C 2/3 (ρ),
Q(z 0 ,ρ)
≤
| p − [h]x0 ,ρ ||u| dz
Q(z 0 ,ρ)
| p − [h]x0 ,ρ |
3/2
dz
2/3
Q(z 0 ,ρ)
|u|3 dz
1/3
≤ Nρ 3 D 2/3 (ρ)C 1/3 (ρ). Then the conclusion of Lemma 2.10 follows immediately. Proof of Proposition 2.11. Let’s prove first (2.13) without the presence of F on the left-hand side. For a given point z 0 = (x0 , t0 ) ∈ Q T ∪ (R4 × {T }) satisfying (2.12), choose ρ0 > 0 such that Q(z 0 , ρ0 ) ⊂ Q T . Then for any ρ ∈ (0, ρ0 ] and γ ∈ (0, 1/6), by using (2.11), A(γρ) + E(γρ) ≤ N [C 2/3 (2γρ) + C(2γρ) + D(2γρ)].
Partial Regularity of Solutions to the NSE
795
This estimate, (2.7) and (2.9) together with Young’s inequality imply A(γρ) + E(γρ) + C(γρ) + D(γρ) ≤ N [γ 2/3 C 2/3 (ρ) + γ 5/2 D(ρ) + γ C(ρ) + γ A(ρ)] + N γ −100 (E(ρ) + E 3 (ρ)) ≤ N γ 2/3 [A(ρ) + E(ρ) + C(ρ) + D(ρ)] + N γ 2/3 + N γ −100 (E(ρ) + E 3 (ρ)).
(3.9)
It is easy to see that for any ε3 > 0, there are sufficiently small real numbers γ ≤ 1/(2N )3/2 and ε1 such that if (2.12) holds then for all small ρ we have N γ 2/3 + N γ −100 (E(ρ) + E 3 (ρ)) < ε3 /2. By using (3.9) we reach A(ρ1 ) + C(ρ1 ) + D(ρ1 ) ≤ ε3 for some ρ1 > 0 small enough. To include F in the estimate, it suffices to use (2.8). 4. Proof of Proposition 2.14 Proof of Lemma 2.12. Let r = θρ. Define the backward heat kernel as 2
(t, x) =
|x−x | − 2 1 1 2(r +t1 −t) . e 4π 2 (r 2 + t1 − t)2
In the energy inequality (2.6) we put t = t1 and choose ψ = φ := φ1 (x)φ2 (t), where φ1 , φ2 are suitable smooth cut-off functions satisfying φ1 ≡ 0 in R4 \ B(x1 , ρ), 0 ≤ φ1 ≤ 1 in R4 , φ1 ≡ 1 in B(x1 , ρ/2) φ2 ≡ 0 in (−∞, t1 − ρ 2 ) ∪ (t1 + ρ 2 , +∞), 0 ≤ φ2 ≤ 1 in R, φ2 ≡ 1 in (t1 − ρ 2 /4, t1 + ρ 2 /4), |φ2 | ≤ Nρ −2 in R, |∇φ1 | < Nρ −1 , |∇ 2 φ1 | < Nρ −2 in R4 .
(4.1)
By using the equality + t = 0, we have
|u(x, t)| (t, x)φ(x, t) d x + 2 2
B(x0 ,ρ)
≤
Q(z 0 ,ρ) 2
Q(z 0 ,ρ)
|∇u|2 φ dz
{|u|2 (φt + φ + 2∇φ∇)
+ (|u| + 2 p)u · (∇φ + φ∇)} dz.
(4.2)
After some straightforward computations, it is easy to see the following three properties:
796
(i)
H. Dong, D. Du
¯ 1 , r ) it holds that For some constant c > 0, on Q(z φ = ≥ cr −4 .
(ii)
For any z ∈ Q(z 1 , ρ), we have |φ(z)∇(z)| + |∇φ(z)(z)| ≤ Nr −5 .
(iii)
For any z ∈ Q(z 1 , ρ) \ Q(z 1 , r ), we have |(z)φt (z)| + |(z)φ(z)| + |∇φ∇| ≤ Nρ −6 .
These properties together with (4.2) and (4.1) yield A(r ) + E(r ) ≤ N [θ 2 A(ρ) + θ −3 (C(ρ) + D(ρ))].
(4.3)
Owing to Lemma 2.6 with q = 3, one easily gets C(ρ/3) ≤ N C(ρ) ≤ N [A(ρ) + E(ρ)]3/2 .
(4.4)
By using (2.10) with γ = 1/3, we have D(ρ/3) ≤ N [A(ρ) + E(ρ) + F(ρ)]3/2 .
(4.5)
Upon combining (4.3) (with ρ/3 in place of ρ), (4.4) and (4.5) together, the lemma is proved. Proof of Lemma 2.13. Due to (2.8) and (2.14), for any γ , θ ∈ (0, 1/3], we have F(γ θρ) ≤ N γ −2 (A(θρ) + E(θρ)) + γ (3−α)/(1+α) F(θρ) ≤ N γ −2 θ 2 A(ρ) + γ (3−α)/(1+α) θ −2 F(ρ) 3/2 + N γ −2 θ −3 A(ρ) + E(ρ) + F(ρ) A(γ θρ) + E(γ θρ) ≤ (γ θ )2 A(ρ) + (γ θ )−3 [A(ρ) + E(ρ) + F(ρ)]3/2 .
(4.6) (4.7)
Now we put α = 1/27 such that (3 − α)/(1 + α) = 20/7 > 2. In Sect. 5, we will give more explanation why we choose α = 1/27. Now one can choose and fix γ and θ sufficiently small such that N [γ −2 θ 2 + γ 20/7 θ −2 + (γ θ )2 ] ≤ 1/2. Upon adding (4.6) and (4.7), we obtain A(γ θρ) + E(γ θρ) + F(γ θρ) ≤
1 A(ρ) + N [A(ρ) + E(ρ) + F(ρ)]3/2 , 2
where N depends only on θ and γ . After putting θ1 = γ θ , the lemma is proved.
Partial Regularity of Solutions to the NSE
797
Proof of Proposition 2.14. Take the constant θ1 from Lemma 2.13. Due to Lemma 2.10, we may choose ε0 , ε > 0 small enough such that A(ρ0 /2) + E(ρ0 /2) + C(ρ0 /2) + D(ρ0 /2) + F(ρ0 /2) ≤ ε . 2ε + 8N (θ1 )ε3/2 ≤ min(4ε , θ12 ε2 ),
(4.8)
where the constant N (θ1 ) is the same one as in (2.15). Since z 1 ∈ Q(z 0 , ρ/4), we have Q(z 1 , ρ0 /4) ⊂ Q(z 0 , ρ0 /2) ⊂ Q T , A(ρ0 /4, z 1 ) + E(ρ0 /4, z 1 ) + F(ρ0 /4, z 1 ) ≤ 4ε . By using (4.8) and (2.15), one obtains inductively for k = 1, 2, · · · , A(θ1k ρ0 /4, z 1 ) + E(θ1k ρ0 /4, z 1 ) + F(θ1k ρ0 /4, z 1 ) ≤ min{θ12 ε2 , 4ε }. Thus, for any ρ ∈ (0, ρ0 /4], it holds that A(ρ, z 1 ) + E(ρ, z 1 ) + F(ρ, z 1 ) ≤ ε2 . To include the term C(ρ, z 1 ) in the estimate, it suffices to use (4.4). The proposition is proved. 5. Proof of Theorems 2.1-2.5 Proof of Theorems 2.1 and 2.2. Let z 0 ∈ Q T ∪ (R4 × {T }) be a given point. Proposition 2.11 and 2.14 imply that for any ε2 > 0 there exist small numbers ε1 , ε0 , ρ0 > 0 such that either (5.1) lim sup E(r, z 0 ) ≤ ε1 r →0
or C(ρ0 ) + D(ρ0 ) + F(ρ0 ) ≤ ε0
(5.2)
holds true, we can find ρ1 > 0 so that Q(z 0 , ρ1 ) ⊂ Q T and for any z 1 ∈ Q(z 0 , ρ1 /2), ρ ∈ (0, ρ1 /2) we have (5.3) C(ρ, z 1 ) + F(ρ, z 1 ) ≤ ε2 . Let δ ∈ (0, ρ12 /4) be a number and denote Mδ =
max
¯ 0 ,ρ1 /2)∩ Q¯ T −δ Q(z
d(z)|u(z)|,
where d(z) = min[dist(x, ∂), (t + ρ12 /4 − T )1/2 ]. Lemma 5.1. If (5.3) holds true for a sufficiently small ε2 , then sup
Q(z 0 ,ρ1 /4)
|u(z)| < +∞.
(5.4)
798
H. Dong, D. Du
Proof. If for all δ ∈ (0, ρ12 /4) we have Mδ ≤ 2, then there’s nothing to prove. Otherwise, ¯ 0 , ρ1 /2) ∩ Q¯ T −δ , suppose for some δ and z 1 ∈ Q(z M := Mδ = |u(z 1 )|d(z 1 ) > 2. Let r1 = d(z 1 )/M < d(z 1 )/2. We make the scaling as follows: u(y, ¯ s) = r1 u(r1 y + x1 , r12 s + t1 ), p(y, ¯ s) = r1 p(r1 y + x1 , r12 s + t1 ). It’s known that the pair (u, ¯ p) ¯ satisfies the Navier-Stokes equations (1.1) in Q(0, 1). Obviously, sup |u| ¯ ≤ 2, |u(0, ¯ 0)| = 1. (5.5) Q(0,1)
Due to the scaling-invariant property of our objects A, E, C, D and F, in what follows we look at them as objects associated to (u, ¯ p) ¯ at the origin. For any ρ ∈ (0, 1], we have (5.6) C(ρ) + F(ρ) ≤ ε2 . Recall what we did before in the proof of Lemma 2.9. Since u¯ is bounded in Q(0, 1), we have | p˜¯ 0,1 |14 dz ≤ | p˜¯ 0,1 |14 dz ≤ N , (5.7) Q(0,1/3)
Q(0,1/2)
|h¯ 0,1 (z) − [h¯ 0,1 ]0,1/3 |14 d x B(0,1/3)
sup |∇ h¯ 0,1 (x, t)|14
≤N
B(0,1/3)
≤N
|h¯ 0,1 − [h¯ 0,1 ]0,1/2 |28/27 d x
27/2
,
B(0,1/2)
and
|h¯ 0,1 (z) − [h¯ 0,1 ]0,1/3 |14 dz Q(0,1/3) 0
≤N
−1/9
|h¯ 0,1 − [h¯ 0,1 ]0,1/2 |28/27 d x
27/2
dt
B(0,1/2)
≤ N (1 + F 14 (1)). Estimates (5.7) and (5.8) yield
| p(z) ¯ − [h¯ 0,1 ]0,1/3 |14 dz ≤ N .
Q(0,1/3)
Because (u, ¯ p) ¯ satisfies the equation ¯ − ∇( p¯ − [h¯ 0,1 ]0,1/3 ) u¯ t − u¯ = div(u¯ ⊗ u)
(5.8)
(5.9)
Partial Regularity of Solutions to the NSE
799
in Q(0, 1). Owing to (5.5), (5.9) and the classical Sobolev space theory of the parabolic equation, we have 1,1/2
u¯ ∈ W14
(Q(0, 1/4)), u ¯ W 1,1/2 (Q(0,1/4)) ≤ N .
(5.10)
14
Since 1/2 − 6/14 = 1/14 > 0, owing to the Sobolev embedding theorem (see [11]), we obtain ¯ C 1/14 (Q(0,1/5)) ≤ N , u¯ ∈ C 1/14 (Q(0, 1/5)), u where N is a universal constant independent of ε1 and ε2 . Therefore, we can find δ1 < 1/5 independent of ε1 , ε2 such that |u(x, ¯ t)| ≥ 1/2 in Q(0, δ1 ).
(5.11)
Now we choose ε2 small enough which makes (5.11) and (5.6) a contradiction. The lemma is proved. Theorem 2.1 and 2.2 follow immediately from Lemma 5.1. Proof of Theorem 2.4. Take the number ε1 in Lemma 5.1. Denote ∗ := {z ∈ × {T } | lim sup E(r, z) ≤ ε1 }. r ↓0
It is well known that the 2-D Hausdorff measure of \ ∗ is zero. By using Lemma 5.1, for any z ∈ ∗ we can find ρ > 0 such that u is bounded in Q(z, ρ). Then there’s no blow-up at z and z is a regular point. The theorem is proved. Proof of Theorem 2.5. For any α0 ∈ [0, 1], due to Lemma 2.7 (i), the interpolation inequality (2.8) with r = +∞, q = 2(1 + α0 ) and Hölder’s inequality, one can easily get u L t
x 4 + (1+α0 )/α0 L 2(1+α0 ) (R ×R )
< +∞.
(5.12)
Since (u, p) satisfies ∂2 (u i u j ) in R4 × R+ , ∂ xi ∂ x j
p =
due to the Calderón-Zygmund estimate, we have p L t
x 4 + (1+α0 )/(2α0 ) L 1+α0 (R ×R )
< +∞.
(5.13)
Because of (5.12) with α0 = 1/2 and (5.13) with α0 = 1/2, α, and again by the Calderón-Zygmund estimate, for any ε4 ∈ (0, 1) we can find R ≥ 1 sufficiently large such that for any z 0 ∈ R4 × (R, +∞) it holds that C(1, z 0 ) + p L t
x 3/2 L 3/2 (Q(z 0 ,1))
p˜ z 0 ,1 L t
x 3/2 L 3/2 (Q(z 0 ,1))
+ p L t
x (1+α0 )/(2α0 ) L 1+α0 (Q(z 0 ,1))
+ p˜ z 0 ,1 L t
≤ ε4 ,
x (1+α0 )/(2α0 ) L 1+α0 (Q(z 0 ,1))
≤ ε4 .
(5.14) (5.15)
Thus, h z 0 ,1 L t
x 3/2 L 3/2 (Q(z 0 ,1))
+ h z 0 ,1 L t
x (1+α0 )/(2α0 ) L 1+α0 (Q(z 0 ,1))
≤ 2ε4 .
(5.16)
800
H. Dong, D. Du
After combining (5.14) and (5.16) together, it is clear by using Hölder’s inequality that C(1, z 0 ) + D(1, z 0 ) + F(1, z 0 ) ≤ N ε4 , where N is independent of ε4 . Then owing to Proposition 2.14 and Lemma 5.1, for sufficiently small ε4 we can find a uniform upper bound M0 > 0 such that for any z 0 ∈ R4 × (R, +∞), sup
z∈Q(z 0 ,1/4)
|u(z)| ≤ M0 .
Therefore, u will not blow up as t goes to infinity, and Theorem 2.5 is proved. Acknowledgement. The authors would like to express their sincere gratitude to Prof. V. Sverak for pointing out this problem and giving many useful comments for improvement. The authors would also thank to Prof. N.V. Krylov for helpful discussions and the referee for his careful review of the article.
References 1. Caffarelli, L., Kohn, R., Nirenberg, L.: Partial regularity of suitable weak solutions of the Navier-stokes equations. Comm. Pure Appl. Math. 35, 771–831 (1982) 2. Cannone, M.: A generalization of a theorem by Kato on Navier-Stokes equations. R. Mat. Iberoam 13, 515–541 (1997) 3. Galdi, G.P.: An introduction to the mathematical theory of Navier-Stokes equations, I,II. New York: Springer-Verlag, 1994 4. Giga, Y., Miyakawa, T.: Navier-Stokes flow in R3 with measures as initial vorticity and Morrey spaces. Commun. Part. Differ. Eqs. 14, 577–618 (1989) 5. Giga, Y., Miyakawa, T.: Solution in L r of the Navier-Stokes initial value problem. Arch. Rat. Mech. Anal. 89, 267–281 (1985) 6. Gustafson, S., Kang, K., Tsai, T.: Interior regularity criteria for suitable weak solutions of the Navier-Stokes equations. Commun. Math. Phys. DOI 10.1007/s00220-007-0214-6, 2007 7. Heywood, J.G.: The Navier-Stokes equations: on the existence, regularity and decay of solutions. Indiana Univ. Math. J. 29, 639–681 (1980) 8. Iftimie, D.: The resolution of the Navier-Stokes equations in anisotropic spaces. Rev. Mat. Iberoam 15, 1–36 (1999) 9. Kato, T.: Strong L p -solutions of the Navier-Stokes equation in Rm with applications to weak solutions. Math. Z. 187, 471–480 (1984) 10. Koch, H., Tataru, D.: Well-posedness for the Navier-Stokes equations. Adv. Math. 157(1), 22–35 (2001) 11. Ladyzhenskaya, O.A., Solonnikov, V.A., Ural’tseva, N.N.: Linear and quasi-Linear equations of parabolic type. Moscow: Nauka, 1967 (in Russian); English translation: Providence, RI: Amer. Math. Soc., 1968 12. Ladyzhenskaya, O.: The Mathematical Theory of Viscous Incompressible Flows. 2nd edition, New York: Gordon and Breach, 1969 13. Ladyzhenskaya, O., Seregin, G.A.: On partial regularity of suitable weak solutions to the three-dimensional Navier–Stokes equations. J. Math. Fluid Mech. 1, 356–387 (1999) 14. Leray, J.: Étude de diverses équations intégrales non linéaires et de quelques problèmes que pose l’hydrodynamique. J. Math. Pures Appl. 12, 1–82 (1933) 15. Lin, F.: A new proof of the Caffarelli-Kohn-Nirenberg theorem. Comm. Pure Appl. Math. 51, 241–257 (1998) 16. Seregin, G.: Regularity for Suitable Weak Solutions to the Navier-Stokes Equations in Critical Morrey Spaces. Preprint, http://arxiv.org/list/math.AP/0607537, 2006 17. Scheffer, V.: Partial regularity of solutions to the Navier-Stokes equations. Pacific J. Math. 66, 535–552 (1976) 18. Scheffer, V.: Hausdorff measure and the Navier-Stokes equations. Commun. Math. Phys. 55, 97–112 (1977) 19. Scheffer, V.: The Navier-Stokes equations in space dimension four. Commun. Math. Phys. 61, 41–68 (1978) 20. Scheffer, V.: The Navier-Stokes equations on a bounded domain. Commun. Math. Phys. 73, 1–42 (1980)
Partial Regularity of Solutions to the NSE
801
21. Serrin, J.: On the interior regularity of weak solutions of Navier-Stokes equations. Arch. Rat. Mech. Anal. 9, 187–195 (1962) 22. Solonikov, V.A.: Estimates of solutions to the linearized systems of the Navier-Stokes equations. Trudy Steklov Math. Inst. LXX, 213–317 (1964) 23. Taylor, M.: Analysis on Morrey spaces and applications to Navier-Stokes equation. Comm. Part. Differ. Eqs. 17, 1407–1456 (1992) 24. Wiegner, M.: Higher order estimates in further dimensions for the solutions of Navier-Stokes equations. Evolution equations (Warsaw, 2001), Banach Center Publ. 60, Warsaw: Polish Acad. Sci., 2003, pp 81–84 Communicated by P. Constantin
Commun. Math. Phys. 273, 803–827 (2007) Digital Object Identifier (DOI) 10.1007/s00220-007-0213-7
Communications in
Mathematical Physics
Obstructions to the Existence of Sasaki–Einstein Metrics Jerome P. Gauntlett1,2 , Dario Martelli3 , James Sparks4,5 , Shing-Tung Yau4 1 2 3 4
Blackett Laboratory, Imperial College, London SW7 2AZ, U.K. The Institute for Mathematical Sciences, Imperial College, London SW7 2PG, U.K. Department of Physics, CERN Theory Unit, 1211 Geneva 23, Switzerland Department of Mathematics, Harvard University, One Oxford Street, Cambridge, MA 02138, U.S.A. E-mail: [email protected] 5 Jefferson Physical Laboratory, Harvard University, Cambridge, MA 02138, U.S.A. Received: 17 August 2006 / Accepted: 5 October 2006 Published online: 4 May 2007 – © Springer-Verlag 2007
Abstract: We describe two simple obstructions to the existence of Ricci–flat Kähler cone metrics on isolated Gorenstein singularities or, equivalently, to the existence of Sasaki–Einstein metrics on the links of these singularities. In particular, this also leads to new obstructions for Kähler–Einstein metrics on Fano orbifolds. We present several families of hypersurface singularities that are obstructed, including 3–fold and 4–fold singularities of ADE type that have been studied previously in the physics literature. We show that the AdS/CFT dual of one obstruction is that the R–charge of a gauge invariant chiral primary operator violates the unitarity bound. Contents 1. Introduction . . . . . . . . . . . . . . 2. The Obstructions . . . . . . . . . . . . 2.1 The Bishop obstruction . . . . . . 2.2 The Lichnerowicz obstruction . . 2.3 Smooth Fanos . . . . . . . . . . . 2.4 AdS/CFT interpretation . . . . . . 3. Isolated Hypersurface Singularities . . 3.1 The Bishop obstruction . . . . . . 3.2 The Lichnerowicz obstruction . . 3.3 Sufficient conditions for existence 4. A Class of 3–Fold Examples . . . . . . 4.1 Obstructions . . . . . . . . . . . . 4.2 Cohomogeneity one metrics . . . 4.3 Field theory . . . . . . . . . . . . 5. Other Examples . . . . . . . . . . . . 5.1 ADE 4–fold singularities . . . . . 5.2 Weighted actions on Cn . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
804 806 807 808 809 811 813 815 816 816 817 817 818 818 820 820 821
804
J. P. Gauntlett, D. Martelli, J. Sparks, S.-T. Yau
6. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 822 A. Cohomogeneity One Metrics . . . . . . . . . . . . . . . . . . . . . . . . . 823 1. Introduction The study of string theory and M–theory on singular manifolds is a very rich subject that has led to many important insights. For geometries that develop an isolated singularity, one can model the local behaviour using a non–compact manifold. In this case, a natural geometric boundary condition is for the metric to asymptote to a cone away from the singularity. This means that one studies a family of metrics that asymptotically approach the conical form g X = dr 2 + r 2 g L
(1.1)
with (L , g L ) a compact Riemannian manifold. The dynamics of string theory or M–theory on special holonomy manifolds that are developing an isolated conical singularity X , with metric (1.1), has proved to be an extremely intricate subject. A particularly interesting setting is in the context of the AdS/CFT correspondence [1]. The worldvolume theory of a large number of D3–branes placed at an isolated conical Calabi–Yau 3–fold singularity is expected to flow, at low energies, to a four–dimensional N = 1 superconformal field theory. In this case, the AdS/CFT conjecture states that this theory is dual to type IIB string theory on AdS5 × L [2–5]. Similar remarks apply to M–theory on conical eight–dimensional singularities with special holonomy, which lead to superconformal theories in three dimensions that are dual to AdS4 × L, although far less is known about this situation. The focus of this paper will be on conical Calabi–Yau singularities, by which we mean Ricci–flat Kähler metrics of the conical form (1.1). This gives, by definition, a Sasaki–Einstein metric on the base of the cone L. We tacitly assume that L is simply– connected which, although not entirely necessary, always ensures the existence of a globally defined Killing spinor on L. A central role is played by the Reeb vector field ∂ ξ=J r , (1.2) ∂r where J denotes the complex structure tensor on the cone X . ξ is holomorphic, Killing, and has constant norm on the link L = {r = 1} of the singularity at r = 0. If the orbits of ξ all close then L has a U (1) isometry, which necessarily acts locally freely, and the Sasakian structure is said to be either regular or quasi–regular if this action is free or not, respectively. The orbit space is in general a positively curved Kähler–Einstein orbifold (V, gV ), which is a smooth manifold in the regular case. More generally, the generic orbits of ξ need not close, in which case the Sasakian structure is said to be irregular. The AdS/CFT correspondence maps the symmetry generated by the Reeb vector field to the R–symmetry of the dual CFT. Thus for the quasi–regular case the CFT has a U (1) R–symmetry, whereas for the irregular case it has a non–compact R R–symmetry. Given a Sasaki–Einstein manifold (L , g L ), the cone X , as a complex variety, is an isolated Gorenstein singularity. If X 0 denotes X with the singular point removed, we have X 0 = R+ × L with r > 0 a coordinate on R+ . X being Gorenstein means simply that there exists a nowhere zero holomorphic (n, 0)–form on X 0 . One may then turn things around and ask which isolated Gorenstein singularities admit Sasaki–Einstein metrics on their links. This is a question in algebraic geometry, and it is an extremely
Obstructions to the Existence of Sasaki–Einstein Metrics
805
difficult one. To give some idea of how difficult this question is, let us focus on the quasi–regular case. Thus, suppose that X has a holomorphic C∗ action, with orbit space being a Fano1 manifold, or Fano orbifold, V . Then existence of a Ricci–flat Kähler cone metric on X , with conical symmetry generated by R+ ⊂ C∗ , is well known to be equivalent to finding a Kähler–Einstein metric on V – for a review, see [6]. Existence of Kähler–Einstein metrics on Fanos is a very subtle problem that is still unsolved. That is, a set of necessary and sufficient algebraic conditions on V are not known in general. There are two well–known holomorphic obstructions, due to Matsushima [7] and Futaki [8]. The latter was related to Sasakian geometry in [9] and is not in fact an obstruction from the Sasaki–Einstein point of view. Specifically, it is possible to have a Fano V that has non–zero Futaki invariant and thus does not admit a Kähler–Einstein metric, but nevertheless the link of the total space of the canonical bundle over V can admit a Sasaki–Einstein metric – the point is simply that the Reeb vector field is not2 the one associated with the canonical bundle over V . It is also known that vanishing of these two obstructions is, in general, insufficient for there to exist a Kähler–Einstein metric on V . It has been conjectured in [11] that V admits a Kähler–Einstein metric if and only if it is stable; proving this conjecture is currently a major research programme in geometry – see, for example, [12]. Thus, one also expects the existence of Ricci–flat Kähler cone metrics on an isolated Gorenstein singularity X to be a subtle problem. This issue has been overlooked in some of the physics literature, and it has sometimes been incorrectly assumed, or stated, that such conical Calabi–Yau metrics exist on particular singularities, as we shall discuss later. The Reeb vector field contains a significant amount of information about the metric. For a fixed X 0 = R+ × L, the Reeb vector field ξ for a Sasaki–Einstein metric on L satisfies a variational problem that depends only on the complex structure of X [13, 9]. This is the geometric analogue of a–maximisation [14] in four dimensional superconformal field theories. This allows one, in principle and often in practice, to obtain ξ , and hence in particular the volume, of a Sasaki–Einstein metric on L – assuming that this metric exists. Now, for any (2n−1)–dimensional Einstein manifold (L , g L ) with Ric = 2(n−1)g L , Bishop’s theorem [15] (see also [16]) implies that the volume of L is bounded from above by that of the round unit radius sphere. Thus we are immediately led to what we will call the Bishop obstruction to the existence of Sasaki–Einstein metrics: If the volume of the putative Sasaki–Einstein manifold, calculated using the results of [13, 9], is greater than that of the round sphere, then the metric cannot exist. It is not immediately obvious that this can ever happen, but we shall see later that this remarkably simple fact can often serve as a powerful obstruction. We will also discuss the AdS/CFT interpretation of this result. The Reeb vector field ξ also leads to a second possible obstruction. Given ξ for a putative Sasaki–Einstein metric, it is a simple matter to show that holomorphic functions f on the corresponding cone X with definite charge λ > 0, Lξ f = λi f
(1.3)
1 We define a Fano orbifold V to be a compact Kähler orbifold, such that the cohomology class of the Ricci–form in H 2 (V ; R) is represented by a positive (1, 1)–form on V . 2 This happens, for example, when V = F – the first del Pezzo surface. In this case both the Matsushima 1 and Futaki theorems obstruct existence of a Kähler–Einstein metric on V , but there is nevertheless an irregular Sasaki–Einstein metric on the link in the total space of the canonical bundle over V [10].
806
J. P. Gauntlett, D. Martelli, J. Sparks, S.-T. Yau
give rise to eigenfunctions of the Laplacian on the Sasaki–Einstein manifold with eigenvalue λ(λ + 2n − 2). Lichnerowicz’s theorem [17] states that the smallest eigenvalue of this Laplacian is bounded from below by the dimension of the manifold, and this leads to the restriction λ ≥ 1. Thus we have what we will call the Lichnerowicz obstruction: If one can demonstrate the existence of a holomorphic function on X with positive charge λ < 1 with respect to the putative Reeb vector field ξ , one concludes that no Sasaki–Einstein metric can exist with this Reeb vector field. Again, it is not immediately obvious that this can ever happen. Indeed, if ξ is regular, so that the orbit space V is a Fano manifold, we show that this cannot happen. Nevertheless, there are infinitely many examples of simple hypersurface singularities with non–regular Reeb vector fields that violate Lichnerowicz’s bound. We shall show that for Calabi–Yau 3–folds and 4–folds, the Lichnerowicz obstruction has a beautiful AdS/CFT interpretation: holomorphic functions on the cone X are dual to chiral primary operators in the dual superconformal field theory. The Lichnerowicz bound then translates into the unitarity bound for the dimensions of the operators. The plan of the rest of the paper is as follows. In Sect. 2 we discuss the two obstructions in a little more detail. In Sect. 3 we investigate the obstructions in the context of isolated quasi–homogeneous hypersurface singularities. We also compare our results with the sufficient conditions reviewed in [6] for existence of Sasaki–Einstein metrics on links of such singularities. In Sect. 4 we show that some 3–fold examples discussed in [18] do not admit Ricci–flat Kähler cone metrics. We briefly discuss the implications for the dual field theory. The results of this section leave open the possibility3 of a single new cohomogeneity one Sasaki–Einstein metric on S 5 and we present some details of the relevant ODE that needs to be solved in an appendix. In Sect. 5 we show that some of the 4–fold examples discussed in [20] also do not admit Ricci–flat Kähler cone metrics. Section 6 briefly concludes. 2. The Obstructions In this section we describe two obstructions to the existence of a putative Sasaki–Einstein metric on the link of an isolated Gorenstein singularity X with Reeb vector field ξ . These are based on Bishop’s theorem [15] and Lichnerowicz’s theorem [17], respectively. We prove that the case when ξ generates a freely acting circle action, with orbit space a Fano manifold V , is never obstructed by Lichnerowicz. We also give an interpretation of Lichnerowicz’s bound in terms of the unitarity bound in field theory, via the AdS/CFT correspondence. Let X be an isolated Gorenstein singularity, and X 0 be the smooth part of X . We take X 0 to be diffeomorphic as a real manifold to R+ × L, where L is compact, and let r be a coordinate on R+ with r > 0, so that r = 0 is the isolated singular point of X . We shall refer to L as the link of the singularity. Since X is Gorenstein, by definition there exists a nowhere zero holomorphic (n, 0)–form on X 0 . Suppose that X admits a Kähler metric that is a cone with respect to a homothetic vector field r ∂/∂r , as in (1.1). This in particular means that L is the orbit space of r ∂/∂r and g L is a Sasakian metric. The Reeb vector field is defined to be ∂ . (2.1) ξ=J r ∂r 3 Recently reference [19] appeared. The conclusions of the latter imply that this solution does not in fact exist.
Obstructions to the Existence of Sasaki–Einstein Metrics
807
In the special case that the Kähler metric on X is Ricci–flat, the case of central interest, (L , g L ) is Sasaki–Einstein and we have Lξ = ni
(2.2)
since is homogeneous of degree n under r ∂/∂r . This fixes the normalisation of ξ . 2.1. The Bishop obstruction. The volume vol(L , g L ) of a Sasakian metric on the link L depends only on the Reeb vector field [9]. Thus, specifying a Reeb vector field ξ for a putative Sasaki–Einstein metric on L is sufficient to specify the volume, assuming that the metric in fact exists. We define the normalised volume as V (ξ ) =
vol(L , g L ) , vol(S 2n−1 )
(2.3)
where vol(S 2n−1 ) is the volume of the round sphere. Since Bishop’s theorem [15] (see also [16]) implies that for any (2n − 1)–dimensional Einstein manifold (L , g L ) with Ric = 2(n − 1)g L vol(L , g L ) ≤ vol(S 2n−1 )
(2.4)
we immediately have Bishop obstruction. Let (X, ) be an isolated Gorenstein singularity with link L and putative Reeb vector field ξ . If V (ξ ) > 1 then X admits no Ricci–flat Kähler cone metric with Reeb vector field ξ . In particular L does not admit a Sasaki–Einstein metric with this Reeb vector field. There are a number of methods for computing the normalised volume V (ξ ). For quasi– regular ξ , the volume V (ξ ) is essentially just a Chern number, which makes it clear that V (ξ ) is a holomorphic invariant. In general, one can compute V (ξ ) as a function of ξ , and a number of different formulae have been derived in [13, 9]. In [9] a general formula for the normalised volume V (ξ ) was given that involves (partially) resolving the singularity X and applying localisation. For toric Sasakian manifolds there is a simpler formula [13], giving the volume in terms of the toric data defining the singularity. In this paper we shall instead exploit the fact that the volume V (ξ ) can be extracted from a limit of a certain index–character [9]; this is easily computed algebraically for isolated hypersurface singularities, which shall constitute our main set of examples in this paper. We briefly recall some of the details from [9]. Suppose we have a holomorphic (C∗ )r action on X . We may define the character C(q, X ) = Tr q
(2.5)
as the trace4 of the action of q ∈ (C∗ )r on the holomorphic functions on X . Holomorphic functions f on X that are eigenvectors of the induced (C∗ )r action (C∗ )r : f → qm f,
(2.6)
4 As in [9], we don’t worry about where this trace converges, since we are mainly interested in the behaviour near a certain pole.
808
J. P. Gauntlett, D. Martelli, J. Sparks, S.-T. Yau
with eigenvalue qm = ra=1 qam a form a vector space over C of dimension n m . Each eigenvalue then contributes n m qm to the trace (2.5). Let ζa form a basis for the Lie algebra of U (1)r ⊂ (C∗ )r , and write the Reeb vector field as ξ=
r
ba ζa .
(2.7)
a=1
Then the volume of a Sasakian metric on L with Reeb vector field ξ , relative to that of the round sphere, is given by V (ξ ) = lim t n C(qa = exp(−tba ), X ). t→0
(2.8)
In general, the right-hand side of this formula may be computed by partially resolving X and using localisation. However, for isolated quasi–homogeneous hypersurface singularities it is straightforward to compute this algebraically. In addition, it was shown in [9] that the Reeb vector field for a Sasaki–Einstein metric on L extremises V as a function of the ba , subject to the constraint (2.2). This is a geometric analogue of a–maximisation [14] in superconformal field theories.
2.2. The Lichnerowicz obstruction. Let f be a holomorphic function on X with Lξ f = λi f,
(2.9)
where R λ > 0, and we refer to λ as the charge of f under ξ . Since f is holomorphic, this immediately implies that f = r λ f˜,
(2.10)
where f˜ is homogeneous degree zero under r ∂/∂r – that is, f˜ is the pull–back to X of a function on the link L. Moreover, since (X, g X ) is Kähler, ∇ X2 f = 0,
(2.11)
where −∇ X2 is the Laplacian on (X, g X ). For a metric cone, this is related to the Laplacian on the link (L , g L ) at r = 1 by ∂ 1 1 ∂ r 2n−1 . (2.12) ∇ X2 = 2 ∇ L2 + 2n−1 r r ∂r ∂r From this, one sees that −∇ L2 f˜ = E f˜,
(2.13)
E = λ[λ + (2n − 2)].
(2.14)
where
Thus any holomorphic function f of definite charge under ξ , or equivalently degree under r ∂/∂r , corresponds to an eigenfunction of the Laplacian on the link. The charge λ is then related simply to the eigenvalue E by the above formula (2.14).
Obstructions to the Existence of Sasaki–Einstein Metrics
809
By assumption, (X, g X ) is Ricci–flat Kähler, which implies that (L , g L ) is Einstein with Ricci curvature 2n − 2. The first non–zero eigenvalue E 1 > 0 of −∇ L2 is bounded from below: E 1 ≥ 2n − 1.
(2.15)
This is Lichnerowicz’s theorem [17]. Moreover, equality holds if and only if (L , g L ) is isometric to the round sphere S 2n−1 [21]. This is important as we shall find examples of links, that are not even diffeomorphic to the sphere, which hit this bound. From (2.14), we immediately see that Lichnerowicz’s bound becomes λ ≥ 1. This leads to a potential holomorphic obstruction to the existence of Sasaki–Einstein metrics: Lichnerowicz obstruction. Let (X, ) be an isolated Gorenstein singularity with link L and putative Reeb vector field ξ . Suppose that there exists a holomorphic function f on X of positive charge λ < 1 under ξ . Then X admits no Ricci–flat Kähler cone metric with Reeb vector field ξ . In particular L does not admit a Sasaki–Einstein metric with this Reeb vector field. As we stated earlier, it is not immediately clear that this can ever happen. In fact, there are examples of hypersurface singularities where this serves as the only obvious simple obstruction, as we explain later. However, in the next subsection we treat a situation where Lichnerowicz never obstructs. Before concluding this subsection we note that the volume of a Sasakian metric on L with Reeb vector field ξ is also related to holomorphic functions on X of definite charge, as we briefly reviewed in the previous subsection. In fact we may write (2.8) as V (ξ ) = lim t n Tr exp(−tLr ∂/∂r ), t→0
(2.16)
where r ∂/∂r = −J (ξ ). Here the trace denotes a trace of the action of Lr ∂/∂r on the holomorphic functions on X . Thus a holomorphic function f of charge λ under ξ contributes exp(−tλ) to the trace. That (2.16) agrees with (2.8) follows from the fact that we can write λ = (b, m). Given our earlier discussion relating λ to eigenvalues of the Laplacian on L, the above trace very much resembles the trace of the heat kernel, also known as the partition function, on L. In fact, since it is a sum over only holomorphic eigenvalues, we propose to call it the holomorphic partition function. The fact that the volume of a Riemannian manifold appears as a pole in the heat kernel is well known [22], and (2.16) can be considered a holomorphic Sasakian analogue. Notice then that the Lichnerowicz obstruction involves holomorphic functions on X of small charge with respect to ξ , whereas the Bishop obstruction is a statement about the volume, which is determined by the asymptotic growth of holomorphic functions on X . 2.3. Smooth Fanos. Let V be a smooth Fano Kähler manifold. Let K denote the canonical line bundle over V . By definition, K −1 is an ample holomorphic line bundle, which thus specifies a positive class c1 (K −1 ) = −c1 (K ) ∈ H 2 (V ; Z) ∩ H 1,1 (V ; R) ∼ = Pic(V ).
(2.17)
Recall here that Pic(V ) is the group of holomorphic line bundles on V . Let I (V ) denote the largest positive integer such that c1 (K −1 )/I (V ) is an integral class in Pic(V ). I (V ) is called the Fano index of V . For example, I (CP2 ) = 3, I (CP1 ×CP1 ) = 2, I (F1 ) = 1.
810
J. P. Gauntlett, D. Martelli, J. Sparks, S.-T. Yau
Let L be the holomorphic line bundle L = K 1/I (V ) , which is primitive in Pic(V ) by construction. Denote the total space of the unit circle bundle in L by L – this is our link. We thus have a circle bundle S 1 → L → V,
(2.18)
where L is the associated line bundle. If V is simply–connected then L is also simply– connected, as follows from the Gysin sequence of the fibration (2.18). Note that V admits a Kähler–Einstein metric if and only if L admits a regular Sasaki–Einstein metric with Reeb vector field that rotates the S 1 fibre of (2.18). X is obtained from the total space of L by collapsing (or deleting, to obtain X 0 ) the zero section. Holomorphic functions on X of definite charge are then in 1–1 correspondence with global sections of L−k , which are elements of the group H 0 (O(L−k )). Let ζ be the holomorphic vector field on X that rotates the fibre of L with weight one. That is, if s ∈ H 0 (O(L−1 )) is a holomorphic section of the ample line bundle L−1 , viewed as a holomorphic function on X , then Lζ s = is.
(2.19)
L I (V )
is the canonical bundle of V , it follows that the correctly normalised Since K = Reeb vector field is (see, for example, [9]) n ζ. (2.20) ξ= I (V ) We briefly recall why this is true. Let ψ be a local coordinate such that ξ = ∂/∂ψ. Then nψ/I (V ) is a local coordinate on the circle fibre of (2.18) with period 2π . This follows since locally the contact one–form of the Sasakian manifold is η = dψ − A, where A/n is a connection on the canonical bundle of V . Holomorphic functions of smallest positive charge obviously correspond to k = 1. Any section s ∈ H 0 (O(L−1 )) then has charge n (2.21) λ= I (V ) under ξ . However, it is well known (see, for example, [23], p. 245) that for smooth Fanos V we have I (V ) ≤ n, with I (V ) = n if and only if V = CPn−1 . Thus, in this situation, we always have λ ≥ 1 and Lichnerowicz never obstructs. Lichnerowicz’s theorem can only obstruct for non–regular Reeb vector fields. We expect a similar statement to be true for the Bishop bound. For a regular Sasaki– Einstein manifold with Reeb vector field ξ and orbit space a Fano manifold V , Bishop’s bound may be written I (V ) c1 (V )n−1 ≤ n c1 (CPn−1 )n−1 = n n . (2.22) V
CPn−1
It seems reasonable to expect the topological statement (2.22) to be true for any Fano manifold V , so that Bishop never obstructs in the regular case, although we are unaware of any proof. Interestingly, this is closely related to a standard conjecture in algebraic geometry, that bounds V c1 (V )n−1 from above by n n−1 for any Fano manifold V , with equality if and only if V = CPn−1 . In general, this stronger statement is false (see [23], p. 251), although it is believed to be true in the special case that V has Picard number one, i.e. rank(Pic(V )) = 1. This has recently been proven up to dimension n = 5 [24]. It would be interesting to investigate (2.22) further.
Obstructions to the Existence of Sasaki–Einstein Metrics
811
2.4. AdS/CFT interpretation. In this section we show that the Lichnerowicz obstruction has a very natural interpretation in the AdS/CFT dual field theory, in terms of a unitarity bound. We also briefly discuss the Bishop bound. Recall that every superconformal field theory possesses a supergroup of symmetries and that the AdS/CFT duality maps this to the superisometries of the dual geometry. In particular, in the context of Sasaki–Einstein geometry, it maps the R–symmetry in the field theory to the isometry generated by the Reeb vector field ξ , and the R–charges of operators in the field theory are proportional to the weights under ξ . Generically, Kaluza–Klein excitations in the geometry correspond to gauge invariant operators in the field theory. These operators are characterised by their scaling dimensions . The supersymmetry algebra then implies that a general operator satisfies a BPS bound relating the dimension to the R–charge R: ≥ (d − 1)R/2. When this bound is saturated the corresponding BPS operators belong to short representations of the supersymmetry algebra, and in particular are chiral. Here we will only consider scalar gauge invariant operators which are chiral. It is well known that for any conformal field theory, in arbitrary dimension d, the scaling dimensions of all operators are bounded as a consequence of unitarity. In particular, for scalar operators, we have d −2 . (2.23) 2 In Sect. 2.2 we have argued that a necessary condition for the existence of a Sasaki– Einstein metric is that the charge λ > 0 of any holomorphic function on the corresponding Calabi–Yau cone must satisfy the bound
≥
λ ≥ 1.
(2.24)
In the following, we will show that these two bounds coincide. We start with a gauge theory realised on the world–volume of a large number of D3 branes, placed at a 3–fold Gorenstein singularity X . The affine variety X can then be thought of as (part of) the moduli space of vacua of this gauge theory. In particular, the holomorphic functions, defining the coordinate ring of X , correspond to (scalar) elements of the chiral ring of the gauge theory [25]. Recalling that an AdS4/5 × L 7/5 solution arises as the near–horizon limit of a large number of branes at a Calabi–Yau 4–fold/3–fold conical singularity, it is clear that the weights λ of these holomorphic functions under the action of r ∂/∂r must be proportional to the scaling dimensions
of the dual operators, corresponding to excitations in AdS space. We now make this relation more precise. According to the AdS/CFT dictionary [26, 27], a generic scalar excitation in AdS obeying (AdSd+1 − m 2 ) = 0
(2.25)
and which behaves like ρ − near the boundary of AdS (ρ → ∞), is dual to an operator in the dual CFT with scaling dimension d2 d m 2 = ( − d) ⇒ ± = ± + m2. (2.26) 2 4 More precisely, for m 2 ≥ −d 2 /4 + 1 the dimension of the operator is given by + . However, for −d 2 /4 < m 2 < −d 2 /4 + 1 one can take either ± and these will correspond
812
J. P. Gauntlett, D. Martelli, J. Sparks, S.-T. Yau
to inequivalent CFTs [28, 29]. Notice that + is always well above the bound implied by unitarity. On the other hand, − saturates this bound for m 2 = −d 2 /4 + 1. The values for m 2 can be obtained from the eigenvalues E of the scalar Laplacian −∇ L2 on the internal manifold L by performing a Kaluza–Klein analysis. The modes corresponding to the chiral primary operators have been identified in the literature in the context of a more general analysis for Einstein manifolds; see [28, 30] for type IIB supergravity compactified on L 5 , and [31, 32] for M–theory compactified on L 7 . Consider first d = 4 (i.e. n = 3). The supergravity modes dual to chiral operators are a mixture of the trace mode of the internal metric and the RR four–form and lead to [33, 28, 30] √ m 2 = E + 16 − 8 E + 4. (2.27) Combining this with (2.14) it follows that
± = 2 ± |λ − 2|
(2.28)
so that = λ, providing that we take − for λ < 2 and + for λ ≥ 2. Notice that for λ = 2, + = − , and this corresponds to the Breitenlohner–Freedman bound m 2 (λ = 2) = −4 for stability in AdS5 . The case d = 3 (i.e. n = 4), relevant for AdS4 × L 7 geometries, is similar. The scalar supergravity modes corresponding to chiral primaries [31, 32] are again a mixture of the metric trace and the three–form potential [34], and fall into short N = 2 multiplets. Their masses are given by5 [34, 35, 32] √ E + 9 − 3 E + 9. 4 Combing this with (2.14) it follows that m2 =
± =
1 (3 ± |λ − 3|) 2
(2.29)
(2.30)
so that = 21 λ, providing that we take − for λ < 3 and + for λ ≥ 3. Once again the switching of the two branches occurs at the Breitenlohner–Freedman bound m 2 (λ = 3) = −9/4 for stability in AdS4 . In summary, we have shown that ⎧ for d = 4 ⎨ λ . (2.31)
= 1 ⎩ λ for d = 3 2 Thus in both cases relevant for AdS/CFT the Lichnerowicz bound λ ≥ 1 is equivalent to the unitarity bound (2.23). The Bishop bound also has a direct interpretation in field theory. Recall that the volume of the Einstein 5–manifold (L , g L ) is related to the exact a central charge of the dual four dimensional conformal field theory via [36] (see also [37]) a(L) =
π3N2 , 4vol(L , g L )
(2.32)
5 Note that the mass formulae in [35] are relative to the operator Ad S4 − 32. Moreover, the factor of four mismatch between their m 2 and ours is simply due to the fact that it is actually m 2 R 2 that enters in (2.26),
and the radius of AdS4 is 1/2 that of AdS5 .
Obstructions to the Existence of Sasaki–Einstein Metrics
813
where N is the number of D3–branes. The Bishop bound then implies that a(L) ≥
N2 = a(N = 4), 4
(2.33)
where N 2 /4 is the central charge of N = 4 super Yang–Mills theory. One can give a heuristic argument for this inequality, as follows6 . By appropriately Higgsing the dual field theory, and then integrating out the massive fields, one expects to be able to flow to N = 4 super Yang–Mills theory. This is because the Higgsing corresponds to moving the D3–branes away from the singular point to a smooth point of the cone, at which the near horizon geometry becomes Ad S5 × S 5 . Since the number of massless degrees of freedom is expected to decrease in such a process, we also expect the a central charge to decrease. This would then explain the inequality (2.33). 3. Isolated Hypersurface Singularities In this section we describe links of isolated quasi–homogeneous hypersurface singularities. These provide many simple examples of both obstructions. Let wi ∈ Z+ , i = 1, . . . , n + 1, be a set of positive weights. We denote these by a vector w ∈ (Z+ )n+1 . This defines an action of C∗ on Cn+1 via (z 1 , . . . , z n+1 ) → (q w1 z 1 , . . . , q wn+1 z n+1 ),
(3.1)
C∗ .
Without loss of generality one can take the set {wi } to have no common where q ∈ factor. This ensures that the above C∗ action is effective. However, for the most part, this is unnecessary for our purposes and we shall not always do this. Let F : Cn+1 → C
(3.2)
be a quasi–homogeneous polynomial on Cn+1 with respect to w. This means that F has definite degree d under the above C∗ action: F(q w1 z 1 , . . . , q wn+1 z n+1 ) = q d F(z 1 , . . . , z n+1 ).
(3.3)
Moreover we assume that the affine algebraic variety X = {F = 0} ⊂ Cn+1
(3.4)
is smooth everywhere except at the origin (0, 0, . . . , 0). For obvious reasons, such X are called isolated quasi–homogeneous hypersurface singularities. The corresponding link L is the intersection of X with the unit sphere in Cn+1 : n+1
|z i |2 = 1.
(3.5)
i=1
A particularly nice set of such singularities are provided by so–called Brieskorn– Pham singularities. These take the particular form F=
n+1 i=1
6 We thank Ken Intriligator for this argument.
z iai
(3.6)
814
J. P. Gauntlett, D. Martelli, J. Sparks, S.-T. Yau
with a ∈ (Z+ )n+1 . Thus the weights of the C∗ action are given by wi = d/ai . The corresponding hypersurface singularities X are always isolated, as is easily checked. Moreover, the topology of the links L are also extremely well understood – see [39] for a complete description of the homology groups of L. In particular, L is known to be (n − 2)–connected, meaning that the homotopy groups are πa (L) = 0 for all a = 1, . . . , n − 2. Returning to the general case, we may define a nowhere zero holomorphic (n, 0)– form on the smooth part of X by =
dz 1 ∧ · · · ∧ dz n . ∂ F/∂z n+1
(3.7)
This defines on the patch where ∂ F/∂z n+1 = 0. One has similar expressions on patches where ∂ F/∂z i = 0 for each i, and it is simple to check that these glue together into a nowhere zero form . Thus all such X are Gorenstein, and moreover they come equipped with a holomorphic C∗ action by construction. The orbit space of this C∗ action, or equivalently the orbit space of U (1) ⊂ C∗ on the link, is a complex orbifold V . In fact, V is the weighted variety defined by {F = 0} in the weighted projective space WCPn[w1 ,w2 ,...,wn+1 ] . The latter is the quotient of the non–zero vectors in Cn+1 by the weighted C∗ action WCPn[w1 ,w2 ,...,wn+1 ] = Cn+1 \ {(0, 0, . . . , 0)} /C∗ (3.8) and is a complex orbifold with a natural Kähler orbifold metric, up to scale, induced from Kähler reduction of the flat metric on Cn+1 . It is not difficult to show that V is a Fano orbifold if and only if |w| − d > 0,
(3.9)
Lζ z j = w j i z j
(3.10)
Lζ = (|w| − d)i.
(3.11)
n+1 where |w| = i=1 wi . To see this, first notice that |w| − d is the charge of under ∗ U (1) ⊂ C . To be precise, if ζ denotes the holomorphic vector field on X with
for each j = 1, . . . , n + 1, then
Positivity of this charge |w| − d then implies [9] that the cohomology class of the natural Ricci–form induced on V is represented by a positive (1, 1)–form, which is the definition that V is Fano. If there exists a Ricci–flat Kähler metric on X which is a cone under R+ ⊂ C∗ , then the correctly normalised Reeb vector field is thus ξ=
n ζ. |w| − d
(3.12)
We emphasise that here we will focus on the possible (non–)existence of a Sasaki– Einstein metric on the link L which has this canonical vector field as its Reeb vector field. It is possible that such metrics are obstructed, but that there exists a Sasaki–Einstein metric on L with a different Reeb vector field. This may be investigated using the results of [9]. In particular we shall come back to this point for a class of 3–fold examples in Sect. 4.
Obstructions to the Existence of Sasaki–Einstein Metrics
815
3.1. The Bishop obstruction. A general formula for the volume of a Sasaki–Einstein metric on the link of an isolated quasi–homogeneous hypersurface singularity was given in [40]. Strictly speaking, this formula was proven only when the Fano V is well–formed. This means that the orbifold loci of V are at least complex codimension two. When V is not well–formed, the singular sets of V considered as an orbifold and as an algebraic variety are in fact different. A simple example is the weighted projective space WCP1[ p,q] , where hcf( p, q) = 1. As an orbifold, this is topologically a 2–sphere with conical singularities at the north and south poles of polar angle 2π/ p and 2π/q, respectively. As an algebraic variety, this weighted projective space is just CP1 since C/Z p = C. In fact, as a manifold it is diffeomorphic to S 2 , for the same reason. When we say Kähler–Einstein orbifold metric, we must keep track of this complex codimension one orbifold data in the non–well–formed case. For further details, the reader is directed to the review [6]. Assuming that there exists a Sasaki–Einstein metric with Reeb action U (1) ⊂ C∗ , then the volume of this link when V is well–formed is given by [40] π(|w| − d) n 2d vol(L) = . (3.13) w(n − 1)! n n+1 Here w = i=1 wi denotes the product of the weights. Using the earlier formula (2.8), we may now give an alternative derivation of this formula. The advantage of this approach is that, in contrast to [40], we never descend to the orbifold V . This allows us to dispense with the well–formed condition, and show that (3.13) holds in general. The authors of [40] noted that their formula seemed to apply to the general case. Let us apply (2.8) to isolated quasi–homogeneous hypersurface singularities. Let q ∈ C∗ denote the weighted action on X . We may compute the character C(q, X ) rather easily, since holomorphic functions on X descend from holomorphic functions on Cn+1 , and the trace over the latter is simple to compute. A discussion of precisely this problem may be found in [41]. According to the latter reference, the character is simply 1 − qd . C(q, X ) = n+1 wi i=1 (1 − q )
(3.14)
The limit (2.8) is straightforward to take, giving the normalised volume V (ξ ) =
d , wbn
(3.15)
where, as above, ξ = bζ,
(3.16)
and ζ generates the U (1) ⊂ C∗ action. Thus, from our earlier discussion on the charge of , we have b=
n , |w| − d
(3.17)
giving vol(L) =
d (|w| − d)n vol(S 2n−1 ). wn n
(3.18)
816
J. P. Gauntlett, D. Martelli, J. Sparks, S.-T. Yau
Restoring vol(S 2n−1 ) =
2π n (n − 1)!
(3.19)
we thus obtain the result (3.13). Bishop’s theorem then requires, for existence of a Sasaki–Einstein metric on L with Reeb vector field ξ generating the canonical U (1) action, d (|w| − d)n ≤ wn n .
(3.20)
We shall see that infinitely many isolated quasi–homogeneous hypersurface singularities with Fano V violate this inequality. 3.2. The Lichnerowicz obstruction. As we already mentioned, holomorphic functions on X are simply restrictions of holomorphic functions on Cn+1 . Thus the smallest positive charge holomorphic function is z m , where m ∈ {1, . . . , n + 1} is such that wm = min{wi , i = 1, . . . , n + 1}.
(3.21)
Of course, m might not be unique, but this is irrelevant since all such z m have the same charge in any case. This charge is λ=
nwm |w| − d
(3.22)
and thus the Lichnerowicz obstruction becomes |w| − d ≤ nwm .
(3.23)
Moreover, this bound can be saturated if and only if X is Cn with its flat metric. It is again clearly trivial to construct many examples of isolated hypersurface singularities that violate this bound. 3.3. Sufficient conditions for existence. In a series of works by Boyer, Galicki and collaborators, many examples of Sasaki–Einstein metrics have been shown to exist on links of isolated quasi–homogeneous hypersurface singularities of the form (3.6). Weighted homogeneous perturbations of these singularities can lead to continuous families of Sasaki–Einstein metrics. For a recent review of this work, we refer the reader to [6] and references therein. Existence of these metrics is proven using the continuity method. One of the sufficient (but far from necessary) conditions for there to exist a Sasaki–Einstein metric is that the weights satisfy the condition [6] |w| − d <
n wm . (n − 1)
(3.24)
In particular, for n > 2 this implies that |w| − d < nwm ,
(3.25)
which is precisely Lichnerowicz’s bound. Curiously, for n = 2 the Lichnerowicz bound and (3.24) are the same, although this case is rather trivial.
Obstructions to the Existence of Sasaki–Einstein Metrics
817
4. A Class of 3–Fold Examples Our first set of examples are given by the 3–fold singularities with weights w = (k, k, k, 2) and polynomial F=
3
z i2 + z 4k ,
(4.1)
i=1
where k is a positive integer. The corresponding isolated hypersurface singularities X k = {F = 0} are of Brieskorn–Pham type. Notice that X 1 = C3 and X 2 is the ordinary double point singularity, better known to physicists as the conifold. Clearly, both of these admit Ricci–flat Kähler cone metrics and moreover, the Sasaki–Einstein metrics are homogeneous. The differential topology of the links L k can be deduced using the results of [39], together with Smale’s theorem for 5–manifolds. In particular, for k odd, the link L k is diffeomorphic to S 5 . For k = 2 p even, one can show that L 2 p ∼ = S 2 × S 3 (alternatively, see Lemma 7.1 of [42]). The Fanos Vk are not well–formed for k > 2. In fact, the subvariety z 4 = 0 in Vk is a copy of CP1 , which is a locus of Zk orbifold singularities for k odd, and Zk/2 orbifold singularities for k even. As algebraic varieties, all the odd k are equivalent to CP2 , and all the even k are equivalent to CP1 × CP1 . As orbifolds, they are clearly all distinct. 4.1. Obstructions. These singularities have appeared in the physics literature [18] where it was assumed that all X k admitted conical Ricci–flat Kähler metrics, with Reeb action corresponding to the canonical U (1) action. In fact, it is trivial to show that the Bishop bound (3.20) is violated for all k > 20. Moreover, the Lichnerowicz bound (3.23) is even sharper: for k ≥ 2, z 4 has smallest7 charge under ξ , namely 6 , (4.2) k+2 which immediately rules out all k > 4. For k = 4 we have λ = 1. Recall that, according to [21], this can happen if and only if L 4 is the round sphere. But we already argued that L 4 = S 2 × S 3 , which rules out k = 4 also. Thus the only link that might possibly admit a Sasaki–Einstein metric with this U (1) Reeb action, apart from k = 1, 2, is k = 3. We shall return to the k = 3 case in the next subsection. Given the contradiction, one might think that perhaps the canonical C∗ action is not the critical one, in the sense of [9]. Writing λ=
F = z 12 + uv + z 4k ,
(4.3)
there is clearly a (C∗ )2 action generated by weights (k, k, k, 2) and (0, 1, −1, 0) on (z 1 , u, v, z 4 ), respectively. The second U (1) ⊂ C∗ is the maximal torus of S O(3) acting on the z i , i = 1, 2, 3, in the vector representation. It is then straightforward to compute the volume of the link as a function of the Reeb vector field ξ=
2
ba ζa
(4.4)
a=1 7 For k = 1, z , i = 1, 2, 3 have the smallest charge λ = 1. This is consistent with the fact that k = 1 i corresponds to the link L 1 = S 5 with its round metric.
818
J. P. Gauntlett, D. Martelli, J. Sparks, S.-T. Yau
using the character formula in [41] and taking the limit as in (2.8). We obtain V (b1 , b2 ) =
1 2kb1 = . 2b1 · kb1 (kb1 + b2 )(kb1 − b2 ) b1 (kb1 + b2 )(kb1 − b2 )
(4.5)
The first component b1 is fixed by the charge of , as above, to be b1 =
3 . k+2
(4.6)
According to [9], the critical Reeb vector field for the putative Sasaki–Einstein metric is obtained by setting to zero the derivative of V (b1 , b2 ), with respect to b2 . This immediately gives b2 = 0. Thus the original weighted C∗ action is indeed a critical point of the Sasakian–Einstein–Hilbert action on the link, in this 2–dimensional space of Reeb vector fields. We could have anticipated this result without computing anything. According to [9], the critical Reeb vector field could not have mixed with any vector field in the Lie algebra of a U (1) subgroup of S O(3), since the latter group is semi–simple. 4.2. Cohomogeneity one metrics. It is interesting to observe that any conical Ricci–flat Kähler metric on X k would necessarily have U (1) × S O(3) isometry (the global form of the effectively acting isometry group will depend on k mod 2). This statement follows from Matsushima’s theorem [7]. Specifically, Matsushima’s theorem says that the isometry group of a Kähler–Einstein manifold8 (V, gV ) is a maximal compact subgroup of the group of complex automorphisms of V . Quotienting L k by the U (1) action, one would thus have Kähler–Einstein orbifold metrics on Vk with an S O(3) isometry, whose generic orbit is three–dimensional. In other words, these metrics, when they exist, can be constructed using standard cohomogeneity one techniques. In fact this type of construction is very well motivated since demanding a local SU (2) × U (1)2 isometry is one way in which the Sasaki–Einstein metrics of [43] can be constructed (in fact they were actually found much more indirectly via M–theory [44]). However, apart from the k = 3 case, and of course the k = 1 and k = 2 cases, we have already shown that any such construction must fail. For k = 3, the relevant ODEs that need to be solved have actually been written down in [45]. In Appendix A we record these equations, as well as the boundary conditions that need to be imposed. We have been unable to integrate these equations, so the question of existence of a Sasaki–Einstein metric on L 3 remains open. 4.3. Field theory. In [18] a family of supersymmetric quiver gauge theories were studied whose classical vacuum moduli space reproduces the affine varieties X 2 p . These theories were argued to flow for large N in the IR to a superconformal fixed point, AdS/CFT dual to a Sasaki–Einstein metric on the link L 2 p for all p. Indeed, the R–charges of fields may be computed using a–maximisation [14], and agree with the naive geometric computations, assuming that the Sasaki–Einstein metrics on L 2 p exist. However, as we have already seen, these metrics cannot exist for any p > 1. We argued in Sect. 2 that this bound, coming from Lichnerowicz’s theorem, is equivalent to the unitarity bound in the CFT. We indeed show that a gauge invariant chiral primary operator, dual to the holomorphic function z 4 that provides the geometric obstruction, violates the unitarity bound for p > 1. 8 The generalisation to orbifolds is straightforward.
Obstructions to the Existence of Sasaki–Einstein Metrics
819
A
X
Y
B Fig. 1. Quiver diagram of the A1 orbifold gauge theory.
Before we recall the field theories for k = 2 p even, let us make a remark on the X k singularities when k is odd. In the latter case, it is not difficult to prove that X k admits no crepant resolution9 . That is, there is no blow–up of X k to a smooth manifold X˜ with trivial canonical bundle. In such cases the field theories might be quite exotic, and in particular not take the form of quiver gauge theories. In contrast, the X 2 p singularities are resolved by blowing up a single exceptional CP1 [47], which leads to a very simple class of gauge theories. Consider the quiver diagram for the N = 2 A1 orbifold, depicted in Fig. 1. The two nodes represent two U (N ) gauge groups. There are 6 matter fields: an adjoint for each gauge group, that we denote by X and Y , and two sets of bifundamental fields A I and B I , where I = 1, 2 are SU (2) flavour indices. Here the A I are in the (N , N¯ ) representation of U (N ) × U (N ), and the B I are in the ( N¯ , N ) representation. This is the quiver for N D3–branes at the C × (C2 /Z2 ) singularity, where C2 /Z2 is the A1 surface singularity. However, for our field theories indexed by p, the superpotential is given by W = Tr X p+1 + (−1) p Y p+1 + X (A1 B1 + A2 B2 ) + Y (B1 A1 + B2 A2 ) . (4.7) It is straightforward to verify that the classical vacuum moduli space of this gauge theory gives rise to the X 2 p singularities. In fact these gauge theories were also studied in detail in [48], and we refer the reader to this reference for further details. The SU (2) flavour symmetry corresponds to the S O(3) automorphism of X 2 p . It is a simple matter to perform a–maximisation for this theory, taken at face value. Recall this requires one to assign trial R–charges to each field, and impose the constraints that W has R–charge 2, and that the β–functions of each gauge group vanish. One then locally maximises the a–function 3N 2 a= 3(R(X i ) − 1)3 − (R(X i ) − 1) (4.8) 2+ 32 i
subject to these constraints, where the sum is taken over all R–charges of fields X i . One finds the results, as in [18], 2 , R(X ) = R(Y ) = p+1 p R(A I ) = R(B I ) = , (4.9) p+1 9 Since the link L = S 5 for all odd k, it follows that Pic(X \ {r = 0}) is trivial, and hence X is factorial. k k k The isolated singularity at r = 0 is terminal for all k. These two facts, together with Corollary 4.11 of [46], imply that X k has no crepant resolution.
820
J. P. Gauntlett, D. Martelli, J. Sparks, S.-T. Yau
and central charge a(L 2 p ) =
27 p 2 N 2 . 8( p + 1)3
(4.10)
This corresponds, under the AdS/CFT relation (2.32), to a Sasaki–Einstein volume vol(L 2 p ) =
2π 3 ( p + 1)3 , 27 p 2
(4.11)
which agrees with the general formula (3.13). Thus an initial reaction [18] is that one has found agreement between geometric and field theory results. However, the results of this paper imply that the Sasaki–Einstein metrics on L 2 p do not exist for p > 1. In fact, it is clear that, upon closer inspection, the gauge invariant chiral primary operator Tr X (or TrY ) has R–charge 2/( p + 1), which violates the unitarity bound for p > 1. In fact, when one computes the vacuum moduli space for a single D3–brane N = 1, Tr X is identified with the holomorphic function z 4 , and the unitarity bound and Lichnerowicz bound are identical, as we argued to be generally true in Sect. 2. The superpotential (4.7) can be regarded as a deformation of the A1 orbifold theory. For p > 2, using a–maximisation (and assuming an a–theorem), this was argued in [48] to be an irrelevant deformation (rather than a “dangerously irrelevant” operator). This is therefore consistent with our geometric results. The case p = 2 is interesting since it appears to be marginal. If it is exactly marginal, we expect a one parameter family of solutions with fluxes that interpolates between the A1 orbifold with link S 5 /Z2 and the X 4 singularity with flux. 5. Other Examples In this section we present some further obstructed examples. In particular we examine ADE 4–fold singularities, studied in [20]. All of these, with the exception of D4 and the obvious cases of A0 and A1 , do not admit Ricci–flat Kähler cone metrics with the canonical weighted C∗ action. We also examine weighted C∗ actions on Cn . 5.1. ADE 4–fold singularities. Consider the polynomials H = z 1k + z 22 + z 32 Ak−1 , H = z 1k + z 1 z 22 + z 32 Dk+1 , H = z 13 + z 24 + z 32 E 6 , H = z 13 + z 1 z 23 + z 32 E 7 , H = z 13 + z 25 + z 32 E 8 .
(5.1)
The hypersurfaces {H = 0} ⊂ C3 are known as the ADE surface singularities. Their links L AD E are precisely S 3 / , where ⊂ SU (2) are the finite ADE subgroups of SU (2) acting on C2 in the vector representation. Thus these Gorenstein singularities are both hypersurface singularities and quotient singularities. Clearly, the links admit Sasaki–Einstein metrics – they are just the quotient of the round metric on S 3 by the group .
Obstructions to the Existence of Sasaki–Einstein Metrics
821
The 3–fold singularities of the previous section are obtained from the polynomial H for Ak−1 by simply adding an additional term z 42 (and relabelling). More generally, we may define the ADE n–fold singularities as the zero loci X = {F = 0} of F=H+
n+1
z i2 .
(5.2)
i=4
Let us consider the particular case n = 4. The C∗ actions, for the above cases, are generated by the weight vectors w = (2, k, k, k, k) w = (2, k − 1, k, k, k) w = (4, 3, 6, 6, 6) w = (6, 4, 9, 9, 9) w = (10, 6, 15, 15, 15)
d d d d d
= 2k = 2k = 12 = 18 = 30
Ak−1 , Dk+1 , E6, E7, E8.
(5.3)
It is then straightforward to verify that for all the Ak−1 singularities with k > 3 the holomorphic function z 1 on X violates the Lichnerowicz bound (3.23). The case k = 3 saturates the bound, but since the link is not10 diffeomorphic to S 7 Obata’s result [21] again rules this out. For all the exceptional singularities the holomorphic function z 2 on X violates (3.23). The Dk+1 singularities are a little more involved. The holomorphic function z 1 rules out all k > 3. On the other hand the function z 2 rules out k = 2, but the Lichnerowicz bound is unable to rule out k = 3. To summarise, the only ADE 4–fold singularity that might possibly admit a Ricci–flat Kähler cone metric with the canonical C∗ action above, apart from the obvious cases of A0 and A1 , is D4 . Existence of a Sasaki–Einstein metric on the link of this singularity is therefore left open. It would be interesting to investigate whether or not there exist Ricci–flat Kähler metrics that are cones with a different Reeb action. In light of our results on the non–existence of the above Sasaki–Einstein metrics, it would also be interesting to revisit the field theory analysis of [20]. 5.2. Weighted actions on Cn . Consider X = Cn , with a weighted C∗ action with weights v ∈ (Z+ )n . The orbit space of non–zero vectors is the weighted projective n space WCPn−1 [v1 ,...,vn ] . Existence of a Ricci–flat Kähler cone metric on C , with the coni∗ cal symmetry generated by this C action, is equivalent to existence of a Kähler–Einstein orbifold metric on the weighted projective space. In fact, it is well known that no such metric exists: the Futaki invariant of the weighted projective space is non–zero. In fact, one can see this also from the Sasakian perspective through the results of [13, 9]. The diagonal action with weights v = (1, 1, . . . , 1) is clearly a critical point of the Sasakian– Einstein–Hilbert action, and this critical point was shown to be unique in the space of toric Sasakian metrics. Nonetheless, in this subsection we show that Lichnerowicz’s bound and Bishop’s bound both obstruct existence of these metrics. The holomorphic (n, 0)–form on X = Cn has charge |v| under the weighted C∗ action, which implies that the correctly normalised Reeb vector field is ξ=
n ζ |v|
10 One can easily show that H (L; Z) = Z for the links of the A 3 k k−1 4–fold singularities.
(5.4)
822
J. P. Gauntlett, D. Martelli, J. Sparks, S.-T. Yau
with notation as before, so that ζ is the vector field that generates U (1) ⊂ C∗ . The Lichnerowicz bound is therefore nvm ≥ 1, (5.5) |v| where vm is the (or a particular) smallest weight. However, clearly |v| ≥ nvm , with equality if and only if v is proportional to (1, 1, . . . , 1). Thus in fact nvm ≤1 |v|
(5.6)
with equality only in the diagonal case, which is just Cn with the canonical Reeb vector field. Thus our Lichnerowicz bound obstructs Kähler–Einstein orbifold metrics on all weighted projective spaces, apart from CPn−1 of course. For the Bishop bound, notice that a Kähler–Einstein orbifold metric on WCPn−1 [v1 ,...,vn ] would give rise to a Sasaki–Einstein metric on S 2n−1 with a weighted Reeb action. The volume of this metric, relative to the round sphere, would be V =
|v|n , nn v
(5.7)
n where v = i=1 vi denotes the product of the weights. This may either be derived using the methods described earlier, or using the toric methods of [13]. Amusingly, (5.7) is precisely the arithmetic mean of the weights v divided by their geometric mean, all to the n th power. Thus the usual arithmetic mean–geometric mean inequality gives V ≥ 1 with equality if and only if v is proportional to (1, 1, . . . , 1). This is precisely opposite to Bishop’s bound, thus again ruling out all weighted projective spaces, apart from CPn−1 . Thus Kähler–Einstein orbifold metrics on weighted projective spaces are obstructed by the Futaki invariant, the Bishop obstruction, and the Lichnerowicz obstruction. In some sense, these Fano orbifolds couldn’t have more wrong with them. 6. Conclusions The problem of existence of conical Ricci–flat Kähler metrics on a Gorenstein n–fold singularity X is a subtle one; a set of necessary and sufficient algebraic conditions is unknown. This is to be contrasted with the case of compact Calabi–Yau manifolds, where Yau’s theorem guarantees the existence of a unique Ricci–flat Kähler metric in a given Kähler class. In this paper we have presented two simple necessary conditions for existence of a Ricci–flat Kähler cone metric on a given isolated Gorenstein singularity X with specified Reeb vector field. The latter is in many ways similar to specifying a “Kähler class”, or polarisation. These necessary conditions are based on the classical results of Bishop and Lichnerowicz, that bound the volume and the smallest eigenvalue of the Laplacian on Einstien manifolds, respectively. The key point that allows us to use these as obstructions is that, in both cases, fixing a putative Reeb vector field ξ for the Sasaki–Einstein metric is sufficient to determine both the volume and the “holomorphic” eigenvalues using only the holomorphic data of X . Note that any such vector field ξ must also be a critical point of the Sasakian–Einstein–Hilbert action of [13, 9], which in Kähler–Einstein terms means that the transverse Futaki invariant is zero. We emphasize, however, that the possible obstructions presented here may be analysed independently of this, the weighted
Obstructions to the Existence of Sasaki–Einstein Metrics
823
projective spaces at the end of Sect. 5 being examples that are obstructed by more than one obstruction. To demonstrate the utility of these criteria, we have provided many explicit examples of Gorenstein singularities that do not admit Sasaki–Einstein metrics on their links, for a particular choice of Reeb vector field. The examples include various quasi–homogeneous hypersurface singularities, previously studied in the physics literature, that have been erroneously assumed to admit such Ricci–flat Kähler cone metrics. We expect that in the particular case that the singularity is toric, neither Lichnerowicz nor Bishop’s bound will obstruct for the critical Reeb vector field b∗ of [13]. This is certainly true for all cases that have been analysed in the literature. In this case both bounds reduce to simple geometrical statements on the polyhedral cone C ∗ and its associated semi–group SC = C ∗ ∩ Zn . For instance, given the critical Reeb vector field b∗ , the Lichnerowicz bound implies that (b∗ , m) ≥ 1 for all m ∈ SC . It would be interesting to try to prove that this automatically follows from the extremal problem in [13], for any toric Gorenstein singularity. We have also explained the relevance of these bounds to the AdS/CFT correspondence. We have shown that the Lichnerowicz bound is equivalent to the unitarity bound on the scaling dimensions of BPS chiral operators of the dual field theories. In particular, we analysed a class of obstructed 3–fold singularities, parameterised by a positive integer k, for which, in the case that k is even, the field theory dual is known and has been extensively studied in the literature. The fact that the links L k do not admit Sasaki–Einstein metrics for any k > 3 supports the field theory arguments of [48]. It would be interesting to know whether a Sasaki–Einstein metric exists on L 3 ; if it does exist, it might be dual to an exotic type of field theory since the corresponding Calabi– Yau cone does not admit a crepant resolution. For the 4–folds studied in [20], it will be interesting to analyse the implications of our results for the field theories. Acknowledgements. We would like to thank O. Mac Conamhna and especially D. Waldram for collaboration in the early stages of this work. We would also like to thank G. Dall’Agata, M. Haskins, N. Hitchin, K. Intriligator, P. Li, R. Thomas, C. Vafa, N. Warner, and S. S.–T. Yau for discussions. We particularly thank R. Thomas for comments on a draft version of this paper. J. F. S. is supported by NSF grants DMS–0244464, DMS–0074329 and DMS–9803347. S.–T. Y. is supported in part by NSF grants DMS–0306600 and DMS– 0074329.
A. Cohomogeneity One Metrics Here we discuss the equations that need to be solved to obtain a Kähler–Einstein orbifold metric on the Fano orbifold Vk of Sect. 4, which we recall is a hypersurface F = z 12 + z 22 + z 32 + z 4k = 0 in the weighted projective space WCP3[k,k,k,2] . The group S O(3) acts on z i , i = 1, 2, 3, in the vector representation, and then Matsushima’s theorem [7] implies that this acts isometrically on any Kähler–Einstein metric. The generic orbit is three–dimensional, and hence these metrics are cohomogeneity one. The Kähler–Einstein condition then reduces to a set of ordinary differential equations in a rather standard way. The ODEs for a local Kähler–Einstein 4–metric with cohomogeneity one SU (2) action have been written down in [45]. The metric may be written as ds 2 = dt 2 + a 2 (t)σ12 + b2 (t)σ22 + c2 (t)σ32 ,
(A.1)
where σi , i = 1, 2, 3, are (locally) left–invariant one–forms on SU (2), and t is a coordinate transverse to the principal orbit. The ODEs are then [45]
824
J. P. Gauntlett, D. Martelli, J. Sparks, S.-T. Yau
1 a˙ =− (b2 + c2 − a 2 ), a 2abc 1 b˙ =− (a 2 + c2 − b2 ), b 2abc c˙ 1 ab =− (a 2 + b2 − c2 ) + , c 2abc c
(A.2)
where is the Einstein constant, which is = 6 in the normalisation relevant for Sasaki–Einstein metrics on L k . The key question, given these local equations, is what the boundary conditions are. For a complete metric on Vk , the parameter t must take values in a finite interval, which without loss of generality we may take to be [0, t∗ ] for some t∗ . At the endpoints, the principal orbit collapses smoothly (in an orbifold sense) to a special orbit. It is not difficult to work out the details for Vk , given its embedding in WCP3[k,k,k,2] . One must separate k = 2 p even and k odd. For k odd, the principal orbit is S O(3)/Z2 . This collapses to the two special orbits Bt=0 = (S O(3)/Z2 ) /U (1)1 = RP2 , Bt=t∗ = (S O(3)/Z2 ) /U (1)3 = CP1 ,
(A.3)
where the circle subgroups U (1)1 , U (1)3 ⊂ S O(3) are rotations about the planes transverse to the 1–axis and the 3–axis, respectively, thinking of S O(3) acting on R3 in the usual way. Thus the two U (1) subgroups are related by a conjugation. For k = 2 p even, the principal orbit is instead simply S O(3) = RP3 . The two special orbits are Bt=0 = S O(3)/U (1)1 = S 2 , Bt=t∗ = S O(3)/U (1)3 = CP1 .
(A.4)
Of course, these are diffeomorphic, but the notation indicates that the second orbit is embedded as a complex curve in V2 p , whereas S 2 is embedded as a real submanifold of V2 p . In both bases, with k odd or k = 2 p even, the bolts are the real section of Vk , and the subvariety z 4 = 0, respectively. The latter is the image of the conic in CP2 ⊂ WCP3[k,k,k,2] at z 4 = 0, and is a locus of orbifold singularities. This is the only singular set on Vk . The boundary conditions at t = 0 are then, in all cases, a(t) = β + O(t), b(t) = β + O(t), 2 c(t) = t + O(t 2 ), k
(A.5)
where β2 =
k+2 . 6k
(A.6)
Obstructions to the Existence of Sasaki–Einstein Metrics
825
At t = t∗ , one simply requires that a collapses to zero a(t∗ ) = 0, with b(t∗ ) = c(t∗ ) positive and finite. The metric functions should remain strictly positive on the open interval (0, t∗ ). The system of first order ODEs (A.2) may be reduced to a single second order ODE as follows. The change of variables dr/dt = 1/c allows one to find the integral a (r ) = − coth(r ), b
(A.7)
where an integration constant can be reabsorbed by a shift of r . Defining f (r ) = ab, one obtains d log dr
df f dr
= 2 [ f + coth(2r )] .
(A.8)
Any solution of this equation gives rise to a solution of (A.2), using the fact that c2 = −
df . dr
(A.9)
For k = 1, k = 2, one can write down explicit solutions to these equations and boundary conditions, corresponding to the standard metrics on CP2 and CP1 × CP1 , respectively. For k = 1 we have π , a(t) = cos t + 4
π b(t) = sin t + , 4
c(t) = sin(2t),
(A.10)
where the range of t is 0 ≤ t ≤ π/4. Correspondingly, 1 f (r ) = − tanh(2r ) 2
(A.11)
with tan(t) = exp (2r ), so that −∞ ≤ r ≤ 0. For k = 2 we instead have √ 1 a(t) = √ cos( 3t), 3
1 b(t) = √ , 3
√ 1 c(t) = √ sin( 3t), 3
(A.12)
√ where the range of t is 0 ≤ t ≤ π/(2 3). Correspondingly, 1 f (r ) = − tanh(r ) 3
(A.13)
√ with tan( 3t/2) = exp (r ) and −∞ ≤ r ≤ 0. For all k > 3, this paper implies that there do not exist any solutions. This still leaves the case k = 3. We have neither been able to integrate the equations explicitly, nor have our preliminary numerical investigations been conclusive. We leave the issue of existence of this solution open.
826
J. P. Gauntlett, D. Martelli, J. Sparks, S.-T. Yau
References 1. Maldacena, J. M.: “The large N limit of superconformal field theories and supergravity.” Adv. Theor. Math. Phys. 2, 231 (1998) [Int. J. Theor. Phys. 38, 1113 (1999)] 2. Kehagias, A.: “New type IIB vacua and their F-theory interpretation”. Phys. Lett. B 435, 337 (1998) 3. Klebanov, I.R., Witten, E.: “Superconformal field theory on threebranes at a Calabi-Yau singularity”. Nucl. Phys. B 536, 199 (1998) 4. Acharya, B.S., Figueroa-O’Farrill, J.M., Hull, C.M., Spence, B.: “Branes at conical singularities and holography”. Adv. Theor. Math. Phys. 2, 1249 (1999) 5. Morrison, D.R., Plesser, M.R.: “Non-spherical horizons. I”. Adv. Theor. Math. Phys. 3, 1 (1999) 6. Boyer, C.P., Galicki, K.: “Sasakian Geometry, Hypersurface Singularities, and Einstein Metrics.” Supplemento ai Rendiconti del Circolo Matematico di Palermo Serie II. Suppl 75, 57–87 (2005) 7. Matsushima, Y.: “Sur la structure du groupe d’homéomorphismes analytiques d’une certaine variété kaehlérienne”. Nagoya Math. J. 11, 145–150 (1957) 8. Futaki, A.: “An obstruction to the existence of Einstein Kähler metrics”. Invent. Math. 73, 437–443 (1983) 9. Martelli, D., Sparks, J., Yau, S.-T.: “Sasaki–Einstein Manifolds and Volume Minimisation,” http:// arxiv.org/list/hep-th/0603021 10. Martelli, D., Sparks, J.: “Toric geometry, Sasaki-Einstein manifolds and a new infinite class of AdS/CFT duals”. Commun. Math. Phys. 262, 51 (2006) 11. Yau, S.-T.: “Open problems in Geometry”. Proc. Symp. Pure Math. 54, 1–28 (1993) 12. Donaldson, S.K.: “Symmetric spaces, Kähler geometry, and Hamiltonian dynamics”. Amer. Math. Soc. Transl. 196, 13–33 (1999) 13. Martelli, D., Sparks, J., Yau, S.-T.: “The geometric dual of a-maximisation for toric Sasaki-Einstein manifolds”. Commun. Math. Phys. 268, 39–65 (2006) 14. Intriligator, K., Wecht, B.: “The exact superconformal R-symmetry maximizes a”. Nucl. Phys. B 667, 183 (2003) 15. Bishop, R.L., Crittenden, R.J.: “Geometry of manifolds.” New York: Academic Press, 1964 16. Besse, A.L.: “Einstein Manifolds.” Berlin-Heidelberg-New York: Springer–Verlag, 2nd edition, 1987 17. Lichnerowicz, A.: “Géometrie des groupes de transformations.” Paris: Dunod, 1958 18. Cachazo, F., Fiol, B., Intriligator, K.A., Katz, S., Vafa, C.: “A geometric unification of dualities”. Nucl. Phys. B 628, 3 (2002) 19. Conti, D.: “Cohomogeneity one Einstein-Sasaki 5-manifolds.” http://arxiv.org/list/math.DG/0606323, 2006 20. Gukov, S., Vafa, C., Witten, E.: “CFT’s from Calabi-Yau four-folds.” Nucl. Phys. B 584, 69 (2000) [Erratum-ibid. B 608, 477 (2001)] 21. Obata, M.: “Certain conditions for a Riemannian manifold to be isometric to a sphere”. J. Math. Soc. Japan 14, 333–340 (1962) 22. Minakshisundaram, S., Pleijel, A.: “Some Properties of the Eigenfunctions of the Laplace–Operator on Riemannian Manifolds”. Can. J. Math. 1, 242–256 (1949) 23. Kollár, J.: “Rational Curves on Algebraic Varieties.” Berlin-Heidelberg-New York: Springer–Verlag. Ergebnisse der Math. Vol 32, 1996 24. Hwang, J.-M.: “On the degrees of Fano four-folds of Picard number 1”. J. Reine Angew. Math. 556, 225–235 (2003) 25. Gubser, S., Nekrasov, N., Shatashvili, S.: “Generalized conifolds and four dimensional N = 1 superconformal theories”. JHEP 9905, 003 (1999) 26. Gubser, S.S., Klebanov, I.R., Polyakov, A.M.: “Gauge theory correlators from non-critical string theory”. Phys. Lett. B 428, 105 (1998) 27. Witten, E.: “Anti-de Sitter space and holography”. Adv. Theor. Math. Phys. 2, 253 (1998) 28. Klebanov, I.R., Witten, E.: “AdS/CFT correspondence and symmetry breaking”. Nucl. Phys. B 556, 89 (1999) 29. Balasubramanian, V., Kraus, P., Lawrence, A.E.: “Bulk vs. boundary dynamics in anti-de Sitter spacetime”. Phys. Rev. D 59, 046003 (1999) 30. Ceresole, A., Dall’Agata, G., D’Auria, R., Ferrara, S.: “Spectrum of type IIB supergravity on AdS(5) x T(11): Predictions on N = 1 SCFT’s”. Phys. Rev. D 61, 066001 (2000) 31. Fabbri, D., Fre, P., Gualtieri, L., Termonia, P.: “M-theory on AdS(4) x M(111): The complete Osp(2|4) x SU(3) x SU(2) spectrum from harmonic analysis”. Nucl. Phys. B 560, 617 (1999) 32. Fabbri, D., Fre’, P., Gualtieri, L., Reina, C., Tomasiello, A., Zaffaroni, A., Zampa, A.: “3D superconformal theories from Sasakian seven-manifolds: New nontrivial evidences for AdS(4)/CFT(3)”. Nucl. Phys. B 577, 547 (2000)
Obstructions to the Existence of Sasaki–Einstein Metrics
827
33. Kim, H.J., Romans, L.J., Nieuwenhuizen, P. van : “The Mass Spectrum Of Chiral N=2 D = 10 Supergravity On S 5 ”. Phys. Rev. D 32, 389 (1985) 34. Castellani, L., D’Auria, R., Fre, P., Pilch, K., Nieuwenhuizen, P. van : “The Bosonic Mass Formula For Freund-Rubin Solutions Of D = 11 Supergravity On General Coset Manifolds”. Class. Quant. Grav. 1, 339 (1984) 35. D’Auria, R., Fre, P.: “Universal Bose-Fermi Mass Relations In: Kaluza-Klein Supergravity And Harmonic Analysis On Coset Manifolds With Killing Spinors”. Annals Phys. 162, 372 (1985) 36. Henningson, M., Skenderis, K.: “The holographic Weyl anomaly”. JHEP 9807, 023 (1998) 37. Gubser, S.S.: “Einstein manifolds and conformal field theories”. Phys. Rev. D 59, 025006 (1999) 38. Gauntlett, J.P., Martelli, D., Sparks, J., Waldram, D.: “Supersymmetric AdS(5) solutions of type IIB supergravity”. Class. Quant. Grav. 23, 4693 (2006) 39. Randell, R.C.: “The homology of generalized Brieskorn manifolds.” Topology 14, no. 4, 347–355 (1975) 40. Bergman, A., Herzog, C.P.: “The Volume of some Non-spherical Horizons and the AdS/CFT Correspondence.” JHEP 0201, 030 (2002) 41. Nekrasov, N., Shadchin, S.: “ABCD of instantons”. Commun. Math. Phys. 252, 359 (2004) 42. Smith, I., Thomas, R.P.: “Symplectic surgeries from singularities”. Turkish J. Math. 27, 231–250 (2003) 43. Gauntlett, J.P., Martelli, D., Sparks, J., Waldram, D.: “Sasaki-Einstein metrics on S 2 × S 3 ”. Adv. Theor. Math. Phys. 8, 711 (2004) 44. Gauntlett, J.P., Martelli, D., Sparks, J., Waldram, D.: “Supersymmetric AdS5 solutions of M-theory”. Class. Quant. Grav. 21, 4335 (2004) 45. Dancer, A.S., Strachan, I.A.B.: “Kähler–Einstein metrics with SU (2) action”. Math. Proc. Camb. Phil. Soc. 115, 513 (1994) 46. Kollár, J.: “Flops”. Nagoya Math. J. 113, 15–36 (1989) 47. Laufer, H.B.: “On CP1 as an exceptional set.” In: Recent developments in several complex variables, Tokyo/Princeton, NJ: Princeton University Press and University of Tokyo Press, 1981 48. Corrado, R., Halmagyi, N.: “N = 1 field theories and fluxes in IIB string theory”. Phys. Rev. D 71, 046001 (2005) Communicated by G.W. Gibbons