December 27, 2001 14:41 WSPC/148-RMP
00107
Reviews in Mathematical Physics, Vol. 14, No. 1 (2002) 1–28 c World Scientific Publishing Company
SEMICLASSICAL MOTION OF DRESSED ELECTRONS
STEFAN TEUFEL∗ and HERBERT SPOHN† Zentrum Mathematik, Technische Universit¨ at M¨ unchen, 80290 M¨ unchen, Germany ∗
[email protected] †
[email protected]
Received 6 February 2001 We consider an electron coupled to the quantized radiation field and subject to a slowly varying electrostatic potential. We establish that over sufficiently long times radiation effects are negligible and the dressed electron is governed by an effective one-particle Hamiltonian. In the proof only a few generic properties of the full Pauli–Fierz Hamiltonian HPF enter. Most importantly, HPF must have an isolated ground state band for |p| < pc ≤ ∞ with p the total momentum and pc indicating that the ground state band may terminate. This structure demands a local approximation theorem, in the sense that the one-particle approximation holds until the semiclassical dynamics violates |p| < pc . Within this framework we prove an abstract Hilbert space theorem which uses no additional information on the Hamiltonian away from the band of interest. Our result is applicable to other time-dependent semiclassical problems. We discuss semiclassical distributions for the effective one-particle dynamics and show how they can be translated to the full dynamics by our results.
1. Introduction Electrons, protons, and other elementary charged particles move in essence along classical orbits provided the external potentials have a slow variation. E.g. in accelerators it is safe to compute the orbits by means of classical mechanics. As a somewhat crude physical picture one imagines that the electron dressed with its photon cloud at any given instant t is in its state of lowest energy consistent with the current total momentum p(t). During the time span δt the external forces change the momentum of the electron followed by a rapid adjustment of the photon cloud resulting in the new total momentum p(t + δt). In this picture two physical mechanisms are interlaced. Since the forces are weak, the acceleration is small and radiative losses can be ignored. The effective dynamics of the dressed electron is conservative within a good approximation. In addition the total momentum p(t) has a slow variation and can be regarded as a semiclassical variable with respect to the effective Hamiltonian of the ground state band (lowest energy shell). The goal of our paper is to establish the validity of the physical picture. To set the scene, let us introduce the quantum Hamiltonian under consideration. In fact, as will be explained below, our technique is fairly general and uses only 1
December 27, 2001 14:41 WSPC/148-RMP
2
00107
S. Teufel & H. Spohn
a few generic properties of the Hamiltonian. Still, it is useful to have a specific example in mind. We consider a free electron coupled to the quantized Maxwell field. The Hilbert space of states for the electron is Hel = L2 (R3 ) and its time 1 ∆. m is the mass of the electron evolution is governed by the Hamiltonian − 2m and we have set ~ = 1. For the photon field we introduce the Fock space Fb = Hf L∞ over the one-particle space L2 (R3 ) ⊗ C2 , i.e. Fb = n=0 Sn (L2 (R3 ) ⊗ C2 )⊗n and Sn the symmetrizer. Thus a state φ ∈ Fb is a sequence of vectors {φ(0) , φ(1) , . . .} P∞ (n) 2 k < ∞. On Fb with φ(n) ∈ Sn (L2 (R3 ) ⊗ C2 )⊗n such that kφk2 = n=0 kφ we define the two component Bose field with annihilation operators a(k, λ), where k ∈ R3 stands for the wave number and λ = 1, 2 for the helicity of the photon. The fields satisfy the canonical commutation relations [a(k, λ), a∗ (k 0 , λ0 )] = δ(k−k 0 )δλλ0 , [a(k, λ), a(k 0 , λ0 )] = 0 = [a∗ (k, λ), a∗ (k 0 , λ0 )]. The Hamiltonian of the free photon field is then given by X Z (1) d3 k ω(k)a∗ (k, λ)a(k, λ) Hf = λ=1,2
with dispersion relation ω(k) = |k|, the velocity of light c = 1. The electron and the photon field are minimally coupled through the transverse vector potential A(x). To assure transversality we introduce an orthonormal basis of the form e1 (k), e2 (k), k/|k|. Then X Z 1 eλ (k)(eik·x a(k, λ) + e−ik·x a∗ (k, λ)) . (2) d3 k p A(x) = (2π)−3/2 2ω(k) λ=1,2 A is an operator valued distribution. To make it an unbounded operator we smoothen over the form factor ρ as Z (3) Aρ (x) = d3 x0 ρ(x − x0 )A(x0 ) and R 3 assume that ρ is radial, smooth, of rapid decrease, and normalized as d x ρ(x) = 1. In the corresponding classical Hamiltonian eρ would be the rigid charge distribution. With all these preparations we can introduce the Pauli–Fierz operator of a free electron as 1 (−i∇ ⊗ 1 − eAρ (x))2 + 1 ⊗ Hf (4) H0 = 2m acting on H = Hel ⊗Hf . e is the charge of the electron. In (4) x denotes the position operator of the electron on L2 (R3 ). H0 is invariant under translations jointly of the electron and the photons. Thus the total momentum X Z (5) d3 k k a∗ (k, λ)a(k, λ) , p = pel ⊗ 1 + 1 ⊗ pf , pf = λ=1,2
is conserved, [H0 , p] = 0. This can be seen more directly by rewriting (4) in momentum representation as 1 (pel ⊗ 1 − eAρ (i∇pel ))2 + 1 ⊗ Hf (6) H0 = 2m
December 27, 2001 14:41 WSPC/148-RMP
00107
Semiclassical Motion of Dressed Electrons
3
and then going to a representation in which p is diagonal. This is achieved through the unitary transformation T defined by ! n X (n) (n) ki , k1 , . . . , kn , p− (7) (T ψ) (p, k1 , . . . , kn ) = ψ i=1
where ψ (n) is the n-particle sector component of ψ in electron momentum representation. The transformed Hamiltonian T −1 H0 T , again denoted by H0 , takes the form 1 (p − pf − eAρ (0))2 + Hf . (8) H0 = 2m We decompose H and H0 on the spectrum of p, Z ⊕ Z ⊕ d3 p Hp , H0 = d3 p H0 (p) . H= R3
(9)
R3
The spaces Hp are isomorphic to Hf in a natural sense and will be identified in the following. H0 (p) is just H0 acting on Hf for a given value of p. Note that with a properly chosen phase H0 (p) is a real operator. Physically one expects to have a dressed electron state for given momentum p, at least if |p| < pc . In our semi-relativistic model states with |p| ≥ pc decay through Cherenkov radiation to lower momentum states. Thus provisionally we assume that there exists a pc such that for every p ∈ Λg = {p : |p| < pc } H0 (p) has a unique ground state with energy E(p), i.e. H0 (p)ψ0 (p) = E(p)ψ0 (p)
(10)
has a solution ψ0 (p) ∈ Hf which is unique up to scalar multiples and E(p) = inf spec(H0 (p)). We will discuss in Sec. 2.3 under which additional conditions on ρ our assumptions can be verified. E(p) R ⊕ is the ground state band energy. The corresponding projection is Pg = Λg d3 p P0 (p) where P0 (p) denotes the orthogonal projection onto the onedimensional subspace spanned by ψ0 (p). The states in Ran Pg are called dressed electron states. More explicitly (Z ) ⊕ 3 2 d p φ(p)ψ0 (p) ∈ H; φ ∈ L (Λg ) . (11) Ran Pg = Λg
On Ran Pg we have e
−iH0 t
Z ψ=
d3 p (e−iE(p)t φ(p))ψ0 (p) .
(12)
Λg
Thus, if initially in Ran Pg , the dressed electron propagates like a single quantum particle with dispersion relation E(p), which is generated through the interaction with the photons. In particular, it stays in the dressed electron subspace for all times.
December 27, 2001 14:41 WSPC/148-RMP
4
00107
S. Teufel & H. Spohn
As already remarked, the motion of an electron is modified through external electromagnetic potentials. In general, they have a slow variation on the scale set by the Compton wave length. Thus we add to H0 the external potential V (εx) (and possibly also an external vector potential Aex (εx)). ε is a small dimensionless parameter which controls the variation of V . The external forces break the translation invariance of H0 and the total momentum is no longer conserved. A state ψ initially in Ran Pg will no longer remain so under the time evolution generated by the full Hamiltonian H, which reflects that an accelerated charge loses energy through radiation. Since the external forces are weak of order ε, we can expect radiation losses to be negligible. More precisely, the acceleration is of order ε and by Larmor’s formula the energy radiated over the time span τ is ε2 τ . Thus the relevant time scale is of order ε−1 . On that time scale the radiation loss is of order ε, whereas the cumulative effect of the forces is of order 1 and the electron moves on the scale set by the potential V . If the initial ψ ∈ Ran Pg , the dressed electron should still be governed by an effective one-particle Hamiltonian, which is obtained from the dispersion relation E(p) through the Peierls substitution H1 = E(p − eAex (εx)) + V (εx) , and e−iHt ψ =
Z
d3 p(e−iH1 t φ(p))ψ0 (p) + O(ε)
(13)
(14)
Λg
for 0 ≤ t ≤ ε−1 T with some suitable macroscopic time T . We will establish (14) under some additional assumptions on H0 and for Aex = 0. In (14) it is crucial to assume that ψ ∈ Ran Pg . For a general ψ ∈ H one expects that, in essence on a time scale of order 1, ψ splits into outgoing radiation and a piece which is approximately in Ran Pg . This second piece is then governed by the approximation (14). To prove that this really happens is a challenging problem of scattering theory for free electrons and outside the scope of our present investigation. There are several difficulties with the picture proposed in (14). Firstly the Pauli– Fierz Hamiltonian is infrared divergent. For p 6= 0, the photon cloud has an infinite number of photons, though of finite total energy. The physical ground state at fixed p 6= 0 does not lie in Fock space. Even if we introduce a suitable infrared cut-off by assuming that ρˆ(k) → 0 as k → 0, with ρˆ the Fourier transform of ρ, E(p) is not separated from the rest of the spectrum of H0 (p). To overcome both difficulties we are forced to give the photons a small mass, which means to set ω(k) = (m2ph + k 2 )1/2 . In a separate study [19] we remove the gap condition using ideas as developed in [1]. As a second difficulty, which has been accounted for already, we observe that E(p) will cease to exist beyond a certain critical value pc , i.e. for |p| ≥ pc . This is most easily understood by considering the uncoupled Hamiltonian H0 (p) at e = 0. It has absolutely continuous spectrum and the only eigenvalue E(p) = 2m1 el p2 .
December 27, 2001 14:41 WSPC/148-RMP
00107
Semiclassical Motion of Dressed Electrons
5
This eigenvalue is isolated and below the continuum edge for |p| < mel (mph small) and is embedded in the continuum for |p| > mel . For small coupling the embedded eigenvalue should dissolve and E(p) exists only for |p| < pc ∼ = mel . We could avoid the bounded extension of E(p) through a suitable modification of the boson dispersion ω(k) and/or the electron dispersion p2 /2m, cf. Sec. 2.3. However, the termination of bands is a fairly generic phenomenon. Thus, viewed in a more general setting, we have a classical Hamiltonian Hcl (q, p) = E(p) + V (q)
(15)
corresponding to (13) on phase space Γ = R3 × Λg . The solution flow does not exist globally in time and for given initial conditions (q, p) there is a first time T (including T = ∞) when the solution trajectory hits the boundary of Γ. This means we have to control the approximation (14) up to the time when substantial parts of the wave packet “leave” the allowed phase space. Beyond that time new physical phenomena appear not accounted for by (14). In spirit the approximation (14) is similar to the adiabatic theorem, which states that for a Hamiltonian slowly varying in time, transitions between the timedependent eigenspaces are small. In our problem the slow variation is in space and transitions from and to Ran Pg are small. We borrow some elements from the proof of the time-adiabatic theorem; cf. Sec. 4. It turns out that the validity of (14) relies only on a few rather general facts. Therefore we decided to prove (14) as an abstract operator problem. The basic assumptions are the decomposition (9) and a non-degenerate energy band of possibly finite extension separated by a gap from the rest of the spectrum. We mention three widely studied physical systems which possess this abstract mathematical structure, but are, at first sight, very different from the dressed electron. (i) For an electron moving in a periodic crystal potential, p becomes the quasimomentum and the energy band is one particular Bloch band. Bloch bands are discrete but may cross. The small parameter arises, as for Pauli–Fierz, through a, on the scale of the lattice spacing, slowly varying external potential. In [8] we studied the semiclassical motion in periodic potentials for Bloch bands that are isolated over the whole Brillouin zone. While we used a similar approach, the present paper is a substantial improvement in two respects. The approximation is local, no isolated bands have to be assumed, and we consistently avoid using any information on H(p) away from the band of interest. (ii) Electrons are lighter than nuclei by a factor of 2·103 –5·105, which is the starting point of the Born–Oppenheimer approximation for molecular dynamics. p is ~ the collection of nucleonic coordinates. H(R) ~ is the electron now replaced by R, Hamiltonian for fixed nuclei and the band structure arises from eigenvalues of ~ They may cross or dive into the continuum. If we include the kinetic H(R). energy of the nuclei, the small parameter becomes ε = (mel /mnucleus)1/2 and, ~ the Born–Oppenheimer approximation except for the interchange of p and R,
December 27, 2001 14:41 WSPC/148-RMP
6
00107
S. Teufel & H. Spohn
fits our framework. Since in the following only bounded potentials will be covered — the kinetic energy is p2 and plays the role of the potential — we postpone a discussion of the time-dependent Born–Oppenheimer approximation to a separate paper [18]. For an analysis from the point of view of wave packet dynamics we refer to [6]. (iii) The Dirac equation for a single particle has the electron and the positron band. One studies the motion of an electron, say, under slowly varying external potentials [2, 17]. As novel feature, the bands are doubly degenerate. This would also be the case if we include the electron spin in (8) as H=
1 (σ · (p − pf − eAρ ))2 + Hf , 2m
[7]. In the semiclassical limit the internal degrees of freedom (degeneracy) remain quantum mechanical and one has to approximate by matrix valued classical mechanics. Our paper is organized as follows. The abstract setting will be explained in Sec. 2. In this framework, almost by necessity, the theorems are stated as uniform convergence of certain unitary groups and of the corresponding time dependent semiclassical observables in the Heisenberg picture. The reader may worry, as we did, whether such convergence results imply the semiclassical approximation of quantities of physical interest, like position and momentum distributions. In fact, they do under very mild assumptions on the initial wave function, as discussed in Sec. 6. Our main theorems are stated in Sec. 2 with proofs given in Secs. 3 to 5. 2. General Setting and Main Results 2.1. General setting For better readability we denote “momentum space” by M := Rd , d ∈ N. Let Hf be any separable Hilbert space, although the notation should remind one of the Hilbert space for the Bose field in case of our main application. Let H0 be a self-adjoint operator on D(H0 ) ⊂ H = L2 (M ) ⊗ Hf that can be decomposed on M as Z ⊕ dp H0 (p) , (16) H0 = M
where {H0 (p), p ∈ M } is a family of self-adjoint operators with a common domain D0 ⊂ Hf . We assume that the map p 7→ H0 (p) is differentiable in the sense that for all p ∈ M and j = 1, . . . , d the limit H0 (p + hej ) − H0 (p) (H0 (p) − i)−1 h→0 h
(∂pj H0 (p))(H0 (p) − i)−1 := lim
exists in the norm of bounded operators on Hf . This defines, in particular, (∇p H0 )(p) on D0d .
December 27, 2001 14:41 WSPC/148-RMP
00107
Semiclassical Motion of Dressed Electrons
7
For p in some compact and convex Λg ⊆ M with non-empty interior let H0 (p) have an isolated non-degenerate eigenvalue E(p), i.e. H0 (p)ψ0 (p) = E(p)ψ0 (p) with ψ0 (p) ∈ Hf and dist(E(p), σ(H0 (p)) \ E(p)) > 0. We assume that E(·) ∈ C ∞ (Λg , R). As before, P0 (p) denotes the rank-one projection onto ψ0 (p) and Pg is defined as the orthogonal projection on Ran Pg given by (11). The eigenfunctions ψ0 (p), p ∈ Λg , are defined only up to a p-dependent phase factor and the choice of phase is crucial when comparing the full dynamics and the effective one-particle dynamics. We assume that it is possible to choose this phase such that ψ0 (·) ∈ C 2 (Λg , Hf ) and, in order to avoid complications coming from geometric phases, Imhψ0 (p), ∇p ψ0 (p)iHf = 0. This choice is considered fixed from now on. On Λg we require H0 (·) to be twice continuously differentiable in the same sense as above, i.e. for all p ∈ Λg and j, k ∈ 1, . . . , d the limit (∂pj H0 )(p + hek ) − (∂pj H0 )(p) (H0 (p) − i)−1 h→0 h
(∂pk ∂pj H0 (p))(H0 (p) − i)−1 := lim
exists in the norm of bounded operators on Hf and depends continuously on p in the same topology. We remark that the last condition ensures the existence of a C 2 (Λg , Hf ) version of ψ0 (·). If, in addition, H0 (p) is real, i.e. commutes with complex conjugation, then ψ0 (p) can be chosen real-valued in that basis for each p and a real-valued version of ψ0 (·) ∈ C 2 (Λg , Hf ) exists, since Λg is convex and thus contractible. Consequently also Imhψ0 (p), ∇p ψ0 (p)iHf = 0 is satisfied. Hence the assumptions on ψ0 (·) follow from the assumptions on H0 (·) in the case of real H0 (p). R d ik·x ˆ V (k), where R Let nthe potential V : R → R be such that V (x) = dk e dk |k| |Vˆ (k)| < ∞ for all n ∈ N0 . This guarantees, in particular, that V and all its partial derivatives are in C ∞ (Rd ) and vanish at infinity. V ε := V (iε∇p ) is a bounded self-adjoint operator on L2 (M ) and the full Hamiltonian H = H0 + V ε ⊗ 1 H f
(17)
is self-adjoint on D(H0 ). We define the corresponding one-particle Hamiltonian on L2 (M ) by H1 = E(p) + V ε .
(18)
Note that E(p) is a priori only defined for p ∈ Λg . To make things simple, we continue E(p) to a smooth and compactly supported but otherwise arbitrary function on M . Since we will be interested only in the behavior for p ∈ Λg , we use this global function to define H1 . The corresponding unitary groups are denoted by U (t) = e−iHt and U1 (t) = e−iH1 t . Our goal is to show that, in a suitable sense, (U (t/ε) − U1 (t/ε))Pg = O(ε)
(19)
December 27, 2001 14:41 WSPC/148-RMP
8
00107
S. Teufel & H. Spohn
for macroscopic times t < T < ∞, where T depends only on V and the initial momenta. But before we can state the precise result, we have to make sense of H1 acting on Ran Pg . According to (11) we define the map U : RanPg → L2 (M ) U(φψ0 ) = φ ,
i.e. (Uψ)(p) = hψ0 (p), ψ(p)iHf .
Its adjoint U ∗ : L2 (M ) → Ran Pg is given by Z ⊕ ∗ dp 1lΛg (p)φ(p)ψ0 (p) , U φ=
(20)
(21)
where here and in the following 1lA denotes the characteristic function of a set A. Clearly U is an isometry and U ∗ U = 1 on Ran Pg . The effective dynamics cease to make sense, once the momentum of the particle leaves Λg . If that is not excluded by energy conservation, the comparison (19) can hold only up to some finite time, which can be determined from the classical dynamics generated by Hcl (x, p) = E(p) + V (x) on phase space Rd × M . Let Λi ⊂ M , the “set of initial momenta”, and Λm ⊂ M some “maximal set of allowed momenta”, both be compact. For Λ ⊂ M compact and δ ≥ 0 we let Λ + δ := p ∈ M : inf |p − k| ≤ δ k∈Λ
be the corresponding δ-enlarged set, which is again compact. With Φtp : Rd × M → M denoting the momentum component of the classical flow, we define for Λi + δ ⊂ Λm δ d Tm (Λi , Λm ) := sup {t : supp((1lΛi ◦ Φ−s p )(x, ·)) + δ ⊆ Λm ∀ s ∈ [0, t], x ∈ R } t≥0
as the maximal time for which the momentum of any (i.e. with any starting position) classical particle with initial momentum in Λi stays within a δ-margin inside of Λm . The following lemma shows that the classical bound on the momentum is respected also by the quantum dynamics in the limit ε → 0. Let Pi = 1lΛi (p) and Pm = 1lΛm (p). Lemma 2.1. Let Λi , Λm ⊂ M be both compact and Λi + δ ⊂ Λm for some δ > 0. δ (Λi , Λm ) there is a C < ∞ such that for all t ∈ [0, T ] For any T < ∞ with T ≤ Tm
(1 − Pm ) U1 t Pi ≤ Cε2 . (22)
2
ε L(L (M)) For the proof of Lemma 2.1 see Sec. 3.
December 27, 2001 14:41 WSPC/148-RMP
00107
Semiclassical Motion of Dressed Electrons
9
2.2. Main results In the following we will always assume that Λi ⊂ Λg and we can therefore identify Pi with U ∗ Pi UPg to keep notation simple. I.e., Pi projects on “dressed electron states” with momenta in Λi . Theorem 2.2. Let Λi be compact with Λi + δ ⊂ Λg for some δ > 0. For any T < ∞ δ (Λi , Λg ) there is a C < ∞ such that for all t ∈ [0, T ] with T ≤ Tm
U t − U ∗ U1 t U Pi ≤ C ε. (23)
ε ε L(H) This means, in the case of our main application, that dressed electron states with initial momenta in Λi evolve according to the effective dynamics without radiation losses for times of order ε−1 as long as the momenta of the corresponding classical δ depends orbits stay inside Λg . Note, in particular, that the macroscopic time Tm only on Λi , but not on ε. The equivalence between the full dynamics and the effective one-particle dynamics on the isolated energy band extends to the level of semiclassical or macroscopic observables. We consider classical symbols a ∈ C ∞ (Rd × M ) such that for all multiindices α, β there exists Cα,β < ∞ with sup |∂xα ∂pβ a(x, p)| ≤ Cα,β . x,p
We use the notation of [3] and denote this set of symbols by S00 (1). For a ∈ S00 (1) Weyl quantization leads to an operator aW,ε ∈ L(L2 (M )) given by Z p+k (24) e−i(p−k)·x φ(k) , (aW,ε φ)(p) = (2π)−d dx dk a εx, 2 with operator norm that is bounded uniformly in ε (cf. Sec. 4 for details). For a ∈ S00 (1) let aε (t) = U (−t/ε) (aW,ε ⊗ 1) U (t/ε)
(25)
aε1 (t) = U1 (−t/ε) aW,ε U1 (t/ε) .
(26)
and
Theorem 2.3. Let Λi be compact with Λi + δ ⊂ Λg for some δ > 0 and a ∈ S00 (1) such that Z a(1) (ξ, p)| < ∞ , dξ sup |ξ| |ˆ p∈M
where b (1) denotes Fourier transformation in the first argument. Then for every δ (Λi , Λg ) there is a constant C < ∞ such that for all t ∈ [0, T ] T < ∞ with T ≤ Tm and ψ ∈ H |hPi ψ, (aε (t) − U ∗ aε1 (t)U) Pi ψi| = |hPi ψ, aε (t) Pi ψi − hφ, aε1 (t) φi| ≤ ε C kPi ψk2H . (27)
December 27, 2001 14:41 WSPC/148-RMP
10
00107
S. Teufel & H. Spohn
Thus one can compute expectations of semiclassical observables in the full system approximately by computing the expectations of the corresponding observable in the effective one-particle theory. The advantage is that powerful methods of semiclassical approximation like the Egorov theorem (cf. Proposition 3.4 in Sec. 3) can be applied to aε1 (t), but not to aε (t). A detailed discussions with important examples is postponed to Sec. 6. 2.3. The massive Nelson and Pauli Fierz model It is of interest to see whether our abstract assumptions are in fact satisfied for physically, at least semi-realistic models. The best studied case is the Nelson model [13], where the coupling is to the position of the particle and the Maxwell field is replaced by a scalar field. Switching immediately to the total momentum representation, cf. (8), the Nelson Hamiltonian reads Z 1 ρˆ(k) (a(k) + a∗ (−k)) , (28) HN (p) = (p − pf )2 + Hf + dk p 2 2ω(k) where instead of (1), (5), a(k), a∗ (k) is a one-component RBose field over Rd . Again, p is the total momentum and regarded as a parameter. If dk |ˆ ρ|2 (ω −3 +ω −1 ) < ∞, then HN (p) is bounded from below and self-adjoint with domain D(Hf ). ρˆ(k) can be made real through an appropriate canonical transformation a(k) → eiθ(k) a(k). With such a choice HN is a real operator. According to a result of Fr¨ohlich [5], if ω(k) = (m2ph + k 2 )1/2 , mph > 0, HN has √ an isolated, nondegenerate ground state band for |p| < pc with pc ≥ 3 − 1. If in (28) one replaces the electron dispersion by E0 (p) = (m2el + p2 )1/2 , mel > 0, still keeping ω(k) = (m2ph + k 2 )1/2 , mph > 0, then at zero coupling the ground state lies strictly below the continuum edge for all p, which means pc = ∞ for ρˆ = 0. As proved in [5] pc = ∞ persists to arbitrary coupling strength. A larger class of bosonic dispersion relations is studied in [16]. For the particular case ω(k) = ω0 > 0 and dimensions d = 1, 2 one has pc = ∞, whereas for d = 3 and small coupling indeed pc < ∞ [12]. 1 For the Pauli–Fierz model (8) with ω(k) = (k 2 + mph ) 2 , mph >√0, the existence of the ground state is proved as in [5] for |p| < pc with pc ≥ 3 − 1. For the uniqueness no general argument is available, in contrast to the Nelson model. One route is to estimate the overlap of any ground state with the Fock vacuum (we are grateful to V. Bach for this remark). Thereby uniqueness is ensured provided Z (29) dk |ˆ ρ(k)|2 ω(k)−1 (E(p − k) − E(p) + ω(k))−2 < 1 . 4E(p) e2 ∆(p) = inf{E(p − k) − E(p) − ω(k), k ∈ Rd } is the gap between E(p) and the continuum edge. Thus ∆(p) > 0 for |p| < pc by definition. The condition (29) can be made more explicit through suitable estimates on E(p). We conclude that all assumptions in Theorem 2.2 are satisfied provided mph > 0 with no further restrictions in the case of the Nelson model and the implicit restriction (29) on the coupling strength e in the case of the Pauli–Fierz model.
December 27, 2001 14:41 WSPC/148-RMP
00107
Semiclassical Motion of Dressed Electrons
11
3. Preliminaries from Semiclassical Analysis Although Theorem 2.2 compares a full quantum evolution to an effective quantum evolution independently of any semiclassical behavior of H1 , we need to control the leaking through the boundary of Λg as expressed in Lemma 2.1. To prove Lemma 2.1 we use some tools from semiclassical analysis, which are introduced and applied to the one-particle Hamiltonian H1 in this section. Furthermore, in Sec. 6 we will consider semiclassical distributions based on the notions and results discussed here. Since E and V are both smooth bounded functions, we can apply standard results of semiclassics to the Hamiltonian H1 (p, iε∇p ) = E(p) + V (iε∇p ) acting on L2 (M ), where the roles of momentum and position are exchanged and the role of ~ is taken by ε. In the following we will simply ignore this difference and leave the necessary changes as compared to the standard case to the reader. Note, however, that the change in sign, i∇ instead of −i∇, is canceled by the fact that p0 = q and q 0 = −p is the canonical transformation interchanging q and p. We will consider classical symbols a ∈ S00 (1). As was stated in Sec. 2, Weyl quantization of functions in S00 (1) leads to bounded operators. The following result sharpens this statement. Proposition 3.1 (Calderon Vaillancourt). There is a C < ∞ and a finite n ∈ N such that for all a ∈ S00 (1) and ε ∈ [0, 1] kaW,ε kL(L2 (M)) ≤ C
sup |α|≤n,|β|≤n,(x,p)∈Rd×M
|∂xα ∂pβ a(x, p)| .
(30)
For a proof see, e.g., [3, Theorem 7.11]. The statement there is slightly different, but their proof implies our Proposition 3.1. Note, in particular, that the constants n and C in Proposition 3.1 depend on the dimension d of configuration space. The so called product rule, presented in the following Proposition 3.2, is at the basis of all semiclassical analysis we will apply. For a proof see [3, Proposition 7.7 and Theorem 7.9]. Proposition 3.2 (Product rule). Let a, b ∈ S00 (1). Then for all n ∈ N0 there is a dn < ∞ such that
!W,ε
n X
W,ε W,ε k
a b − ε c ≤ dn εn+1 , (31) k
2
k=0 L(L (M))
with k i ck (x, p) = 2
X |α|+|β|=k
(−1)|β| ((∂xβ ∂pα a)(∂pβ ∂xα b))(x, p) . |α|!|β|!
(32)
We state two simple facts that are immediate consequences of the product rule as
December 27, 2001 14:41 WSPC/148-RMP
12
00107
S. Teufel & H. Spohn
Corollary 3.3. Let a, b ∈ S00 (1). Then (i)
kaW,ε bW,ε − (ab)W,ε kL(L2 (M)) = O(ε) .
(ii) If, in addition, supp(a) ∩ supp(b) = ∅, then for any n ∈ N kaW,ε bW,ε kL(L2 (M)) = O(εn ) . Proof. (i) is just Proposition 3.2 for n = 0. (ii) holds since in this case ck = 0 in (31) for all k ∈ N0 . Here and in the following O(εn ) means that an expression, or its appropriate norm, is bounded by a constant times εn for sufficiently small ε. The crucial ingredient to our semiclassical analysis is the following first-order version of a Theorem going back to Egorov [4] (cf. also [15, Th´eor`eme IV-10], which is also a direct consequence of the product rule. Proposition 3.4 (Egorov’s Theorem). Let a ∈ S00 (1) and 0 ≤ T < ∞. There is a C < ∞ such that for all t ∈ [−T, T ]
U1 − t aW,ε U1 t − a ◦ Φt W,ε ≤ Cε2 .
2
ε ε L(L (M)) Proof. First note that for a ∈ S00 (1) we have that a ◦ Φt ∈ S00 (1) for all finite t, since the Hamiltonian vector field is smooth, uniformly bounded and all its partial derivatives are uniformly bounded. Moreover, a ◦ Φt and all its partial derivatives are each bounded uniformly for t ∈ [−T, T ]. Therefore k(a ◦ Φt )W,ε kL(L2 (M)) is bounded uniformly for t ∈ [−T, T ] by Proposition 3.1. Writing t t W,ε a U1 − (a ◦ Φt )W,ε U1 − ε ε Z t s d s = ds U1 − (a ◦ Φt−s )W,ε U1 , (33) ds ε ε 0 one is led to consider s s d t−s W,ε U1 − (a ◦ Φ ) U1 ds ε ε i W,ε s s t−s W,ε t−s W,ε [Hcl , (a ◦ Φ ) ] − Hcl , (a ◦ Φ ) , U1 = U1 − ε ε ε (34) W,ε . From the where {·, ·} denotes the Poisson bracket and we used that H1 = Hcl product rule one computes that for arbitrary a, b ∈ S00 (1)
i i W,ε W,ε [a , b ] = (aW,ε bW,ε − bW,ε aW,ε ) = {a, b}W,ε + O(ε2 ) , ε ε
(35)
December 27, 2001 14:41 WSPC/148-RMP
00107
Semiclassical Motion of Dressed Electrons
13
which implies that (34) is O(ε2 ) for fixed t − s. However, one can easily convince oneself that this bound is uniform for t − s in any bounded interval, since the seminorm used in Proposition 3.1 is bounded uniformly for the c2 -terms appearing in a derivation of (35) using the product rule. To have a natural way for extending functions of x or p alone to functions on phase space, we introduce the projections πx : Rd × M → Rd and πp : Rd × M → M as πx (x, p) = x and πp (x, p) = p. We are now ready to prove Lemma 2.1. Proof of Lemma 2.1. In order to regularize 1lΛ i and 1lΛm we pick some fi ∈ C0∞ (M ) and fm ∈ C ∞ (M ) such that fi Λ = 1, fm M\Λ = 1 and supp(fi ◦ Φ−t p )∩ i
m
δ ]. Egorov’s theorem implies that there is a C < ∞ supp(fm ◦πp ) = ∅ for all t ∈ [0, Tm δ ] such that for t ∈ [0, Tm
2
U1 t (fi ◦ πp )W,ε U1 − t − (fi ◦ Φ−t )W,ε p
2 ≤ Cε .
ε ε L(L ) W,ε (fi ◦ Since, by construction, supp(fi ◦ Φ−t p ) ∩ supp(fm ◦ πp ) = ∅, we have (fm ◦ πp ) W,ε 2 W,ε ) = O(ε ) by Corollary 3.3. This and the fact that P = (f ◦ π ) P Φ−t i i p i and p W,ε implies (22). that (1 − Pm ) = (1 − Pm )(fm ◦ πp )
4. Convergence of the Unitary Groups δ (Λi , Λg ) We prove Theorem 2.2. Let Λi +δ ⊂ Λg . In the following T < ∞ with T ≤ Tm will be fixed once and for all and we will always assume that 0 ≤ t ≤ T . For reasons that will become clear during the proof we have to introduce further sets in momentum space. Let [ supp(1lΛi ◦ Φ−t Λm = p (x, ·)) t∈[0,T ],x∈Rd
be the subset in momentum space that is reached by the classical dynamics starting from Λi in our relevant time span. By construction we have that Λm + δ ⊆ Λg and we define Λm1 := Λm +δ/4, Λm2 := Λm +δ/2 and Λm3 := Λm +3δ/4. The respective projections will be denoted by Pm1 , Pm2 and Pm3 . The following lemma ensures that V ε maps wave functions supported in Λm1 (resp. in Λm2 , Λm3 ) to wave functions supported in Λm2 (resp. in Λm3 , Λg ) up to errors of arbitrary order in ε. We will use this result in the following implicitly many times. R Lemma 4.1. Let Vˆ ∈ L1 (Rd ) such that dk |k|n |Vˆ (k)| < ∞ for all n ∈ N. Then for each compact Λ ⊂ Λg and all m ∈ N and δ > 0 there is a Cδ,m < ∞ and a Cδ < ∞ such that (i)
k(V ε − 1lΛ+δ V ε )PΛ kH ≤ Cδ,m εm and, if Λ + δ ⊂ Λg ,
(ii)
k(V ε − PΛ+δ V ε )PΛ kH ≤ Cδ ε .
December 27, 2001 14:41 WSPC/148-RMP
14
00107
S. Teufel & H. Spohn
Proof. Let ψ ∈ Ran PΛ , i.e. ψ = φψ0 with φ ∈ L2 (M ) and supp φ ⊂ Λ. For (i) it suffices to consider the L2 (M ) part and calculate Z (V ε φ)(p) = dk Vˆ (k)φ(p − εk) Z
Z dk Vˆ (k)φ(p − εk) +
= |k|≤ε−1/2
dk Vˆ (k)φ(p − εk) .
(36)
|k|>ε−1/2
The first term in (36) is supported in Λ + δ for ε sufficiently small. For the second term note that Z Z dk Vˆ (k)φ(p − εk) = dk 0 ε−d Vˆ (k 0 /ε)φ(p − k 0 ) |k|>ε−1/2
|k0 |>ε1/2
amounts to convolution with the function 1l|k0 |>ε1/2 ε−d Vˆ (k 0 /ε) and therefore, by Young’s inequality, the L(L2 )-norm of the corresponding map is bounded by Z Z 0 −d ˆ 0 dk ε |V (k /ε)| = dk |k|−2m |k|2m |Vˆ (k)| ≤ Cm εm . |k0 |>ε1/2
|k|>ε−1/2
This shows (i). Part (ii) follows from the fact that the off diagonal part of V ε is of order ε, as expressed in Lemma 4.4. We turn to the proof of (23). Using Lemma 2.1 together with UU ∗ = Id on Ran Pm2 we get t t − U ∗ U1 U Pi U ε ε Z t ε t = −iU ds U (−s)(HU ∗ − U ∗ H1 )U1 (s)UPi ε 0 Z t ε t = −iU ds U (−s)(H − U ∗ H1 U)Pm2 U ∗ U1 (s)UPi + O(ε) . (37) ε 0 For the last equality note also that a factor of order O(ε2 ) in the otherwise uniformly bounded integrand leads to the integral being O(ε). We would be done by the same argument, if we could show that H − U ∗ H1 U acting on RanPm2 is O(ε2 ). However, the first order term does not vanish, as we will see, and we have to treat the integral more carefully. In order to separate the leading order term we write H − U ∗ H1 U = (H − Hdiag ) + (Hdiag − U ∗ H1 U) , where Hdiag = H0 + Pg V ε Pg + Pg⊥ V ε Pg⊥ and Pg⊥ := 1H − Pg .
(38)
December 27, 2001 14:41 WSPC/148-RMP
00107
Semiclassical Motion of Dressed Electrons
15
We will treat the easy part first and show in Lemma 4.2 that the difference Hdiag − U ∗ H1 U vanishes sufficiently fast on Ran Pm2 . Lemma 4.2. For ε sufficiently small there is a C < ∞ such that k(Hdiag − U ∗ H1 U)Pm2 k ≤ Cε2 .
(39)
Proof. By definition we have that Pg⊥ Hdiag Pm2 = Pg⊥ U ∗ H1 UPm2 = 0. Hence it suffices to consider the difference projected onto Ran Pg . On Ran Pm2 we have Pg (Hdiag φψ0 )(p)
Z
= E(p)φ(p)ψ0 (p) + 1lΛg (p)
dk Vˆ (k)φ(p − εk) hψ0 (p), ψ0 (p − εk)iHf ψ0 (p)
and
Z
(U ∗ H1 Uφψ0 )(p) = E(p)φ(p)ψ0 (p) + 1lΛg (p)
dk Vˆ (k)φ(p − εk) ψ0 (p) .
Hence Pg (Hdiag − U ∗ H1 U)(φψ0 )(p) Z dk Vˆ (k)φ(p − εk) (hψ0 (p), ψ0 (p − εk)iHf − 1) ψ0 (p) . = 1lΛg (p)
(40)
We will show that there is a constant C such that |hψ0 (p), ψ0 (p − εk)iHf − 1| ≤ C|k|2 ε2
(41)
for all p ∈ Λg and k with p − εk ∈ Λg . Therefore Z k(Hdiag − U ∗ H1 U)φψ0 kH ≤ Cε2 dk |Vˆ (k)| |k|2 kφ(· − εk)ψ0 (·)kH Z dk |Vˆ (k)| |k|2
≤ Cε kφkL2 (M) 2
Z = Cε kφψ0 kH 2
dk |Vˆ (k)| |k|2 .
To see (41), note that for ψ0 (·) ∈ C 2 (Λg , Hf ) Taylor expansion yields 1 0 ψ0 (p − εk) = ψ0 (p) − εk · ∇p ψ0 (p) + ε2 k · (∇(2) p ψ0 )(p )k , 2 where ∇p ψ0 denotes the Hessian and 12 ε2 k · (∇p ψ0 )(p0 )k is the Lagrangian remainder. In view of Rehψ0 (p), ∇p ψ0 (p)iHf = 0, which follows from differentiating the identity hψ0 (p), ψ0 (p)iHf = 1, and Imhψ0 (p), ∇p ψ0 (p)iHf = 0, which holds by assumption, we obtain (2)
(2)
|hψ0 (p), ψ0 (p − εk)iHf − 1| ≤ C(p)|k|2 ε2 .
December 27, 2001 14:41 WSPC/148-RMP
16
00107
S. Teufel & H. Spohn
Here C(p) = 12 by continuity.
P i,j
|hψ0 (p0 ), ∂pi ∂pj ψ0 (p0 )i|, which is bounded uniformly for p ∈ Λg
With the help of Lemma 4.2 and (37), we have t t − U ∗ U1 U Pi U ε ε Z t ε t = −iU ds U (−s)(H − Hdiag )Pm2 U ∗ U1 (s)U Pi + O(ε) . ε 0
(42)
Since (H − Hdiag )Pm2 is of order ε and not of order ε2 , the last integral needs to be treated more carefully. Modulo technicalities the strategy is as follows. We isolate the first order term and show that H − Hdiag = iε[(∇V )ε · ∇p P0 , P0 ] + O(ε2 ). Then an operator A is constructed such that [(∇V )ε · ∇p P0 , P0 ] = [A, H] + O(ε), i.e. U (−s)[(∇V )ε · ∇p P0 , P0 ]U (s) is the time derivative of the bounded operator U (−s)AU (s) modulo terms of order ε. Then partial integration in (42) shows that not only the integrand but also the integral is of order ε. The strategy just described is clearly related to the proof of the time-adiabatic theorem (cf. [10] and for a recent overview on the extensive literature [1]). In the time-adiabatic case, however, one has H(t) − Hdiag (t) = iε[P˙ (t), P (t)] without error term and also [P˙ (t), P (t)] = [H(t), A(t)] can be solved exactly. Lemma 4.3. P0 (·) ∈ C 2 (Λg , L(Hf )) and for p ∈ Λg one has P0 (p)(∇p P0 )(p)P0 (p) = Q0 (p)(∇p P0 )(p)Q0 (p) = 0 ,
(43)
where Q0 (p) = 1Hf − P0 (p). Proof. P0 (·) ∈ C 2 (Λg , L(Hf )) follows from the standard argument involving the Riesz formula. For p ∈ Λg one can express P0 (p) as a contour integral , I 1 dλ Rλ (H0 (p)) , P0 (p) = − 2πi c(p) where c(p) is a smooth curve in the complex plane circling the isolated eigenvalue E(p) only and Rλ (H0 (p)) = (H0 (p)−λ)−1 . It is easy to see that one can differentiate twice under the integral given our assumptions on H0 (·). (43) follows from ∇p P0 (p) = ∇p (P02 (p)) = P0 (p)(∇p P0 )(p) + (∇p P0 )(p)P0 (p) . Lemma 4.4. (H − Hdiag )Pm2 = −εQg (∇p Pg )Pm3 · F ε Pm2 + O(ε2 ) , R⊕ R⊕ where (∇p Pg ) = Λg dp ∇p P0 (p), Qg = Λg dp Q0 (p) and Z ε (F ψ)(p) := −i dk k Vˆ (k)ψ(p − εk) .
(44)
December 27, 2001 14:41 WSPC/148-RMP
00107
Semiclassical Motion of Dressed Electrons
17
Proof. Using Lemma 4.1(i) we obtain (H − Hdiag )Pm2 ψ = Pg⊥ V ε Pm2 ψ Z Z ⊕ dp Q0 (p) dk Vˆ (k)P0 (p − εk)(Pm2 ψ)(p − εk) + O(ε2 ) . = Λm3
Since P0 : Λg → L(Hf ) is twice continuously differentiable, we have that 0 P0 (p − εk) = P0 (p) − εk · (∇p P0 )(p) + ε2 k · (∇(2) p P0 )(p (p, εk)) · k ,
(45)
where the last term is again the Lagrangian remainder. Hence, for p ∈ Λg , Z dk Vˆ (k)P0 (p − εk)(Pm2 ψ)(p − εk) Z dk Vˆ (k)(P0 (p) − εk · (∇p P0 )(p))(Pm2 ψ)(p − εk)
=
Z +ε Since
2
0 dk Vˆ (k) k · (∇(2) p P0 )(p (p, εk)) · k (Pm2 ψ)(p − εk) .
Z
(2) 0 ˆ
1lΛm3 (·) dk V (k) k · (∇p P0 )(p (·, εk)) · k (Pm2 ψ)(· − εk)
Z ≤
(46) (47)
(48)
H
0 dk Vˆ (k) k1lΛm3 (·) k · (∇(2) p P0 )(p (·, εk)) · k (Pm2 ψ)(· − εk) kH
(49)
Z ≤ sup
p∈Λm2
k(∇(2) p P0 )(p)k
dk |Vˆ (k)| |k|2 k1lΛm3 (·) (Pm2 ψ)(· − εk)kH
Z
≤ C kψkH
dk |Vˆ (k)|k 2 ,
(47) is O(ε2 ) in L(Hf ) and multiplying (46) with Q0 (p) from the left establishes (44). Including Lemma 4.4 we are left with t t ∗ − U U1 U Pi U ε ε Z t ε t = iεU ds U (−s)Qg (∇p Pg )Pm3 · F ε Pm2 U ∗ U1 (s)UPi + O(ε) . ε 0 To exploit the time averaging we write, as explained before, Qg (∇p Pg )Pg as a time derivative, at least in approximation. Let B(p) := RE(p) (H0 (p))Q0 (p)(∇p P0 (p))P0 (p) .
December 27, 2001 14:41 WSPC/148-RMP
18
00107
S. Teufel & H. Spohn
Note that RE(p) (H0 (p))Q0 (p) := (H0 (p)−E(p))−1 Q0 (p) is bounded since E(p) is an isolated eigenvalue for all p ∈ Λg and Q0 (p) projects on the orthogonal complement of the corresponding eigenvector. Lemma 4.5. Qg (∇p Pg )Pm3 = [H0 , B]Pm3 = [H, B]Pm3 + O(ε)
(50) (51)
and [H, F ε ]Pm2 = O(ε)
(52)
in L(H). Proof. (50) follows from direct computation: H0 (p)B(p) − B(p)H0 (p) = (H0 (p) − E(p))RE(p) (H0 (p))Q0 (p)(∇p P0 (p))P0 (p) = Q0 (p)(∇p P0 )(p)P0 (p) . For (51) we need to show that [V ε , B]Pm3 = O(ε): [V ε , B]Pm3 = [V ε , RE (H0 )Qg (∇p Pg )Pg ]Pm3 . However, P0 (·), ∇p P0 (·) and RE(·) (H0 (·))Q0 (·) are all in C 1 (Λg , L(Hf )) and thus each of them has a commutator with V ε which is of order ε (cf. proof of Lemma 4.4). Thus we showed (51). For (52) first observe that [F ε , V ε ] = 0. [H0 , F ε ] = O(ε) follows again from smoothness of H0 (p) as a function of p. But since H0 is not bounded we shortly explain the argument. Using (as always) Lemma 4.1(i) we observe that ([H0 , F ε ]Pm3 ψ)(p) = 1lΛg (p)([H0 , F ε ]Pm3 ψ)(p) + O(ε) Z = −i1lΛg (p) dk Vˆ (k) k (H0 (p) − H0 (p − εk))(Pm3 ψ)(p − εk) + O(ε) Z = −iε1lΛg (p)
dk Vˆ (k) k k · (∇p H0 )(p0 (k))(Pm3 ψ)(p − εk) + O(ε)
(53)
where the gradient is evaluated at the appropriate point p0 (k). By assumption we have that (∇p H0 )(p) is bounded uniformly for p ∈ Λg in the graph norm, which is equivalent to the H-norm on span{ψ0 (p), p ∈ Λg }. Thus the above expression is O(ε) as ε → 0 (cf. estimate (48)). Let A = B · Pm3 F ε . Then Lemma 4.5 yields Qg (∇p Pg ) · Pm3 F ε Pm2 = [A, H]Pm2 + O(ε)
December 27, 2001 14:41 WSPC/148-RMP
00107
Semiclassical Motion of Dressed Electrons
19
in L(H), where [Pm3 F ε , H]P2 = O(ε) follows immediately from (52). Let A(t) = U (−t)AU (t), then t t − U ∗ U1 ( )U)Pi U ε ε Z t ε t = iεU ds U (−s)[A, H]U (s)U (−s)Pm2 U ∗ U1 (s)UPi + O(ε) ε 0 Z t ε d t A(s) U (−s)Pm2 U ∗ U1 (s)U Pi + O(ε) ds = −εU ε ds 0 t t [A(s)U (−s)Pm2 U ∗ U1 (s)U]0ε Pi + O(ε) = −εU ε Z t ε t d ds A(s) (U (−s)Pm2 U ∗ U1 (s)U)Pi + εU ε ds 0 Z t ε t = iεU ds A(s)U (−s)(HPm2 U ∗ − Pm2 U ∗ H1 )U1 (s)UPi + O(ε) ε 0 Z t ε t ds A(s)U (−s)(H − U ∗ H1 U)Pm1 U ∗ U1 (s)UPi + O(ε) . = iεU ε 0 For the last equality we used again Lemma 2.1 and the fact that Lemma 4.1(i) guarantees Pm2 U ∗ H1 UPm1 = U ∗ H1 UPm1 + O(ε2 ). Finally also the last integral is O(ε), since we showed already that (H − U ∗ H1 U)Pm1 = O(ε) and the proof of Theorem 2.2 is completed. 5. Convergence of the Macroscopic Observables In this section we prove Theorem 2.3. We start by showing Lemma 5.1. Let a ∈ S00 (1) such that Z a(1) (ξ, p)| < ∞ dξ sup |ξ| |ˆ p∈M
and γ > 0. Then there is a constant C < ∞ such that k(aW,ε ⊗ 1 − U ∗ aW,ε U) Pg−γ k ≤ C ε ,
(54)
where Pg−γ projects on dressed electron states supported in Λg − γ := {p ∈ Λg : inf k∈M\Λg |k − p| ≥ γ}. Proof. For φ in a dense subset of L2 (Λg − γ) and p ∈ Λg we have ((aW,ε ⊗ 1)φψ0 )(p) Z p+k e−i(p−k)·x φ(k)ψ0 (k) dk dx a εx, = (2π)−d 2
December 27, 2001 14:41 WSPC/148-RMP
20
00107
S. Teufel & H. Spohn
ε ˜ ik·˜ ˜ ˜ ψ0 (p + εk) ˜ ˜ = (2π) dk d˜ xa x ˜, p + k e x φ(p + εk) 2 Z ε˜ −d (1) ˜ ψ0 (p + εk) ˜ ˜ ˜ dk a ˆ − k, p + k φ(p + εk) = (2π) 2 Z ˜ ψ0 (p) ˜ p + ε k˜ φ(p + εk) dk˜ a ˆ(1) − k, = (2π)−d 2 Z ˜ k˜ · (∇p ψ0 )(f (p, εk)) ˜ ˜ p + ε k˜ φ(p + εk) dk˜ a ˆ(1) − k, + ε (2π)−d 2 −d
Z
= (U ∗ aW,ε U φψ0 )(p) + Rε .
(55)
In the previous computation we substituted x˜ = εx and k˜ = (k − p)/ε. In the second to last equality we used Taylor expansion of ψ0 and the mean value theorem ˜ such that the equality holds. guarantees the existence of a point f (p, εk) From (55) we conclude that k(1lΛg (·) ⊗ 1) (aW,ε ⊗ 1 − U ∗ aW,ε U)Pg−γ k ≤ kRε k .
(56)
Since k(1 − 1lΛg (·) ⊗ 1) (aW,ε ⊗ 1 − U ∗ aW,ε U)Pg−γ k = k(1 − 1lΛg (·) ⊗ 1) (aW,ε ⊗ 1)(1lΛg −γ (·) ⊗ 1)Pg−γ k = O(εn ) for arbitrary n by Corollary 3.3 (ii), Lemma 5.1 follows by showing that Rε is of order ε:
Z
(1) ε˜ ε −d ˜ ˜ ˜ ˜ ˜
ˆ dk a − k, · + k φ(· + εk) k · (∇p ψ0 )(f (·, εk)) kR k ≤ ε(2π)
2 H
Z
(1) ε˜ ˜ −d ˜ ˜ ˜
≤ ε(2π) sup k(∇p ψ0 )(p)kHf dk a − k, · + k |k| φ(· + εk) ˆ
2 2 p∈M L (M) Z ˜ |ˆ ˜ p)| ≤ εC kφkL2 (M) a(1) (k, dk˜ sup |k| p∈M
= εC˜ kφψ0 kH . Lemma 5.1 together with Theorem 2.2 and Lemma 2.1 immediately implies Theorem 2.3: |hPi ψ, (aε (t) − U ∗ aε1 (t) U)Pi ψi| =
|hU (t/ε)Pi ψ, (aW,ε ⊗ 1)U (t/ε)Pi ψi − hU1 (t/ε) UPi ψ, aW,ε U1 (t/ε) UPi ψi|
Thm 2.2
=
|hU (t/ε)Pi ψ, (aW,ε ⊗ 1)U (t/ε)Pi ψi − hU (t/ε)Pi ψ, U ∗ aW,ε U U (t/ε)Pi ψi| + O(ε)kPi ψk2
December 27, 2001 14:41 WSPC/148-RMP
00107
Semiclassical Motion of Dressed Electrons Lem 2.1 & Thm 2.2
=
21
|hPm U (t/ε)Pi ψ, (aW,ε ⊗ 1 − U ∗ aW,ε U)Pm U (t/ε)Pi ψi| + O(ε)kPi ψk2
Lem 5.1
=
O(ε)kPi ψk2 ,
where Pm was defined in the previous section and satisfies Pm = Pg−δ Pm .
6. Semiclassical Distributions Theorem 2.2 establishes that, restricted to the ground state band, the full unitary group U (t) and the approximate one-particle unitary U1 (t) are uniformly close to each other and Theorem 2.3 lifts this assertion to semiclassical observables. Experimentally measured are empirical statistics of suitable observables, like position and momentum, and we still have the task to investigate in what sense they are approximated through the time evolution U1 (t) for ε 1. For small ε, H1 itself is a semiclassical Hamiltonian, which means that empirical distributions can be determined through the classical flow Φt generated by H1 . Somewhat crudely the scheme is as follows: one choses an initial wave function φε , which may or may not depend on ε, such that for small ε it determines the measure ρcl (dx dp) on phase space. We evolve φε as φεt = e−iH1 t/ε φε and ρcl (dx dp, t) = (ρcl ◦ Φ−t )(dx dp). Then the empirical distributions computed from φεt agree with those of ρcl (t) up to errors of order ε, i.e. quantum distributions are well approximated by their classical counterpart. In our context the distributions of physical interest are really determined through ψtε = U (t/ε)ψ ε with ψ ε = φε ψ0 and not through U1 (t/ε)φε . Theorem 2.3 asserts that they are still well approximated through the semiclassical evolution corresponding to H1 . There are various “schools” which differ in what initial ψ’s are regarded as physically natural. (i) wave packet dynamics. The initial wave function is well localized in macroscopic position and momentum, i.e. ρcl (dx dp) = δ(x − x0 )δ(p − p0 ) dx dp. Then the wave packet follows the classical orbit , which only reflects that ρcl (dxdp, t) = δ(x − xt )δ(p − pt ) dx dp. (ii) microscopic wave function independent of ε. On the macrscopic scale the position is localized, but there is momentum spread. Therefore ρcl (dx dp) = 2 ˆ dx dp. Such a choice is appropriate immediately after a scattering δ(x)|φ(p)| event. Then φ is still localized at the scatterer but has considerable momentum spread. (iii) WKB. The wave function is taken to be build up from local plane waves, which means it has the form φε (x) = εd/2 f (εx)eiS(εx)/ε on the microscopic scale. φε is spread over macroscopic distance, but at any given point it has a sharp momentum. φε yields the phase space measure ρcl (dx dp) = |f (x)|2 δ(p − ∇S(x)) dx dp.
December 27, 2001 14:41 WSPC/148-RMP
22
00107
S. Teufel & H. Spohn
A general discussion of semiclassical distributions for evolutions of the type U1 , including the three mentioned examples, can be found in [11]. However, since the approximation from U (t/ε) to U1 (t/ε), covered by Theorems 2.2 and 2.3, holds uniform on Pg H, we would like to add uniform results also for the semiclassical analysis of U1 . This shows that the rate at which the distributions become classical is independent of the microscopic details of the initial wave function. On the other hand, the rate does depend on the type of scaling that is used. The results from the following subsections can be immediately translated to the language of Wigner measures. For ψ ∈ L2 (Rd ) ⊗ Hf let the reduced Wigner distribution be defined by Z ε (ψ)(dx dp, t) = dξ ε−d hψˆx∗0 (x/ε − ξ/2, t), ψˆx0 (x/ε + ξ/2, t)iHf eip·ξ dx dp , Wrd where here b stands for Fourier transformation in the first argument, i.e. for F ⊗ 1. Then the results translate into ε ε (ψ)(dx dp, t) = ρcl (dx dp, t) := lim Wrd (ψ)(t = 0) ◦ Φ−t (dx dp) lim Wrd ε→0
ε→0
weakly as measures. But, in addition, we get that the previous limit holds uniformly in ψ if evaluated pointwise on test functions. 6.1. Wave packets following a classical trajectory The conceptually simplest way for a quantum particle to behave classically is to have a well localized wave function that follows a classical trajectory. Hence we consider initial wave functions with sharply peaked momentum and, at the same time, sharply peaked macroscopic position. Let the initial wave function be √ φx0 ,p0 (x) = εd/4 eix·p0 φ( ε(x − x0 /ε)) for some φ ∈ L2 (Rd ), i.e., a wave function that is peaked on the macroscopic scale and centered at x0 , but spread out on the microscopic scale. Its Fourier transform is given by x ·(p−p ) −d/4 −i 0 ε 0 ˆ p − p0 ˆ √ , e φ φx0 ,p0 (p) = ε ε which becomes sharply peaked at p0 for ε small. There is no difficulty to include also asymmetric scaling with weights ε1−α and εα , 0 < α < 1, and the choice α = 1/2 was made to simplify presentation. Under the time evolution generated by H1 = E(−i∇x ) + V (εx) it moves along the corresponding classical trajectory starting at (x0 , p0 ) following the classical flow Φt generated by Hcl = E(p) + V (x) .
December 27, 2001 14:41 WSPC/148-RMP
00107
Semiclassical Motion of Dressed Electrons
23
To be consistent with the previous chapters, we continue to work in momentum representation. Proposition 6.1. Let a ∈ S00 (1) and T < ∞. Then there is a C < ∞ such that ˆ |p|φˆ ∈ L1 (Rd ) and t ∈ [0, T ] for φ ∈ L2 (Rd ) with φ, |x|φ, φ, | hφˆx ,p , aε (t) φˆx ,p i − (a ◦ Φt )(x0 , p0 ) | 0
0
1
0
0
√
ˆ L1 + k|x|φkL1 kφk ˆ L1 ) . ≤ C ε (kφkL1 k|p|φk
(57)
Using Theorem 2.3, this translates immediately to the full dynamics. Corollary 6.2. Let the assumptions of Theorem 2.3 be satisfied. Then there is a ˆ |p|φˆ ∈ L1 (Rd ) and C < ∞ such that for ψx0 ,p0 = φx0 ,p0 ψ0 ∈ Ran Pi with φ, |x|φ, φ, t ∈ [0, T ] | hψx0 ,p0 , aε (t) ψx0 ,p0 i − (a ◦ Φt )(x0 , p0 ) | √ ˆ L1 + k|x|φkL1 kφk ˆ L1 + kφk2 2 ) . ≤ C ε (kφkL1 k|p|φk L Hence, when initially, on the macroscopic scale, position and momentum are both sharply defined, the wave packet follows the classical orbit without spreading even for macroscopic times. Such a situation occurs for example in particle accelerators, where one can indeed calculate the particle trajectories based solely on classical dynamics in good approximation. Proof of Proposition 6.1. Referring to Egorov’s theorem we have to compute hφˆx ,p , (a ◦ Φt )W,ε φˆx ,p i . 0
One can replace (a ◦ Φ ) by
t W,ε
0
0
0
by the so called standard quantization of a ◦ Φt defined
ˆ := (2π)−d/2 ((a ◦ Φt )S,ε φ)(p)
Z
dx (a ◦ Φt )(εx, p) e−ip·x φ(x)
(58)
where the error is of order ε uniformly in t ∈ [0, T ] (cf. [3, Chap. 7]). For the standard quantization the result becomes a simple calculation: hφˆx0 ,p0 , (a ◦ Φt )S,ε φˆx0 ,p0 i Z x ·(p−p ) p − p0 −d/2 i 0 ε 0 ˆ∗ √ (a ◦ Φt )(εx, p) e−ip·x eix·p0 φ dx dp e = (2π) ε √ x0 ×φ ε x− ε Z √ p (a ◦ Φt )(εx + x0 , p + p0 ) e−ip·x φ( ε(x)) = (2π)−d/2 dx dp φˆ∗ √ ε Z √ √ p)(a ◦ Φt )( ε¯ x + x0 , ε¯ p + p0 ) e−i¯p·¯x φ(¯ x) x d¯ p φˆ∗ (¯ = (2π)−d/2 d¯ = (a ◦ Φt )(x0 , p0 ) + R ,
December 27, 2001 14:41 WSPC/148-RMP
24
00107
S. Teufel & H. Spohn
where |R| ≤
√ ˆ L1 k∇p (a ◦ Φt )k∞ + k|x|φkL1 kφk ˆ L1 k∇x (a ◦ Φt )k∞ ) . ε(kφkL1 k|p|φk
The last inequality follows from a Taylor expansion of a ◦ Φt around (x0 , p0 ). 6.2. Initial wave function with momentum spread Generally wave functions are not of the special form described in the previous subsection. Typically they arise from microscopic interactions and thus “live” on the microscopic scale and do not depend on ε. But if the shape of the initial wave function does not depend on ε it will effectively look like a delta function on the macroscopic scale at t = 0. More precisely, let φ ∈ L2 (Rd ), then the scaled position distribution is ε−d |φ(x/ε)|2 which converges to δ(x) as a measure. However, the 2 ˆ , since in the quotient x/t the ε’s scaled momentum distribution is still |φ(p)| cancel and thus the initially peaked wave function will spread if evolved to times of order ε−1 . More generally we consider as initial wave function φx0 (x) = φ(x − x0 /ε) , i.e. we move φ to the macroscopic initial position x0 . Then it is natural to choose ρcl (dx dp) = δ(x − x0 )|φˆx0 (p)|2 dx dp as the corresponding classical phase space distribution at t = 0 and evolve it according to the classical flow to ρcl (dx dp, t) = ρ ◦ Φ−t (dx dp) . Let us now, as the simplest example, compare the quantum mechanical position distribution ρε (dx, t) = ε−d |φt (x/ε)|2 dx , with the classical one,
φt = e−iH1 t/ε φx0 ,
Z ρxcl (dx, t)
=
ρcl (dx dp, t) ,
in the limit ε → 0. As a first step we calculate, with f ∈ C0∞ a test function, Z ρε (dx, t) f (x) = hφεt , f φεt i = hφx0 , f (xε (t)) φx0 i = hφˆx0 , (f ◦ Φtx )W,ε φˆx0 i + O(ε2 ) .
(59)
For the last equality we used Egorov’s Theorem. To proceed we need to know how the Weyl quantization of a time evolved classical observable acts on microscopic wave functions. The following proposition follows from a calculation analogous to the one for Proposition 6.1.
December 27, 2001 14:41 WSPC/148-RMP
00107
Semiclassical Motion of Dressed Electrons
25
Proposition 6.3. Let a ∈ S00 (1). Then for each T < ∞ there is a C < ∞ such that for φ ∈ L2 (Rd ) with |x|φ, φˆ ∈ L1 (Rd ) and t ∈ [0, T ] Z 2 t ˆ ˆ L1 . (60) hφˆx0 , aε (t) φˆx0 i − dp |φx0 (p)| (a ◦ Φ )(x0 , p) ≤ C ε k|x|φkL1 kφk 1 Thus (59) becomes Z hφˆx0 , (f ◦ Φtx )W,ε φˆx0 i =
dp |φˆx0 (p)|2 (f ◦ Φtx )(x0 , p) + O(ε) .
(61)
However, the right hand side of (61) is exactly what we were aiming for: Z Z 2 t ˆ dp |φx0 (p)| (f ◦ Φx )(x0 , p) = ρcl (dx dp, 0)(f ◦ Φtx )(x, p) Z ρcl (dx dp, t)f (x)
= Z =
ρxcl (dx, t)f (x) .
In summary, one obtains that lim ρε (dx, t) = ρxcl (dx, t)
ε→0
(62)
weakly as measures. Note that this means that the position distribution of the quantum particle converges to the classical one, although the wave function is not following the classical orbit, but spreading. Such a situation occurs for example in a scattering experiment. After the quantum particle is scattered off the target its wave function usually has a large momentum spread and thus it also spreads in position space. However, during the process of detection it is subject to relatively weak potentials and can be treated like a classical particle, at least on the level of statistics. For general observables a ∈ S00 (1) Proposition 6.3 yields Z ε ˆ ˆ (63) hφx0 , a1 (t) φx0 i = ρcl (dx dp, t)a(x, p) + O(ε) . Again, Theorem 2.3 allows us to translate this result to the full quantum dynamics. Corollary 6.4. Let the assumptions of Theorem 2.3 be satisfied. Then there is a C < ∞ such that for ψx0 = φx0 ψ0 ∈ Ran Pi with |x|φ, φˆ ∈ L1 (Rd ) and t ∈ [0, T ] Z ˆ L1 ) . hψx0 , aε (t) ψx0 i − ρcl (dx dp, t)a(x, p) < Cε (kφk2 2 + k|x|φkL1 kφk L
December 27, 2001 14:41 WSPC/148-RMP
26
00107
S. Teufel & H. Spohn
6.3. Initial wave function of WKB form In the previous case of an initially localized wave function the different momentum components travel at different velocities and therefore such a wave function spreads on the macroscopic scale. After some time one expects the wave function to have locally well defined momentum as long as no interference occurs, i.e. it should be of WKB type. On the microscopic scale a WKB wave function has the form φ(x) = εd/2 f (εx)ei
S(εx) ε
,
with f and S real valued. Hence it locally looks like a plane wave with momentum ∇S(εx) and amplitude f (εx). Time-dependent WKB approximation is concerned with showing that the time evolution of such a wave function is in first order given by (e−iH1 t/ε φ)(x) ≈ ft (εx)ei
St (εx) ε
,
where St is the solution of the classical Hamilton–Jacobi equation with initial condition S and ft is the solution of the classical continuity equation ∂t f + div ∇St = 0 . The corresponding classical phase space distribution is therefore ρcl (dx dp) = f 2 (x)δ(p − ∇S(x)) dx dp .
(64)
The main drawback of the WKB approximation is that it works only as long as no caustics are reached, or, put differently, as long as no interference between different parts of the WKB wave function happens. However, if we focus again on distributions of semiclassical observables on phase space no difficulty arises, only ρcl (t) is no longer of the particular form (64). Proposition 6.5. Let f ∈ C ∞ (Rd ) ∩ L1 (Rd ), S ∈ C ∞ (Rd ) and φ(x) = εd/2 f (εx)ei
S(εx) ε
be normalized in L2 (Rd ). Let a ∈ C0∞ (Rd × Rd ) and T < ∞. Then there is a C < ∞ such that for all t ∈ [0, T ] Z √ ˆ − ρcl (dx dp, t)a(x, p) ≤ C ε , ˆ aε (t)φi hφ, (65) 1 where ρcl (dx dp, t) = (ρcl ◦ Φ−t )(dx dp) and ρcl (dx dp) was defined in (64). As in the preceding cases we can again translate Proposition 6.5 to the full dynamics using Theorem 2.3, but we omit the corresponding statement this time. Proof. We apply Egorov and switch to standard quantization: ˆ = hφ, ˆ (a ◦ Φt )W,ε φi ˆ + O(ε) = hφ, ˆ (a ◦ Φt )S,ε φi ˆ + O(ε) . ˆ aε (t)φi hφ, 1
December 27, 2001 14:41 WSPC/148-RMP
00107
Semiclassical Motion of Dressed Electrons
27
We calculate for a ∈ C0∞ Z y \ −d d/2 S,ε ˆ (a φ ) = (2π) ε dx dp a(εx, p)f (εx)e−ip·(x−y/ε) eiS(εx)/ε ε Z −d −d/2 = (2π) ε dx dp a(x, p)f (x)ei(S(x)−p·(x−y))/ε = (2π)−d ε−d/2
Z dz dp a(z + y, p)f (z + y)ei(S(z+y)−p·z)/ε .
Stationary phase method (cf. [9, Theorem 7.7.7] with k = n + 1) yields that Z dz dp a(z + y, p)f (z + y)ei(S(z+y)−p·z)/ε i 1 − (2πε)d a(y, ∇S(y))f (y)e ε S(y) ≤ C εd+ 2 , where the constant is uniform in y and depends on a via a sum of sup-norms of finite many partial derivatives. Hence √ i y \ t S,ε ˆ = εd/2 (a ◦ Φt )(y, ∇S(y))f (y)e ε S(y) + εd/2 O( ε) ((a ◦ Φ ) φ) ε and
Z ˆ = ˆ (a ◦ Φt )S,ε φi hφ,
√ dy(a ◦ Φt )(y, ∇S(y))f 2 (y) + O( ε) ,
where we used that f ∈ L1 (Rd ). References [1] J. E. Avron and A. Elgart, “Adiabatic theorems without a gap condition”, Commun. Math. Phys. 203 (1999) 445–463. [2] J. Bolte and S. Keppeler, “A semiclassical approach to the Dirac equation”, Ann. Phys. 274 (1999) 125–162. [3] M. Dimassi and J. Sj¨ ostrand, Spectral Asymptotics in the Semi-Classical Limit, London Mathematical Society Lecture Note Series 268, Cambridge University Press, 1999. [4] Y. V. Egorov, “On canonical transformations of pseudodifferential operators”, Uspehi Math. Nauk 25 (1969) 235–236. [5] J. Fr¨ ohlich, “Existence of dressed one electron states in a class of persistent models”, Fortschritte der Physik 22 (1974) 159–198. [6] G. A. Hagedorn, “A time dependent Born-Oppenheimer approximation”, Commun. Math. Phys. 77 (1980) 1–19. [7] F. Hiroshima and H. Spohn, “Ground state degeneracy for the Pauli–Fierz operator with spin”, in preparation. [8] F. H¨ overmann, H. Spohn and S. Teufel, “Semiclassical limit for the Schr¨ odinger equation with a short scale periodic potential”, Commun. Math. Phys. 215 (2001) 609–629. [9] L. H¨ ormander, The Analysis of Linear Partial Differential Operators I, Grundlehren der mathematischen Wissenschaften 256, Springer, 1983.
December 27, 2001 14:41 WSPC/148-RMP
28
00107
S. Teufel & H. Spohn
[10] T. Kato, “On the adiabatic theorem of quantum mechanics”, Phys. Soc. Japan 5 (1958) 435–439. [11] P. L. Lions and T. Paul, “Sur les mesures de Wigner”, Rev. Math. Iberoamericana 9 (1993). [12] R. Minlos, “Lower branch of the spectrum of a fermion interacting with a bosonic gas (polaron)”, Theor. Math. Phys. 92 (1993) 869–877. [13] E. Nelson, “Interaction of nonrelativistic particles with a quantized scalar field”, J. Math. Phys. 5 (1964) 1190–1197. [14] M. Reed and B. Simon, Methods of Modern Mathematical Physics 1, Functional Analysis, Academic Press, San Diego, 1972. [15] D. Robert, Autour de l’Approximation Semi-Classique, Progress in Mathematics, Volume 68, Birkh¨ auser 1987. [16] H. Spohn, “The polaron at large total momentum”, J. Phys. A21 (1987) 1199–1211. [17] H. Spohn, “Semiclassical limit of the Dirac equation and spin precession”, Ann. Phys. 282 (2000) 420–431. [18] H. Spohn and S. Teufel, “Adiabatic decoupling and the time-dependent BornOppenheimer theory”, to appear in Commun. Math. Phys. (2001). [19] S. Teufel, “Adiabatic decoupling for perturbations of fibered Hamiltonians”, in preparation.
December 27, 2001 15:22 WSPC/148-RMP
00106
Reviews in Mathematical Physics, Vol. 14, No. 1 (2002) 29–85 c World Scientific Publishing Company
SKYRMIONS, FULLERENES AND RATIONAL MAPS
RICHARD A. BATTYE Department of Applied Mathematics and Theoretical Physics Centre for Mathematical Sciences, University of Cambridge Wilberforce Road, Cambridge CB3 0WA, U.K.
[email protected] PAUL M. SUTCLIFFE Institute of Mathematics, University of Kent at Canterbury, Canterbury, CT2 7NZ, U.K.
[email protected]
Received 23 December 2000 We apply two very different approaches to calculate Skyrmions with baryon number B ≤ 22. The first employs the rational map ansatz, where approximate charge B Skyrmions are constructed from a degree B rational map between Riemann spheres. We use a simulated annealing algorithm to search for the minimal energy rational map of a given degree B. The second involves the numerical solution of the full non-linear time dependent equations of motion, with initial conditions consisting of a number of well separated Skyrmion clusters. In general, we find a good agreement between the two approaches. For B ≥ 7 almost all the solutions are of fullerene type, that is, the baryon density isosurface consists of twelve pentagons and 2B −14 hexagons arranged in a trivalent polyhedron. There are exceptional cases where this structure is modified, which we discuss in detail. We find that for a given value of B there are often many Skyrmions, with different symmetries, whose energies are very close to the minimal value, some of which we discuss. We present rational maps which are good approximations to these Skyrmions and accurately compute their energy by relaxation using the full non-linear dynamics.
1. Introduction 1.1. Overview The possibility that solitons can be used to represent particles is an attractive one, very much at the heart of current ideas in high energy physics. The first such model, known as the Skyrme model [27], was proposed in 1961 as a theory for the strong interactions of pions. The resulting non-linear field theory admits topological soliton solutions, which became known as Skyrmions. In the context of nuclear physics, the topological charge which stabilizes these solitons was identified with baryon number and hence the solitons themselves were identified with baryons. This model was set aside after the advent of gauge theories and Quantum Chromodynamics (QCD) in the late 1960’s, but much later [31] it was shown to be a 29
December 27, 2001 15:22 WSPC/148-RMP
30
00106
R. A. Battye & P. M. Sutcliffe
low-energy effective action for QCD, in the limit of the number of colours (Nc ) being large. Subsequent work has shown the model to be capable of describing at least some aspects of the low-energy behaviour of hadrons. In particular, it was shown that the properties of the proton and neutron could be adequately described using the simple quantization of the solution with a single unit of topological charge [1], and that the other static configurations appeared to be most stable when considering charges corresponding to 4 He and 7 Li [5]. Although understanding the properties of light hadrons is the main motivation for this paper, we shall only comment very briefly at various points in our discussion on the implications of our results for this application. Instead, we will concentrate on the Skyrmions themselves, which are of interest in their own right. As examples of three-dimensional topological solitons, they have been seen to be very similar to BPS monopoles [3, 4, 28]. From a mathematical point of view they can be thought of as maps between 3-spheres and, therefore, the minimum energy configurations which we compute are the minimum energy maps relative to the Skyrme energy functional. Although there are no doubt many other possibilities for an energy functional on the space of maps between 3-spheres, the Skyrme functional is the simplest which supports topological structures and, therefore, it is also interesting to speculate as to their generality in this context. In the following we will present an extensive study of minimum energy solitons using two very different numerical methods.a With a small number of caveats, the two approaches come to the same conclusion: that there is a connection with fullerene cages [21] familiar in carbon chemistry, as conjectured in [5], and that the solutions can be represented in terms of the rational map ansatz [18]. For each topological charge, we will discuss the structure and symmetries of the solution in detail, and present an explicit analytic formula for the approximate solution. In the small number of cases where there is some deviation from the fullerene hypothesis, we will discuss qualitatively, and sometimes quantitatively, possible reasons. Finally, in the context of the two applications mentioned above, nuclear physics and generalized harmonic maps between 3-spheres, we will attempt to provide some heuristic insight into why our conclusions as to the structure of Skyrmions are consistent with the particular application.
1.2. Skyrmions In terms of the algebra valued currents Rµ = (∂µ U )U † of an SU(2) valued field U , the Skyrme Lagrangian density is, 1 1 µ µ ν (1.1) − Tr(Rµ R ) + Tr([Rµ , Rν ][R , R ]) . L= 24π 2 8 a The bare essentials of this work were first reported in [7]. Here, we give a more detailed exposition of our results.
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
31
At first sight the domain is R3 and the target space is the group manifold of SU(2), S 3 , but the finite energy boundary condition, U (∞) = I, means that U is in fact a map from compactified R3 ∼ S 3 7→ S 3 . Such mappings have non-trivial homotopy classes characterized by π3 (S 3 ) = Z, which has the explicit representation Z 1 ijk d3 x Tr(Ri Rj Rk ) . (1.2) B=− 24π 2 A more geometrical description of the model can be made in terms of the strain tensor defined at each point x in the domain by 1 Dij = − Tr(Ri Rj ) , 2
(1.3)
which can be thought of as quantifying the deformation induced by the map between 3-spheres. This symmetric, positive definite tensor can be diagonalized with nonnegative eigenvalues λ21 , λ22 , λ23 , and the static energy, E, and baryon number, B, can be computed as integrals over R3 of the corresponding densities E and B given by E=
1 (λ2 + λ22 + λ23 + λ21 λ22 + λ22 λ23 + λ23 λ21 ) , 12π 2 1
B=
1 λ1 λ2 λ3 . 2π 2
(1.4)
A simple manipulation of these expressions allows one to deduce the Faddeev– Bogomolny bound E ≥ |B|, but in contrast to monopoles and vortices this bound cannot be saturated for any non-trivial finite energy configuration. In fact, it can be attained only when all the eigenvalues of the strain tensor are equal to one at all points in space — an isometry — and this is clearly not possible since R3 is not isometric to S 3 . In practice we shall see that minimization of the Skyrme energy functional associated with the Lagrangian density requires that the solution be as close to this bound as possible. The energy minimization can be thought of as finding the map which is as close to an isometry as possible, when averaged over space [22]. The boundary condition breaks the chiral symmetry (SU(2) × SU(2)) of the Skyrme model to an SO(3) isospin symmetry U 7→ OU O† , where O is a constant element of SU(2). When we refer to a spatial symmetry of a Skyrmion, such as spherical symmetry, the fields are not invariant under a spatial rotation, but rather there is an equivariance property that the effect of a spatial rotation can be absorbed into an isospin transformation. This implies that both the energy density, E, and baryon density, B, are strictly invariant under the symmetry. The B = 1 Skyrmion is spherically symmetric [27], the maximally allowed symmetry of the Lagrangian, and its energy is E = 1.232 [1]. In fact, spherically symmetric solutions exist at all charges, but they are not the minimum energy configurations, which are less symmetric. The B = 2 solution is axially symmetric [20, 30] and the higher charge solutions all have point symmetries [8, 5] which are subgroups of O(3). For B = 3, 4, 7 the Skyrmions have the Platonic symmetries of the tetrahedron (Td ), the cube (Oh ) and the dodecahedron (Yh ) respectively, while
December 27, 2001 15:22 WSPC/148-RMP
32
00106
R. A. Battye & P. M. Sutcliffe
for B = 5, 6, 8 the Skyrmions have the dihedral symmetries D2d , D4d and D6d respectively (see the discussion in Sec. 1.5 if you are unfamiliar with these point groups). For B = 9 a tetrahedrally symmetric Skyrmion has been found, but as we shall discuss later this appears not to be the minimum energy configuration, and hence is probably a low-energy saddle point. In all the above cases the baryon density (and also the energy density) is localized around the edges of a polyhedron. From these known results we were able, in a previous paper [5], to formulate some simple geometrical rulesb for the structure of Skyrmions which led us to conjecture that higher charge Skyrmions (B ≥ 7) would resemble trivalent polyhedra formed from 12 pentagons and 2B − 14 hexagons. We will refer to such structures as fullerene-like and to the conjecture as the fullerene hypothesis since precisely the same structures arise in carbon chemistry where carbon atoms sit at the vertices of such polyhedra, known as fullerenes [14]. On the basis of this, it was suggested that the minimum energy Skyrmion of charge B, SB , would have the same symmetry as a fullerene from the family C4(B−2) . For low charges (B = 7, B = 8) this leads to a unique prediction for SB , and indeed this was what we found in our original simulations. But as the charge increases the number of possible structures increases, in particular for B = 9 there are 2 possibilities with D2 and Td symmetries respectively, for B = 10 there are 6, for B = 11 there are 15, with a rapid increase for B > 11. Using this simple analogy, it was possible to predict that there would be an icosahedral configuration with B = 17 corresponding to the famous Buckminsterfullerene structure of C60 , and given its highly symmetric structure we suggested that this would be the minimum energy configuration at that charge. Our original work on computing minimum energy Skyrmions [5] and also that of [8] used the full non-linear field equations, or the full non-linear energy functional; a very resource hungry procedure even in the modern era of parallel supercomputers. It is also likely to be somewhat imprecise since (1) it can be very difficult to identify the particular symmetries of the solution that one computes in this way, even using sophisticated visualization packages, and (2) the choice of initial conditions is somewhat arbitrary: even when they are chosen to have no particular symmetry, it is impossible to guarantee that they will relax to the global minimum. Fortunately, help is at hand in the form of the rational map ansatz [18], which was devised as an approximate representation of the solutions computed in [5]. Although this representation is not exact it reduces the degrees of freedom in the problem to be a finite, manageable number, allowing computations to take place in an acceptable amount of time. Of course, the original approach based on the full non-linear equations still has a role to play in firstly checking and then fully relaxing the approximate solutions computed in the rational map approach. The development of this subject b The
original Geometric Energy Minimization (GEM) rules were that all the solutions computed for B ≤ 9 had the symmetries of almost spherical trivalent polyhedra with 4(B − 2) vertices, 2(B − 1) faces and 6(B − 2) edges.
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
33
now treads an interesting interface between numerically generated solutions and this analytic approximation. In this paper we shall first compute what are the minimum energy solutions of the Skyrme model upto B = 22, under the assumption that they can be adequately represented by the rational map ansatz. Then by simulating collisions of well separated Skyrmion clusters using the full non-linear field equations, followed by a numerical relaxation of the resulting coalesced cluster, we shall attempt to verify that these structures are in fact the minimum energy Skyrmions. We find precisely the fullerene structures conjectured in [5] for all but a small number of cases where there are interesting caveats. We also demonstrate that not only are the minimal energy Skyrmions fullerene-like, but that there are also several other fullerene Skyrmions at a given baryon number whose energies are only very slightly higher than the minimal value. Finally, using the full non-linear equations once again, we relax the rational map generated solutions with specified symmetries to compute accurately the energies of the configurations.
1.3. Rational map ansatz The rational map ansatz was introduced in [18], and is a way to construct approximate Skyrmions from rational maps between Riemann spheres. Briefly, we use spherical coordinates in R3 , so that a point x ∈ R3 is given by a pair (r, z), where r = |x| is the distance from the origin, and z is a Riemann sphere coordinate giving the point on the unit two-sphere which intersects the half-line through the origin and the point x. Now, let R(z) be a degree B rational map between Riemann spheres, that is, R = p/q where p and q are polynomials in z such that max[deg(p), deg(q)] = B, and p and q have no common factors. Given such a rational map the ansatz for the Skyrme field is ¯ if (r) 2R 1 − |R|2 , (1.5) U (r, z) = exp 2R |R|2 − 1 1 + |R|2 where f (r) is a real profile function satisfying the boundary conditions f (0) = π and f (∞) = 0, which is determined by minimization of the Skyrme energy of the field (1.5) given a particular rational map R. It can be shown that this Skyrme field has charge B, and for 1 ≤ B ≤ 9 rational maps were presented in [18] which reproduced Skyrmions with the same symmetries as those computed in [5]. Furthermore, they were shown to have energies which are only about one or two percent above the numerically calculated values. Substitution of the rational map ansatz (1.5) into the Skyrme energy functional results in the following expression for the energy Z sin4 f 1 r2 f 02 + 2B(f 02 + 1) sin2 f + I 2 dr , (1.6) E= 3π r
December 27, 2001 15:22 WSPC/148-RMP
34
00106
R. A. Battye & P. M. Sutcliffe
where I denotes the integral 4 Z 1 + |z|2 dR 2i dzd¯ z 1 . I= 4π 1 + |R|2 dz (1 + |z|2 )2
(1.7)
To minimize the energy (1.6), therefore, one first determines the rational map which minimizes I, which may be thought of as an energy functional on the space of rational maps. Then given the minimum value of I it is a simple exercise to find the profile function which minimizes the energy (1.6) using a gradient flow method to solve the appropriate boundary value problem. Thus, within the rational map ansatz, the problem of finding the minimal energy Skyrmion reduces to the simpler problem of calculating the rational map which minimizes the function I. Computing the map which minimizes this set up is the essence of our procedure for finding the minimal energy Skyrmion, and in Sec. 2 we shall describe our numerical techniques used to address this problem. The baryon density is proportional to the derivative of the rational map, and (counting multiplicities) this will have 2B − 2 zeros, giving the points on the Riemann sphere for which the baryon density vanishes along the corresponding halflines through the origin. In terms of a baryon density isosurface plot these correspond to holes in a shell-like structure which resembles a polyhedron and the holes correspond to the face centres. 1.4. Symmetric maps: general discussion Since the maps we shall be dealing with describe symmetric Skyrmions, let us recall what it means for a rational map (and hence the associated Skyrmion) to be symmetric under a group G ⊂ SO(3). Consider a spatial rotation g ∈ 4 SO(3), which acts on the Riemann sphere coordinate z as an SU(2) M¨ obius transformation z 7→ g(z) =
γz + δ ¯ + γ¯ , −δz
where
|γ|2 + |δ|2 = 1 .
(1.8)
Similarly a rotation, D ∈ SO(3), of the target two-sphere (which corresponds to an isospin transformation) will act in the same way ΓR + ∆ where |Γ|2 + |∆|2 = 1 . (1.9) ¯ +Γ ¯, −∆R A map is G-symmetric if, for each g ∈ G, there exists a target space rotation, D, which counteracts the effect of the spatial rotation, that is, R 7→ D(R) =
R(g(z)) = D(R(z)) .
(1.10)
Note that in general the rotations on the domain and target spheres will not be the same, so that (γ, δ) 6= (Γ, ∆). Since we are dealing with SU(2) transformations the set of target space rotations will form a representation of the double group of G, which is the group of order 2|G| ¯ which squares to the identity. obtained from G by the addition of an element E
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
35
The fact that we are dealing with the double group is important since it has representations which are not representations of G. From now on it is to be understood that when we refer to a group G we shall actually mean its double group. To determine the existence and compute particular symmetric rational maps is, therefore, a matter of classical group theory. We are concerned with degree B polynomials which form the carrier space for B + 1, the (B + 1)-dimensional irreducible representation of SU(2). Now, as a representation of SU(2) this is irreducible, but if we only consider the restriction to a subgroup G, B + 1|G , this will in general be reducible. What we are interested in is the irreducible decomposition of this representation and tables of these subductions can be found, for example, in [2]. The simplest case in which a G-symmetric degree B rational map exists is if B + 1|G = E + · · ·
(1.11)
where E denotes a two-dimensional representation. In this case a basis for E consists of two degree B polynomials which can be taken to be the numerator and denominator of the rational map. A subtle point which needs to be addressed is that the two basis polynomials may have a common root, in which case the resulting rational map is degenerate and does not correspond to a genuine degree B map. More complicated situations can arise, for example, if B + 1|G = A1 + A2 + · · ·
(1.12)
where A1 and A2 denote two one-dimensional representations, then a whole oneparameter family of maps can be obtained by taking a constant multiple of the ratio of the two polynomials which are a basis for A1 and A2 respectively. An mparameter family of G-symmetric maps can be constructed if the decomposition contains (m + 1) copies of a two-dimensional representation, that is, B + 1|G = (m + 1)E + · · ·
(1.13)
where the m (complex) parameters correspond to the freedom in the decomposition of (m + 1)E into m + 1 copies of E. Explicit examples corresponding to the above types of decompositions will be constructed in Sec. 1.5. For a detailed explanation of how to calculate these maps by computing appropriate projectors see [18]. 1.5. Symmetric maps: specific examples As we shall discuss in Sec. 2.2 the basic procedure for finding the minimum energy Skyrmion in the rational map ansatz will be to minimize the function I subject to the map having degree B. However, this will find the map in an arbitrary spatial orientation, preventing identification of the symmetry directly from the map. One can always compute the corresponding baryon density isosurface and identify the symmetry by eye, a procedure which is often helpful, but this is also fraught with difficulties, particularly when, for example, the solution only has a small number of
December 27, 2001 15:22 WSPC/148-RMP
36
00106
R. A. Battye & P. M. Sutcliffe
symmetry generators. Therefore, in order to be sure of the symmetry identification we will also search maps which are restricted to have a particular symmetry. Since all the point groups that we shall consider are subgroups of O(3), they must be either cyclic groups, Cn , which involve invariance under rotations by (360/n)◦ about some axis, dihedral groups, Dn , which are obtained from the cyclic group by the addition of a C2 axis which is perpendicular to the main symmetry axis, tetrahedral groups (T ), which are the symmetries associated with the tetrahedron, octahedral groups (O), those associated with the octahedron/cube, or icosahedral groups (Y ), those associated with the icosahedron/dodecahedron. Each of these symmetry groups can be extended by the inclusion of reflections. All the icosahedral maps presented in this paper have already been discussed in [18], while all the octahedral maps used are easily deduced from the tetrahedral maps discussed below, so we shall only concentrate on understanding the details of the dihedral and tetrahedral maps. In terms of the Riemann sphere coordinate z the generators of the dihedral group Dn may be taken to be z 7→ e2πi/n z and z 7→ 1/z. This can be extended by the addition of a reflection symmetry in two ways: by including a reflection in the plane perpendicular to the main Cn axis, which is represented on the Riemann sphere by invariance under z 7→ 1/¯ z, and the group Dnh is obtained. Alternatively, a reflection symmetry may be imposed in a plane which contains the main symmetry axis and bisects the C2 axes obtained by applying the Cn symmetry to the C2 axis. This reflection is represented on the Riemann sphere as invariance under z 7→ eπi/n z¯, and the resulting group is Dnd . To construct Dn symmetric maps does not require any group theory formalism discussed in Sec. 1.4 since it is a simple task to explicitly apply the two generators of Dn to a general degree B rational map to determine a family of symmetric maps. Explicitly, an s-parameter family is given byc , s s X X R(z) = aj z jn+r as−j z jn , (1.14) j=0
j=0
where r = B mod n and s = (B − r)/n. Here as = 1 and a0 , . . . , as−1 are arbitrary complex parameters. Clearly, this map satisfies the conditions for it to be symmetric under Dn , R(e2πi/n z) = e2πir/n R(z) ,
R(1/z) = 1/R(z) ,
(1.15)
and imposing a reflection symmetry constrains the otherwise complex coefficients aj to either be real, or pure imaginary. In the case of Dnh symmetry the condition c There
are other Dn symmetric families of maps in addition to those of the form (1.14), but they will not be needed in this paper.
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
37
is that all aj are real, whereas for Dnd symmetry the coefficient aj is either real or purely imaginary depending on whether (s − j) mod 2 is zero or one respectively. The procedure for constructing tetrahedrally symmetric maps is more difficult than for dihedral symmetries and the systematic group theory approach discussed in Sec. 1.4 must be employed. No simple formula such as (1.14) exists, so for later reference we shall, therefore, need to recall the basic facts about the irreducible representations of the tetrahedral group T. T has three one-dimensional representations, which are the trivial representation, A, and two conjugate representations A1 and A2 . There is also a threedimensional representaion, F , which is obtained as 3|T . In addition to these representations there are three two-dimensional representations of the double group of T , which we denote by E 0 , E10 , E20 , where the prime signifies that these are not representations of T , but only of the double group of T. E 0 is obtained as 2|T and E10 and E20 are conjugate representations. 2. Numerical Minimization Algorithms 2.1. Overview Minimization of an energy functional is a classical numerical problem, with no hard and fast optimum method. Methods which are tailored for a particular application can work very badly in others. In this section we will outline the basic features of the numerical methods which we have employed to compute minimum energy Skyrmions with and without using the rational map ansatz. We will attempt to discuss both the advantages and disadvantages of the two methods. Our original approach to this problem was to use the code first used in [4] to evolve the full field equations for well-separated Skyrmions, and this is discussed in Sec. 2.3. It worked well for B < 9 [5] and its results were the main motivation for the rational map ansatz. Its disadvantages are that it is very slow, requiring many hours of CPU time on a parallel computer, and in circumstances where there is the possibility of two or more minima separated by a small energy gap, dependence on the choice of initial conditions is also an issue. It is, however, the only way in which the results of approximate methods, such as the rational map ansatz, can be checked. Creating a particular configuration from many different initial conditions using this method can be thought of as strong evidence for it being the minimum energy solution, irrespective of other considerations. The simulated annealing of rational maps as discussed in Sec. 2.2 is by contrast fast — it only requires a serial processor — by virtue of the fact that the number of degrees of freedom have been substantially reduced, is relatively independent of the initial conditions and much less sensitive to local minima. But, of course, the results are only as good as the rational map approach to describing Skyrmions. Thankfully, as we shall see, the rational map approach in general works very well, allowing us to generate symmetric Skyrmions with large baryon number.
December 27, 2001 15:22 WSPC/148-RMP
38
00106
R. A. Battye & P. M. Sutcliffe
2.2. Simulated annealing of rational maps Simulated annealing is a fairly recent numerical method for obtaining the global minimum of an energy function, and is based on the way that a solid cools to form a lattice [29]. Let E(a) be an energy function which depends on a number of parameters a. The idea is that at a given temperature, T , the system is allowed to reach thermal equilibrium, characterized by the probability of being in a state with energy E given by the Boltzmann distribution Pr(E = E) =
1 exp(−E/T ) , Z(T )
(2.1)
where Z(T ) is the partition function. In practice, this is achieved by applying a Metropolis method: starting with a given configuration, a, a small random perturbation δa is made and the energy of the resulting configuration is computed. If the change in energy, δE = E(a + δa)−E(a), is negative then the new configuration is accepted, that is, a is replaced by a + δa. However, if the change results in an increase in energy then the probability of accepting the new configuration is e−δE/T . By performing a large number of such perturbations thermal equilibrium can be achieved at temperature T . The procedure to minimize the energy is to start at a high temperature, bring the system into thermal equilibrium and then lower the temperature before regaining the equilibrium. As the temperature is decreased, the system is more likely to be found in a state with lower energy and in the limit as T → 0 the configuration will move toward a minimum of E. In the limit of infinitesimally slow variations in the temperature this can be shown to be the global minimum. From the above description it is immediately clear that the simulated annealing method has a major advantage over other conventional minimization techniques in that changes which increase the energy are allowed, enabling the algorithm to escape from minima that are not the global minimum. Of course, in practice one is not guaranteed to find the global minimum since the number of iteration loops used to bring the system into thermal equilibrium at a fixed temperature and also the number of times the temperature can be decreased are both restricted by computational resources. However, it does provide the most efficient means for searching for a global minimum and with sufficient computational resources, plus sufficient care in applying them, one can be fairly confident of the final result. For our application to rational maps we obviously take the energy function to be I and the parameters a to be the constants in the rational map. To compute I involves a numerical integration over the sphere, which can be performed with standard methods, and in a typical simulation this needs to be calculated approximately a million times for a full simulated annealing run. In each case we take the initial rational map to be the axially symmetric one R = z B , which for large B has a very high value of I. Note that since we are using the Riemann sphere coordinate z, the two-sphere metric is the Fubini–Study metric in this coordinate system. This means that in
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
39
terms of moving energy around the sphere there is a bias between points near the south pole as compared to those near the north pole, in terms of small variations of the rational map parameters. To counteract this discrepancy we perform a spatial rotation of the configuration, z 7→ 1/z each time the temperature is decreased. To identify the rational map, and more importantly its symmetries, produced by the simulated annealing algorithm is a two stage process. First, we compute the minimum energy map assuming no particular symmetry, that is, we allow a general map of degree B. As already pointed in Sec. 1.5 the end result will be a rational map which is in a random orientation in both the domain and target two-spheres. The target space orientation could easily be fixed (in fact, it is more convenient not to do this, due to the above comments regarding a periodic spatial rotation of the map during the minimization procedure), but there is no simple way to make the spatial orientation such that the symmetry generators of the map are conveniently represented. Once one has the minimizing degree B rational map and the corresponding minimum value of I, the Skyrmion is then constructed and its baryon density is plotted and examined in an attempt to identify its symmetries by eye. This conjectured symmetry is then confirmed by constructing the most general map with this symmetry and minimizing within this constrained symmetric family to check that the same minimum value of I is recovered. As a final check the corresponding Skyrmion is constructed and its baryon density examined to confirm that it is identical to the one obtained previously. For each charge we have also performed several simulated annealing runs within constrained symmetric families, such as D2 , D3 and D4 . Since all the symmetry groups must be subgroups of O(3), the number of possibilities is finite and checking just these three possibilities, allows one to rule out a large fraction of them; the rest often being possible by eye. This not only provides an additional check that the minimizing map was found, but also allows us to obtain other low energy maps, which may be either saddle points or local minima. Finally, it is perhaps worth mentioning that a simulated annealing method has recently been used in a rather different way to study Skyrmions [15]. These authors used a simulated annealing algorithm on a discretized version of the full Skyrme energy, taking the parameters to be the field values at the discretized lattice sites. This is a much more computationally expensive approach than using the rational map ansatz, which is probably the reason these authors only considered Skyrmions upto charge B = 4, where, of course, the results were already known. Nonetheless, it is interesting to know that a simulated annealing approach is viable in this manner and demonstrates yet another application of this versatile technique. 2.3. Full field dynamics 2.3.1. Sigma model formulation In the SU(2) form, the equations of motion are cumbersome to handle numerically, so we convert to the notation of a non-linear sigma model (NLSM), which has
December 27, 2001 15:22 WSPC/148-RMP
40
00106
R. A. Battye & P. M. Sutcliffe
Lagrangian, 1 1 L = ∂µ φ · ∂ µ φ − (∂µ φ · ∂ µ φ)2 + (∂µ φ · ∂ν φ)(∂ µ φ · ∂ ν φ) + λ(φ · φ − 1) , (2.2) 2 2 with the Lagrange multiplier λ introduced to maintain the constraint φ · φ = 1. The Euler–Lagrange equations are given by (1 − ∂µ φ · ∂ µ φ)φ − (∂ ν φ · ∂µ ∂ν φ − ∂µ φ · φ)∂ µ φ + (∂ µ φ · ∂ ν φ)∂µ ∂ν φ − λφ = 0 , (2.3) where the Lagrange multiplier can be calculated by contracting (2.3) with φ and using the second derivative of the constraint, λ = (1 − ∂µ φ · ∂ µ φ)φ · φ + (∂ µ φ · ∂ ν φ)(φ · ∂µ ∂ν φ) = −(∂µ φ · ∂ν φ)(∂ µ φ · ∂ ν φ) − (1 − ∂µ φ · ∂ µ φ)∂ν φ · ∂ ν φ .
(2.4)
Denoting differentiation with respect to time as a dot, these equations can be recast as ˙ ∂i ∂j φ) − λφ = 0 , ˙ ∂i φ, ∂i φ, M φ¨ − α(φ,
(2.5)
where the symmetric matrix M has elements Mab = (1 + ∂j φ · ∂j φ)δab − ∂j φa ∂j φb ,
(2.6)
and α is given by ˙ φ˙ + 2(φ˙ · ∂i φ)∂i φ˙ − (φ˙ · ∂i φ)∂ ˙ i φ − φ˙ 2 ∂i ∂i φ α = (φ˙ · ∂i ∂i φ − ∂i φ · ∂i φ) + (∂i φ · ∂i ∂j φ − ∂j φ · ∂i ∂i φ)∂j φ + (1 + ∂j φ · ∂j φ)∂i ∂i φ − (∂i φ · ∂j φ)∂i ∂j φ . (2.7) Quite clearly these equations of motion are not analytically tractable. In subsequent sections we will discuss our numerical methods for evolving the equations of motion for spatially discretized initial conditions and for obtaining minimal energy static Skyrmions. As we shall see this is still a highly non-trivial task and can only be done for a specialized, but ill-defined, set of initial conditions. 2.3.2. Discretization and boundary conditions There are three aspects common to almost all numerical approaches to solving nonlinear PDE’s. The first is a spatial discretization and an approximation for spatial derivatives. We discretized on a regular, cubic grid with N points in each of the Cartesian directions and the array φi,j,k ≈ φ(i∆x, j∆x, k∆x). The choice of N and ∆x is of critical importance, since the soliton configurations we wish to represent are localized. We found that grids with N = 100 and ∆x = 0.1 were convenient for representing all the configurations which we study in this paper, although larger grid spacings (∆x = 0.2) also give sensible results, and larger grids (N = 200) were used to obtain more accurate calculations of energies.
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
41
The spatial derivatives used were fourth order, so as to accurately represent the large spatial gradients of the solitonic configurations. Since the reader may not be totally familiar with this procedure, various expressions for derivatives are presented below: for first order derivatives, −φi+2,j,k + 8φi+1,j,k − 8φi−1,j,k + φi−2,j,k ∂φ = + O(∆x4 ) , (2.8) ∂x 12∆x for second order derivatives, −φi+2,j,k + 16φi+1,j,k − 30φi,j,k + 16φi−1,j,k − φi−2,j,k ∂2φ = + O(∆x4 ) , (2.9) 2 ∂x 12∆x2 and for mixed second order derivatives ∂2φ ∂ 2φ ∂2φ + + 2 = O(∆x4 ) + (−φi+2,j+2,k + 16φi+1,j+1,k − 30φi,j,k ∂x2 ∂x∂y ∂y 2 + 16φi−1,j−1,k − φi−2,j−2,k )/(12∆x2 ) .
(2.10)
The next part of the procedure is a method for time evolution. The equations of motion can be transformed into first order form, (2.11) M ψ˙ − α(ψ, ∂i φ, ∂i ψ, ∂i ∂j φ) − λφ = 0 , ˙ and this can be solved using a leapfrog method. This involves by defining ψ = φ, replacing ψ+ − ψ− φ+ − φ− + O(∆t2 ) , ψ˙ = + O(∆t2 ) , (2.12) φ˙ = 2∆t 2∆t where + and − correspond to the the values of ψ and φ at one step after and one before respectively. In a much simpler case such as the wave equation, this creates a decoupling of the arrays containing the discretized versions φ and ψ. However, the non-linear dependence of the function α on ψ requires the storing of two copies of each array, and hence four in total, requiring 64Mb of core memory for N = 100. The choice of ∆t is also crucial, since it can create numerical instability. The √ standard Courant condition for a linear equation in three dimensions states that 3∆t < ∆x. The non-linear nature of the Skyrme equations leads to a non-trivial modification to this relation, which still stands as the best possible due to reasons of causality. We find, essentially by trial and error, that ∆t ≈ ∆x/10 leads to a stable algorithm. We should note at this stage that there is another, potentially more pathological, instability of the Skyrme model which is discussed in a subsequent section. Finally, one is required to specify boundary conditions for the finite grid employed. Since the fourth order spatial derivatives require a five point wide stencil, one point away from the boundary it is necessary to use second order spatial approximations. But on the boundary itself the spatial derivatives cannot be evaluated, since the second order spatial approximation requires a three point wide stencil. We experimented with various different types of boundary conditions, such as Neumann (zero normal derivative), Dirichlet (fixed) and periodic. The results presented here are for Dirichlet boundary conditions, although we believe that the use of Neumann boundaries would have little effect on the results.
December 27, 2001 15:22 WSPC/148-RMP
42
00106
R. A. Battye & P. M. Sutcliffe
2.3.3. Imposing the constraint In the previous section we have discussed all aspects of the numerical solution of the Skyrme equations of motion which are common to the numerical solution of most non-linear PDE’s. There is, however, an added extra which makes life much more difficult in the case of a NLSM. In the previous section we included the Lagrange multiplier λ, assuming that it can be calculated from ψ, φ and their spatial derivatives. In fact, this leads to a numerical scheme which becomes unstable for any choice of ∆t in under ten timesteps. The problem is that we have ignored the reason for its introduction; to maintain the constraint φ · φ = 1, which is manifest in the NLSM. A number of approaches have been developed to deal with such constraints. Firstly, one could modify the numerical scheme to calculate λ so that it explicitly maintains the constraint for the discretized equations of motion. At each step, this is seen to be almost equal to calculating λ from the formula (2.4), but the two differ by numerical discretization effects at a level well below 1%, which nonetheless cause the solution to slip off the unit sphere, if allowed to accumulate. An alternative approach, which was found to work well in simulations of Baby Skyrmions [26], is to simply rescale the field to have unit modulus, that is, to continually make the replacement φ , φ 7→ √ φ·φ
(2.13)
at each point on the discretized grid after each timestep. While appearing ugly from a purist numerical analysis point of view, this technique is effective and does not become unstable except in the most extreme circumstances. One is effectively projecting the field back onto the sphere along the field itself, which has no particular physical motivation, but if the modification is small, which can be arranged by choosing a sufficiently small timestep, then this should be as good as any other arbitrary choice. Another possibility is to require that the derivative of the constraint is zero: the relation φ · φ = 1 not only implies that the solution lies on the unit sphere, but that it cannot come off, that is, all the derivatives of the constraint are also satisfied. This leads to an infinite hierarchy of relations which must hold. Obviously, with a second order time evolution one cannot hope to maintain them all explicitly in the discretized system. However, if one manages to satisfy the first one, it may be possible to construct an effective code. It is possible to impose that φ · ψ = 0 by computing λ in order to satisfy φ+ · ψ + = 0 which requires that, λ=−
ψ − · ψ + φ · M −1 α + 2ψ · M −1 α∆t . φM −1 φ + 2ψ · M −1 φ∆t
(2.14)
Although we should note that this does not implicitly imply the imposition of the constraint on the discretized equations. Largely by trial and error, we find that the best method for our numerical scheme is a hybrid of the last two methods. It just so happens that this is possible within
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
43
our scheme, since there are two discretized grids with essentially their own time evolution. Firstly, we calculate λ to satisfy φ+ · ψ + = 0, followed by the rescaling transformation (2.13) on the field φ+ . We find that this hybrid methods maintains the constraint for many thousands of timesteps. 2.3.4. Non–hyperbolic regions We have already discussed the potential for our numerical scheme to become unstable because of Courant-type instability and also due to the imperfect imposition of the constraint φ · φ = 1. However, there is a much more pathological instability which comes about since the equations of motion are not manifestly hyperbolic and their precise nature, hyperbolic, parabolic or elliptic, depends on the particular configuration being evolved [13]. The problem is that the specific numerical scheme we have designed will work only for the hyperbolic case and we know of no way of treating all configurations within a single numerical scheme. Fortunately, the hyperbolic regime is the only physically meaningful one. One can understand this by thinking of the Skyrme model as a low energy effective action, with higher order terms ignored. At higher energies, where the terms which are ignored would be large, the Skyrme equations of motion become non-hyperbolic. It was suggested in [13] that the equations of motion became non-hyperbolic whenever the kinetic energy is greater than the potential energy. Empirically, we find that this is at least partially true, with an instability associated with a very large kinetic energy density relative to that of potential energy. However, this statement is a little imprecise since we found that sometimes the kinetic energy density rose above the potential locally on the discretized grid without creating instability. Unfortunately, the omplicated nature of the equations of motion makes it almost impossible to say for certain, which configurations will eventually lead to an instability and which will not, although experience has taught us that most of the configurations we wish to evolve for physical applications, such as low energy nuclear physics, are possible. 2.3.5. Locating minima In the preceding sections we have discussed how to construct a numerical scheme which evolves the full non-linear equations of motion for the Skyrme model. This allowed us to simulate the dynamics of Skyrmion collisions in [4]. However, for the purposes of this paper one would also like to create static multi-soliton configurations. We are assisted in this by the observation that when well separated Skyrmions coalesce they seem to create low energy symmetric multi-soliton states. The procedure that we use is to set up initial conditions with the required topological charge involving a collision between one or more Skyrmions of some particular charge, which can be done using the rational map ansatz, described in Sec. 1.3, using charges for which the rational map is already known. We then evolve
December 27, 2001 15:22 WSPC/148-RMP
44
00106
R. A. Battye & P. M. Sutcliffe
the configuration until they visibly coalesce or until the potential energy begins to increase. At this point all the kinetic energy is removed, that is, φ˙ = 0 at all points on the grid, and the evolution is continued until once again the potential energy rises. This procedure is repeated many times until the energy is no longer decreasing. Once the solution is sufficiently relaxed it often pays to just evolve the solution under the full equations of motion without removing kinetic energy since this can prevent the minute oscillations required to achieve the minimum from gaining momentum. Even this procedure can be very slow, but can be speeded up by adding a dissipative term to the equations of motion, ˙ ∂i ∂j φ) − λφ = −φ˙ , ˙ ∂i φ, ∂i φ, M φ¨ − α(φ,
(2.15)
for > 0. Clearly, the static solutions are still the same. However, the dissipation causes the solution to roll down the potential well to the minimum much quicker in certain circumstances, with = 0.5 seeming to work well. We should note that the addition of this dissipation can also effect the Courant instability of the algorithm, and, in particular, very near to the minimum one is plagued by instability. Experience, has shown us that a combination of running with dissipation and then without helps speed up the process. Obviously, one should be concerned that this process might not necessarily lead one to the global minima. One might, for example, relax down to a metastable local minima or the initial conditions may have some symmetry which is maintained by the equations of motion and hence the final solution. We attempt to ensure that we do not encounter the latter possibility by creating initial conditions using the product ansatz, which is manifestly asymmetric (U1 U2 6= U2 U1 ). The possibility of local minima, however, can never be totally excluded, but one can build up confidence in the minima by using different initial configurations. For low charges (B ≤ 4), the attractive channel configurations discussed in [4] are particularly good initial conditions. But for higher charge no such maximally attractive channels exist for B well separated Skyrmions and only a small number of attractive configurations are known. We find that sensible initial conditions can be produced for any charge B > 10 by using two clusters, one with charge B − n and the other with charge n, such that n = 1, 2, 3, 4, 5 but no larger. These are Lorentz boosted together with a velocity of v = 0.3 in a collision which has a small but non-zero impact parameter. 3. Skyrmion Identification In this section we will present the results of an extensive set of simulations performed with the intention of identifying the symmetry and associated polyhedron of the minimum energy configurations for B ≤ 22. We should note that making definitive statements as to an identification for a particular charge has, historically, been fraught with difficulties. In particular, the identifications of the B = 5 and B = 6
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
45
solutions in [8] and the B = 9 configuration in [5] were incorrect to varying degrees. Suffice to say when the two numerical methods agree for a wide range of initial conditions and simulation parameters, there is strong grounds to believe that we have created the correct configuration; identifying the symmetry is then the only complication. Conversely, when they disagree, or we know more than one very low energy solution for a particular charge, it is a matter of some debate which is the true minimum energy solution, or whether there is some more complicated symmetry allowing local, or even degenerate minima. We will engage in this debate at various stages, but the reader should make up their own mind as to the strength of our arguments. We have argued strongly in the previous sections on numerical methods that the simulated annealing of rational maps is the simplest and probably cleanest approach to compute low energy Skyrmion configurations. Clearly, its veracity for computing the true minima depends on the viability of the ansatz for describing Skyrmions, and that the energy functional based on I is a good approximation to the true energy, or at least the relative energies of particular configurations. With these caveats in mind, we present our first attempt at identification of the minima as the rational maps which minimize I, before discussing other possibilities. The results of the simulated annealing algorithm applied to a general rational map of degree B(≤ 22) and the symmetry identification procedure discussed in Sec. 1.5 are presented in Table 1. In each case, we tabulate the identified symmetry group G, the minimum value of I, the quantity I/B 2 — which is strikingly uniform at around Sec. 5.1.2 and 5.1.3 — and the value of E/B for a profile function which minimizes the energy functional (1.6) for the particular map. We should first comment that for B ≤ 8 the rational maps which minimize I are exactly those presented in [18] to approximate the results of the full non-linear simulations [5]; thus the simulated annealing algorithm provides a nice numerical check that the same maps are reproduced by searching the full parameter space of rational maps. Also for B ≥ 7 all the symmetry groups with the exception of B = 9, B = 10 and B = 13 are compatible with the fullerene hypothesis: that SB has 4(B − 2) trivalent vertices and is constructed from 2B − 14 hexagons and 12 pentagons. The baryon density isosurfaces for each of the solutions are displayed in Fig. 1 along with a model of the associated polyhedron in Fig. 2, which confirm that for the most part they are indeed of the fullerene type. The symmetry groups of B = 9, B = 10 and B = 13 all contain the cyclic subgroup C4 , which is not compatible with them being of the fullerene type, since the associated polyhedron of such a solution must contain either a four-valent bond, or a square. These are the first Skyrme solutions found which do not comply with the Geometric Energy Minimization (GEM) rules suggested in [5]. We shall discuss these solutions in more detail in the subsequent section, but it is gratifying to note that all the other solutions appear qualitatively to comply with our expectations based on the GEM rules.
December 27, 2001 15:22 WSPC/148-RMP
46
00106
R. A. Battye & P. M. Sutcliffe
Table 1. Results from the simulated annealing of rational maps of degree B. For 1 ≤ B ≤ 22 we list the symmetry of the rational map, G, the minimal value of I, its comparison with the bound I/B 2 ≥ 1, and the energy per baryon E/B obtained after computing the profile function which minimizes the Skyrme energy functional. B
G
I
I/B 2
E/B
1.0
1.000
1.232
1
O(3)
2
D∞h
5.8
1.452
1.208
3
Td
13.6
1.509
1.184
4
Oh
20.7
1.291
1.137
5
D2d
35.8
1.430
1.147
6
D4d
50.8
1.410
1.137
7
Yh
60.9
1.242
1.107
8
D6d
85.6
1.338
1.118
9
D4d
109.3
1.349
1.116
10
D4d
132.6
1.326
1.110
11
D3h
161.1
1.331
1.109
12
Td
186.6
1.296
1.102
13
O
216.7
1.282
1.098
14
D2
258.5
1.319
1.103
15
T
296.3
1.317
1.103
16
D3
332.9
1.300
1.098
17
Yh
363.4
1.257
1.092
18
D2
418.7
1.292
1.095
19
D3
467.9
1.296
1.095
20
D6d
519.7
1.299
1.095
21
T
569.9
1.292
1.094
22
D5d
621.6
1.284
1.092
In Table 2, Fig. 3 and Fig. 4, we present the results of our extensive search for the minimum energy maps with particular symmetries, usually dihedral groups or chosen from the extensive tables of fullerenes presented in [14], which lend further weight to our conclusions that those presented in Table 1 are in fact the minima relative to the energy functional I. They do, however, turn up the possibility that in certain cases the minima of I may not necessarily be the minimum of the Skyrme energy, since some of them have values of I very close to the values presented in Table 1. When the values are so close it is difficult to make a guess as to how the relaxation to the true solution might effect their relative positions; an issue to which we shall return in subsequent sections. For the moment we will denote them by ?, and conclude at least that they are not a global minima of I, but are believed to represent other critical points.
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
47
Fig. 1. The baryon density isosurfaces of the Skyrmions with B = 1 − 22 which are minimum energy configurations (see Table 1) within the rational map ansatz. Each corresponds to a value of B = 0.035 and are presented to scale.
For 9 ≤ B ≤ 22 we shall now describe in detail the rational maps we have obtained, the structure of the associated Skyrmions and make a comparison with the results from full field simulations. Our study of the rational maps has already turned up a few oddities and we shall attempt to interpret these at the relevant charge. Charges where the fullerene hypothesis appears to break down are B = 9 and B = 13, while the rational map approach to representing the minimum energy Skyrmion appears to need careful consideration for B = 10, B = 14, B = 16 and B = 22.
December 27, 2001 15:22 WSPC/148-RMP
48
00106
R. A. Battye & P. M. Sutcliffe
Fig. 2. scale.
The associated polyhedra for the Skyrmions presented in Fig. 1. The models are not to
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
49
Table 2. Same as for Table 2, but for the other critical points of I. Notice that the I values for the B = 10 configurations with D3 and D3d symmetry, the B = 13 with D4d , the B = 16 with D2 and the B = 22 with D3 are extremely close to the corresponding values in Table 1, suggesting the possibility of local minima. B
G
I
I/B 2
E/B
9* 10* 10* 10* 13* 13* 15* 16* 17* 19* 22*
Td D3 D3d D3h D4d Oh Td D2 Oh Th D3
112.8 132.8 133.5 143.2 216.8 265.1 313.7 333.4 367.2 469.8 623.4
1.393 1.328 1.335 1.432 1.283 1.568 1.394 1.302 1.271 1.301 1.288
1.123 1.110 1.111 1.126 1.098 1.140 1.113 1.098 1.093 1.096 1.092
Fig. 3. The baryon density isosurfaces of the Skyrmions we found which are other stationary points of I (see Table 2) within the rational map ansatz. Each corresponds to a value of B = 0.035 and are presented to scale.
3.1. B = 9 In [5] it was suggested that the B = 9 minimum energy configuration had Td symmetry, a symmetric configuration of C28 (it corresponds to configuration 28:2 in [14]). The polyhedron to which it corresponds comprises of 12 pentagons, fuzed into 4 triplets, placed at the vertices of a tetrahedron, with 4 hexagons placed at the vertices of a dual tetrahedron. This, plus the solutions for B = 7 and B = 8, was one of the main motivations of the fullerene hypothesis. Unfortunately, it appears that the original identification of the symmetry was incorrect and further relaxation of this configuration using the full non-linear field equations lead to a somewhat different solution.
December 27, 2001 15:22 WSPC/148-RMP
50
00106
R. A. Battye & P. M. Sutcliffe
Fig. 4. The associated polyhedra for the Skyrmions presented in Fig. 3. The models once again are not to scale. Note that we have been unable to make models for the B = 13 solution with Oh symmetry and the B = 15 solution with Td symmetry since they contain a large number of four-valent bonds.
The minimizing map in this case has D4d symmetry and I = 109.3, with the functional form of the map being given by R=
z(a + ibz 4 + z 8 ) , 1 + ibz 4 + az 8
(3.1)
where a = −3.38, b = −11.19. It should be noted that this is slightly lower than the value I = 112.8 of the tetrahedral map [18] √ √ √ √ 5i 3z 6 − 9z 4 + 3i 3z 2 + 1 + az 2 (z 6 − i 3z 4 − z 2 + i 3) √ √ √ √ , (3.2) R= z 3 (−z 6 − 3i 3z 4 + 9z 2 − 5i 3) + az(−i 3z 6 + z 4 + i 3z 2 − 1) where a = −1.98. Amazingly, the solution which was created by the relaxation of initially wellseparated B = 8 and B = 1 configurations (henceforth such initial conditions will
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
51
denoted 8 + 1) using the full non-linear field equations, and confirmed using a number of different initial conditions (for example, 7 + 2 and 6 + 3), is precisely that which corresponds to the rational map (3.1). The associated polyhedron is not a fullerene, since the symmetry group D4d is incompatible with pentagons and hexagons forming a trivalent polyhedron. In fact, it has two four-valent links which occur between four pentagons forming the top and bottom pseudo-facesd of a rather flat polyhedron, linked by a belt of eight alternately up and down pointing pentagons; the top and bottom being rotated relative to each other by 45◦ to give the D4d symmetry. In Sec. 5.2 we will discuss how this solution can be formed by the symmetry enhancement of the other known fullerene corresponding to C28 which has D2 symmetry (this is labelled 28:1 in [14]). On the basis of this we conclude that S9 has D4d symmetry and not Td as previously suggested, but that the known Td symmetric solution is a saddle point. 3.2. B = 10 A glance at Table 1 and Table 2 shows that there are (at least) four maps whose I values are very close together. Minimizing over all maps yields the value I = 132.6 and this can be obtained from a D4d symmetric map of the form R=
z 2 (a + ibz 4 + z 8 ) , 1 + ibz 4 + az 8
(3.3)
where a = −8.67 and b = 14.75. This is not a fullerene Skyrmion, the four-fold symmetry this time manifesting itself in the existence of a square on the top and bottom of the associated polyhedron. There is, however, a fullerene Skyrmion with I = 132.8, very close to that of the minimum, which corresponds to a D3 symmetric map of the form R=
z(1 + az 3 + bz 6 + cz 9 ) , c + bz 3 + az 6 + z 9
(3.4)
where a, b, c are complex parameters. The minimum is obtained for the values a = 4.40 − 1.72i, b = −2.38 + 3.10i, c = −0.12 + 0.19i. The symmetry can be increased to D3d by choosing b to be real and a and c to both be purely imaginary, and within this class the minimum is very slightly higher at I = 133.5, which is attained when a = 20.40i, b = −30.22, c = −4.69i. Finally, if a, b, c are all real then the symmetry is D3h and the minimum in this class has I = 143.2 when a = −5.14, b = −2.20, c = −0.36. The baryon density of the D4d symmetric map is presented in Fig. 1 and for the other three maps, D3 , D3d , D3h , in Fig. 3. The polyhedron associated with the D4d solution can be constructed by taking two squares each surrounded by 4 hexagons, connected via a band of 8 pentagons alternately pointing up and down; the two d We
shall use the term pseudo-face to refer, rather loosely, to a set of connected polygons, which act from the point of view of symmetry of the associated polygon as a single face.
December 27, 2001 15:22 WSPC/148-RMP
52
00106
R. A. Battye & P. M. Sutcliffe
squares being rotated relative to each other by 45◦ . Each of the links is trivalent, but instead of comprizing of 12 pentagons and 6 hexagons as would have been suggested by the fullerene hypothesis, it contains 8 pentagons, 8 hexagons and 2 squares, although the number of vertices 32 ≡ 4(B − 2) and faces 18 ≡ 2(B − 1) are still compatible with the GEM rules The other three maps give Skyrmions of fullerene type, with the baryon density isosurface comprising the requisite number of pentagons and hexagons arranged in a trivalent polyhedron. The associated polyhedron for the D3h solution (which corresponds to 32:5 in [14]) comprises of two copies of a hexagonal triple linked by a belt of 12 pentagons which can be thought of as being made of 3 sets of four fused pentagons in a C3 arrangement. The D3 and D3d solutions are very similar: to make each of the associated polyhedra (which correspond to configurations 32:6 and 32:4 in [14] respectively) first start with two pentagon triples. There are six places on each triple to which one can add another polygon, which fall into two types — one can connect to a single pentagon edge, or between two pentagons connecting to an edge of each. To make the two different configurations, one must add three hexagons and three pentagons alternately around each of the triples; the difference being which polygon (pentagon or hexagon) connects to the two different sites. In particular, the D3d configuration has hexagons connected to the single pentagon, and pentagons between the two pentagons; vice versa for the D3 configuration. Once one has added these 6 polygons, the two identical copies are then connected, the two being rotated relative to each other by 60◦ in the D3d configuration, and at no particular fixed angle in the case of D3 . As fullerenes these three structures are tabulated in [14], along with three other possibilities which have less symmetry (2 × C2 and D2 ). Unfortunately, these lower symmetry solutions are impossible to find using the simulated annealing algorithm since their symmetry groups are subgroups of D4d and the minimum energy rational map with this symmetry is already known. In order to try and understand which of the four configurations is the true minimum we have relaxed all the different initial configurations made from two individual Skyrmion clusters whose baryon numbers sum to 10, that is, 9 + 1, 8 + 2, 7 + 3, 6 + 4 and 5 + 5. None of these form the D4d configuration, nor that with D3h , suggesting — but not proving — that neither of them are the minimum energy solution, and are hence likely to be saddle points. However, it appears that it is possible to produce both the D3d and D3 configurations from collisions. In particular, the 7 + 3 relaxation appears to give the more symmetric D3d configuration, while all the others give one which only has D3 symmetry. We have already found solutions which we believe to be saddle points, for example, the B = 9 configuration with Td symmetry, but here we have evidence for a new phenomena — local minima. Given that there is no symmetry which can be invoked to explain why there might be degenerate minima, it seems likely that the energies of the two configurations must differ by a minute amount. As we shall discuss in Sec. 4, given the uncertainties we are unable to ascertain which is the global minimum.
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
53
The reason that the rational map ansatz is so successful in describing Skyrmions is that they appear to prefer to be as sphericale as possible. Examination of the models of the associated polyhedra sheds some light on the preference of the rational map ansatz for the D4d and D3 configurations, which are very spherical, as opposed to the more elongated D3d configuration. It might be that this oddity is not reproduced in the full non-linear energy functional, and the configuration with D4d symmetry is impossible to reproduce. The phenomenon of many different configurations at a given baryon number, often with energies very close to the minimal value, is a feature of the fullerene hypothesis which one might have been able to predict since there are many possible polyhedra which contain 12 pentagons and 2B − 14 hexagons for B ≥ 9. The possibility of four-valent vertices and also trivalent configurations containing squares only make things worse. This was the main motivation for the choice of a simulated annealing algorithm as our minimization scheme for rational maps; a choice which appears to have been vindicated. It is also one reason why we have used two very different numerical techniques, the rational map approach and full field simulations, to try and confirm the results we obtain, thereby increasing the confidence that the solutions we construct are the global minima. Unfortunately, in this case we were unable to make a definitive identification of the minimum energy Skyrmion, S10 , but have presented evidence that it is one of two configurations which are almost indistinguishable. 3.3. B = 11 The minimum value at this charge is I = 161.1 and this is obtained from the D3h symmetric map R=
z 2 (1 + az 3 + bz 6 + cz 9 ) , c + bz 3 + az 6 + z 9
(3.5)
where a, b, c are real parameters, taking the values a = −2.47, b = −0.84, c = −0.13 at the minimum. The associated polyhedron (which corresponds to 36:13 in [14]) can be constructed by considering a hexagon to which 3 pentagons and 3 hexagons are connected alternately with a C3 symmetry. Each of the pentagons is part of a set of four fused together, each of which is placed in the C3 arrangement. Each of these is then connected to another hexagon, which is directly below the first one and can be thought of as being equivalent to the original one from the point of view of symmetry. The spaces in between are filled up with hexagons, the whole structure comprizing of 12 pentagons and 8 hexagons. eA
sensible quantification of how spherical a solution is might be to consider the eigenvalues of the moments of the baryon density. A distribution which is isotropic, and hence almost spherical, would have all the eigenvalues approximately equal, whereas a more elongated solution would have one which is substantially different to the other two.
December 27, 2001 15:22 WSPC/148-RMP
54
00106
R. A. Battye & P. M. Sutcliffe
The exact same configuration was produced by the collision and then relaxation of two Lorentz-boosted B = 3 Skyrmions and a stationary B = 5 solution in a linear arrangement (3 + 5 + 3). Given that the fullerene hypothesis is clearly not the whole story for B = 9 and B = 10, it is reassuring that things get back on track at B = 11 with what appears to be the unique global minimum, S11 , being a fullerene type solution describable by the rational map ansatz. 3.4. B = 12 Considering all degree 12 maps the minimum is found to be I = 186.6 and this can be reproduced from a Td symmetric map constructed as follows. Decomposing 13 as a representation of T gives 13|T = 2A + A1 + A2 + 3F . Now let p± be the Klein polynomials [19] √ p± = z 4 ± 2 3iz 2 + 1 ,
(3.6)
(3.7)
associated with the vertices and faces of a tetrahedron. On applying the C3 generator contained in the tetrahedral group to these polynomials they acquire the multiplying factors p± 7→ e±2πi/3 p± . Thus, the degree 12 polynomials p3± are strictly invariant, forming a basis for the representation 2A in the above decomposition and the polynomials p2+ p− and p2+ p− are bases for the representations A1 and A2 respectively. Explicitly, the rational map R=
ap3+ + bp3− , p2+ p−
(3.8)
is Td symmetric for all real a and b, with the minimal value I = 186.6 obtained for a = −0.53, b = 0.78. We should note that there are other maps with Td symmetry (the denominator in (3.8) can be replaced by p3+ , for example), but it appears that all these have a larger value for I. As for B = 11, this fullerene-like configuration was reproduced in non-linear field theory simulations, this time from initially well-separated B = 7 and B = 5 solutions (7 + 5), allowing us to conclude that it is the unique S12 . The associated polyhedron (which corresponds to configuration 40:40 in [14]) is in some ways similar to the Td solution at B = 9: there being four pentagon triplets positioned on the vertices of a tetrahedron. Each of these triplets is completely surrounded by hexagons forming a polyhedron well-known in fullerene chemistry [14], where it is one of 40 configurations with 12 pentagons and 10 hexagons which are candidates for a C40 cage. 3.5. B = 13 The minimal map of degree 13, deduced from simulated annealing of general maps, has cubic symmetry, another with four-fold symmetry which is incompatible with
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
55
the fullerene hypothesis. It is interesting to note that the fullerene hypothesis would have predicted a trivalent polyhedron made from 12 pentagons and 12 hexagons. We have not discussed representations of the cubic groupf , O, so we shall describe the group theory of this example by embedding it into the tetrahedral group, whose representations were reviewed earlier. One finds that 14|T = 3E 0 + 2E10 + 2E20 ,
(3.9)
so there is a two parameter family of T maps associated with the first component in (3.9). Setting one of these parameters to zero extends the symmetry to O and results in the one parameter family of maps R=
z(a + (6a − 39)z 4 − (7a + 26)z 8 + z 12 ) , 1 − (7a + 26)z 4 + (6a − 39)z 8 + az 12
(3.10)
whose minimum occurs at a = 0.40 + 5.18i when I = 216.7. The associated polyhedron is in many ways similar to a cube comprizing of six pseudo-faces, each of which are made of four pentagons with a four-valent bond, very similar to those in the B = 9 configuration. Clearly, in order for the them to fit together, with all the other bonds being trivalent, each of these pseudo-faces must be rotated slightly relative to the one diametrically opposite, which removes the possibility of the reflection symmetries of the cube and, hence, the symmetry group Oh . The polyhedron comprises of a total of 24 pentagons, as opposed to the 12 pentagons and 12 hexagons that would have been expected had the fullerene hypothesis been correct for this charge. As we shall discuss in Sec. 5.2, this solution, which is reproduced in relaxation of a range of initially well separated clusters (3 + 7 + 3 and 12 + 1, for example), is rather special; it being obtainable via a multiple symmetry enhancement of a D2 fullerene polyhedron (probably from either configurations 44:75 or 44:89 in [14]). If a is real then the symmetry can be extend to Oh , but the minimum in this class is quite a bit higher at I = 265.1 for a = 7.2. This Skyrmion, which is probably a saddle point, has recently been computed in [24] from the relaxation of a single Skyrmion surrounded by 12 others in an initially face centred cubic array. The polyhedron associated with this configuration is more akin to the octahedron than the cube, comprising of eight triangular pseudo-faces. It contains mainly four-valent bonds; the only trivalent ones being placed in the centre of the pseudo-faces. There is a degree 13 map with D4d symmetry whose I value is extremely close to the minimal one, in fact I = 216.8. This map is R=
z(ia + bz 4 + icz 8 + z 12 ) 1 + icz 4 + bz 8 + iaz 12
(3.11)
where a, b, c are real and take the values a = −5.15, b = −50.46, c = 46.31 at the minimum. This Skyrmion looks similar to the O symmetric minimum, but it only f The
cubic symmetry group O is that of the octahedron/cube, without all the reflection symmetries contained in the full symmetry group Oh .
December 27, 2001 15:22 WSPC/148-RMP
56
00106
R. A. Battye & P. M. Sutcliffe
has two four-valent vertices as opposed to the six in the cubic configuration, and can be thought of as being an extension of the D4d configuration with B = 9. The two four valent bonds are part of two pseudo-faces forming the top and bottom, which are linked by eight copies of a single hexagon connected to a pentagon, alternately arranged pointing up and down, so that four hexagons and four pentagons connect to both the top and bottom pseudo-faces. This configuration contains 8 hexagons and 16 pentagons, breaks the GEM rules since the number of vertices is 42 rather than the predicted number 44 (the number of faces is still 24 ≡ 2(B − 1)) and can be created by a single symmetry enhancement of the same D2 fullerene as the O symmetric configuration (see Sec. 5.2 for details). Given the similarity of this configuration to the minimum, it is no surprise that the values of I are very close. We conclude, therefore, that S13 has O symmetry, can be approximated by the rational map (3.10), and that the GEM rules and the fullerene hypothesis breakdown at this charge as they did for B = 9. 3.6. B = 14 The minimizing map of degree 14 has only a relatively small symmetry, that of D2 . The map can be written in the form , 7 7 X X aj z 2j a7−j z 2j , (3.12) R= j=0
j=0
where a7 = 1 and a0 , . . . , a6 are complex parameters. The minimum is I = 258.5 which occurs when the parameters are those given in Table 3. This configuration has much less symmetry than any of the others previously described and the associated polyhedron is difficult to visualize in detail. It can be constructed by arranging the 12 pentagons in 4 sets of 3. The 4 sets should be split up into two pairs, each of which is connected by three hexagons, one in the gap between two pentagons on each side, and the other two either side of the first. The two pairs should then be connected by a band of a further 8 hexagons, making 14 in total. The configuration is of the fullerene type (it corresponds to configuration 48:144 in [14]) and is one of 192 possibilities containing 12 pentagons and 14 hexagons. Table 3. a0
The coefficients of the minimal D2 map with B = 14. a1
a2
a3
a4
a5
a6
Re(a)
0.8
−5.0
−3.0
−53.4
−15.2
−13.1
0.9
Im(a)
0.3
−13.5
3.7
−59.4
66.2
34.1
−11.6
Attempts to reproduce this solution by the relaxation of well-separated clusters have proved unsuccessful. We have tried initial conditions which comprise of 7 + 7,
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
57
Fig. 5. The baryon density isosurface and associated polyhedron of the B = 14 solution with C2 which is created during the collision of well-separated Skyrmion clusters. We believe this elongated solution to be the minimum energy Skyrmion at this charge.
12 + 2 and 13 + 1 and in each case the same configuration, shown in Fig. 5 was the end-product. This configuration has even less symmetry than the minimum energy rational map; it having just C2 symmetry. The associated polyhedron is almost impossible to describe due to the lack of symmetry suffice to say that it contains 12 pentagons and 14 hexagons, and corresponds to configuration 48:83g in [14]. Note that this Skyrmion is very elongated and so it is not surprising that the rational map approximation does not describe this configuration very well, since it assumes that the baryon density has the same angular distribution on concentric spherical shells. Presumably there is a rational map which describes a distorted, more spherical, version of this Skyrmion, but its I value will be larger than the minimal one. Unfortunately we are unable to find this rational map using simulated annealing since its symmetry group is contained within that of the minimizing map. We believe that S14 is the elongated configuration shown in Fig. 5, based on the fact that it was created easily in the collision of well-separated clusters. It is of the fullerene type, but at this stage we do not have a good description of it in terms of a rational map. In subsequent sections, when we use the rational map ansatz as the starting point for a quantitative investigation of Skyrmion properties we will be forced to use the D2 configuration instead of what we believe to be the minimum. However, this should not lead to substantial errors.
3.7. B = 15 Considering all degree 15 maps the minimum is found to be I = 296.3 which has tetrahedral symmetryh , T . To construct the map the relevant decomposition is 16|T = 2E 0 + 3E10 + 3E20 . g This
(3.13)
was identified by computing the pentagon index for the associated polyhedron and checking it against the Table in [14]. h As for the cubic group O, the group T is that of the tetrahedron, but without the reflection symmetries of Td .
December 27, 2001 15:22 WSPC/148-RMP
58
00106
R. A. Battye & P. M. Sutcliffe
At first sight it may appear that there is a one (complex) parameter family of tetrahedral maps corresponding to the 2E 0 component. However, this is not the case since this family of maps is degenerate, having common factors. From 8|T = 2E 0 + E10 + E20 ,
(3.14)
it follows that there is a one parameter family of degree seven tetrahedral maps (this family is constructed explicitly in [18]). Furthermore, 9|T = A + A1 + A2 + 2F ,
(3.15)
so there is a strictly invariant degree eight tetrahedral polynomial, which is given by p+ p− = 1 + 14z 4 + z 8 and is the vertex polynomial of a cube. A basis for the 2E 0 component of (3.13) is obtained by multiplying each basis polynomial for the 2E 0 component in (3.14) by p+ p− , and hence the corresponding map is degenerate, being only degree seven rather than 15. The 3E10 component in (3.13) does correspond to a genuine two (complex) parameter family of degree 15 tetrahedral maps. Using the methods described in [18] we find that this family of maps is given by R = p/q where √ p = i 3(1 + a − b)z 15 + (77 − 99a − 5b)z 13 √ + i 3(637 + 21a + 35b)z 11 + (1001 + 561a − 65b)z 9 √ + i 3(−429 + 99a + 45b)z 7 + (−1001 − 297a − 127b)z 5 √ − i 3(273 + 185a + 15b)z 3 + (115 + 27a + 5b)z ,
(3.16)
and q(z) = z 15 p(1/z). The value I = 296.3 is obtained when a = 0.16 + 2.06i, b = −4.47 − 8.57i. If a and b are both real then the symmetry extends to Td , but the minimum in this class is higher at I = 313.7, when a = 4.64, b = −20.45. The polyhedron associated with the T symmetric solution is of the fullerene type. It contains 12 pentagons and 16 hexagons which can be thought of as being arranged in 8 pseudo faces: 4 of these comprise of hexagon triples, whereas the other 4 can be made from a hexagon connected to 3 pentagons in a C3 arrangement. The 4 hexagon triples can be thought of as being placed on the vertices of a tetrahedron, and the other 4 pseudo-faces, which are connected to the others, can be thought of as being on the vertices of another tetrahedron which is not dual to the first, removing the possibility of reflection symmetries. There is no fullerene polyhedron with Td symmetry [14], and hence this configuration must contain four-valent bonds. In fact it contains more four valent bonds than trivalent ones, in an essentially similar way to the Oh configuration for B = 13. The T symmetric solution was reproduced in the relaxation of clusters containing 12 + 3 and 13 + 2, initially well separated, and hence we conclude that S15 has T symmetry, is of the fullerene type and can be reproduced by a rational map.
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
59
3.8. B = 16 B = 16 is another interesting situation where it appears that the minimum energy rational map may not in fact be the minimum energy Skyrmion. The minimizing map of degree 16 has D3 symmetry which takes the form , 5 5 X X aj z 3j+1 a5−j z 3j , R= (3.17) j=0
j=0
where a5 = 1 and a0 , . . . , a4 are complex parameters. The minimum is I = 332.9 which is attained when the parameters take the values presented in Table 4. The associated polyhedron can be constructed by first taking two sets of 3 hexagons, each of which is almost flat. Now connect a total of 6 pentagons and 3 hexagons, in 3-fold cyclic order around the flat structure, such that each of the gaps between the original hexagons is filled by a pentagon flanked on one side by another pentagon and on the other side by a hexagon. A further 6 pentagons, split into 3 pairs are used to fuse the two halves; each of the pairs connecting sets of 2 pentagons. Within the structure, there are a number of symmetrically placed groupings of polygons which are exactly those which were found to lead to symmetry enhancement as observed for B = 9 and B = 13. However, in this case a close examination of the baryon density isosurface shows that all the bonds remain trivalent and the associated polyhedron is of the fullerene type, containing 12 pentagons and 18 hexagons. Table 4.
The coefficients of the minimal D3 map with B = 16. a0
a1
a2
a3
a4
Re(a)
5.4
−14.6
35.9
−125.2
−5.2
Im(a)
−0.4
−69.3
165.9
77.4
34.2
There exists a family of D2 symmetric rational maps with B = 16 of the form , 8 8 X X aj z 2j a8−j z 2j , (3.18) R= j=0
j=0
where a8 = 1 and a0 , . . . , a7 are complex parameters. The minimum in this class takes place when I = 333.4, very close to that with D3 , and the parameters take the values presented in Table 5. Since the solution has very little symmetry it is difficult to describe the associated polyhedron as for that with B = 14. It is of the fullerene type, comprising of 12 pentagons and 18 hexagons, and can be thought of being formed from two identical half shells, connected together. To construct each of the two shells, start with a hexagon and attach to it, in cyclic order, two pentagons, a hexagon, another two pentagons and another hexagon. Then connect a pentagon, pointing downwards to the edge of the two hexagons, and add three hexagons to each side connecting the two pentagons.
December 27, 2001 15:22 WSPC/148-RMP
60
00106
R. A. Battye & P. M. Sutcliffe Table 5.
The coefficients of the minimal D2 map with B = 16.
a0
a1
a2
Re(a)
0.0
−4.2
6.9
Im(a)
0.5
19.9
4.2
a3
a4
a5
a6
a7
39.8
−76.4
−201.0
−5.9
−9.7
−105.0
64.8
−41.0
27.8
−2.7
Given that there are at least two rational maps with very similar values of I, it is interesting to see if we can create them both via the relaxation of initial wellseparated clusters. To this end we have tried a number of different initial conditions (7 + 9, 12 + 4 and 13 + 3). In contrast to the B = 10 case where we were able to create both the D3 and D3d configurations, in each case the end-product of the relaxation process had D2 symmetry. This strongly suggests, but does not prove, that this is the global minimum energy solution and that the D3 configuration may be a saddle point solution. This is interesting since it suggests that the relative ordering of the D3 and D2 solutions is probably different when considering the full non-linear energy functional and that for the rational map ansatz. 3.9. B = 17 The case of B = 17 is interesting since it was conjectured in [5] on the basis of the fullerene hypothesis that there might exist a Skyrmion configuration of this charge with the same structure as that of Buckminsterfullerene, C60 , which comprises of 20 hexagons and 12 pentagons in an icosahedral configuration. This is well known as the standard design for a football since it is almost spherical, and also in civil engineering where it was championed by Buckminster Fuller as a candidate for a geodesic dome. It is the isolated pentagon structure (each pentagon is isolated by connecting it to 5 hexagons) with the lowest number of vertices, and appears to be the most stable carbon cage; it being the subject of much interest in chemistry in the recent past. An icosahedrally symmetric rational map was found in [18] and this exact same map is reproduced by minimizing over all degree 17 maps. The value I = 363.4 is given by the Yh symmetric buckyball mapi R=
17z 15 − 187z 10 + 119z 5 − 1 . z 2 (z 15 + 119z 10 + 187z 5 + 17)
(3.19)
Further confirmation that this is indeed the minimum energy configuration at this charge comes from non-linear field theory simulations. We have performed relaxations of initial configurations of 12 + 5, 13 + 4 and 5 + 7 + 5 all of which relax very quickly to the buckyball structure. [18] the value of I quoted for the B = 17 buckyball map was the result of a typographical error and should read 363.41, not 367.41.
i In
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
61
In addition to the buckyball map there is also a non-fullerene map which has a low value of I. This map has Oh symmetry, and again we shall describe its group theory construction by embedding it into the tetrahedral group. The tetrahedral decomposition in this case reads 18|T = 3E 0 + 3E10 + 3E20 ,
(3.20)
with the two-parameter family of T maps corresponding to the 3E 0 component, reducing to a one-parameter family of O maps by setting one of the two parameters to zero. If the remaining parameter is real then the symmetry extends to Oh and the map is given by R = p/q where p = (129 + a) + (2380 + 116a)z 4 + (24310 + 286a)z 8 + (6188 − 156a)z 12 + (17 + 9a)z 16 ,
(3.21)
and q(z) = z 17 p(1/z). The minimal Oh map in this class has I = 367.2 when a = 280.9. Given that Buckyball map is the minimum map with B = 17 and it is reproduced in the numerical field theory relaxations, we conclude that the fullerene hypothesis and the conjecture of [5] are spectacularly confirmed at this charge; S17 has Yh symmetry and the associated polyhedron is the buckyball. 3.10. B = 18 After the particularly high symmetry of the B = 17 solution, the minimizing map of degree 18 is relatively unremarkable having only D2 symmetry, and takes the form , 9 9 X X aj z 2j a9−j z 2j , R= (3.22) j=0
j=0
where a9 = 1 and a0 , . . . , a8 are complex parameters. The minimum value is I = 418.7 which is attained when the parameters take the values given in Table 6. The associated polyhedron, which is of fullerene type, but is not an isolated pentagon structurej , is difficult to describe to similar reasons to the B = 14 and B = 16 solutions with D2 symmetry. It contains 12 pentagons and 22 hexagons, and can best be described, as for B = 16, in terms of two half shells which fit together to create the whole polyhedron. Each shell can be created by first taking a hexagon and connecting to it, in cyclic order, two hexagons, a pentagon, two hexagons and another pentagon. Now connect a pentagon, pointing downward to each of the hexagons, then fill in the gaps, of which there are 6, with hexagons. Given the rather low symmetry of the minimum energy rational map, one might think that as, for example, with the cases of B = 14 and B = 16 that there might be some confusion. However, by relaxing initial conditions comprizing of 9 + 9 and j In
fact, no isolated pentagon structures exist with 12 pentagons and 22 hexagons [14].
December 27, 2001 15:22 WSPC/148-RMP
62
00106
R. A. Battye & P. M. Sutcliffe Table 6. a0
The coefficients of the minimal D2 map with B = 18.
a1
a2
a3
a4
a5
a6
a7
a8
Re(a)
−0.1
−5.6
3.2
51.5
−35.9
−50.9
−168.6
−0.3
−10.8
Im(a)
−0.5
−16.4
−0.8
104.8
−73.2
10.7
51.4
−3.3
1.4
17 + 1 we have reproduced the D2 configuration. Hence, we conclude that S18 is of the fullerene type and can be well approximated by the D2 rational map above.
3.11. B = 19 The minimum value at degree 19 is I = 467.9 and is attained by a D3 symmetric map of the form , 6 6 X X aj z 3j+1 a6−j z 3j , R= j=0
(3.23)
j=0
where a6 = 1 and a0 , . . . , a5 are complex parameters given in Table 7. The associated polyhedron can be constructed by taking two sets of three hexagons which are almost flat. To each, connect a pentagon in the gap connecting two hexagons and another pentagon next to it connected to only one of the hexagons in the triple. These two structures are C3 symmetric and can be thought of as forming the top and bottom of the polyhedron. They are connected together by 9 sets of two hexagons, one which connects to the top and the other which connects to the bottom. This is a fullerene type polyhedron containing 12 pentagons and 24 hexagons. Table 7.
The coefficients of the minimal D3 map with B = 19. a0
a1
a2
a3
a4
Re(a)
5.2
Im(a)
0.9
−0.9
71.4
−325.4
−116.0
73.6
41.8
−96.7
95.9
a5 0.83 −32.5
There is a more symmetric map, with Th symmetry, whose I value is only slightly higher than the minimum at I = 469.8. Computing the relevant decomposition 20|T = 4E 0 + 3E10 + 3E20 ,
(3.24)
shows that there is a three parameter family of tetrahedral maps of degree 19 corresponding to the first component in (3.24). This family of maps is given by R = p/q where
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
63
p = (239 − 9b)z + (503a − 25c)z 3 + (−5508 + 460b)z 5 + (−1300a + 284c)z 7 + (−4862 − 286b)z 9 + (1794a + 210c)z 11 + (9996 − 196b)z 13 + (−484a + 44c)z 15 + (135 + 31b)z 17 + (−a − c)z 19 ,
(3.25)
and q(z) = z 19 p(1/z). The minimum in this family is Th symmetric with a = 5.5, b = 6.3, c = 37.3 and produces the I value given above. The associated polyhedron contains many four valent bonds and is most definitely not of the fullerene type. In fact a cursory glance at the configuration might convince one that the polyhedron has cubic symmetry with there being eight sets of three fused pentagons effectively situated at the corners of a cube. However, the two pentagons which are situated at the centre of each face break the cubic symmetry since they point alternately in different directions. In total the configuration comprizes of a total of 36 pentagons. The D3 configuration was reproduced in the collision and subsequent relaxation of 17 + 2 and therefore we conclude that it corresponds to S19 . It is of the fullerene type and can be approximated using the rational map (3.23). 3.12. B = 20 For B = 20 the minimum value is I = 519.6 and is reproduced by the D6d symmetric map R=
z 2 (ia + bz 6 + icz 12 + z 18 ) , 1 + icz 6 + bz 12 + iaz 18
(3.26)
with a = −16.8, b = −288.3 and c = 215.8. The associated polyhedron can be thought of as being created from two half shells connected together and rotated relative to each other by 30◦ to give D6d symmetry. Each half can be constructed from a single hexagon, surrounded by another six forming an almost flat hexagonal structure, which are then surrounded by 6 hexagons and 6 pentagons. The flat structure has 12 positions for attaching another polygon, 6 places to connect to one hexagon and 6 places to connect to two. The pentagons connect to two and the hexagons connect to one forming a structure which is of the isolated pentagon fullerene type (corresponding to configuration 72:1 in [14]). It contains 12 pentagons and 26 hexagons, was reproduced in the relaxation of 17+3 and, hence, we conclude that it is S20 . 3.13. B = 21 The B = 21 minimizing map is Td symmetric. From the decomposition 22|T = 3E 0 + 4E10 + 4E20
(3.27)
there is a three parameter family of T symmetric maps corresponding to the 4E10 component and this family of maps is given by R = p/q where
December 27, 2001 15:22 WSPC/148-RMP
64
00106
R. A. Battye & P. M. Sutcliffe
√ p = 1025 + 3a + b + c + i(210 + 890a + 74b − 10c) 3z 2 + (5985 + 6327a + 1433b − 75c)z 4 √ + i(54264 − 7752a − 680b + 392c) 3z 6 + (203490 + 5814a − 1598b + 690c)z 8 √ + i(352716 + 16796a + 2652b + 260c) 3z 10 + (293930 − 25194a + 442b + 130c)z 12 √ + i(116280 − 7752a − 1768b − 120c) 3z 14 + (20349 + 14535a + 221b − 243c)z 16 √ + i(1330 − 646a + 234b − 10c) 3z 18 + (21 + 51a + 13b + 9c)z 20 ,
(3.28)
and q(z) = z 21 p(1/z). The minimum of I = 569.9 is obtained when a = 20.8, b = −102.0, and c = 570.1, for which the symmetry extends to Td since a, b, c are all real. The associated polyhedron can be thought of in terms of four copies of two different pseudo faces, one set is placed on the vertices of a tetrahedron and the other on the vertices of a tetrahedron dual to the first. In this respect it is very similar to the Td configuration with B = 9, and different to the T configuration with B = 15. One set of pseudo faces comprise of a hexagon triple, whereas the others consists of a hexagon surrounded alternately by hexagons and pentagons. Note that this map is the latest in an infinite family of tetrahedral maps, corresponding to charges B = 6n + 3, where n = 0, 1, 2, 3, . . .. This is because 6n + 4|T = nE 0 + (n + 1)E10 + (n + 1)E20
(3.29)
so there is an n parameter family of tetrahedral maps corresponding to the middle component in the above. For n = 0, 2, 3 (B = 3, 15, 21) we have seen that this family includes the minimal map, and for n = 1 (B = 9) this family includes a map which is very close to the minimal value. Thus it seems possible that other members of this family will be minimal maps, for example, for B = 27, although this configuration must have only T symmetry if it is to be of the fullerene type and will therefore be similar to that with B = 15. This solution was reproduced in the relaxation of 17 + 4 and 20 + 1, and hence we conclude that it corresponds to S21 . It is of the isolated pentagon fullerene type (corresponding to configuration 76:2 in [14]), comprising of 12 pentagons and 28 hexagons and can be reproduced by the rational map (3.28).
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
65
3.14. B = 22 The minimum value at degree 22 is I = 621.6, obtained from a D5d symmetric map R=
az 2 + ibz 7 + cz 12 + idz 17 + z 22 , 1 + idz 5 + cz 10 + ibz 15 + az 20
(3.30)
where a = 24.8, b = −814.6, c = −2000.3, d = 320.3. The polyhedron associated with this configuration can be constructed in two halves, which fit together as with many of the solutions already described. To construct each half, take a pentagon and surround it by 5 hexagons. There are 10 places to position another polygon, 5 of which connect to two hexagons and 5 which connect to just one. Place 5 pentagons in the gaps connecting to two hexagons and 5 hexagons just connecting to one. Then place a further 5 hexagons, connecting to the pentagons. This configuration, which comprises of 12 pentagons and 30 hexagons, is of the isolated pentagon fullerene type (it corresponds to configuration 80:1 in [14]). For B = 22 we find that there is an interesting phenomenon in that an icosahedrally symmetric fullerene polyhedron exists [14], but no corresponding rational map generated Skyrmion. The Yh symmetric C80 fullerene is constructed in a similar manner to the D5d fullerene described above, except that one interchanges the pentagons and hexagons at the point at which there was a choice in inserting polygons into the 10 positions. It is easy to check that there are no Y symmetric rational maps of degree 22, since 23|Y contains only representations of dimension three and higher. Thus there are symmetric fullerene polyhedra which do not correspond to symmetric rational maps. This apparent puzzle can be understoodk by realizing that there are rational map generated Skyrmions whose baryon (and energy) density has more symmetry than the Skyrme field itself. As mentioned earlier, the baryon density of a Skyrmion is localized around the edges of a polyhedron with the face centres of the polyhedron given by the vanishing of the derivative of the rational map, or more accurately by the zeros of the Wronskian of the numerator and the denominator w(z) = p0 (z)q(z) − q 0 (z)p(z) ,
(3.31)
which is in general a degree 2B − 2 polynomial in z. For B = 22 the Wronskian is, therefore, a degree 42 polynomial, and although there are no Y symmetric degree 22 rational maps there is a Y symmetric degree 42 polynomial, given by the product of the Klein polynomials corresponding to the edges and vertices of an icosahedron [19]. Therefore, it appears that the existence of a symmetric fullerene polyhedron coincides with the existence of a rational map whose Wronskian has this symmetry, but that the existence of such a Wronskian does not imply the existence of a symmetric rational map itself. Fortunately, we have not encountered this situation in our study of minimal energy rational maps and Skyrmions for the other charges k We
thank Conor Houghton for pointing this out to us.
December 27, 2001 15:22 WSPC/148-RMP
66
00106
R. A. Battye & P. M. Sutcliffe
we have studied, since it would make the problem of identifying and constructing a particular rational map a much more difficult exercise. There exists a family of D3 symmetric rational maps with B = 22 of the form , 7 7 X X aj z 3j+1 a7−j z 3j , (3.32) R= j=0
j=0
where a7 = 1 and a0 , . . . , a6 are complex parameters. The minimum value of I in this class of maps is I = 623.4, which is very close to that for the D5d . The coefficients at this minimum are presented in Table 8. This configuration, which is also of the the isolated pentagon fullerene type, corresponds to configuration 80:4 in [14]. It can be constructed by first taking two hexagon triples each of which are surrounded by 3 pentagons and 6 hexagons in a C3 arrangement, hexagons filling the gaps which connect to 2 of the hexagons in the original triples. Then connect 3 more pentagons to each ‘half’ in between each of the C3 symmetric hexagon triples. The two ‘halves’ should then be connected by a band of 12 hexagons around the centre, which is split up into 3 lots of 4 by the C3 symmetry. The whole polyhedron contains 12 pentagons and 30 hexagons, as for the D5d configuration. Table 8. a0
The coefficients of the minimal D3 map with B = 22. a1
a2
a3
a4
a5
a6
Re(a)
4.5
−75.4
−393.4
270.5
26.1
123.8
41.5
Im(a)
−3.2
54.9
62.3
391.5
872.7
−177.2
13.8
Relaxation of clusters containing 17 + 5, 20 + 2 and 21 + 1 all lead to the same structure, that with D3 symmetry. Therefore, we conclude that S22 is that approximated by the rational map (3.32). This removes, from the point of view of this paper at least, further motivation for attempting to create the Skyrmion with the Y symmetric baryon density isosurface. 3.15. Summary The main conclusion of this section on Skyrmion identification is that the fullerene hypothesis appears to apply for a wide range of charges, and that the rational map ansatz can be used to make a good approximation to the solutions. In particular, we have concludedl that SB is of the fullerene type for 7 ≤ B ≤ 22, except when B = 9 and B = 13. For these charges the associated polyhedron contains fourvalent bonds, but as we shall discuss in Sec. 5.2, even these solutions can be related to fullerene polyhedra via symmetry enhancement. Clearly, there is a strong correlation between the structure of multi-Skyrmions and that of fullerene polyhedra. l We
should note that we have also confirmed the results of [5, 18] for 1 ≤ B ≤ 8.
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
67
For B = 17, B = 20, B = 21 and B = 22 where there are fullerene polyhedra in which all the pentagons are surrounded by just hexagons, this type of configuration is picked out as that with minimum energy. In fullerene chemistry the isolated pentagon isomers are thought to minimize energy by placing the pentagonal defects as far as possible from each other, and it likely that this also taking place here. It is interesting to speculate that these highly spherical, isolated pentagon configurations will be the minima at higher charge in a similar way to our suggestion of the fullerene hypothesis just on the basis of the B = 7, B = 8 minima and the B = 9 saddle point. Although there will no doubt be caveats, as we have reported in this paper for the fullerene hypothesis for B = 9 and 13, we believe they are likely to be the exception rather than the rule. Notwithstanding these successes we have turned up a few oddities. Firstly, we have seen that in the cases of B = 10, B = 14, B = 16 and B = 22, that the minimum energy rational maps might not be SB . For B = 10 the situation is particularly interesting since the minimum energy rational map is not of the fullerene type, yet only fullerenes are produced in the relaxation of initially well-separated clusters. While for B = 14 we have been unable to produce a rational map which accurately reproduces the configuration which is readily produced in the full relaxation, probably since it is elongated and has little symmetry. The cases of B = 16 and B = 22 are probably more mundane since the differences in the I values for the two fullerene-like configurations are very small, and it is not difficult to imagine that using I as the energy functional rather than the true energy manages to swap around the two configurations. These special cases vindicate our approach of using two different methods to make our identifications.
4. Skyrmion Energies It is usual when presenting numerically generated minimum energy Skyrmion configurations to discuss their energy. Before we present what we believe are good representations of the energies, we should just make a point that identifying the symmetry is probably a better way of judging success, rather than just this single number. In particular, making comparisons between different approaches for computing the energy presented in the literature is extremely hazardous, whereas the symmetry identification should be universal. The approach that we shall discuss here is based on using the full non-linear dynamics to relax a solution generated using the rational map ansatz for a given charge and symmetry. We have already seen that the rational map ansatz systematically over estimates the ratio of the energy to baryon number (E/B) for a given configuration and it is clear that the numerical relaxation is likely to reduce this somewhat. Specifically, we have performed relaxations on numerical grids for all the solutions listed in Table 1 and Table 2 with (a) ∆x = 0.02 and N = 100 and (b) ∆x = 0.01 and N = 200, both using Dirichlet (fixed) boundary conditions
December 27, 2001 15:22 WSPC/148-RMP
68
00106
R. A. Battye & P. M. Sutcliffe
Table 9. The results of relaxing the rational map solutions which minimize the function I. The first set correspond to N = 100 and ∆x = 0.02, while the second set have N = 200 and ∆x = 0.01. In both cases the profile function was set to be zero at r∞ = 10. Notice that the E/B values are very close for the two different size grids suggesting that this figure is universal with an error ±0.001. B
G
Edis
Bdis
E/B
Edis
Bdis
E/B
1
O(3)
1.1591
0.9407
1.2322
1.2137
0.9849
1.2322
2
D∞h
2.2335
1.8935
1.1796
2.3260
1.9726
1.1791
3
Td
3.2773
2.8573
1.1470
3.3960
2.9627
1.1462
4
Oh
4.2683
3.8091
1.1205
4.4265
3.9519
1.1201
5
D2d
5.3308
4.7708
1.1174
5.5199
4.9409
1.1172
6
D4d
6.3391
5.7230
1.1077
6.5692
5.9296
1.1079
7
Yh
7.3243
6.6889
1.0950
7.5766
6.9210
1.0947
8
D6d
8.3796
7.6441
1.0962
8.6690
7.9100
1.0960
9
D4d
9.4026
8.5984
1.0936
9.7322
8.8990
1.0936
10
D4d
10.4212
9.5579
1.0903
10.7826
9.8893
1.0903
11
D3h
11.4464
10.5129
1.0888
11.8457
10.8788
1.0889
12
Td
12.4533
11.4721
1.0855
12.8888
11.8723
1.0856 1.0834
13
O
13.4689
12.4304
1.0835
13.9311
12.8585
14
D2
14.5057
13.3819
1.0840
15.0139
13.8480
1.0842
15
T
15.5214
14.3403
1.0824
16.0635
14.8387
1.0825
16
D3
16.5274
15.2969
1.0804
17.0167
15.8283
1.0808
17
Yh
17.5275
16.2677
1.0774
18.1205
16.8185
1.0774
18
D2
18.5677
17.2152
1.0786
19.2134
17.8094
1.0788
19
D3
19.6234
18.1913
1.0787
20.2717
18.7951
1.0786
20
D6d
20.6414
19.1607
1.0773
21.3198
19.7781
1.0779
21
Td
21.7056
20.1351
1.0780
22.3781
20.7580
1.0780
22
D5d
22.7349
21.1146
1.0767
23.4183
21.7525
1.0766
at the edge of the grid. These two discretized grids have exactly the same spatial extent and so when computing the initial profile function for the rational map ansatz we have set the profile function to zero at r∞ = 10. The results of this extensive set of simulations are presented in Table 9 for the solutions which are the minimum energy rational maps with respect to the energy functional I and in Table 10 for the other critical points of I, some of which we have concluded in the previous section are, in fact, the minimum energy Skyrmions. The total simulation time — which is the number of timesteps multiplied by ∆t — for N = 100 is twice that for N = 200, although making an exact comparison can be misleading since the rate at which the relaxation takes place is somewhat arbitrary due to the periodic removal of kinetic energy. Suffice to say, in both cases we believe that we have run the code for long enough for it to settle down to the minimum. The first thing to notice is that the computed values of the baryon number for the discrete grid Bdis are less than the relevant integer; the value for N = 200 being
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps Table 10. B
69
Same as Table 9 but for the other saddle points of I listed in Table 2. G
Edis
Bdis
E/B
Edis
Bdis
E/B
9*
Td
9.4281
8.5935
1.0971
9.7636
8.8975
1.0973
10*
D3
10.4223
9.5592
1.0903
10.7826
9.8891
1.0904
10*
D3d
10.4171
9.5586
1.0898
10.7791
9.8890
1.0900
10*
D3h
10.4623
9.5548
1.0950
10.8308
9.8882
1.0953
13*
D4d
13.4674
12.4295
1.0835
13.9311
12.8585
1.0834
13*
Oh
13.6152
12.4264
1.0957
14.0939
12.8548
1.0964
15*
Td
15.5792
14.3429
1.0862
16.1226
14.8369
1.0867
16*
D2
16.5294
15.2964
1.0806
17.1092
15.8281
1.0809
17*
Oh
17.5351
16.2654
1.0781
18.1389
16.8215
1.0783
19*
Th
19.6461
18.1854
1.0803
20.2996
18.7935
1.0801
22*
D3
22.7455
21.1162
1.0772
23.4175
21.7509
1.0766
closer than that for N = 100. There are two possible sources for this error: the first is numerical discretization error from the computation of the spatial derivatives (and the resulting numerical integration) and the other is that the physical size of the box r∞ is not large enough to encompass the whole solution and we have underestimated the gradient energy. In order to understand which is the dominant effect we have repeated the same relaxations with r∞ = 20 (∆x = 0.02 and N = 200) for B = 1 − 4, 9, 13, 17 and the results are presented in Table 11 for a total simulation time which is exactly the same as that for ∆x = 0.02, N = 100 and r∞ = 10. The values of Bdis agree to the third decimal place for B = 1 − 4 with the agreement being less good for larger values of B. Clearly discretization error is the dominant effect for the small values of B, and the truncation error due to r∞ being finite becomes important as it increases. Remarkably, the value of Edis /Bdis appears to remain constant at the level of around ±0.001 for all the relaxations, irrespective of what is the main source of the Table 11. The results of relaxing the rational map solutions for selected charges using N = 200, ∆x = 0.02 and r∞ = 20. Notice that the numerical values are almost identical to those listed in Table 9 for the same value of ∆x. B
G
Edis
Bdis
E/B
1
O(3)
1.1587
0.9410
1.2314
2
D∞h
2.2324
1.8936
1.1789
3
Td
3.2748
2.8562
1.1466
4
Oh
4.2683
3.8095
1.1204
9
D4d
9.3906
8.5870
1.0936
13
O
13.4572
12.4197
1.0835
17
Yh
17.5511
16.2862
1.0777
December 27, 2001 15:22 WSPC/148-RMP
70
00106
R. A. Battye & P. M. Sutcliffe
error in computing the individual values for the sensible parameters ∆x and N that we have used. Therefore, the main conclusion we will draw from these simulations is the value of Edis /Bdis which we will equate with the true value of E/B. Clearly, knowledge of E/B to a certain level of accuracy allows one to compute the energy of the solution, EB , to the same level of accuracy, and the values of the energies based upon this hypothesis are presented in Table 12 and Table 13 for the minima of I and the other critical points respectively. For the subsequent discussions we will take the value of E/B to be that computed when ∆x = 0.01, N = 200 and r∞ = 10 subject to an assumed error of ±0.001, with the relative difference between adjacent values of B probably being even more accurate. This error budget is used to include the many other possible systematic uncertainties in computing E/B which have not already been discussed and the spread of the values computed.
Table 12. The actual computed energies, EB , of the solutions presented in Table 9 deduced from EB = B ∗ Edis /Bdis . B
G
E/B
EB
E/B
EB
1
O(3)
1.2322
1.2322
1.2322
1.2322
2
D∞h
1.1796
2.3592
1.1791
2.3582
3
Td
1.1470
3.4410
1.1462
3.4386
4
Oh
1.1205
4.4820
1.1201
4.4804
5
D2d
1.1174
5.5870
1.1172
5.5860
6
D4d
1.1077
6.6462
1.1079
6.6474
7
Yh
1.0950
7.6650
1.0947
7.6629
8
D6d
1.0962
8.7696
1.0960
8.7680
9
D4d
1.0936
9.8424
1.0936
9.8424
10
D4d
1.0903
10.9030
1.0903
10.9030
11
D3h
1.0888
11.9768
1.0889
11.9779
12
Td
1.0855
13.0260
1.0856
13.0272
13
O
1.0835
14.0855
1.0834
14.0842
14
D2
1.0840
15.1760
1.0842
15.1788
15
T
1.0824
16.2360
1.0825
16.2375
16
D3
1.0804
17.2864
1.0808
17.2928
17
Yh
1.0774
18.3158
1.0774
18.3158
18
D2
1.0786
19.4148
1.0788
19.4184
19
D3
1.0787
20.4953
1.0786
20.4934
20
D6d
1.0773
21.5460
1.0779
21.5580
21
Td
1.0780
22.6380
1.0780
22.6380
22
D5d
1.0767
23.6874
1.0766
23.6852
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
71
Table 13. The actual computed energies, EB , of the solutions presented in Table 10 deduced from EB = B ∗ Edis /Bdis . B
G
E/B
EB
E/B
EB
9*
Td
1.0971
9.8739
1.0973
9.8757
10*
D3
1.0903
10.9030
1.0904
10.9040
10*
D3d
1.0898
10.8980
1.0900
10.9000
10*
D3h
1.0950
10.9500
1.0953
10.9530
13*
D4d
1.0835
14.0855
1.0834
14.0842
13*
Oh
1.0957
14.2441
1.0964
14.2532
15*
T
1.0862
16.2930
1.0867
16.3005
16*
D2
1.0806
17.2896
1.0809
17.2944
17*
Oh
1.0781
18.3277
1.0783
18.3311
19*
Th
1.0803
20.5257
1.0801
20.5219
22*
D3
1.0772
23.6984
1.0766
23.6852
The precise values of E/B we have computed here differ from the those quoted for B = 1 − 9 in our earlier work [5]m , and for B = 1 − 4 in recent work which used a simulated annealing algorithm on the full field dynamics [15]. By making comparison with [5] at the level of accuracy suggested there (∼ 1%), we see that only the B = 2 and B = 3 are discrepant and then the difference is only of the order of an extra 1%, but that the actual values quoted are very different at the level of 3 decimal places. We believe the earlier method has a tendency to under estimate the true value of E/B since the value of the field at the boundary, which is kept fixed during the relaxation runs, is sensitive to the initial conditions, which are well-separated in contrast to the situation here. This is borne out on inspection of the E/B values computed for the relaxed solutions from well-separated clusters for large B used to identify the minima. The comparison with the results of [15] is less good, their results being systematically about 1% higher than those presented here and 2% higher than those of [5]. Although it is difficult to make strong conclusions, we believe that these higher values could well be a function of using periodic boundary conditions rather than fixed. This assumption was used to make their computed values of B almost exactly an integer, but it is clear that such an assumption will modify the scale of the solution. In particular, their quoted value for the energy of a single Skyrmion is larger than the known value (E = 1.232), and the solution assuming periodic boundary conditions with a domain of finite extent will be somewhat different. We should comment on the computed values of E/B for values of B where we have more than one rational map which is of low energy with respect to I. m Note
that the B = 9 solution of [5] had Td symmetry and so one should be careful to make the correct comparison.
December 27, 2001 15:22 WSPC/148-RMP
72
00106
R. A. Battye & P. M. Sutcliffe
Previously in a number of cases (B = 10, 16 and 22) we had concluded that the minimum energy rational map is not necessarily the minimum energy Skyrmion and one might hope that the relaxation process might confirm this with their energies being separated by a significant amount. For B = 9, 15, 17 and 19 its clear that the computed values of E/B are systematically much higher for the solutions which are not minima of the energy functional I and therefore we conclude that these solutions are definitely not the minima of the full Skyrme energy functional (remember that on the basis of relaxation of wellseparated clusters we have concluded that the minima with respect to I are in fact the minima). For B = 13 the solution with D4d symmetry has exactly the same value as that with O, while that with Oh is much higher. It might seem remarkable that the values for the D4d and O symmetric solutions are exactly the same, but one should remember that the two solutions are very similar and can be related by symmetry enhancement (see Sec. 5.2). Since the O solution was produced from the relaxation of clusters and the I values for the two solutions are very close anyway, we conclude that the O solution is probably the minimum. For B = 10, 16 and 22, we are unable to tell the different candidate minima apart based on the computed energies since our quoted error of ±0.001 for each of the values of E/B encompasses the different solutions under consideration. For B = 10 the D3h solution is of higher energy and is clearly not the minima, but the other 3 are well within the range of uncertainty. This is also the case for B = 16 and B = 22, where each of the candidate minima are within the quoted range for E/B. In Table 14 we have summarized the computed values of E/B and EB for the solutions which we have identified as the minima in Sec. 3. Included also are the ionization energy IB = EB−1 + E1 − EB , the energy required to remove a single Skyrmion, and the binding energy per baryon given by ∆E/B = E1 − (E/B), which is the energy required to separate the solution up into single Skyrmions divided by the total baryon number. The accuracy of ∆E/B will be exactly that of the computed value of E/B since the value of E1 we compute by this method appears to be exact within the quoted limits, but the errors in computing IB could theoretically be larger since it is the difference of two energies. For B > 10 the worst case errors in computing IB could be as much as ±0.02 (a significant amount on inspection of the quoted values), but since we have already commented that we believe the difference in energies for adjacent values of B will be even more accurate than the absolute errors in the energy we suspect that things will be much better. We will comment on this in subsequent sections. The values of E/B computed for these relaxed solutions and also for the original rational map are plotted against B in Fig. 6. Both start at approximately the same value for B = 1, and both appear to asymptote for large B, albeit at different values. For the relaxed solutions this appears to be about 6–7% above the Faddeev–Bogomolny bound, which is compatible with that computed for the hexagonal Skyrme lattice [6], which can be thought of as the infinite limit of a shell-like
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
73
Table 14. A summary of the symmetry and energy of the Skyrmion configurations which we have identified as the minima. Included also is the ionization energy (IB ) — that required to remove one Skyrmion — and the binding energy per Skyrmion (∆E/B) — that is the energy required to split the charge B Skyrmion into B with charge one divided by the total number of Skyrmions. (*) These correspond to the minimum energy Skyrme solutions which are not minimum energy solutions within the rational map ansatz. (**) The values quoted for B = 14 are computed using the initial configuration with D2 symmetry since we have been unable to derive the rational map with C2 symmetry. B
G
E/B
EB
IB
∆E/B
1
O(3)
1.2322
1.2322
0.0000
0.0000
2
D∞h
1.1791
2.3582
0.1062
0.0531
3
Td
1.1462
3.4386
0.1518
0.0860
4
Oh
1.1201
4.4804
0.1904
0.1121
5
D2d
1.1172
5.5860
0.1266
0.1150
6
D4d
1.1079
6.6474
0.1708
0.1243
7
Yh
1.0947
7.6629
0.2167
0.1375
8
D6d
1.0960
8.7680
0.1271
0.1362
9
D4d
1.0936
9.8424
0.1578
0.1386
10*
D3
1.0904
10.9040
0.1706
0.1418
11
D3h
1.0889
11.9779
0.1583
0.1433
12
Td
1.0856
13.0272
0.1829
0.1466
13
O
1.0834
14.0842
0.1752
0.1488
14**
C2
1.0842
15.1788
0.1376
0.1480
15
T
1.0825
16.2375
0.1735
0.1497
16*
D2
1.0809
17.2944
0.1753
0.1513
17
Yh
1.0774
18.3158
0.2108
0.1548
18
D2
1.0788
19.4184
0.1296
0.1534
19
D3
1.0786
20.4934
0.1572
0.1536
20
D6d
1.0779
21.5580
0.1676
0.1543
21
Td
1.0780
22.6380
0.1522
0.1542
22*
D3
1.0766
23.6852
0.1850
0.1556
Skyrmion, while the asymptote for the rational map ansatz is higher at around 9%. The curve for the relaxed solutions is smoother than that for the rational maps, which has notable dips associated with the highly symmetric solutions with B = 4, 7, 13 and 17. Although these deviations from what appears to be approximately a smooth curve do not totally disappear after relaxation, one can deduce that the other solutions, not being particularly spherical, do not fit the rational map ansatz as well, but that the relaxation using the full non-linear field dynamics softens these effects. The binding energy per baryon, ∆E/B, is plotted against B in Fig. 7, its shape just being the inversion of Fig. 6. Interestingly, it increases to an asymptote just as
December 27, 2001 15:22 WSPC/148-RMP
74
00106
R. A. Battye & P. M. Sutcliffe 1.25
1.2
1.15
1.1
1.05 0
5
10
15
20
B
Fig. 6. The computed values of E/B as a function of B for the configurations which we have identified as the minimum energy solutions, that is, those summarized in Table 14. The solid line is that after the process of relaxation, and the dashed line that from before, that is, the value for the appropriate rational map. For B = 14 where we have no rational map to represent the minimum energy solution we have used the values for the solution with D2 symmetry.
one might expect in a simple model of nuclei which excludes the Coulomb interaction within the nucleus. We shall return to this issue in Sec. 5.6. The ionization energy, IB , is plotted against B in Fig. 8. We have already commented that our quoted errors in E/B might lead to substantial errors in computing IB , but that systematic errors in computing the energy of a particular solution are likely to be similar and therefore our computed values for IB could probably be more accurate than one might naively expect. This is borne out by a cursory inspection of Fig. 8: we see that the most stable solutions with respect to the removal of a single Skyrmion are those with the most symmetry B = 4, 7 and 17, while those with the least symmetry B = 5, 8, 14 and 18 are much less stable. This is very much as one might expect.
5. Discussion There are some remarkable aspects of the solutions which we have created. In this section we point out, discuss and attempt to explain some of them.
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
75
0.15
0.1
0.05
0 0
5
10
15
20
Fig. 7. As for Fig. 6, but ∆E/B is plotted instead of E/B. For large B the values appear to level out at around 0.15 − 0.16 as one might expect in a simple model of nuclei excluding the effect of the Coulomb interaction.
5.1. Platonic symmetries It had been known for sometime that Skyrmion solutions existed whose associated polyhedra are platonic solids (B = 3, 4 and 7) and it had been conjectured that the minimum energy solution with B = 17 had Yh symmetry. These platonic symmetry groups are the groups with the highest symmetry (most generators) which are discrete subgroups of O(3), the dihedral groups having considerably fewer generators. Here, we have shown that platonic symmetries are even more prevalent in Skyrmion solutions, be they minimum energy solutions B = 12, 13, 15 and 21, or low energy saddle points B = 9, 13, 15, 17 and 19. The existence of these solutions is truly remarkable. This could have been expected within the fullerene hypothesis since the platonic groups T and Y are compatible with the associated polyhedron comprising of pentagons and hexagons. But we have also seen that the existence of a highly symmetric fullerene polyhedron compatible with the fullerene hypothesis at a particular charge does not necessarily imply that it is a minimum energy configuration. In particular, tetrahedral fullerene structures are compatible with B = 13, 16 and 19, and the group theory decomposition is consistent with the existence of appropriate rational maps. However, in each of these cases the minimum energy Skyrmion has much less
December 27, 2001 15:22 WSPC/148-RMP
76
00106
R. A. Battye & P. M. Sutcliffe
0.2
0.15
0.1
0.05
0 0
5
10
15
20
Fig. 8. As for Fig. 6, but IB is plotted instead of E/B. Notice that the most stable solutions are those with the most symmetry, B = 4, 7 and 17, while the least stable are those with little symmetry B = 5, 8, 14 and 18.
symmetry. Clearly, the solution being highly symmetric is not the sole criterion in minimizing the Skyrme energy functional. We have also encountered one case, icosahedral symmetry for B = 22, in which a platonic fullerene polyhedron exists, but no corresponding platonic rational map (and probably no Skyrmion either). However, it appears that a rational map exists (and hence a corresponding Skyrmion) for which the baryon density surface has more symmetry than the Skyrme field and is icosahedrally symmetric. But again this very symmetric structure appears not to be the minimum energy Skyrmion. For B > 7 the octahedral group is incompatible with the fullerene hypothesis since it requires four-fold symmetry which is impossible in a polyhedron comprising only of pentagons and hexagons. However, the minimum energy Skyrmion with B = 13 has O symmetry and we have also been able to find a Skyrmion with Oh symmetry with B = 17 which is a low-energy saddle point. Many such solutions in which there is a four-valent bond connecting 4 pentagons can be related to a fullerene polyhedron by symmetry enhancement as we shall discuss in the next section. This produces an extra twist to the fullerene hypothesis which allows octahedral Skyrmion solutions.
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
77
5.2. Symmetry enhancement We have seen that, in keeping with the fullerene hypothesis and the expectations of [5], most links in the polyhedra associated with Skyrmion solutions are trivalent. In particular, the baryon density isosurface of what we have identified as the minimum energy solutions consist of a trivalent polyhedron for all cases except B = 9 and B = 13. In these two cases the polyhedra contain four-valent vertices which means that they are not fullerene Skyrmions, since by definition all links are trivalent. However, it turns out that these two exceptional cases can be obtained from fullerenes by the application of a simple rule, which we refer to as symmetry enhancement, and shall now explain. Consider part of a fullerene which has the form shown in the first figure in Fig. 9, consisting of two pentagons and two hexagons with a C2 symmetry. The symmetry enhancement operation is to shrink the edge which is common to the two hexagons (the thick line) until it has zero length, which results in the coalescence of two vertices. The final object formed is shown in the second figure in Fig. 9. It has a four-valent vertex connecting four pentagons and the symmetry is enhanced from C2 to C4 . We find, empirically, that symmetry enhancement operations appear to take place in pairs, with a particular operation always being accompanied by the same operation on an opposite face of the associated polyhedron.
Fig. 9. On the left is a configuration comprising of two pentagons separated by two hexagons. Such configurations are prevalent in many fullerene polyhedra. On the right is what can be created by a single symmetry enhancement operation as appears to take place for B = 9 and B = 13. It is usual for such an operation to be accompanied by a similar operation on a diametrically opposite face of the associated polyhedron.
There is a C28 fullerene with D2 symmetry (denoted 28:1 in [14]) that contains two of the structures shown in the first figure in Fig. 9. If the above symmetry enhancement operation is performed on both these structures then the resulting object is precisely the D4d configuration of the B = 9 Skyrmion described earlier. There are also D2 symmetric C44 (denoted 44:75 and 44:89 in [14]) fullerenes to which similar statements apply. In this case, which is B = 13, there are an equal number of pentagons and hexagons (12 of each) and so a very symmetric configuration can be
December 27, 2001 15:22 WSPC/148-RMP
78
00106
R. A. Battye & P. M. Sutcliffe
obtained by applying the symmetry enhancement operation at all possible vertices (in this case 6) which results in the cubic Skyrmion. Selective application of the symmetry enhancement rule to these fullerenes allows one to create the associated polyhedron for the D4d symmetric Skyrmion at this charge, and also that for a D4 symmetric Skyrmion, whose rational map we have been unable to deduce since the minimum energy rational map in this class of maps has D4d symmetry. In the context of fullerenes it is, of course, impossible for vertices to coalesce since they correspond to the positions of the carbon atoms, but for Skyrmions the vertices represent local maxima of the baryon density and so there is no restriction that they be distinct; it just appears that in most cases it is energetically favourable to have distinct vertices. Note that, by an examination of the baryon density isosurface by eye, it can often be difficult to identify whether a given vertex is tri-valent or four-valent, since the edge length required to be zero for symmetry enhancement could be small, but non-zero. 5.3. Vertices, faces and rational maps We will now attempt to explain the various features of the Skyrmions we have created by considering the basic properties of the rational map ansatz. Recall that the baryon density of a Skyrmion is localized around the edges of a polyhedron. The face centres of this polyhedron are given by the zeros of the Wronskian of the numerator and the denominator w(z) = p0 (z)q(z) − q 0 (z)p(z) ,
(5.1)
which is in general a degree 2B − 2 polynomial in z. All the solutions which we have created have the property that all the roots of (5.1) are distinct and hence within the rational map ansatz this is a vindication of one of the GEM rules, in that it explains why the number of faces of the polyhedron is F = 2B − 2. Often (though not always, as we shall discuss further below) the position of the vertices correspond to local maxima of the density which occur in the integrand defining I in equation (1.7). This density depends on the modulus of the rational map and its derivative so in general it is not possible to obtain such a simple characterization of the location of its maxima. However, in particularly symmetric cases they can be identified with the zeros of a polynomial constructed from the Hessian of the Wronskian [17]. Explicitly, the polynomial is H = (2B − 2)w(z)w00 (z) − (2B − 3)w0 (z)2 ,
(5.2)
and has degree 4(B − 2). As an example in which the above formula does work, consider the degree 7 rational map describing the icosahedrally symmetric minimal energy charge 7 Skyrmion [18] R=
z 7 − 7z 5 − 7z 2 − 1 . z 7 + 7z 5 − 7z 2 + 1
(5.3)
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
79
The Wronskian and the Hessian in this case are given by w = 28z(z 10 + 11z 5 − 1),
H = −8624(z 20 − 228z 15 + 494z 10 + 228z 5 + 1) , (5.4)
which are proportional to the Klein polynomials associated with the faces and vertices of a dodecahedron respectively [19]. Note that if the zeros of the Hessian (5.2) could always be identified with the vertices of the polyhedron then this would explain another of the GEM rules, that is, the fact that the number of vertices is equal to V = 4(B − 2). In this case the number of edges E = 6(B − 2) by the Euler formula, and hence E = 3V /2, that is, the polyhedron is trivalent. Unfortunately, this is not true in general, though for the minimal energy maps it may be the case that the zeros of (5.2) give good approximations to the positions of the maxima. We have already discussed explicit cases where the number of vertices and edges is affected by symmetry enhancement, and therefore it comes as no surprize that we cannot make general statements about the E, F and V based solely on the rational map ansatz. Although we do not have a general global characterization of the vertices it is possible, by a local analysis of the rational map, to check whether a given point is a local maximum and to obtain its valency.n By using the freedom to perform rotations of both the domain and target spheres it is always possible to choose the point we are considering to be given by z = 0 and the rational map to have a local expansion about this point of the form R = α(z + βz p+1 + O(z p+2 ))
(5.5)
where α and β are real positive constants. A possible exception to this is if the derivative of the map is zero at the point we are considering, but since such points correspond to face centres they are clearly not maxima and may be ignored for our purposes here. Substituting the expansion (5.5) into the expression for the baryon density one can obtain the following result. If p > 2 then z = 0 is a p-valent vertex if α > 1. If p = 1 then z = 0 is not a vertex. The remaining case of p = 2 is a little more subtle. In many cases there is a one-to-one correspondence between the vertices and the local maxima of the baryon density polyhedron. However, this is not always true and in some cases (the lowest charge example being B = 5) only some of the local maxima are vertices, whilst others correspond to midpoints of an edge. In this situation some of the edges may appear thicker than others, reflecting their local maxima nature. The rational map description of such a bivalent maximum is the final p = 2 case, where a local √ maximum requires the more restrictive condition that α > 1 + 3β. As an example of this analysis, consider the B = 9 map with D4d symmetry given by (3.1). Expanding this map about the point z = 0 gives R = az + ib(1 − a)z 5 + · · · n We
thank Nick Manton for suggesting this possibility.
(5.6)
December 27, 2001 15:22 WSPC/148-RMP
80
00106
R. A. Battye & P. M. Sutcliffe
which can be rotated into the form (5.5) with p = 4, α = −a = 3.38 > 1 and β = −b(1 − a). Thus the point z = 0 is a four-valent vertex, as we have observed from the baryon density plot. The other minimizing map with four-valent vertices (this time six of them) is the B = 13 O map (3.10), which can be checked in a similar way. 5.4. Isomerism — local minima and saddles We have argued strongly that there is an analogy between Skyrmion solutions and polyhedra found in carbon chemistry. Moreover, at some charges we have found more than one solution which has very low energy, and therefore it might seem sensible within the analogy to chemistry to describe these solutions as isomers, whether they be saddle point solutions or local minima.o In most cases, for example, B = 9, 16, 17, 19 and 22 the symmetries and structures of the known isomers are unrelated to those of the minimal energy Skyrmions and in these cases it has been easy to identify the minimum energy configurations using the relaxation of initially well-separated clusters. The cases of B = 10 and 13 are interesting since there are known configurations whose associated polyhedra are related by symmetry enhancement. We have already commented that the polyhedron associated with O symmetry at B = 13 can be created by 6 symmetry enhancement operations from a D2 fullerene polyhedron, and that with D4d symmetry requires just 2. However, we have not discussed the B = 10 solutions in this context. The D4d and D3h solutions do not appear to be related to this concept, but one can understand the D3 and D3d solutions in terms of a more symmetric polyhedron which can be created from either by 3 symmetry enhancement operations. Clearly, this highly symmetric configuration, which is likely to be of higher energy, can be thought of as a saddle point in configuration space, with the two minima on either side. It is interesting to speculate that the true minimum energy Skyrmion is more closely associated with a polyhedron in which partial symmetry enhancement has taken place, that is, the bond lengths have shrunk, but not totally to a four valent bond. This might explain our difficulty in identifying the symmetry of the true minima using our methods. 5.5. Skyrmion architecture One of the most important reasons for performing full field simulations in our study of Skyrmions is to verify that the minimal energy Skyrmion (at least for B ≤ 22) consists of a single shell structure, which is the main assumption in the rational map ansatz. In this section we speculate on the kinds of structures which may form for Skyrmions of higher charge. o The existence of degenerate minima, would probably require some kind of symmetry between the solutions. Although we cannot rule it out, it appears to be very unlikely.
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
81
The lowest known value for the energy per baryon in the Skyrme model arises from an infinite three-dimensional cubic crystal [9], with an energy [6] only 3.6% above the Faddeev–Bogomolny bound. In considering a large single-shell fullerene Skyrmion, where hexagons are dominant, the twelve pentagons may be viewed as defects, inserted into a flat hexagonal structure, in order to generate the required curvature necessary to close the shell. Energetically the optimum structure of this form is an infinite hexagonal lattice and this was constructed in one of our earlier papers [6], and found to have an energy per baryon which is 6.1% above the Faddeev–Bogomolny bound. This is therefore the value to which a bigger and bigger single-shell structure will asymptote. Since this value is higher than for the Skyrme crystal it is reasonable to expect that above some critical charge B ? , the minimal energy Skyrmion will resemble a portion cut from the crystal rather than a single shell. However, it is very difficult to estimate the value of B ? (note that we have seen that at least B ? > 22) since it relies on a delicate comparison of the surface to volume energy of a finite portion of the crystal and this is very sensitive to the way in which the portion of the crystal is smoothed off at the edges. As the crystal is basically composed of stacking B = 4 cubes together then B = 32 is the first charge at which any sizeable chunk of the crystal can emerge, and even then it has a large area to volume ratio, so perhaps the charge will need to be even larger than this before a crystal regime takes over from the shell regime. An intermediate between a single-shell and a crystal is a multi-shell structure and this has recently been studied by Manton and Piette [24]. For the charges they consider (B = 12, 13, 14) the relaxation of an initial multi-shell structure produces a single-shell configuration which has relatively high energy in comparison with the minimal energy single-shell Skyrmions we have found. As a shell may be thought of as a spherical domain wall, connecting the two vacua U = ±1, then all configurations with an odd number of shells have U = −1 at the centre, whereas if there are an even number of shells then U = 1 at the centre. Thus a single shell structure obtained from an initial odd number of shells can relax to one of the configurations we have found. In fact, as we have already mentioned, the B = 13 Skyrmion obtained by Manton and Piette is the Oh symmetric saddle point solution whereas the minimal energy Skyrmion is only O symmetric. In summary, there are a number of alternatives to a single-shell structure for higher charge Skyrmions and what is remarkable is that none of these alternatives appear to arise at least for B ≤ 22. It seems reasonably clear that single-shells can not be the whole story for large enough charge, but whether this charge is so large as to be irrelevant in applications to nuclear physics has yet to be determined.
5.6. Relation to applications In the introduction we commented on two diverse motivations for creating Skyrmion solutions, namely from a purely mathematical point of view, to study an interesting class of maps between 3-spheres which generalize the harmonic map equations,
December 27, 2001 15:22 WSPC/148-RMP
82
00106
R. A. Battye & P. M. Sutcliffe
and from a physical perspective to investigate a phenomenological model of nuclei. Here, we comment briefly on the relevance of our results to these two applications and suggest interesting avenues for future research which we have opened up with this work. We have already noted that the Skyrme model is the simplest model in which one finds stable solitonic solutions which correspond to maps from S 3 to S 3 , and so our solutions may have some generality to other extensions. An interesting feature of the solutions which we have found is that, in some sense, they can be thought of as being close to a conformal map between the two 3-spheres, for which the three eigenvalues of the strain tensor, λ21 , λ22 , λ23 , would all be equal. The rational map ansatz, which we have seen provides a good approximation to the true solutions, has two of these eigenvalues equal and it has been observed [22] that the shape of the profile function appears to be such that the deviation from a conformal map is minimized when averaged over space. For a conformal map which is locally an isometry the values of E and B would be exactly equal and so any deviations from the map being locally isometric can be visualized by plotting E − B. When one does this the relevant isosurface is highly localized around the edges of the associated polyhedron and also in the centre of each face where B is close to zero and E is large in comparison. Such an isosurface is a plot of second order effects due to curvature. The fact that the associated polyhedra are generally of the fullerene type is also interesting because in chemistry such structures arise since they minimize what is called steric strain, the overall strain of the delocalized electron distribution. Using this analogy we suggest that the effects of the strain tensor for maps between 3-spheres can be thought of as being analogous to steric strain in fullerene molecules. Finally, we should note that the existence of a Skyrmion with a particular symmetry, which can be described by a rational map, implies that there also exists an SU(2) BPS monopole with the same symmetry, although, of course, all BPS monopoles of a given charge have the same energy. The fields and Lagrangians of monopoles and Skyrmions are very different but the structures which arise in each case are remarkably similar. This suggests that these types of configurations may be generic as low energy states in a variety of 3-dimensional soliton models and elsewhere. Although the original motivation for studying the Skyrme model was to make quantitative predictions for the properties of nuclei based on a model which is derived in some limit from QCD, this has historically been a tricky business. Our hope, based on the extensive results that we have presented here, is that some progress can be made in this direction. Part of the problem in achieving such a goal is how one should understand the model in the context of nuclei. Based on the idea that it is a low energy effective action for QCD several studies have attempted to quantize Skyrmion solutions as rigidly rotating spinning tops for B = 1, 2 and 3 (see, for example, [1, 10, 11]) quantifying corrections in terms of their position in the 1/Nc expansion. This is not only complicated but probably is also too simplified
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
83
an approach, as demonstrated by the more sophisticated quantization of the B = 2 Skyrmion in [23]. Here we would like to make a few interesting points in terms of thinking of the solitons as a phenomenological model for nuclei which incorporates classical isospin. The first thing that we would like to discuss is the shell structure of the soliton solutions which we have created. Naively, one might think that such a structure is incompatible with the solutions modelling nuclei which are usually assumed to be solid lumps. What one has to appreciate to reconcile this is, that until the solutions are quantized, the charge can be thought of as a fluid. If a nuclei were assumed to be comprised of a simple positively charge fluid then a shell structure would be expected under the action of the strong force since there is a long range, but weak, attraction at long distances, and a short distance repulsion due to nucleon-nucleon repulsion. Therefore, the hollow shell structure of the multi-Skyrmions which we have observed is largely due to the continuum version of nucleon-nucleon repulsion. Following on from this point we should note that some of the features of the classical values of IB , the ionization energy, are very much in line with expectations based on nuclei. Let us focus in detail on the solutions for B = 4 and B = 5. The B = 4 solution is highly symmetric and I4 is relatively large, whereas the B = 5 solution has little symmetry and I5 is much smaller. This is exactly as one might have expected since 4 He is the most stable nucleus whereas there is no naturally occurring stable nucleus with A = 5. Since the packing structure of the solutions is a feature of the symmetry of the solution this suggests that there may be something even more than just a good model of the strong force potential within the Skyrme model. There is also an interesting trend in ∆E/B, the classical binding energy per baryon, which appears to asymptote to a value defined by that of an infinite Skyrmion lattice. We have plotted this compared to the experimentally determined values for nuclei with A = 1 to 22 [12, 16] in Fig. 10 with an arbitrary normalization factor (which amounts to multiplying the curve in Fig. 7 by about 50) accounting for our ability to define the Skyrmion energy units. Although this is crude, it makes the point that the curve has the correct shape. This is very encouraging and is the subject of on-going research. Since the fullerene polyhedra are clearly very important for our understanding of Skyrmions, and as we have just argued there are a number of appealing features of the model for explaining the properties of nuclei, it is tempting to make an analogy between the delocalized electron distributions in fullerenes and nuclear charge distributions. Although the analogy is not exact, it might be possible to relate the Skyrme model to density functional theory methods (see, for example, [25]) used in the study of electron distributions.
6. Conclusion We have performed an exhaustive study of minimal energy Skyrmions for all charges upto B = 22, using a variety of methods and involving a substantial amount of CPU
December 27, 2001 15:22 WSPC/148-RMP
84
00106
R. A. Battye & P. M. Sutcliffe
Fig. 10. The binding energy per baryon number for Skyrmions (solid line) compared, using an arbitrary scaling, to the binding energy per nucleon of real nuclei (dashed line). The main thing that one should note is that the approximate shape, an increase to a plateau, is present in both.
time on a parallel machine. At each charge we have discussed in detail the symmetry, structure and energy of the minimal energy Skyrmion (and often several others) in addition to providing an approximate description by presenting its associated rational map. Supplementary to the detailed investigation at specific charges we have found a number of interesting general phenomena. These include the verification of the fullerene hypothesis, which applies to all except two cases (which can be understood in terms of symmetry enhancement), the discovery that there are often several Skyrmions with very different symmetries from the minimal one but nonetheless have energies which are remarkably close to the minimal value, and finally the confirmation that the shell-like structure of Skyrmions continues to large charges (at least B = 22) with the rational map ansatz providing an effective approximation to the true solution. Hopefully this comprehensive piece of work will provide a useful foundation for further studies on Skyrmions, both mathematical and physical, with the ultimate aim being a comparison with experimental data on nuclei. Acknowledgments Many thanks to Conor Houghton, Nick Manton and Tom Weidig for useful discussions. We thank the EPSRC (PMS) and PPARC (RAB) for Advanced Research
December 27, 2001 15:22 WSPC/148-RMP
00106
Skyrmions, Fullerenes and Rational Maps
85
Fellowships. PMS acknowledges the EPSRC for the grant GR/M57521. The parallel computations were performed on the COSMOS at the National Cosmology Supercomputing Centre in Cambridge. References [1] G. S. Adkins, C. R. Nappi and E. Witten, Nucl. Phys. B228 (1983) 552. [2] S. L. Altmann and P. Herzig, Point-Group Theory Tables, Clarendon Press, 1994. [3] M. F. Atiyah and N. J. Hitchin, The Geometry and Dynamics of Magnetic Monopoles, Princeton University Press, 1988. [4] R. A. Battye and P. M. Sutcliffe, Phys. Lett. 391B (1997) 150. [5] R. A. Battye and P. M. Sutcliffe, Phys. Rev. Lett. 79 (1997) 363. [6] R. A. Battye and P. M. Sutcliffe, Phys. Lett. 416B (1998) 385. [7] R. A. Battye and P. M. Sutcliffe, “Solitonic fullerenes”, to appear in Phys. Rev. Lett. [8] E. Braaten, S. Townsend and L. Carson, Phys. Lett. 235B (1990) 147. [9] L. Castillejo, P. S. J. Jones, A. D. Jackson, J. J. M. Verbaarschot and A. Jackson, Nucl. Phys. A501 (1989) 801. [10] L. Carson, 66 (1991) 1406. [11] L. Carson, Nucl. Phys. A535 (1991) 479. [12] W. N. Cottingham and D. A. Greenwood, An Introduction to Nuclear Physics, Cambrige University Press, 1986. [13] W. Y. Crutchfield and J. B. Bell, J. Comp. Phys. 110 (1991) 234. [14] P. W. Fowler and D. E. Manolopoulos, An Atlas of Fullerenes, Clarendon Press, 1995. [15] M. Hale, O. Schwindt and T. Weidig, Phys. Rev. E62 (2000) 4333. [16] P. E. Hodgson, E. Gadioli and E. Gadioli-Erba, Introductory Nuclear Physics, Oxford University Press, 1997. [17] C. J. Houghton, private communication. [18] C. J. Houghton, N. S. Manton and P. M. Sutcliffe, Nucl. Phys. B510 (1998) 507. [19] F. Klein, Lectures on the Icosahedron, London, Kegan Paul, 1913. [20] V. B. Kopeliovich and B. E. Stern, JETP Lett. 45 (1987) 203. [21] H. W. Kroto, J. R. Heath, S. C. O’Brien, R. F. Curl and R. E. Smalley, Nature (London) 318 (1985) 354. [22] S. Krusch, Nonlinearity 13 (2000). [23] R. A. Leese, N. S. Manton and B. J. Schroers, Nucl. Phys. B442 (1995) 228. [24] N. S. Manton and B. M. A. G. Piette, hep-th/0008110 (2000). [25] I. Zh. Petkov and M. V. Stoitsov, Nuclear Density Functional Theory, Clarendon Press, 1991. [26] B. M. A. G. Piette and W. J. Zakrzewski, J. Comp. Phys. 145 (1998) 359. [27] T. H. R. Skyrme, Proc. Roy. Soc. A260 (1961) 127. [28] P. M. Sutcliffe, Int. J. Mod. Phys. A12 (1997) 4663. [29] P. J. M. Van Laarhoven and E. H. L. Aarts, Simulated Annealing: Theory and Applications, Kluwer Academic Publishers, 1997. [30] J. J. M. Verbaarschot, Phys. Lett. 195B (1987) 235. [31] E. Witten, Nucl. Phys. B223 (1983) 422.
December 27, 2001 16:1 WSPC/148-RMP
00110
Reviews in Mathematical Physics, Vol. 14, No. 1 (2002) 87–119 c World Scientific Publishing Company
EDGE CURRENT CHANNELS AND CHERN NUMBERS IN THE INTEGER QUANTUM HALL EFFECT
J. KELLENDONK, T. RICHTER and H. SCHULZ-BALDES∗ Fachbereich Mathematik, Technische Universit¨ at Berlin, Straße des 17. Juni 136, 10623 Berlin, Germany ∗Department
of Mathematics, University of California at Irvine, Irvine, California, 92697, USA
Received 8 December 2000 Revised 22 May 2001 A quantization theorem for the edge currents is proven for discrete magnetic half-plane operators. Hence the edge channel number is a valid concept also in presence of a disordered potential. Under a gap condition on the corresponding planar model, this quantum number is shown to be equal to the quantized Hall conductivity as given by the Kubo–Chern formula. For the proof of this equality, we consider an exact sequence of C∗ -algebras (the Toeplitz extension) linking the half-plane and the planar problem, and use a duality theorem for the pairings of K-groups with cyclic cohomology.
1. Introduction In quantum Hall effect (QHE) experiments, one observes the quantization of the Hall conductance of an effectively two-dimensional semiconductor in units of the universal constant e2 /h [29, 38]. As the Hall conductance is a macroscopic quantity, this effect is of completely different nature than any quantization in atomic physics resulting from Bohr–Sommerfeld rules. Although also a pure quantum effect, the quantum numbers of the QHE rather turn out to be global topological invariants of the underlying magnetic Hamiltonian. These invariants become apparent only in the strong localization regime. For an explanation of the integer QHE, the only situation studied in this work, a one-particle framework is widely excepted to be sufficient. The three main existing theoretical approaches are respectively based on the Laughlin Gedankenexperiment [31], on the edge channel picture introduced by Halperin [26] and B¨ uttiker [15] and on the Kubo–Chern formula for the Hall conductivity first derived by TKN2 [43]. Laughlin’s argument was rigorously analyzed by Avron, Seiler, Simon and Yaffe even for multiparticle Hamiltonians and in presence of a disordered potential [6, 8, 7]. Bellissard [9, 10] and Kunz [30] and others [12, 2] generalized the TKN2 work in order to show quantization of the Hall conductivity also in presence of a disordered potential as long as the Fermi level lies in a region of dynamically 87
December 27, 2001 16:1 WSPC/148-RMP
88
00110
J. Kellendonk, T. Richter & H. Schulz-Baldes
localized states. The mathematical connection between these two works is well understood [7, 12, 2]. Although several recent works concern boundary conditions for magnetic half-plane operators [4] as well as spectral questions for these operators [32, 21, 25], there was up to now no microscopic mathematical theory of edge channel conduction, except for the case without disorder, of course. Furthermore, a conceptual understanding of the link of bulk and edge theory was lacking even though the beautiful work of Hatsugai [27, 28] was an important step in this direction. The results of this work, announced in [42], fill these gaps. For a description of our main results in a particular case, let HH : `2 (Z2 ) → 2 ` (Z2 ) denote the Harper Hamiltonian describing the motion of a tight-binding electron in a plane submitted to a constant (but arbitrary) magnetic flux per unit cell (see Sec. 2.1 for its definition). Further let V be an Anderson type disorder potential, then we consider the stochastic Hamiltonian H = HH + V . In presence of an external electric field, all extended states below the Fermi level undergo the b (β, µ) at inverse temperaLorentz drift. The associated (bulk) Hall conductivity σ⊥ ture β and Fermi energy µ is calculated in linear response theory by the Kubo b b (µ) = σ⊥ (∞, µ) is equal to a constant formula. The main result of [12, 2] is that σ⊥ 2 integer multiple of q /h as long as the Fermi level µ varies in a given interval ∆ of dynamically localized states (q is the particle charge and h Planck’s constant). In particular, if ∆ is a gap of the spectrum of H, the conclusion holds. The appearing integer is actually given by the Chern number of the Fermi projection Pµ of H. It is hence a topological invariant of the planar model. In a macroscopic Hall bar, there is now another conduction mechanism by edge currents. For a classical two-dimensional electron gas (2DEG), the cyclotron orbits are intercepted by the boundary and this leads to a net current along the boundary. In order to calculate the corresponding quantum mechanical edge current, ˆ of H to the half-plane Hilbert space `2 (Z × N), we will study the restriction H together with a given boundary condition. All operators on the half-plane will carry ˆ be the electrical current operator a hat throughout this work. Let Jˆx = qı[X, H]/~ along the boundary (here X is the position operator of the direction parallel to the ˆ on an interval ∆ ⊂ R. The edge boundary) and Pˆ∆ the spectral projection of H current carried by the states in ∆ is then given by j e (∆) = Try Tx (Pˆ∆ Jˆx ) ,
(1)
where Tx the disorder averaged trace per unit volume parallel to the boundary and Try the trace in the direction perpendicular to the boundary. Note that the calcultion of edge currents is mere equilibrium quantum mechanics: no electric field, linear approximation or dissipation mechnism is needed. Our main result is then the following: Theorem 1.1. Let ∆ be a gap of the (almost sure) spectrum of H = HH + V in which the integrated density of surface states µ ∈ ∆ 7→ Try Tx (Pˆ[inf(∆),µ] ) is
December 27, 2001 16:1 WSPC/148-RMP
00110
Edge Current Channels and Chern Numbers
89
continuous. Then, for any interval ∆0 ⊂ ∆, one has almost surely 1 b (µ) |∆0 | , j e (∆0 ) = −σ⊥ q
∀µ ∈ ∆,
where |∆0 | denotes the length of ∆0 . The theorem states that the edge channel number in the sense of [26, 15] of a magnetic Hamiltonian in the half-plane remains a valid concept in presence of a disordered potential, and that this number is moreover equal to the Chern number of the planar Hamiltonian. In the following we will call the limit of the quotient e (µ) at µ. It is well −qj e (∆)/|∆| as ∆ → {µ} also the edge Hall conductivity σ⊥ known [15] that its quantization can only hold for a magnetic operator on a semiinfinite space because for a strip geometry the backscattering, notably tunneling from upper to lower edge, will destroy quantization. For the unperturbed Landau Hamiltonian HL , the continuous analog of the ˆ L has operator H theorem just states that in the nth gap of HL , the half-plane R ˆ ˆ exactly n bands (compare Fig. 1). In fact, if HL = R dkx HL (kx ) is the Bloch decomposition in the x-direction and Ej (kx ) are the corresponding edge channels, then for an interval ∆ in the nth gap, one has Z qX dEj (kx ) q dkx χ(Ej (kx ) ∈ ∆) = − n|∆| , j e (∆) = h dk h x R j≥0
where χ denotes the characteristic function. For the Harper Hamiltonian HH , the theorem asserts that the sum of the Chern numbers of the lowest n bands is equal to the number of edge channels within the nth gap (also Dirichlet bands in [27]) of the half-plane operator multiplied by their orientation. In the commensurate case of rational flux per unit cell, this result is due to Hatsugai [27, 28]. His proof
E
E
kx
−π
kx
+π
ˆ L (solid lines), Fig. 1. l.h.s.: schematic representation of the spectrum of the half-plane operator H ˆ H , the solid lines are the Dirichlet the dashed lines show the Landau bands; r.h.s.: spectrum of H bands and the shaded regions are the Bloch bands (for numerical results, see [27]).
December 27, 2001 16:1 WSPC/148-RMP
90
00110
J. Kellendonk, T. Richter & H. Schulz-Baldes
is completely unrelated to ours though and cannot be generalized to a situation with broken translation invariance due to either an irrational magnetic flux or to a disordered potential. Our proof of the above theorem splits into two parts. In the first purely analytic step, we consider only half-plane operators and prove that the edge Hall conductivity is equal to the Fredholm index of a certain unitary operator. This can be viewed at as an odd index theorem resulting from a pairing between an odd cyclic cohomology class with the K1 -class of the above unitary operator. Similarly, the quantized Kubo–Hall conductivity is given by a pairing of a certain even cyclic cohomology class with the K0 -class defined by the Fermi projection. These two pairings are over two different C∗ -algebras which are linked by a short exact sequence called the Toeplitz extension [37]. In the second step, we use the K-theoretic six-term exact sequence of the Toeplitz extension as well as its dual K-homologic counterpart to prove the equality of the pairings. This duality theorem unveals that the equality of edge and bulk Hall conductivities is a consequence of a fundamental topological concept, namely of Bott periodicity. Hence this work fully exploits (and thereby justifies) the use of the non-commutative C∗ -algebraic framework developed by Bellissard [9]. Let us now comment on variations and generalizations of the above theorem. First of all, its continuous counterpart is under preparation. It uses the Wiener–Hopf rather than the Toeplitz extension [17, 14] and the duality theorem corresponding to Connes’ Thom isomorphism [24]. Further we believe the consequences of the theorem to hold under the weaker condition that ∆ is an interval containing only dynamically localized states of H. Under precisely this condition it is possible to prove quantization of the bulk Hall conductivity [12]. The gap condition imposed above only allows to deal with the weak disorder regime (unsufficient in order to ˆ has purely absolutely conexplain the QHE). In this regime the Hamiltonian H tinuous spectrum as show positive commutator estimates for the boundary current operator. In fact, the arguments of [21] directly transpose to our discrete case as long as the edge bands of the free magnetic operator have a definite sign. This also implies continuity of the integrated density of surface states. In the regime of intermediate disorder, the gaps of H fill with dynamically localized states. In presence of a boundary, the tunnel effect to the edge states should turn all these localized states into resonances so that we expect the spectrum to remain absolutely continuous. At the same time, the edge channel number should remain an integer equal to the Chern number of the Fermi projection of the planar model. In the high disorder regime, the system completely localizes [33, 3]; in between there is a cascade of metal-insulator transitions [12]. The article is organized as follows. In Sec. 2, we develop the mathematical framework, then state our main results more precisely in Sec. 3. In Sec. 4 we prove the edge current quantization. Only in Sec. 5 we then use a K-theoretic duality theorem in order to show the equality between edge and bulk Hall conductivities.
December 27, 2001 16:1 WSPC/148-RMP
00110
Edge Current Channels and Chern Numbers
91
This duality theorem was stated by Nest [34], but his proof is technically complicated and incomplete. Using a result by Pimsner [36], the appendix gives a new and transparent proof in a less general situation than the one considered in [34]. For a discussion of the underlying physics, we refer to [26, 38, 15, 16, 42]. 2. Observable Algebras In this section we explicitly construct the observable algebras as the Toeplitz extension of a twisted crossed product. For further details and for physical motivation of the following definitions and subsequent constructions, we refer to [9, 11, 12, 41, 13]. 2.1. Homogeneous operators and their hull As in [13] we define the (geometric) hull of a tight-binding operator in absense of a magnetic field. This seems more adapted than its definition in [12] because the use of magnetic translations leads to a non-trivial hull even for a free magnetic operator as the Harper Hamiltonian (for the Landau Hamiltonian on continuous physical space this problem does not appear due to its invariance w.r.t. all magnetic translations). Definition 2.1. Let U0 (a), a ∈ Z2 , denote the translations on `2 (Z2 ) defined by U0 (A)ψ(n) = ψ(n − a) for n ∈ Z2 and ψ ∈ `2 (Z2 ). Consider a self-adjoint bounded operator H : `2 (Z2 ) → `2 (Z2 ) of the form X Hn,m |ni hm| , Hn,m ∈ R , H= n,m∈Z2
where hm| and |ni are the usual bra-ket notations for states localized at m ∈ Z2 and n ∈ Z2 . Then H is called homogeneous if the set {U0 (a)HU0 (a)∗ |a ∈ Z2 } has a compact closure Ω in the strong operator topology. By continuity, U0 extends to an action T of Z2 on Ω. The dynamical system (Ω, T, Z2 ) is then called the hull of H. Example 2.2. Let H0 be translation invariant, that is U0 (a)H0 U0 (a) = H0 for all 2 a ∈ Z2 . A typical example is the discrete Laplacian. Let further Ω = [−1, 1]⊗Z be endowed with the Tychonov topology as well as the shift action T of Z2 . Let us denote by Vω : `2 (Z2 ) → `2 (Z2 ) the Anderson-type onsite potential corresponding to the disorder configuration ω ∈ Ω, namely Vω |ni = ωn |ni where ω = (ωn )n∈Z2 . We fix an invariant and ergodic probability measure P on Ω with the property P({ω ∈ Ω|ωn ∈ I}) > 0 for all n ∈ Z2 and all intervals I ⊂ [−1, 1] of non-vanishing length. Then, for P-almost all ω ∈ Ω, the operator Hω = H0 + λVω , λ ∈ R, is homogeneous and its hull is equal to (Ω, T, Z2 ). A proof of this fact can be found in [13]. From a mathematical point of view, this article is about the analysis of certain topological invariants of the dynamical system (Ω, T, Z2 ). Associated to it is the
December 27, 2001 16:1 WSPC/148-RMP
92
00110
J. Kellendonk, T. Richter & H. Schulz-Baldes
C∗ -dynamical system (C(Ω), α, Z2 ) where (αa f )(ω) = f (T −a ω) for a continuous functions f ∈ C(Ω). This gives rise to a natural C∗ -algebra, the twisted crossed product [35] with a twisting given by the group cocycle Z2 × Z2 → S 1 : (a, b) 7→ θ a ∧ b mod(2π) where a ∧ b = ax by − ay bx and θ = qB/~ is the magnetic flux per unit cell. This algebra has a natural extension, the Toeplitz extension. If moreover an ergodic probability measure P on Ω is given, we have a W∗ -dynamical system (L∞ (Ω, P), α, Z2 ) with an associated von Neumann crossed product [35]. This also has a Toeplitz extension. These exact sequences describe the link between magnetic operators in the plane and in the half-plane. 2.2. Algebraic twisted crossed product and its Toeplitz extension The magnetic translations in symmetric gauge U (a), a = (ax , ay ) ∈ Z2 , are defined by ı θ a ∧ n ψ(n − a) , n = (nx , ny ) ∈ Z2 , ψ ∈ `2 (Z2 ) , U (a)ψ(n) = exp 2 where a ∧ n = ax ny − ay nx and θ = qB/~ is the flux per unit cell. U is a projective unitary representation of the group Z2 on `2 (Z2 ), namely ı θ a ∧ b U (a)U (b) . U (a + b) = exp 2 We denote by Ux = U (1, 0) and Uy = U (0, 1), then Uy Ux = eıθ Ux Uy . The C∗ -algebra generated by Ux and Uy is called the rotation algebra Aθ . Now we explicitly construct the algebraic twisted crossed product and its Toeplitz extension. The formulas below are chosen such that the magnetic translation U (a) is the representation of the function (ω, n) 7→ δn,−a . Let A0 = Cc (Ω × Z2 ) be the continuous complex valued functions on Ω × Z2 with compact support. With the following definitions, A0 becomes a ∗-algebra (A, B ∈ A0 , ω ∈ Ω, m ∈ Z2 ): P ı θl∧m , AB(ω, m) = l∈Z2 A(ω, l)B(T −l ω, m − l) exp (2) 2 A∗ (ω, m) = A(T −m ω, −m) . Further let E0 = Cc (Ω × Z × N × N) and T (A)0 = E0 ⊕ A0 . As E0 and T (A)0 will describe half-plane operators, their elements will carry a hat. In order to simultaneously define the ∗-algebra structure on E0 and T (A)0 , we identify elements (Aˆ1 , A2 ) ∈ T (A)0 with a function Aˆ ∈ C(Ω × Z × N × N) by means of the formula (mx ∈ Z, ny , my ∈ N): ˆ mx , ny , my ) = Aˆ1 (ω, mx , ny , my ) + A2 (ω, mx , my − ny ) . A(ω,
(3)
ˆ B ˆ ∈ T (A)0 , the product and involution are then defined by For A, X ˆ lx , ny , ly )B(T ˆ (−lx ,ny −ly ) ω, mx − lx , ly , my ) ˆ A(ω, AˆB(ω, m x , n y , my ) = (lx ,ly )∈Z×N
· exp
ı θ(lx (my − ny ) − mx (ly − ny )) , 2
(4)
December 27, 2001 16:1 WSPC/148-RMP
00110
Edge Current Channels and Chern Numbers
ˆ (−mx ,ny −my ) ω, −mx , my , ny ) . Aˆ∗ (ω, mx , ny , my ) = A(T
93
(5)
ˆ 0) ∈ Let us further introduce the inclusion i : E0 → T (A)0 sending Aˆ ∈ E0 to (A, T (A)0 as well as the projection π : T (A)0 → A0 defined by ˆ mx , k, my + k) , ˆ π(A)(ω, mx , my ) = lim A(ω, k→∞
Aˆ ∈ T (A)0 .
(6)
ˆ = A2 if Aˆ = (Aˆ1 , A2 ) ∈ T (A)0 . One directly verifies π(A) Lemma 2.3. The following sequence of ∗-algebras is exact: 0 → E0 −→ T (A)0 −→ A0 → 0 . i
π
In particular, i and π are ∗-morphisms satisfying Im(i) = Ker(π). We point out that this exact sequence of ∗-algebras is never split exact even though it is split exact as an exact sequence of vector spaces. More precisely, there does not exist a ∗-morphism ρ : A0 → T (A)0 such that ρ ◦ π = 1. Proof of Lemma 2.3. First of all, i : E0 → T (A)0 is clearly a ∗-morphism because the operations (4) and (5) are the same in E0 and T (A)0 and moreover leave E0 invariant in T (A)0 = E0 ⊕ A0 . Let us check that π is also a ∗-morphism (Aˆ = ˆ = (B ˆ1 , B2 ) ∈ T (A)0 ): (Aˆ1 , A2 ), B X X ˆ lx , k, ly ) ˆ A(ω, π(AˆB)(ω, mx , my ) = lim k→∞
lx ∈Z ly ≥0
ˆ −(lx ,ly −k) ω, mx − lx , ly , my + k) · B(T ı θ(lx my − mx (ly − k)) . · exp 2 ˆ1 ∈ E0 have compact support, they will vanish for k suffiNow because both Aˆ1 , B ciently big. Then we make the change of variables ly ↔ ly − k and write out the identification (3): X X ˆ A2 (ω, lx , ly )B2 (T −(lx ,ly ) ω, mx − lx , ly − my ) π(AˆB)(ω, mx , my ) = lim k→∞
lx ∈Z ly ≥−k
· exp
ı θ(lx my − mx ly ) . 2
Now because A2 has compact support, we can replace the summation over ly ≥ −k by that over ly ∈ Z for k sufficiently large. Comparing with (2) we see that ˆ = B2 . The identity π(A) ˆ ∗ = π(Aˆ∗ ) ˆ = π(A)π( ˆ B) ˆ as π(A) ˆ = A2 and π(B) π(AˆB) can be immediately checked. This moreover shows that Im(i) = Ker(π). A0 will be called the bulk algebra, T (A)0 its Toeplitz extension and E0 the edge algebra. The algebra A0 and its various completions will describe operators which are homogeneous in the plane, E0 models operators which are homogeneous along the boundary (x-direction), but compact in the y-direction perpendicular to the
December 27, 2001 16:1 WSPC/148-RMP
94
00110
J. Kellendonk, T. Richter & H. Schulz-Baldes
boundary, whereas T (A)0 contains both of these operators, notably homogeneous half-plane operators with compact boundary contributions. By means of the following formulas, we introduce two families of physical repˆω , ω ∈ Ω, of the ∗-algebras A0 and T (A)0 on `2 (Z2 ) and resentations πω and π 2 ` (Z × N) respectively (both n = (nx , ny ) and m = (mx , my ) are in Z2 or Z × N respectively): ı θ n ∧ m , A ∈ A0 , (7) hn|πω (A)|mi = A(T −n ω, m − n) exp 2 and ˆ ˆ −n ω, mx − nx , ny , my ) exp = A(T hn|ˆ πω (A)|mi
ı θn∧m , 2
Aˆ ∈ T (A)0 .
(8)
ˆω are ∗-morphisms from A0 and T (A)0 to It is elementary to verify that πω and π the algebras of bounded operators on `2 (Z2 ) and `2 (Z× N) respectively and further that the following covariance relations hold (a = (ax , ay ) ∈ Z2 ): U (a)πω (A)U (a)∗ = πT a ω (A) ,
ˆ (ax , 0)ˆ ˆU ˆ (ax , 0)∗ = π ˆ U πω (A) ˆT (ax ,0) ω (A)
(9)
ˆ (0, ay ) only ˆ denotes the restriction of U to `2 (Z × N). Covariance under U where U holds for ay ≥ 0; for ay < 0 there are corrections by operators in E0 . 2.3. Exact sequence of C∗ -algebras Using these representations, we can now introduce semi-norms on A0 and T (A)0 (A ∈ A0 and Aˆ ∈ T (A)0 ): kAk = sup kπω (A)k ,
ˆ = sup kˆ ˆ . kAk πω (A)k
ω∈Ω
(10)
ω∈Ω
It is elementary to verify that these are C∗ -norms. The C∗ -algebras A, T (A) and E are defined to be the closure of A0 , T (A)0 and E0 with respect to these norms. Now, because any ∗-morphism between pre-C∗ -algebras can be extended by continuity to ˆω extend to covariant representations their C∗ -closures, the representations πω and π of the C∗ -algebras and furthermore Lemma 2.3 leads to an exact sequence of C∗ algebras: 0 → E −→ T (A) −→ A → 0 . i
π
(11)
We note that A is the twisted crossed product C(Ω) ×T Z2 of the dynamical system (Ω, T, Z2 ) associated with the magnetic twisting introduced above. The structure of this exact sequence will be further analysed in Sec. 2.6. Example 2.4. Let us consider this exact sequence explicitly for the case of the rotation algebra Aθ , namely the C∗ -algebra generated by the unitary operators ˆx and U ˆy which still Ux and Uy . Then the Toeplitz extension is generated by U ıθ ˆ ˆ ˆ ˆx remains ˆ satisfy the same commutation relation Uy Ux = e Ux Uy , but while U P ∗ˆ ∗ ˆ ˆ ˆ ˆ unitary, Uy satisfies Uy Uy = 1 and Uy Uy = 1 − Π0 where Π0 = l∈Z |l, 0i hl, 0|.
December 27, 2001 16:1 WSPC/148-RMP
00110
Edge Current Channels and Chern Numbers
95
The edge C∗ -algebra is given by the C∗ -tensor product C(S 1 ), notably the C∗ ˆx , with the compact operators K in y-direction (modulo the algebra generated by U ˆx,y ) = gauge transformation discussed in Sec. 2.6). The projection is given by π(U Ux,y . The following proposition shows that the Hamiltonian which we started out with is in the C∗ -algebra A. For its proof we refer the reader to [11, 12]. ˜ is a homogeneous bounded self-adjoint operator on Lemma 2.5. Suppose that H 2 2 ˜ ≤ c|n − m|−2 for ` (Z ) the off-diagonal coefficients of which fall off like |hn|H|mi| 2 some constant c > 0. Let (Ω, T, Z ) be its hull. Then there exists an H ∈ A and a ˜ ω0 ∈ Ω such that πω0 (H) = H. 2.4. Von neumann algebra and its extension Let P be a probability measure on Ω which is invariant and ergodic with respect to both Tx and Ty . We suppose that its topological support is all of Ω. All our results hold for any such measure. We now construct the (twisted) von Neumann crossed product L∞ (A, P) of the W∗ -dynamical system (L∞ (Ω, P), T, Z2 ) by the standard procedure [35, 19] using the direct integral representation: Z ⊕ dP(ω)πω : A → B(L2 (Ω × Z2 , P ⊗ λ)) , πΩ = where λ is the counting measure on Z2 and B(H) denotes the algebra of bounded operators on the Hilbert space H = L2 (Ω× Z2 , P⊗ λ). Then L∞ (A, P) is defined as the weak closure of πΩ (A) in B(H). Elements of L∞ (A, P) are weakly-measurable (in ω) covariant operator families [19, 11] and can hence be described by measurable functions on Ω × Z2 satisfying A(ω, n) → 0 as |n| → ∞. We continue to use the notation πω for the (fiberwise) representation of these functions on `2 (Z2 ). The C∗ -norm on L∞ (A, P) is given by kAk∞ = P-ess sup kπω (A)k . ω∈Ω
∗
Similarly, one can construct the W -crossed product L∞ (C(Ω)×αx Z, P) of the W∗ dynamical system (L∞ (Ω, P), αx , Z) where (αx f )(ω) = f (Tx−1 ω) for f ∈ L∞ (Ω, P). Elements therein can be represented by functions on Ω × Z. Let L∞ (E, P)0 denote the finite dimensional matrices with entries in L∞ (C(Ω)×αx Z, P). We can represent elements of L∞ (E, P)0 as measurable functions on Ω × Z × N × N. Set further L∞ (T (A), P)0 = L∞ (E, P)0 ⊕ L∞ (A, P). Using the identification (3), the formulas (4) and (5) introduce an algebraic structure on L∞ (E, P)0 and L∞ (T (A), P)0 . Using again the fiberwise representations π ˆω identifying the functions on Ω × Z × N × N with operators on B(L2 (Ω × Z × N, P ⊗ λ)), a C∗ -norm on both of these algebras is given by ˆ , ˆ ∞ = P-ess sup kˆ πω (A)k kAk ω∈Ω
December 27, 2001 16:1 WSPC/148-RMP
96
00110
J. Kellendonk, T. Richter & H. Schulz-Baldes
and the closure w.r.t. this norm define the C∗ -algebras L∞ (E, P) and L∞ (T (A), P). Finally the inclusion i and projection π on the pre-C∗ algebras L∞ (E, P)0 and L∞ (T (A), P)0 can be defined as in (6) and be extended by continuity to the C∗ -closures, giving rise to another exact sequence of C∗ -algebras. 2.5. Non-commutative analysis tools The algebra L∞ (A, P) admits a differential structure generated by a two-parameter group ρk , k = (kx , ky ) ∈ T 2 = [−π, π)×2 , of ∗-automorphisms (A ∈ L∞ (A, P) and m = (mx , my ) ∈ Z2 ): ρk (A)(ω, m) = exp(ı(kx mx + ky my ))A(ω, m) .
(12)
~ = (∇x , ∇y ) are densely defined and explicitly The associated ∗-derivations ∇ given by ∇x A(ω, m) = ımx A(ω, m) ,
∇y A(ω, m) = ımy A(ω, m) ,
(13)
~ = (X, Y ) denotes the posiwhenever the right hand sides are in L∞ (A, P). If X 2 2 tion operators on ` (Z ), then one can verify for such differentiable operators the identities: ~ ~ . = ı[πω (A), X] πω (∇(A))
(14)
Similarly, there is a two-parameter family of ∗-automorphisms ρˆk on L∞ (T (A), P) given by ˆ mx , ny , my ) = exp(ı(kx mx + ky (my − ny ))A(ω, mx , ny , my ) . ρˆk (A)(ω,
(15)
~ˆ ˆ y ). They satisfy identities ˆ x, ∇ The associated ∗-derivations are denoted ∇ = (∇ n similar to (14). For n ∈ N, we define C (A) to be those elements for which all nth order gradients are in the C∗ -algebra. Given an invariant and ergodic probability measure P with support Ω, a normalized faithful trace T on L∞ (A, P) is defined by [35] Z dP(ω)A(ω, 0) , A ∈ L∞ (A, P) . (16) T (A) = Ω
T is the trace per unit volume [11, 12]. For p ∈ [1, ∞), the Banach space Lp (A, P) is defined as the closure of A0 under the norm kAkp = (T (|A|p ))1/p . If πGNS is the GNS representation of A on L2 (A, T ) associated with the state T , then the von Neumann algebra πGNS (A)00 where 00 is the bicommutant can easily be seen to ˆ = be isomorphic to L∞ (A, P) [19]. A trace Tˆ on L∞ (T (A), P) is given by Tˆ (A) ˆ ˆ T (π(A)). However, T is obviously not faithful. Let now L∞ (T (A), P)+ denote the cone of positive operators in L∞ (T (A), P) and define TˆE : L∞ (T (A), P)+ → [0, ∞] by X Z ˆ = ˆ 0, my , my ) . (17) TˆE (A) dP(ω)A(ω, my ∈N
December 27, 2001 16:1 WSPC/148-RMP
00110
Edge Current Channels and Chern Numbers
97
Using covariance, one sees that TˆE is the disorder averaged trace per unit volume in the x-direction, followed by the usual trace in the y-direction. Clearly TˆE is a weight, ˆ = TˆE (A) ˆ + λTˆE (B) ˆ for all A, ˆ B ˆ ∈ L∞ (T (A), P)+ and λ ≥ 0. namely TˆE (Aˆ + λB) ∗ ˆ for all Aˆ ∈ L∞ (T (A), P). Moreover one can directly check that TˆE (AˆAˆ ) = TˆE (Aˆ∗ A) Hence TˆE is a trace [22]. In particular, it is unitary invariant, that is for all Aˆ ∈ ˆ ∈ L∞ (T (A), P) one has TˆE (U ˆ AˆU ˆ ∗ ) = TˆE (A). ˆ L∞ (T (A), P)+ and unitary U ∞ ˆ As usual (e.g. [22, 35]), one now says that an operator A ∈ L (T (A), P) is ˆ < ∞. The set of all TˆE -trace-class operators form an ideal TˆE -trace-class if TˆE (|A|) ∞ in L (T (A), P) onto which TˆE can be extended. Further let us define Lp (E, P) ˆ p = (TˆE (|A| ˆ p ))1/p . The ideal of as the closure of E0 with respect to the norm kAk 1 ∞ ˆ TE -trace-class operators is then L (E, P) ∩ L (E, P). It can be directly verified that the traces T and TˆE are invariant under the automorphism groups ρk and ρˆk respectively. Thus for any A ∈ C 1 (A) and any Aˆ ∈ L1 (E, P) ∩ C 1 (E): T (∇x,y A) = 0 ,
ˆ = 0. ˆ x A) TˆE (∇
(18)
2.6. Iterated crossed product and Landau gauge Let C(Ω)×αx Z be the C∗ -crossed product without twisting [35] of the C∗ -dynamical system (C(Ω), αx , Z) where (αx f )(ω) = f (Tx−1 ω) for f ∈ C(Ω). Elements of C(Ω) ×αx Z can be represented by continuous functions on Ω × Z. This C∗ -algebra inherits a dynamics by the action αy defined by (αy f )(ω, mx ) = exp(−ıθmx )f (Ty−1 ω, mx ) ,
f ∈ C(Ω) ×αx Z .
∗
Hence we obtain an iterated crossed product C -algebra C(Ω) ×αx Z ×αy Z. Its elements are given by functions on Ω × Z2 and the multiplication is explicitly given by X A(ω, l)B(T −l ω, m − l) exp(ıθly (lx − mx )) , AB(ω, m) = l∈Z2
mimicking the Landau gauge. The iterated crossed product has a Toeplitz extension w.r.t. the action αy in the sense of Pimsner and Voiculescu [37] and this extension turns out to be isomorphic to the exact sequence (11): ψ
0 → C(Ω) ×αx Z ⊗ K −→ ρ
0 →
↓
κ i
π
−→ C(Ω) ×αx Z ×αy Z → 0
T ↓
η π
−→ T (A) −→
E ∗
↓
A
(19) → 0.
Here ⊗K denotes the C -algebraic tensor product with the compact operators on `2 (N) and ψ, π and T are the mappings and the Toeplitz extension as defined in [37]. In fact, it is elementary to verify that ı ˆ (0,ny ) ω, mx , ny , my ) , ˆ θmx (my + ny ) A(T (20) (ρA)(ω, mx , ny , my ) = exp 2
December 27, 2001 16:1 WSPC/148-RMP
98
00110
J. Kellendonk, T. Richter & H. Schulz-Baldes
and
(ηA)(ω, m) = exp
ı − θmx my A(ω, m) , 2
(21)
are (gauge) isomorphisms. As it is not used later on, we do not write out a explicit formula for κ here. 3. Main Results ˆ = (K, ˆ H) ∈ T (A) be some lift of the planar (stochastic) self-adjoint HamilLet H tonian H ∈ A. For the representation corresponding to the disorder configuration ω, this means: ˆ = Ππω (H)Π + π ˆ , ˆω (K) π ˆω (H) ˆ to where Π denotes the projection from `2 (Z2 ) onto `2 (Z × N). We suppose K have non-zero entries only a finite distance from the boundary (see Sec. 4.1 for more precise and general hypothesis). Choosing such a lift means that we fix some boundary condition for the operator on the half-plane which is homogenous in the x-direction. Following the discussions in the introduction, let us now define the edge Hall conductivity as follows: e (µ) = − σ⊥
q2 ~
lim ∆→{µ}
1 ˆ ˆ ˆ ˆ TE (P∆ ∇x H) . |∆|
(22)
Recall (see e.g. [20]) that the spectrum of πω (H) is P-almost surely independent of ω. Our results below show that, under the two hypothesis that (i) H ∈ C 3 (A) and (ii) ∆ is within gap of the spectrum of H (it is well-known that the spectrum of πω (H) as a set is P-almost surely independent of ω ∈ Ω; this condition is equivalent to ask ∆ to be in a gap of the density of states), the operator Pˆ∆ ∈ L∞ (E, P) is in the ideal of TˆE -trace-class operators. Further we let Πx denote the projection in `2 (Z × N) onto `2 (N × N), namely the subspace spanned by the states |mx , my i with mx ≥ 0. ˆ ∈ Theorem 3.1. Suppose that H ∈ C 3 (A) is a self-adjoint element and H C 3 (T (A)) some self-adjoint lift of it. Let the closed interval ∆ ⊂ R be in a gap of the spectrum of H. Suppose that µ ∈ ∆ 7→ TˆE (Pˆ[inf(∆),µ] is continuous. Set ! ˆ − E0 H ˆ (∆) = exp −2πı Pˆ∆ , E 0 = inf(∆) . (23) U |∆| ˆ (∆))∗ Πx is P-almost surely a Fredholm operator on Then the operator Πx πω (U 2 ` (N × N) and its index is P-almost surely constant. For any µ ∈ ∆, its value is equal to the edge Hall conductivity: e (µ) = σ⊥
q2 ˆ (∆)∗ )Πx |`2 (N×N) ) . Ind(Πx πω (U h
December 27, 2001 16:1 WSPC/148-RMP
00110
Edge Current Channels and Chern Numbers
99
The second main result of this work concerns the equality of the edge Hall b (µ) at Fermi energy conductivity to the zero-temperature bulk Hall conductiviy σ⊥ µ of a gas of independent fermions described by a one-particle Hamiltonian H ∈ A. Because the Lorentz drift is diffusive and non-dissipative, Kubo’s formula allows to calculate it in the zero-dissipation limit [12]. The following is sufficient for our present purposes: Proposition 3.2 ([12, Corollary 2]). Suppose that H ∈ C 1 (A) and that the Fermi level µ is in a gap of the spectrum of H. Then the bulk Hall conductivity b (µ) at zero temperature and zero dissipation is σ⊥ b (µ) = σ⊥
q2 2πıT (Pµ [∇x Pµ , ∇y Pµ ]) . h
(24)
Theorem 3.3. Under the same hypothesis as in Theorem 3.1, b e (µ) = σ⊥ (µ) , σ⊥
(25)
for any µ ∈ ∆. Their common value is constant in ∆ and equal to an integer multiple of q 2 /h. The Hall conductivity can either be calculated by the index in Theorem 3.1 or by that of reference [12]. Hence this work gives another alternative proof of the quantization of the bulk Hall conductivity, once one accepts (24) as its definition. 4. Edge Current Theory 4.1. Trace-class property near the Fermi surface The following summability result will be used in the next section. It also implies that the edge current is actually a well-defined mathematical quantity under a gap condition on the plane operator. The precise conditions on the boundary condition ˆ = (H, K) ˆ are the following: of H ˆ 2 ∈ C 3 (E) and there is an L < ∞ such that for all ω ∈ Ω: ˆ =K ˆ1 +K Hypothesis: K ˆ 1 )|mx , my i = 0 for ny > L , πω (K hnx , ny |ˆ ˆ 2 )|mx , my i = 0 for my > L . πω (K hnx , ny |ˆ Let further the x-sign operator on `2 (Z × N) be defined by X |mx , my i = sgn(mx )|mx , my i |X| ( mx ≥ 0 , |mx , my i , = mx < 0 , −|mx , my i ,
(mx , my ) ∈ Z × N .
ˆ satisfies the ˆ = (K, ˆ H) ∈ C 3 (T (A)) where K Proposition 4.1. Suppose that H above hypothesis. Let the interval ∆ ⊂ R be contained in a gap of the spectrum of πω (H) for all ω ∈ Ωg where P(Ωg ) = 1. Let f be a real C 4 -function supported
December 27, 2001 16:1 WSPC/148-RMP
100
00110
J. Kellendonk, T. Richter & H. Schulz-Baldes
ˆ ∈ C 3 (E) ∩ L1 (E, TˆE ) and moreover the Hilbert–Schmidt norm of in ∆. Then f (H) X ˆ ˆω (f (H))] on `2 (Z × N) is uniformly bounded for all ω ∈ Ωg . the operator [ |X| , π Proof. First recall that every operator A ∈ C γ (A) satisfies |hn|πω (A)|mi| ≤
c , 1 + |n − m|γ
(26)
where we use the notation |n − m|γ = |nx − mx |γ + |ny − my |γ . A similar bound ˆ ∈ C 3 (T (A)) and f is a C 4 -function, we holds for operators in C γ (T (A)). As H 3 ˆ ∈ C (T (A)) (this follows from a straightforward generalization deduce that f (H) ˆ with γ = 3. of [39, Theorem 3.3.7]) so that (26) holds for f (H) 0 Furthermore, given E ∈ ∆, let E 7→ gE (E 0 ) be some C 4 -function equal to (E − E 0 )−1 outside of ∆. By the same argument as above, the operator gE (H) is in C 3 (A) as long as E ∈ ∆. Maximising over E ∈ ∆, we conclude that the bound (26) holds for A = (E − H)−1 for a constant c which can be chosen uniform in ˆ − ΠH) = E ∈ ∆ and ω ∈ Ωg . As last preliminary, let us note that the operator (H 2 2 2 ˆ ˆ K1 + (K2 − ΠH(1 − Π)) from ` (Z ) to ` (Z × N) also satisfies (26) for all ω ∈ Ω and that hn|πω (Kˆ2 − ΠH(1 − Π))|mi = 0
for my > L .
ˆ in which The main idea is now to use these bounds in Stone’s formula for f (H) the resolvent is developed according to the geometric resolvent formula. As all the above bounds hold for all ω ∈ Ωg , we suppress the index ω ∈ Ωg throughout the rest of the proof. We replace the geometric resolvent identity 1 ˆ zΠ − H
=Π
1 1 ˆ − ΠH) 1 + (H ˆ z−H z−H zΠ − H
(27)
ˆ and develop (E ± ı − H)−1 into a power series twice into Stone’s formula for f (H) in . Of this series only the constant term, namely (E − H)−1 , remains after the limit → 0: Z 1 1 ˆ − ΠH) 1 , ˆ = 1 s-lim dE f (E) − (H f (H) ˆ ˆ 2πı →0 ∆ E−H E − ı − H E + ı − H (28) and therefore ˆ |hn|f (H)|mi| ≤c
X
X
l∈Z×N k∈Z2
1 ˆ ˆ |mi . |hn|f (H)|li| |hl|H − ΠH|ki| sup hk| E − H E∈∆ (29)
December 27, 2001 16:1 WSPC/148-RMP
00110
Edge Current Channels and Chern Numbers
101
ˆ − ΠH is localized near the boundary. We As pointed out above, the operator H ˆ ˆ decompose it into two parts K1 and K2 − ΠH(1 − Π) for which we do the following bounds seperately. The contribution C2 of Kˆ2 − ΠH(1 − Π) can be bounded by X X 1 1 1 C2 ≤ c 3 3 1 + |n − l| 1 + |l − k| 1 + |k − m|3 l∈Z×N kx ∈Z,ky ≤L
≤c
1 1 1 + 1 + |nx − mx |3 + m3y 1 + |nx − mx |2 1 + m2y
,
(30)
(here and below, c denotes different constants), and we have used various elementary estimations such as (η, γ > 1): X 1 1 1 1 −1+1/η −1+1/γ ≤ c |a| + |b| . |a| + |k|η |b| + |k − m|γ |b| + |m|γ |b| + |m|η k∈Z
ˆ can be bounded by the r.h.s. of (30). Similarly, one can bound C1 . Hence hn|f (H)|mi This in particular implies that for all ω ∈ Ωg : c ˆ my i| ≤ . |h0, my |f (H)|0, 1 + m2y ˆ As this holds P-almost surely with a uniform constant, it directly implies that f (H) ˆ is TE -trace-class. Replacing (30) twice into 2 ! X X 2 ˆ ˆ , f (H) (sgn(mx )−sgn(mx +nx ))2 |hn|f (H)|mi| , = Tr`2 (Z×N) |X| n,m∈Z×N
the second assertion follows from a short calculation. Remark 4.2. The above proof can be slightly modified in order to prove that Pˆ∆ ∈ L∞ (E, P) and is TˆE -trace-class under the slightly stronger hypothesis that H ∈ C 4 (A). For that purpose one has to replace f in (28) by the indicator function on ∆ and then bound hn|Pˆ∆ |li in (29) trivially by 1. However, the proof of Theorem 3.1 below shows that the TˆE -trace-class property of Pˆ∆ holds whenever H ∈ C 3 (A). Remark 4.3. One can show that, if Aˆ ∈ E is TˆE -trace-class and an element of X ˆ is a Hilbert–Schmidt operator on `2 (Z × N) for P-almost ,π ˆω (A)] C 2 (E), then [ |X| all ω ∈ Ω. This follows from the remarkable identity 2 ! Z X 1 1 ˆ x A) ˆ , ˆ ˆ ∗∇ ,π ˆω (A) (31) dP(ω) Tr`2 (Z×N) = TˆE ((σ A) 4 |X| ı holding for any Aˆ ∈ E0 where ˆ m x , n y , my ) . ˆ (σ A)(ω, mx , ny , my ) = sgn(mx )A(ω, ˆ ∈ E, the formula (31) can be extended to C 2 (E) ∩ As Aˆ ∈ C 2 (E) implies that (σ A) 1 ˆ L (E, TE ), implying hence the above result. Note that this does not give us a uniform bound in ω though and is thus a weaker result (not sufficient for our purposes below). A proof of (31) can be given along the lines of that of Proposition 4.5.
December 27, 2001 16:1 WSPC/148-RMP
102
00110
J. Kellendonk, T. Richter & H. Schulz-Baldes
4.2. 1-cocycles over the edge algebra ˆ B ˆ ∈ E0 : The following expression is well-defined and finite for A, ˆ . ˆ B) ˆ = ı TˆE (Aˆ∇ ˆ x B) ξ1 (A,
(32)
Actually ξ1 can be defined on a much wider class of operators. For the purpose of this work, it is sufficient to consider elements in D = C 2 (E) ∩ L1 (E, TˆE ). Note ˆ x and the ideal property of the TˆE -trace-class operators that the product rule for ∇ imply that D is an algebra. It becomes a normed algebra when endowed with the norm: ˆ + k∇ ˆ 2 Ak ˆ + TˆE (|A|) ˆ + 2k∇ ˆ x Ak ˆ . ˆ D = kAk kAk x Lemma 4.4. ξ1 is a cyclic 1-cocycle on D, notably it is cyclic and closed under the Hochschild boundary operator b : ˆ B) ˆ = −ξ1 (B, ˆ A) ˆ for all A, ˆ B ˆ ∈ D. (i) ξ1 (A, ˆ B, ˆ C) ˆ = ξ1 (AˆB, ˆ C) ˆ − ξ1 (A, ˆ B ˆ C) ˆ + ξ1 (Cˆ A, ˆ B) ˆ for all A, ˆ B, ˆ Cˆ ∈ D. (ii) 0 = bξ1 (A, Proof. Both algebraic identities can be verified using the product rule for the ˆ x. ˆ x and the invariance of the trace TˆE under ∇ derivation ∇ ˜ by setting Next we introduce another 1-cocycle on D Z ˆ B) ˆ = dP(ω)ζ1ω (A, ˆ B) ˆ , A, ˆ B ˆ∈D ˜, ζ1 (A, where ˆ B) ˆ ζ1ω (A,
1 = Tr`2 (Z×N) 4
X X X ˆ ˆ ,π ˆω (A) ,π ˆω (B) . |X| |X| |X|
(33)
According to Remark 4.3 of the last section, this cocycle is actually well-defined and finite on D. Proposition 4.5. On D, we have ζ1 = ξ1 . Proof. It is sufficient to show the equality for the dense subalgebra E0 ⊂ D because both ζ1 and ξ1 are continuous with respect to k · kD . A direct calculation shows ˆ B ˆ ∈ E0 : that for A, Z X X ˆ B) ˆ = − 1 dP(ω) sgn(mx ) ζ1 (A, 4 m∈Z×N l∈Z×N
ˆ ˆ πω (A)|li hl|ˆ πω (B)|mi . · (sgn(mx ) − sgn(lx ))2 hm|ˆ Because Aˆ ∈ E0 , the sum over mx ∈ Z actually only contains a finite number of elements, and can thus be exchanged with the integral over P. Then we make the
December 27, 2001 16:1 WSPC/148-RMP
00110
Edge Current Channels and Chern Numbers
103
change of variables nx = lx − mx and use the covariance relation (9) in order to obtain: X Z X X ˆ B) ˆ = −1 sgn(mx )(sgn(mx ) − sgn(mx + nx ))2 dP(ω) ζ1 (A, 4 mx ∈N
my ,ly ∈N nx ∈Z
ˆ x , ly i hnx , ly |ˆ ˆ · h0, my |ˆ πT (−mx ,0) ω (A)|n πT (−mx ,0) ω (B)|0, my i . Next, by invariance of the measure P, we can replace T (−mx,0) ω by ω. Then we change the sum over mx and the integral over P again and use the identity X sgn(mx )(sgn(mx ) − sgn(mx + nx ))2 = −4nx . mx ∈N
ˆ x , one therefore has By definition of ∇ Z X ˆ ˆ B) ˆ = ı dP(ω) ˆ πω (∇ ˆ x B)|0, my i , h0, my |ˆ πω (A)ˆ ζ1 (A, my ∈N
ˆ B). ˆ which is precisely ξ1 (A, ˜ using the projection operator Finally let us introduce a further 1-cocyle on D Πx = (1 + X/|X|)/2 on `2 (Z × N): Z ˆ ˆ ˆ B) ˆ , A, ˆ B ˆ ∈D ˜, η1 (A, B) = dP(ω)η1ω (A, where ˆ B) ˆ = Tr`2 (Z×N) (Πx π ˆ πω (A)Π ˆ x − Πx π ˆ xπ ˆ x) ˆω (B)ˆ ˆω (B)Π ˆω (A)Π η1ω (A, ˆ πω (B)Π ˆ x − Πx π ˆ xπ ˆ x) . ˆω (A)ˆ ˆω (A)Π ˆω (B)Π − Tr`2 (Z×N) (Πx π
(34)
The following identity is a special case of Connes’ formulas [19]. ˜ both expressions in (34) are finite and one has η1ω = ζ1ω Proposition 4.6. On D, for all ω ∈ Ω. Proof. Some algebra shows ˆ πω (B)Π ˆ x − Πx π ˆ xπ ˆ x ˆω (A)ˆ ˆω (A)Π ˆω (B)Π Πx π 1 X X ˆ ˆ . ,π ˆω (A) ,π ˆω (B) = − Πx 4 |X| |X|
(35)
X ˆ X ,π ˆ ,π ˆω (A)][ Note that if [ |X| |X| ˆω (B)] is trace-class, so is the left-hand side (this is ˆ ˆ ˜ the case for A, B ∈ D by Remark 4.3). Hence using the same identity with Aˆ and ˆ exchanged, we obtain: B
December 27, 2001 16:1 WSPC/148-RMP
104
00110
J. Kellendonk, T. Richter & H. Schulz-Baldes
ˆ B) ˆ η1ω (A,
X X 1 ˆ ˆ ,π ˆω (A) ,π ˆω (B) = Tr`2 (Z×N) Πx 4 |X| |X| X X ˆ ˆ ,π ˆω (B) ,π ˆω (A) − Πx |X| |X| X X X 1 ˆ ˆ ,π ˆω (A) ,π ˆω (B) = Tr`2 (Z×N) 2 |X| |X| |X| X X X ˆ ˆ ,π ˆω (B) ,π ˆω (A) . − |X| |X| |X|
X ˆ and [ X , π ˆ ,π ˆω (A)] Note here that the second equality holds because both [ |X| |X| ˆω (B)] are Hilbert–Schmidt (Remark 4.3). From the above we deduce that
ˆ B) ˆ = 1 (ζ ω (A, ˆ B) ˆ − ζ ω (B, ˆ A)) ˆ , η1ω (A, 1 2 1 and the cyclicity property of ζ1ω allows to conclude. Corollary 4.7. On D, we have ξ1 = ζ1 = η1 . 4.3. Index theorems Below we shall use the following well-known result. Proposition 4.8 (Fedosov’s formula). Let F be a bounded operator on a Hilbert space H. We suppose that (1 − F ∗ F ) and (1 − F F ∗ ) are in the pth Schatten ideal of trace-class operators on H. Then F is a Fredholm operator and for every integer n ≥ p, its index satisfies Ind(F ) = Tr((1 − F ∗ F )n ) − Tr((1 − F F ∗ )n ) . Proposition 4.9. Suppose (only for the purpose of this proposition) that P is ˜ be unitary. Then Πx π ˆ x is Pˆω (A)Π ergodic w.r.t. the Z-action Tx . Let Aˆ ∈ D 2 almost surely a Fredholm operator on ` (N × N) the index of which is P-almost ˆ Aˆ∗ −α ¯ ) whenever surely independent of ω ∈ Ω. Its common value is equal to ξ1 (A−α, ˆ A − α ∈ D, α ∈ C, α 6= 0. ˆ Aˆ∗ ) < ∞ for P-almost ˜ Proposition 4.1 implies that η ω (A, Proof. Because Aˆ ∈ D, 1 all ω ∈ Ω. Thus (34) and the Fedosov formula (we can take p = 1) imply that ˆ x is a Fredholm operator on Πx `2 (Z × N) = `2 (N × N) and that its ˆω (A)Π Πx π ˆ Aˆ∗ ). Because index is equal to η ω (A, 1
ˆ x |`2 (N×N) = Πx π ˆ x + K| ˆ ˆT (a,0) ω (A)Π ˆω (A)Π , Πx π U(a,0)`2 (N×N)∼ =`2 (N×N) where K is a compact operator on `2 (N × N) and the Fredholm index is invariant under compact perturbations, we see that the index is Tx -translation invariant in ω ∈ Ω. Hence it is P-almost surely constant by the ergodicity of P with respect to
December 27, 2001 16:1 WSPC/148-RMP
00110
Edge Current Channels and Chern Numbers
105
ˆ Aˆ∗ ) = η ω (Aˆ − α, Aˆ∗ − α), Tx . As η1ω (A, ¯ Corollary 4.7 implies that the almost sure 1 ∗ ˆ ˆ ¯ index is equal to ξ1 (A − α, A − α). In our context, the measure P is only ergodic w.r.t. the Z2 -action T . However, ˜ notably this is sufficient to give an almost sure index for certain elements in D, those in the image of the exponential map. ˆ satisfy the hypothesis of Proposition 4.1 and let g Proposition 4.10. Let H 4 be a real C function with values in [0, 1], equal to 0 or 1 outside of ∆. Set ˆ = exp(−2πıg(H)). ˆ ˆ )Πx is P-almost surely a Fredholm operator U Then Πx π ˆω (U 2 on ` (N × N) the index of which is P-almost surely independent of ω ∈ Ω. The ˆ − 1, U ˆ −1 − 1). almost sure value is equal to ξ1 (U ˆ ∈ D. ˜ Thus from the proof of Proposition 4.9 follows Proof. By Proposition 4.1, U ˆ ˆω (U )Πx is P-almost surely a Fredholm operator and that its index is that Πx π Tx -invariant. To conclude, we have to show its Ty -invariance. Herefore, let Q : `2 (Z × N) → `2 (Z × N) denote the projection on the subspace ˆ = (K, ˆ H) ∈ T (A), we can use the equation spanned by {|mx , 0i|mx ∈ Z}. Now, if H ˆ ˆ ˆω (K) and the covariance relation Uy π −1 (H)U ∗ = πω (H) π ˆω (H) = Ππω (H)Π + π Ty ω
y
in order to obtain ˆ =U ˆy π ˆ U ˆy∗ + R ˆω , ˆTy−1 ω (H) π ˆω (H) ˆ ω is an operator family, covariant in the x-direction and compact in the where R y-direction and given by ˆ ˆ ˆ ˆy π ˆ U ˆy∗ . ˆω = π ˆω (H)Q + Qˆ πω (H)(1 − Q) + (1 − Q)ˆ πω (K)(1 − Q) + U ˆTy−1 ω (K) R ˆ is actually an operator satisfying the hypothesis of Sec. 4.1. Hence the Now R ˆ U ˆy∗ + λR ˆ ω is a lift of πω (H) and by Proposition 4.1 ˆy π ˆTy−1 ω (H) operator U ˆy π ˆ U ˆy∗ + λR ˆ ω )) , ˆω (λ) = exp(−2πıf (U ˆTy−1 ω (H) U is P-almost surely a norm-continuous family (in λ) of Fredholm operators. Clearly ˆ ˆ ) and moreover, because U ˆ ∗U ˆ ˆ ˆ −1 (U ˆ )U ˆ ∗. ˆω (1) = π ˆω (U U y y = 1, we have Uω (0) = Uy π y Ty ω Therefore the homotopy invariance of the Fredholm index implies: ˆy π ˆ )Πx ) = Ind(Πx U ˆ )U ˆy∗ Πx ) = Ind(Πx π ˆ )Πx ) , ˆω (U ˆTy−1 ω (U ˆTy−1 ω (U Ind(Πx π which concludes the proof. 4.4. Calculation of the edge current Proof of Theorem 3.1. Let G ∈ C 2 (R) be a positive function supported on an interval ∆0 ⊂ ∆. We set Z E 1 dE 0 G(E 0 ) , g(E) = 1 − R 0 G(E 0 ) dE −∞ R
December 27, 2001 16:1 WSPC/148-RMP
106
00110
J. Kellendonk, T. Richter & H. Schulz-Baldes
ˆ = exp(−2πıg(H)). ˆ ˆ ) and and the unitary operator W By Proposition 4.10, π ˆω (W ∗ ˆ π ˆω (W ) are P-almost surely Fredholm operators with a P-almost sure index. Changing ∆0 within ∆ (or changing G) does not change their index by the homotopy invariance of the Fredholm index. Hence their indices are actually associated ∗ ˆ (∆) and U(∆) ˆ with the gap ∆ and can moreover be calculated as the index of U ˆ (∆)∗ by Ind so that respectively. We denote that of U ˆ ). ˆ ∗ − 1)∇ ˆ xW Ind = ı TˆE ((W
(36)
ˆ: ˆ xW Now recall Duhamel’s formula for ∇ Z 1 ˆ = −2πı ˆ x g(H) ˆ xW ˆ 1−s ∇ ˆ W ˆ s, ∇ ds W 0
where the integral is defined as a norm convergent Riemann sum in L∞ (E, TˆE ). We replace this into (36), use the continuity of the map A ∈ A 7→ TˆE ((W ∗ − 1)A) in order to exchange the trace with the integral over s and finally use the unitary invariance of TˆE to deduce ˆ )∇ ˆ x g(H)) ˆ . Ind = 2π TˆE ((1 − W P To calculate this, we develop g(E) = k ck Tk (E) in a series of Tchebychev polynomials over ∆ (slightly enlarged if necessary). Because g ∈ C 3 (R), this series converges absolutely and uniformly on ∆0 ⊂ ∆, so the sum over k can be exchanged ˆ we now apply the gradient and then use the with TˆE . On each polynomial in H ˆ , H] ˆ = 0 in order to regroup all polynomials in H ˆ to cyclicity of the trace and [W R P 0 0 0 ˆ ˆ the left of ∇x H. Because G(E) = − k ck Tk (E) · ( R dE G(E )), we conclude that 2π ˆ . ˆ − 1)G(H) ˆ ∇ ˆ x H) Tˆ ((W 0 G(E 0 ) E dE R
Ind = R
Now let Gj ∈ C 3 (R) be a sequence of positive functions converging (pointwise) from ˆ j denotes the unitary constructed from below to the indicator function onto ∆0 . If W ˆ j − 1) Gj (H) ˆ converges in norm to U ˆ (∆0 ) − 1. Therefore, G = Gj as above, then (W for any ∆0 ⊂ ∆, Ind =
2π ˆ ˆ 0 ˆ . ˆ x H) TE ((U (∆ ) − 1)∇ |∆0 |
For a given k ∈ N, k ≥ 1, we divide ∆ into k intervals ∆j = [Ej−1 , Ej ], j = 1, · · · , k, of equal length |∆|/k. Note that E0 = E 0 = inf(∆). Let us set ! ˆ − Ej H ˆj − 1 = exp −2πı ˆ (∆)k − 1) Pˆ∆j . Pˆ∆j − 1 = (U U |∆j | ˆ (∆)∗ , namely Ind. Hence ˆ ∗ is equal to that of U Now the Fredholm index of each U j it follows that Ind =
k 1 X 2π ˆ ˆ ˆ = 2π TˆE ((U ˆ . ˆ x H) ˆ (∆)k − 1)∇ ˆ x H) TE ((Uj − 1)∇ k j=1 |∆j | |∆|
(37)
December 27, 2001 16:1 WSPC/148-RMP
00110
Edge Current Channels and Chern Numbers
107
ˆj ’s, it can be verified that (37) Introducing a sign in the above definition of the U also holds for negative integers, so it holds for all k ∈ Z except k = 0. Finally we make a Fourier analysis on the closed interval ∆. Let ∆0 ⊂ ∆ be a closed interval such that min(∆0 ) > E 0 and set ∆00 = ∆ \ ∆0 . The indicator function χ∆00 on ∆00 is given by the following pointwise convergent Fourier series: X 1 2πı 0 k(E − E ) , ak exp − χ∆00 (E) = − χ∂∆0 (E) + 2 |∆| k∈Z
where χ∂∆0 is the characteristic function on the boundary points of ∆0 and Z 2πı dE χ∆00 (E) exp − k(E − E 0 ) . ak = |∆| ∆ |∆| One has: X
ak = 1 ,
a0 =
k∈Z
|∆00 | , |∆|
X
ak =
k6=0
|∆0 | , |∆|
so that: Pˆ∆
X
ˆ (∆)k = ak U
k6=0
|∆0 | ˆ 1 P∆ − Pˆ∆0 − Pˆ∂∆0 . |∆| 2
By hypothesis TˆE (Pˆ∂∆0 ) = 0. Hence, summing (37) one obtains: Ind =
|∆| X 2π ˆ ˆ ˆ ˆ x H) ˆ − 2π TˆE (Pˆ∆ ∇ ˆ x H) ˆ TE (P∆ U (∆)k ∇ ak |∆0 | |∆| |∆| k6=0
=−
2π ˆ ˆ ˆ ˆ TE (P∆0 ∇x H) . |∆0 |
∆0 being an arbitrary closed interval in ∆, the result is thus proven. 5. Equality between Edge and Bulk Hall Conductivity In this section, we give the proof of Theorem 3.3. It makes use of the results and notations of the Appendix. According to Sec. 4 and Proposition 3.2, the definitions in Sec. A.3 show that edge and bulk Hall conductivity result from the following pairings of cyclic cohomology with K-theory over E and A respectively: e (∆) = − σ⊥
q2 ˆ (∆)]1 , [ξ1 ]i , 2πı h[U h
b σ⊥ (µ) =
q2 2 4π ı h[Pµ ]0 , [ξ2 ]i . h
ˆ (∆) and ξ1 are defined in (23) and (32), Pµ is the Fermi projection of the Here U plane operator on a Fermi level µ ∈ ∆ and ξ2 is the 2-cocylce over A defined by ξ2 (A, B, C) = ı T (A∇x B∇y C − A∇y B∇x C) .
December 27, 2001 16:1 WSPC/148-RMP
108
00110
J. Kellendonk, T. Richter & H. Schulz-Baldes
We now use the isomorphisms ρ and η defined in (20) and (21) in order to transform these two pairings to pairings over the algebras C(Ω)×αx Z⊗K and C(Ω)×αx Z×αy Z of the Pimsner–Voiculescu exact sequence (19): e (∆) = − σ⊥
q2 ˆ (∆))]1 , [ρ∗ ξ1 ]i , 2πıh[ρ−1 (U h
b σ⊥ (µ) =
q2 2 4π ıh[η −1 (Pµ )]0 , [η ∗ ξ2 ]i . h
Now we may apply the duality Theorem A.10 for Pimsner–Voiculesu exact sequences by setting B = C(Ω) ×αx Z on which the action is given by α = αy . Hence ˆ (∆))]1 Theorem 3.3 is proven once we have verified that exp([η −1 (Pµ )]0 ) = [ρ−1 (U ∗ ∗ and η ξ2 = #αy ρ ξ1 . Concerning the latter equality one can immediately verify that, if one defines ξ1 and ξ2 by the same formulas on the corresponding algebras of the Pimsner–Voiculescu exact sequence, then η ∗ ξ2 = ξ2 and ρ∗ ξ1 = ξ1 (use herefore that η and ρ are isomorphisms leaving the traces invariant); then going back to the definition of #αy in Sec. A.5.2, the identity ξ2 = #αy ξ1 can be directly checked. ˆ (∆)]1 . For the first equality, it is clearly sufficient to show that exp([Pµ ]0 ) = [U We first note that Pµ is equal to the continuous function of the Hamiltonian g(H) = PE 0 − (H − E 0 )P∆ /|∆| where E 0 = inf(∆). A selfadjoint lift of Pµ is ˆ Now using [PˆE 0 , Pˆ∆ ] = 0 and the fact that exp(2πıPˆ ) = 1 thus given by g(H). for any projection Pˆ , we thus obtain from the definition of the exponential map recalled in the Appendix: ˆ 1 = [U ˆ (∆)]1 . exp([Pµ ]0 ) = [exp(−2πıg(H))] This concludes the proof of Theorem 3.3. Appendix A. Duality of Pairings of K-theory with Cyclic Cohomology In this appendix we prove a duality theorem for pairings over crossed products. This result (Theorem A.10 below) was announced by Nest [34, Proposition 12.6], but the proof given there is incomplete. We close that gap using arguments along the lines of [24]. Using a result of Pimsner [36], we provide in fact a new and conceptually more transparent proof in the situation adequate to our context. A.1. Some help for cooking with K-theory For the convenience of the reader, we give a very brief summary of the main K-theoretic notions used in this work. We refer to [14, 44] for all the details and even for the definition of the groups K0 (B) and K1 (B) of a given involutive Banach algebra B. Roughly, K0 is the Grothendieck group generated by homotopy classes of idempotents in matrix algebras over B, and K1 (B) is formed by homotopy classes of invertibles (or equivalently unitaries) therein. However, great care needs to be taken when B does not have a unit.
December 27, 2001 16:1 WSPC/148-RMP
00110
Edge Current Channels and Chern Numbers
109
The suspension SB of B is the algebra of continuous functions f : S 1 → B which vanish at a distinguished point that we choose to be 1. This construction provides a link between the K-groups. First, the following map is a group isomorphism: " " !# ! # 1 0 1 0 ∗ − . Wt Θ : K1 (B) → K0 (SB) , [V ]1 7→ Wt 0 0 0 0 0 0 Here V ∈ Mn (B) is an unitary representative and Wt ∈ M2n (B) a homotopy from V 0 0 V ∗ to 1. The blocksize in the matrix is n × n. One of the key properties of K-theory is that β : K0 (B) → K1 (SB) ,
[P ]0 7→ [e2πıtP ]1 = [(e2πıt − 1)P + (1 − P )]1
is also an isomorphism called the Bott map. It shows that K0 (B) ∼ = K0 (SSB), making K-theory a periodic theory. Another crucial property of K-theory is that any short exact sequence of Banach algebras, i
π
0 → J −→ B −→ B/J → 0 , gives rise to a six-term exact sequence of K-groups: K0 (J ) x ind
i
π
π
i
∗ ∗ −→ K0 (B) −→ K0 (B/J ) yexp
∗ ∗ K1 (B/J ) ←− K1 (B) ←− K1 (J ) .
The definition of the push-forward maps i∗ and π∗ is immediate. Let us recall those of the index and exponential map. Given a unitary V ∈ Mn (B/J ) defining a class in K1 (B/J ), let W ∈ M2n (B) be a unitary lift of V0 V0∗ . Then " ! " !# # 1 0 1 0 ∗ W − . ind([V ]1 ) = W 0 0 0 0 0 0 The exponential map is defined by exp := Θ−1 ◦ ind ◦ β. Explicitly, if P ∈ Mn (B/J ) defines a class in K0 (B/J ) and L(P ) ∈ B a self-adjoint lift for it, then exp([P ]0 ) = [exp(−2πıL(P ))]1 . We furthermore recall some facts about the Toeplitz extension (see [37]) of the C∗ -crossed product B ×α Z associated with the C∗ -dynamical system (B, Z, α) (for unital B): ψ
π
0 → B ⊗ K −→ T (B) −→ B ×α Z → 0 .
(A.1)
Here K are the compact operators on `2 (N) generated by the finite rank operators en,m , n, m ∈ N, the matrix units, and ψ is the homorphism defined in [37]. We have met such an extension in Sec. 2.6. The imbedding j : B → B ⊗ K,
December 27, 2001 16:1 WSPC/148-RMP
110
00110
J. Kellendonk, T. Richter & H. Schulz-Baldes j∗
j(b) = b ⊗ e11 induces an isomorphism K(B) ∼ = K(B ⊗ K) and in [37] Pimsner and Voiculescu have shown that K(T (B)) ∼ = K(B) as well. Keeping carefully track of these isomorphisms, they obtained from the six-term exact sequence associated with (A.1) the nowadays called Pimsner–Voiculescu six-term exact sequence: K0 (B) x ind K1 (B ×α Z)
i∗ −α−1 ∗
−→
i
∗ ←−
K0 (B)
K1 (B)
i
∗ −→
i∗ −α−1 ∗
←−
K0 (B ×α Z) yexp K1 (B) .
Strictly speaking the maps ind and exp in this diagram are not the same as the ones in the original sequence, but have to be composed with j∗−1 . We now produce a direct proof of the following result by Elliott and Natsume [23] used in Sec. A.6, which is a discrete (one-way) analog of Connes’ Thom isomorphism [17]. We denote the unitary generator of the action in B ×α Z by U , notably α(A) = U AU ∗ for all A ∈ B. Proposition A.1. Let P ∈ Mn (B) be a projection defining a class in K0 (B) which is in the image of the index map, and let W ∈ Mn (B) be a unitary such that α(P ) = W P W ∗ . Then a preimage of [P ]0 is the class defined by the unitary X = U P + W (1 − P ). Proof. First note that the unitary W exists (possibly after enlarging n) because [P ]0 = [α(P )]0 by the Pimsner–Voiculescu exact sequence. In order to compute 0 ind([X]1 ), we have to find a lift W ∈ M2n (T (B)) of X 0 X ∗ and to compute " !# ! " ! # 1 0 1 0 −1 ∗ − W W . j∗ 0 0 0 0 0 0 ˆ = ˆ ∗U ˆ ∈ T (B) of U with the following properties: U ˆU ˆ ∗ = 1, U There exists a lift U ∗ ˆ ˆ 1 − P0 , where P0 = i(1 ⊗ e11 ) and U AU = α(A) for all A ∈ B. In particular, P commutes with P0 and it is not difficult to verify that ! ˆ P + W (1 − P ) U 0 W = ˆ ∗ + (1 − P )W ∗ PU P P0 0 is a lift of X 0 X ∗ . Furthermore, " !# " # !# " ! 1 0 0 0 1 0 ∗ − = = j∗ ([P ]0 ) . W W 0 0 0 P P 0 0 0 0 0 0 A.2. Definition of cyclic cohomology Given a complex algebra B, let Cλn (B) be the set of n + 1-linear functionals on B which are cyclic in the sense that ϕ(A1 , . . . , An , A0 ) = (−1)n ϕ(A0 , . . . , An ) .
December 27, 2001 16:1 WSPC/148-RMP
00110
Edge Current Channels and Chern Numbers
111
Definition A.2. The cyclic cohomology HC(B) of B is the cohomology of the complex b
0 → Cλ0 (B) → · · · → Cλn (B) −→ Cλn+1 (B) → · · · with bϕ(A0 , . . . , An+1 ) =
n X
(−1)j ϕ(A0 , . . . , Aj Aj+1 , . . . , An+1 )
j=0
+ (−1)n+1 ϕ(An+1 A0 , . . . , An ) .
(A.2)
An element ϕ ∈ Cλn (B) satisfying bϕ = 0 is called a cyclic n-cocycle. Example A.3. Let ∇j , j = 1, . . . , n be n commuting derivations on B (not necessarily fully defined) and T an invariant trace (T ◦ ∇j = 0), not necessarily finite. Then ξn : A⊗n+1 → C defined by X (−1)sgn(σ) T (A0 ∇σ(1) A1 · · · ∇σ(n) An ) ξn (A0 , . . . , An ) = σ∈Sn
= T (A0 ∇[1 A1 · · · ∇n] An ) , (the second expression is an abbreviation) is an n-cocycle. This will be proved below. The domain of definition of ξn depends on the domains of definition of the derivations and that of T . A.3. Pairing with K-groups Now let B be a Banach algebra. The elements of cyclic cohomology of degree j pair with Kj (B) (j mod 2 in the latter case) in the following way. Let ξ be a 2k-cocycle, [ξ] its class. Then h[P ]0 − [Pm ]0 , [ξ]i = c2k Tr ⊗ ξ(P, . . . , P ) ,
P ∈ Mm+l (B) ,
is a well-defined pairing between K0 (B) and HC ev (B) [19]. If ξ is a 2k + 1-cocycle, then h[V ]1 , [ξ]i = c2k+1 Tr ⊗ ξ(V −1 − 1, V − 1, . . . , V − 1) ,
V ∈ Mm (B) ,
(the entries alternate) is a well-defined pairing between K1 (B) and HC odd (B). Here the normalization constants are (chosen as in [18] and [36], but not as in [19]): c2k =
1 1 , (2πı)k k!
c2k+1 =
1 1 1 . 1 k+1 2k+1 (2πı) 2 k + 2 k − 12 · · · 12
December 27, 2001 16:1 WSPC/148-RMP
112
00110
J. Kellendonk, T. Richter & H. Schulz-Baldes
A.4. Cycles There is a convenient reformulation of the description of cyclic cocycles in terms of graded differential algebras (Ω, d) with graded closed traces (we use the widely spread notation Ω here, because misinterpretation as the hull of Sec. 2.1 seems L excluded). A graded algebra is a graded vectorspace Ω = n∈Z Ωn (we denote by ∂A the degree of a homogeneous element A) such that ∂(AB) = ∂(A) + ∂(B) for homogeneous elements. It is called a graded differential algebra if there exists a graded differential d (a derivation whose square vanishes) of degree 1. A graded R R R trace on Ωn isR a linear functional which satisfies w1 w2 = (−1)∂w1 +∂w2 w2 w1 . It is closed if vanishes on d(Ωn−1 ). Definition A.4. A n-dimensional cycle over B is a graded differentialR algebra (Ω, d) of highest non-trivial degree n together with a closed graded trace on Ωn and an algebra homomorphism B → Ω0 . We will assume here that the homomorphism B → Ω0 is injective and hence identify B with a subalgebra of Ω0 . See [19] for the proof of the following result. Proposition A.5. Any cycle of dimension n over B defines a cyclic n-cocycle through what is called its character: Z ξ(A0 , . . . , An ) = A0 dA1 · · · dAn . Example A.6. Consider an action of Rn on the algebra B by commuting derivations ∇j (thus view Rn as Lie algebra) and suppose that T is an invariant trace, i.e. T (exp(t∇j )A) = T (A) for all t. Let Ω be the tensor product B ⊗ ΛCn of B with the Grassmann algebra ΛCn with generators ej , j = 1, . . . , n and define d(A ⊗ v) =
n X
∇j A ⊗ ej v .
j=1
It is straightforward to check that (Ω, d) is a differential algebra. Note that Λn Cn ∼ = · · · e ) = 1 C (as vector spaces) and any isomorphism is a graded trace. Taking ı(e 1 n R as such an isomorphism, we define = T ⊗ ı, that is Z A0 dA1 · · · dAn = T (A0 ∇[1 A1 · · · ∇n] An ) , a cycle of dimension n associated with our second example. Let us prove that this is actually closed and graded. The graded cyclicity follows directly from the graded cyclicity of ı and the cyclicity of T . To prove closedness write T ⊗ ı = ı ◦ (T ⊗ id) on the highest degree. Since d acts trivially on C ⊗ ΛCn , the invariance of T implies (T ⊗ id) ◦ d = 0. This proves the claim.
December 27, 2001 16:1 WSPC/148-RMP
00110
Edge Current Channels and Chern Numbers
113
A.5. Constructing new cycles from old ones We next extend Example A.6. Suppose we have the same situation as in that example R except that in place of an invariant trace there is an invariant k-cycle (Ω, δ, ). By this we mean that Rn acts R on Ω by derivations such that δ commutes with this action and the graded trace is invariant. The aim is to construct a k +ncycle from these data. Before we do so let us recall that the graded tensor product ˆ of two graded algebras is the graded tensor product of the vector spaces with Ω⊗Λ ˆ 1 )(w2 ⊗v ˆ 2 ) = (−1)∂w2 ∂v1 w1 w2 ⊗v ˆ 1 v2 . In particular, it reduces to the product (w1 ⊗v usual tensor product if one of the algebras is trivially graded. R Proposition A.7. Let (Ω, δ, ) be a k-cycle over the algebra R B with an action n invariant. Taking of R by n derivations ∇i which commute with δ and leave ˆ ΛCn , the graded tensor product, d0 = δ ⊗ ˆ 1 + d with d(w ⊗ ˆ v) = Ω0 = Ω ⊗ R0 R R0 P ∂w 0 0 ˆ ˆ ı one obtains a k + n-cycle (Ω , d , ) over B. = ⊗ (−1) j ∇j w ⊗ ej v, and Proof. Ω0 is naturally bigraded and one defines the total degree to be the sum of the components of the bidegree. A straightforward calculation shows that that δ and d are anti-commuting graded differentials which are of total degree 1. Therefore d0 = d+δ is a differential of total degree one and (Ω0 , d0 ) a graded differential algebra R0 is a closed graded with respect to the totalRdegree. It remains thus to check that 0 is non-zero precisely at the highest non-trivial (total) trace. First note, that degree. Further Z Z 0 ∂w2 ∂v1 ˆ ˆ (w1 ⊗ v1 )(w2 ⊗ v2 ) = (−1) w1 w2 ı(v1 v2 ) Z = (−1)∂w2 ∂v1 +∂w1 ∂w2 +∂v1 ∂v2 Z = (−1)(∂w1 +∂v1 )(∂w2 +∂v2 ) shows that
R0 Z
0
0
w2 w1 ı(v2 v1 )
ˆ v2 )(w1 ⊗ ˆ v1 ) (w2 ⊗
is a graded trace. We have Z Z X ˆ v) = d0 (w ⊗ δw ı(v) + (−1)∂w ∇j w ı(ej v). j
The first R 0 term vanishes due to closedness of is closed. Thus A.5.1. Suspension of cyclic cocycles
R
and the second due to its invariance.
R We now consider a Banach algebra B with an n-cycle (Ω, d, ) and look at C(S 1 , B) the algebra of continuous functions over S 1 = R/2πZ with values in B. S 1 is an abelian Lie group thus acting on itself and we take this action and extend it trivially to C(S 1 , B). Then the Lie algebra R of S 1 acts by differentiation along the
December 27, 2001 16:1 WSPC/148-RMP
114
00110
J. Kellendonk, T. Richter & H. Schulz-Baldes
R0 parameter of S 1 which we denote t. We define for C(S 1 , B) anR n-cycle (Ω0 , d0 , ) as 0 we combine follows. Ω0 = C(S 1 , Ω), d is the trivial extension of d and for R 0 R the old R = S 1 dt . n-cycle for B with integration (Lebesgue measure) over S 1 , namely R 00 Now Proposition A.7 provides us with an n + 1-cycle (Ω00 , d00 , ) for C(S 1 , B). Its restriction to the suspension SB, sub-algebra of C(S 1 , B), plays an important role for us. Concretely, for Ai : S → B, we get Z 00 A0 d00 A1 · · · d00 An+1
=
n+1 X
Z
Z
(−1)n+1−j
dt
A0 dA1 · · · dAj−1 A˙ j dAj+1 · · · dAn+1 .
S1
j=1
R s If ξ denotes the character of (Ω, d, ) in this example R 00then we denote by ξ the 00 00 character on the suspension SB obtained from (Ω , d , ) in the above way. Note that this suspension construction differs from Connes’ (double) suspension [18, 19]. Theorem A.8 ([36]). The map HC n (B) → HC n+1 (SB) : [ξ] 7→ [ξ s ] is dual to the Bott map β for even cycles and dual to the suspension map Θ for odd cocyles, i.e. h[P ]0 − [Pm ]0 , [ξ]i = hβ([P ]0 − [Pm ]), [ξ s ]i for even cocycles ξ over B and h[V ]1 , [ξ]i = −hθ([V ]1 ), [ξ s ]i for odd ones. A.5.2. Cyclic cocycles for crossed products
R In the second application, we consider a n-cycle (Ω, d, ) over an algebra B on R which is given a Z-action (such that d commutes with this action and is invariant under it). The aim is to construct a n + 1-cycle over B ×α Z, the algebraic crossed product. To do so we first construct a n-cycle over B ×α Z and then apply R0 Proposition A.7. The n-cycle over B is given by (Ω0 , d0 , ) where Ω0 = Ω ×α Z, extension of d to functions A : Z → Ω, the algebraic crossed product,Rd0 the trivial R 0 0 A = A(0). To show that d0 is a derivation one i.e. (d A)(m) = dA(m), and needs indeed that R 0 it commutes with the Z-action α and the latter is also essential is a graded trace. Now on B and on Ω we have an action of the to show that dual group of Z which is S 1 . Its Lie algebra acts by the derivation ∇A(m) = ımA(m) . R0 invariant, the latter simply because This action commutes with d0 and leaves R 00 ∇A(0) = 0. We thus can apply Proposition A.7 to obtain an n+1-cycle (Ω00 , d00 , ). More explicitly, for A : Z → B we get
December 27, 2001 16:1 WSPC/148-RMP
00110
Edge Current Channels and Chern Numbers
Z
00
115
A0 d00 A1 · · · d00 An+1
=
n+1 X
Z n+1−j
(−1)
(A0 dA1 · · · dAj−1 ∇Aj dAj+1 · · · dAn+1 )(0) .
j=1
R Now, if ξ is the character of (Ω, d, ), we R 00 denote by #α ξ the character corresponding to the above n + 1-cycle (Ω00 , d00 , ). The proof of the following lemma is a straightforward algebraic calculation. R Lemma A.9. Let B be an algebra furnished with a Z-action α, further let (Ω, d, ) be a α-invariant n-cycle over B and ξ its character. Then #α (ξ s ) = (#α ξ)s . A.6. Duality theorem for crossed products R Theorem A.10. Let (B, Z, α) be a C∗ -dynamical system, (Ω, d, ) be a α-invariant n-cycle over B and ξ its character. The map HC n (B) → HC n+1 (B ×α Z) : [ξ] 7→ [#α ξ] is dual to the connecting map of the Pimsner–Voiculescu exact sequence, i.e. hexp([P ]0 − [Pm ]0 ), [ξ]i = 2πh[P ]0 − [Pm ]0 , [#α ξ]i , for odd n, and hind([V ]1 ), [ξ]i = −2πh[V ]1 , [#α ξ]i , for even n. Proof. We denote the unitary generating the action α in B ×α Z by U , that is α(A) = U AU ∗ for all A ∈ B. Due to the Lemma A.9 and Theorem A.8, it suffices to consider only the case of even n because the crossed product commutes with the suspension, notably SB ×α Z = S(B ×α Z). Now let [P ]0 ∈ K0 (B) be in the image of the index map. From the Pimsner– Voiculescu exact sequence we then deduce that [α(P )]0 = [P ]0 . Thus there exists a unitary W ∈ B which is homotopic to the identity and such that α(P ) = W P W ∗ (if necessary, one passes to matrix algebras over B). According to Proposition A.1 of the Appendix, the preimage of [P ]0 under the index map is then [U P +W (1−P )]1 ∈ K1 (B ×α Z). Therefore the second statement of the theorem is equivalent to h[P ]0 − [Pm ]0 , [ξ]i = −2πh[U P + W (1 − P )]1 , [#α ξ]i , for an n = 2k-cocycle ξ of B. Suppose first that W = 1. Then Z 00 P (U ∗ − 1)d00 ((U − 1)P ) · · · d00 ((U − 1)P ) h[U P + 1 − P ]1 , [#α ξ]i = c2k+1
December 27, 2001 16:1 WSPC/148-RMP
116
00110
J. Kellendonk, T. Richter & H. Schulz-Baldes
with the cycle for the character as above. Since the action α commutes with d0 and [U, P ] = 0, we can collect all U ’s to the left. Furthermore, P (d0 P )m P vanishes if m is odd and is equal to P (d0 P )m if m is even. Using this the r.h.s. becomes Z 0 U (U − 1)k (U −1 − 1)k+1 P (d0 P )2k ıc2k+1 (k + 1) = −ıc2k+1 (k + 1)
k X
k
l=0
l
2k + 1
= −ıc2k+1 (k + 1) =−
!
k
k+1
!Z
l+1
P (dP )2k
!Z P (dP )2k
1 h[P ]0 , [ξ]i . 2π
To treat now the case W 6= 1, we use the new action α0 (A) = W ∗ α(A)W and the associated crossed product B ×α0 Z. Further let us introduce an automorphism γ of M2 (B) by ! ! A B α(A) α(B)W . γ = C D W ∗ α(C) W ∗ α(D)W Then there are two embeddings ρ1 : B ×α Z → M2 (B) ×γ Z ,
ρ2 : B ×α0 Z → M2 (B) ×γ Z ,
onto the upper left and lower right corner of the 2 × 2-matrix respectively. Finally we introduce the C ∞ -path AdVt : M2 (B) ×γ Z → M2 (B) ×γ Z, t ∈ [0, 1], where π π t − sin t cos 2 2 Vt = π . π t cos t sin 2 2 As AdV1 interchanges the upper left and lower right corners, we can use the following commutative diagram to define an isomorphism i : B ×α Z → B ×α0 Z: ρ1
B ×α Z −→ M2 (B) ×γ Z yAdV1 iy ρ2
B ×α0 Z −→ M2 (B) ×γ Z . One can now check that i(A) = A for all A ∈ B and i(U ) = W U 0 where U 0 is the generator of the α0 -action in B ×α0 Z. Hence i(U P +W (1−P )) = W (U 0 P +(1−P )). Further one checks that #α ξ = ρ∗1 (#γ (Tr2 ⊗ ξ)) ,
#α0 ξ = ρ∗2 (#γ (Tr2 ⊗ ξ)) .
December 27, 2001 16:1 WSPC/148-RMP
00110
Edge Current Channels and Chern Numbers
117
To verify that the pairings with [#α ξ]1 and [#α0 ξ]1 coincide, we now use Connes’ homotopy invariance of cyclic cohomology [19] extended to non-unital algebras in [24]. As V0 = 1, it tells us that [#γ (Tr2 ⊗ ξ)]1 = [Ad∗V1 #γ (Tr2 ⊗ ξ)]1 in HC odd (M2 (B) ×γ Z). Therefore we obtain: h[U P + W (1 − P )]1 , [#α ξ]1 i = h[ρ1 (U P + W (1 − P ))]1 , [#γ (Tr2 ⊗ ξ)]i = h[ρ1 (U P + W (1 − P ))]1 , [Ad∗V1 (#γ (Tr2 ⊗ ξ))]i = h[i(U P + W (1 − P ))]1 , [ρ∗2 (#γ (Tr2 ⊗ ξ))]i = h[U 0 P + (1 − P )]1 , [#α0 ξ]i , where in the last step we use [W (U 0 P + (1 − P ))]1 = [U 0 P + (1 − P )]1 because W is homotopic to the identity in B. Now this last pairing is in B ×α0 Z where the above calcuation can be applied in order to conclude. Acknowledgments We would like to express our deep gratitude to our teachers Jean Bellissard and Ruedi Seiler. Their works set the stage for the present article which hereby inherits their spirit. We moreover profited from discussions with many collegues, among them M. Aizenman, Y. Avron, J.-M. Combes, S. DeBievre, G. Elliott, F. Germinet, S. Jitormirskaya, A. Klein and M. Seifert. We acknowledge support of the SFB 288. References [1] M. Aizenman, “Localization at weak disorder: some elementary bounds”, Rev. Math. Phys. 6 (1994) 1163–1182. [2] M. Aizenman and G. Graf, “Localization bounds for an electron gas”, J. Phys. A: Math. Gen. 31 (1998) 6783–6806. [3] M. Aizenman and S. Molchanov, “Localization at large disorder and at extreme energies: an elementary derivation”, Com. Math. Phys. 157 (1993) 245–278. [4] E. Akkermans, J. E. Avron, R. Narevich and R. Seiler, “Boundary conditions for bulk and edge states in quantum Hall systems”, European Phys. J. B 1 (1998) 117–121. [5] B. W. Alphenaar, P. McEuen, R. Wheeler and R. Sacks, “Selective equilibration among the current-carrying states in the quantum Hall regime”, Phys. Rev. Lett. 64 (1990) 677–680. [6] J. E. Avron and R. Seiler, “Quantization of the Hall conductance for gerneral, multiparticle Schr¨ odinger Hamiltonians”, Phys. Rev. Lett. 54 (1985) 259–262. [7] J. E. Avron, R. Seiler and B. Simon, “Charge Deficiency, Charge Transport and Comparison of Dimensions”, Com. Math. Phys. 159 (1994) 399–422. [8] J. E. Avron, R. Seiler and A. Yaffe, “Adiabatic theorems and applications to the quantum Hall effect”, Com. Math. Phys. 110 (1987) 33–49. [9] J. Bellissard, “K-theory of C∗ -algebras in solid state physics”, pp. 95–156 in Statistical Mechanics and Field Theory: Mathematical Aspects, Lecture Notes in Physics 257, eds. T. Dorlas, M. Hugenholtz and M. Winnink, Springer-Verlag, Berlin, 1986. [10] J. Bellissard, “Ordinary quantum Hall effect and non-commutative cohomology”, in Proc. of the Bad Schandau Conference on Localization, 1986, eds. Ziesche and Weller, Teubner Texte Phys. 16, Teubner-Verlag, Leipzig, 1988.
December 27, 2001 16:1 WSPC/148-RMP
118
00110
J. Kellendonk, T. Richter & H. Schulz-Baldes
[11] J. Bellissard, “Gap labelling theorems for Schr¨ odinger operators”, pp. 538–630 in From Number Theory to Physics, Springer, Berlin, 1992. [12] J. Bellissard, A. van Elst and H. Schulz-Baldes, “The non-commutative geometry of the quantum Hall effect”, J. Math. Phys. 35 (1994) 5373–5451. [13] J. Bellissard, D. J. Herrmann and M. Zarrouati, “Hull of aperiodic solids and gap labelling theorems”, in Directions in Mathematical Quasicrystals, eds. M. Baake and R. V. Moody, Amer. Math. Soc., Providence, RI, 2000. [14] B. Blackadar, K-Theory for Operator Algebras, Springer-Verlag, New York, 1986. [15] M. B¨ uttiker, “Absence of backscattering in the quantum Hall effect in multiprobe conductors”, Phys. Rev. B 38 (1988) 9375–9389. [16] M. B¨ uttiker, “The quantum Hall effect in open conductors”, pp. 191–277 in Semiconductors and Semimetals Vol. 35, ed. M. Reed, Academic Press, San Diego, 1992. [17] A. Connes, “An analogue of the Thom isomorphism”, Adv. Math. 39 (1981) 31–55. [18] A. Connes, “Non-Commutative Differential Geometry: Part I and II”, Publ. IHES 62 (1985) 41–144. [19] A. Connes, Non-Commutative Geometry, Acad. Press, San Diego, 1994. [20] H. L. Cyon, R. G. Froese, W. Kirsch and B. Simon, Schr¨ odinger Operators, Springer, Berlin, 1987. [21] S. De Bievre and J. V. Pul´e, “Propagating edge states for a magnetic Hamiltonian”, Elect. J. Math. Phys. 5 (1999). [22] J. Dixmier, C ∗ -Algebras, North Holland, Amsterdam, 1977. [23] G. Elliott and T. Natsume, “A Bott periodicity map for crossed products of C∗ -algebras by discrete groups”, K-Theory 1 (1987) 423–435. [24] G. Elliott, T. Natsume and R. Nest, “Cyclic cohomology for one-parameter smooth crossed products”, Acta Math. 160 (1988) 285–305. [25] J. Fr¨ ohlich, G. M. Graf and J. Walcher, “On the extended nature of edge states of quantum Hall Hamiltonians”, Ann. H. Poincar´ e 1 (2000). [26] B. I. Halperin, “Quantized Hall conductance, current-carrying edge states, and the existence of extended states in a two-dimensional disordered potential”, Phys. Rev. B 25 (1982) 2185–2190. [27] Y. Hatsugai, “Edge states in the integer quantum Hall effect and the Riemann surface of the Bloch function”, Phys. Rev. B 48 (1993) 11851–11862. [28] Y. Hatsugai, “The Chern number and edge states in the integer quantum Hall effect”, Phys. Rev. Lett. 71 (1993) 3697–3700. [29] K. v. Klitzing, G. Dorda and M. Pepper, “New method for high-accuracy determination of the fine-structure constant based on quantized Hall resistance”, Phys. Rev. Lett. 45 (1980) 494–497. [30] H. Kunz, “The quantum Hall effect for electrons in a random potential”, Commun. Math. Phys. 112 (1987) 121–145. [31] R. B. Laughlin, “Quantized Hall conductivity in two dimensions”, Phys. Rev. B 23 (1981) 5632–5633. [32] N. Macris, P. A. Martin and J. V. Pul´e, “On edge states in semi-infinite quantum Hall systems”, J. Phys. A 32 (1999) 1985–1996. [33] S. Nakamura and J. Bellissard, “Low energy bands do not contribute to the quantum Hall effect”, Commun. Math. Phys. 131 (1990) 282–305. [34] R. Nest, “Cyclic cohomology of crossed products with Z”, J. Funct. Anan. 80 (1988) 235–283. [35] G. K. Pedersen, C ∗ -Algebras and Their Automorphism Groups, Academic Press, London, 1979. [36] M. Pimsner, “Ranges of traces on K0 of reduced crossed products by free groups”, pp. 374–408 in Lecture Notes in Mathematics Vol. 1132, Springer, Berlin, 1985.
December 27, 2001 16:1 WSPC/148-RMP
00110
Edge Current Channels and Chern Numbers
119
[37] M. Pimsner and D. Voiculescu, “Exact sequences for K-groups of certain crossproducts of C∗ -algebras”, J. Op. Theory 4 (1980) 93–118. [38] R. Prange and S. Girvin (eds), The Quantum Hall Effect, 2nd Edition, SpringerVerlag, Berlin, 1990. [39] S. Sakai, Operator Algebras in Dynamical Systems, Cambridge University Press, Cambridge, 1991. [40] H. Schulz-Baldes and J. Bellissard, “Anomalous transport: a mathematical framework”, Rev. Math. Phys. 10 (1998) 1–46. [41] H. Schulz-Baldes and J. Bellissard, “A kinetic theory for quantum transport in aperiodic media”, J. Stat. Phys. 91 (1998) 991–1027. [42] H. Schulz-Baldes, J. Kellendonk and T. Richter, “Simultaneous quantization of the edge and bulk Hall conductivity”, J. Phys. A: Math. Gen. 33 (2000) L27–L32. [43] D. J. Thouless, M. Kohmoto, M. P. Nightingale and M. den Nijs, “Quantized Hall conductance in a two-dimensional periodic potential”, Phys. Rev. Lett. 49 (1982) 405–408. [44] Wegge-Olsen, K-theory and C ∗ -Algebras, Oxford University Press, Oxford, 1993.
February 4, 2002 10:11 WSPC/148-RMP
00112
Reviews in Mathematical Physics, Vol. 14, No. 2 (2002) 121–171 c World Scientific Publishing Company
LINDSTEDT SERIES FOR PERTURBATIONS OF ISOCHRONOUS SYSTEMS: A REVIEW OF THE GENERAL THEORY
MICHELE BARTUCCELLI Department of Mathematics and Statistics University of Surrey, Guildford, GU2 5XH, UK
[email protected] GUIDO GENTILE Dipartimento di Matematica, Universit` a di Roma Tre Roma, I-00146, Italy
[email protected]
Received 8 December 2000 We give a proof of the persistence of invariant tori for analytic perturbations of isochronous systems by using the Lindstedt series expansion for the solutions of the equations of motion. With respect to the case of anisochronous systems, there is the additional problem of finding the set of allowed rotation vectors, because they cannot be given a priori simply by looking at the unperturbed system. By considering the involved parameters (size of the perturbation, rotation vector and average action of a persisting invariant torus) as independent parameters we can introduce a function which is analytic in such parameters and only when the latter satisfy some constraint it becomes a solution: this can be regarded as a sort of singular implicit function problem. Therefore, although the dependence of the parameters, hence of the solution, upon the size of the perturbation is not smooth, in this way we construct explicitly the solution by using an absolutely convergent power series.
1. Introduction 1.1. Lindstedt series and KAM theorem. The KAM theorem assures the persistence of a large number of invariant tori under perturbations of integrable systems. For analytic Hamiltonians a posteriori one can consider the equations of motion and look directly for analytic quasi-periodic solutions, by writing them as formal power series, Lindstedt series, in the size of the perturbation: when such solutions exist, the series representing them must converge. This is a quite natural approach, and in fact it was the first to be attempted, for instance by Poincar´e [14], who, however, doubted that, in general, the series could converge. The problem was then solved by Kolmogorov [12], and it gave rise to a large amount of literature about what has became known as KAM theory. Exponential bounds on the coefficients of the Lindstedt series are obtained in the proof from the analysis of an implicit function problem, but the mechanism which remained unclear was how 121
February 4, 2002 10:11 WSPC/148-RMP
122
00112
M. Bartuccelli & G. Gentile
the single terms arising in the perturbative expansion of the Lindstedt series and separately growing much faster than exponentially still admit an exponential bound when summed together [13], and the problem was referred to (improperly) as the “problem of a direct proof of KAM theorem”. This was the state of the art until very recently, when the problem was solved by Eliasson [6], for the anisochronous case. In fact in such a case the non-degeneracy condition for the free Hamiltonian allows us to construct for the perturbed system an invariant torus run with a rotation vector chosen among those of the free system: more precisely, if, with obvious notations, {α(t) = ωt, A(t) = A0 } is an orbit on an invariant torus for the integrable anisochronous Hamiltonian H0 (A), one considers the perturbed system H(α, A) = H0 (A) + f (α, A, ε), with f (α, A, ε) = O(ε), and, for |ε| < ε0 , with ε0 small enough, and ω satisfying a Diophantine condition, one looks for a quasiperiodic solution of the equations of motion which has the same rotation vector ω. In the isochronous case an approach of this kind is not so straightforward. In fact for H0 (A) = ω · A all tori have the same rotation vector ω, while for the perturbed system H(α, A) = ω · A + f (α, A, ε), with f (α, A, ε) = O(ε), the KAM theorem states the existence of invariant tori for |ε| < ε0 , with ε0 small enough, but with rotation vectors different from that of the unperturbed system: in addition the latter depend (in general) on ε and the dependence is not a smooth one. So it is not clear at all how to extend the Lindstedt series, mostly because there is no hope to obtain an analytic dependence on ε. Here we address the just described question. We prefer to consider a particular case, rather simplified with respect to the most general one can conceive, but still retaining the most important features of the general case. We consider a twodimensional case, in which the frequency of one harmonic oscillator is fixed once and for all ( i.e. it is simply a clock). We follow the spirit of the approach by Gallavotti [8], where Eliasson’s work was revisited by considering a simplified model (Thirring model), in order to separate the general strategy of the proof from the technical intricacies which would make it less terse. Moreover such a case is already interesting for physical applications [1], for example in the study of the stability of the upside-down pendulum whose point of support is subjected to a fast oscillation in the vertical direction. Extensions to more general systems will be discussed at the end (see also Sec. 1.10 below). ¯ ∈ Rm and for A¯1 ∈ C define the following 1.2. The model. For m ∈ N, for x domains: Σκ = {α ∈ C2 : Re αj ∈ T , x) = {x ∈ C Br (¯
m
|Im αj | < κ, j = 1, 2} ,
¯ | < r} , : |x − x
Aρ (A¯1 ) = {A ∈ C2 : |A1 − A¯1 | < ρ} . Consider the system described by the Hamiltonian
(1.1)
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
H = ω · A + f (α, A1 , ε) ,
123
(1.2)
where A = (A1 , A2 ) ∈ R2 and α = (α1 , α2 ) ∈ T2 are conjugate variables, ω = (µ, 1), with µ ∈ (0, 1), · denotes the inner product in R2 and f is a function real analytic in the variables α, A1 and ε, and holomorphic in a complex domain D = Σκ × D × Bε1 (0) ,
(1.3)
with D ⊂ C an open subset, and such that f = 0 for ε = 0; so one can write in (1.3) f (α, A1 , ε) =
∞ X
εk f (k) (α, A1 ) .
(1.4)
k=1
The system (1.2) represents two harmonic oscillators interacting through a potential depending only on the angles and on the action variable A1 . The corresponding equations of motion are α˙ 1 = µ + ∂A1 f , α˙ 2 = 1 , (1.5) A˙ 1 = −∂α1 f , ˙ A2 = −∂α2 f , so that the angle α2 rotates with constant angular velocity, i.e. α2 (t) = t. Here and henceforth ∂x denotes the partial derivative with respect to x: if x = (x1 , x2 ) then ∂x = (∂x1 , ∂x2 ). 1.3. Results. Assume that ω in (1.2) satisfies the Diophantine condition |ω · ν| > C|ν|−τ
∀ ν ∈ Z2 \ {0} ,
(1.6)
with Diophantine constants C > 0 and τ > 1. We shall prove the following results: (1) if ω0 is a rotation vector close enough to ω and with comparable Diophantine properties (i.e. with Diophantine constants C0 = bC and τ , for some constant ¯ one can fix a value ε b ∈ (0, 1)), then for all A0 close enough to a prefixed A, for which there exists an invariant torus with rotation vector ω 0 and average action A0 for the Hamiltonian with perturbative parameter suitably fixed (close to 0) and depending analytically on A0 ; (2) instead of fixing the average action, we can fix the value of the perturbative parameter ε and look for the invariant tori persisting under perturbation: we have that infinitely many of them persist, with rotation vectors ω 0 close enough to ω and with average actions depending on ω0 . The first result is peculiar to isochronous systems, while the second one is quite analogous to the anischronous KAM theorem and it is the usual form in which it is stated in the literature, up to a quantitative characterization of the “infinitely many tori” and an estimate of the relative measure of the points in phase space
February 4, 2002 10:11 WSPC/148-RMP
124
00112
M. Bartuccelli & G. Gentile
lying on invariant tori. If we allow a much weaker Diophantine condition, that is we let b to be a power of ε, such a measure can be shown to tend to 1 for ε → 0; we shall can back to such a problem in Sec. 6 (see also Sec. 1.10 below). We can state more formally the above results as follows. ¯ = (A¯1 , A¯2 ), with A¯1 ∈ D, and ρ > 0 such that Bρ (A¯1 ) ⊂ D. 1.4. Theorem. Fix A Consider the equations of motions (1.5), corresponding to the Hamiltonian (1.2), with ω = (µ, 1) satisfying the Diophantine condition (1.6), and suppose that Z dα ∂A1 f (1) (α, A1 ) 6= 0 ∀ A1 ∈ Bρ (A¯1 ) . (1.7) T2
There is a universal constant b ∈ (0, 1) and three b-dependent constants a > 0, ρ0 ∈ (0, ρ) and κ0 ∈ (0, κ) such that for all µ0 ∈ (µ − aC, µ + aC), with ω 0 = (µ0 , 1) satisfying the Diophantine condition (1.8) |ω 0 · ν| > C0 |ν|−τ ∀ ν ∈ Z2 \ {0} , C0 = bC , ¯ and for all A0 ∈ Aρ0 (A1 ), there is a value ε = ε(µ0 , A0 ) ∈ Bε1 (0), depending ¯ ¯ A0 , µ0 ) and H(ψ, A0 , µ0 ), analytically on A0 ∈ Aρ0 (A¯1 ), and two functions h(ψ, ¯ 0 0 analytic in (ψ, A0 ) ∈ Σκ × Aρ (A1 ) and with zero ψ-average, such that ( ¯ 0 t, A0 , µ0 ) , α(t) = ω 0 t + h(ω (1.9) ¯ 0 t, A0 , µ0 ) A(t) = A0 + H(ω is a solution of (1.5). The constant a depends on b, but it is independent of C. 1.5. Remarks. (1) The condition (1.7) is not really needed in order to prove the theorem: it is imposed just for simplicity, but it could be considerably weakened. See also Remark 2.13. (2) Since the function H has zero average, the vector A0 represents the average (over time) of the action variable for the quasi-periodic motion with rotation vector ω 0 : this means that we are looking for an invariant torus whose average action equals that of an unperturbed Diophantine one. ¯ = (A¯1 , A¯2 ), with A¯1 ∈ D. Consider the equations of 1.6. Theorem. Fix A motions (1.5) corresponding to the Hamiltonian (1.2), with ω = (µ, 1) satisfying the Diophantine condition |ω · ν| > C|ν|−τ
∀ ν ∈ Z2 \ {0} ,
with Diophantine constants C > 0 and τ > 1, and suppose that Z 2 ¯ 6= 0 . dα ∂A f (1) (α, A) M≡ 1
(1.10)
(1.11)
T2
There is a universal constant b ∈ (0, 1) and two b-dependent constants ε¯ ∈ (0, ε1 ) and κ0 ∈ (0, κ) such that for all ε ∈ Bε¯(0) \ {0} there are a constant a > 0, infinitely many µ0 ∈ (µ − aε, µ + aε) with ω 0 = (µ0 , 1) satisfying the Diophantine condition |ω0 · ν| > C0 |ν|−τ
∀ ν ∈ Z2 \ {0} ,
C0 = bC ,
(1.12)
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
125
and pairs of functions h∗ (ψ, ε, µ0 ) and H∗ (ψ, ε, µ0 ), analytic for ψ ∈ Σκ0 and with zero ψ-average, and a vector A∗ (ε, µ0 ) such that ( α(t) = ω 0 t + h∗ (ω 0 t, ε, µ0 ) , (1.13) A(t) = A∗ (ε, µ0 ) + H∗ (ω 0 t, ε, µ0 ) is a solution of (1.5). One has A∗2 (ε, µ0 ) = A¯2 and A∗1 (ε, µ0 ) ∈ D; the constant a depends on b, but it is independent of ε. 1.7. Remark. (1) The condition (1.11) could be considerably weakened. See also Remark 1.5 (1), and Remark 2.15. (2) We are fixing ε and ω0 and we want to detect an invariant torus with rotation vector ω 0 : to achieve such a goal we are forced to change (with respect to the unperturbed system) the average action into a value A∗ (ε, µ0 ): this is the meaning of the vector A∗ (ε, µ0 ) appearing in the statement of Theorem 1.6. 1.8. Idea of the proof. For ε = 0 and for all A0 ∈ R2 there is a solution {α(t) = ωt, A(t) = A0 } ,
(1.14)
lying on an invariant torus. Because of the isochrony of the unperturbed system we are not able to fix a priori the rotation vectors ω 0 of the quasi-periodic solutions for ε 6= 0. So we proceed by adopting the splitting µ0 + η(A0 , ε, µ0 ) = µ ,
(1.15)
where A0 is the same as in (1.14) and, to begin, µ0 is fixed in a rather arbitrary way. The above prescription leads to the system of equations α˙ 1 = µ0 + ∂A1 f + η(A0 , ε, µ0 ) , α˙ 2 = 1 , (1.16) ˙ A1 = −∂α1 f , ˙ A2 = −∂α2 f , hence, by ignoring the constraint given by Eq. (1.15), we look for a quasi-periodic solution of the modified system (1.16), where η(A0 , ε, µ0 ) is to be determined, with rotation vector ω0 , i.e. for a solution of the form ( α(t) = ω0 t + h(ω 0 t, A0 , ε, µ0 ) , (1.17) A(t) = A0 + H(ω0 t, A0 , ε, µ0 ) , where ω 0 = (µ0 , 1). The functions h and H are called the conjugating functions, while h is called the counterterm. If η(A0 , ε, µ0 ) is replaced with 0 in (1.15), so that ω 0 becomes ω, it is well known from KAM theory that there are not (in general) quasi-periodic solutions with rotation vector ω; on the other hand, as it is defined, η(A0 , ε, µ0 ) depends
February 4, 2002 10:11 WSPC/148-RMP
126
00112
M. Bartuccelli & G. Gentile
only on µ0 and it is not obvious how µ0 has to be chosen nor if it can be chosen at all so that (1.15) is satisfied. Note that, if for some µ0 (1.15) is verified, then a solution of the true equations of motions (1.5), quasiperiodic with rotation vector ω 0 = (µ0 , 1), will have been found. We shall show that, by neglecting the constraint (1.15), for fixed µ0 it will be possible to choose in a unique way η(A0 , ε, µ0 ) as an analytic function of ε and of A0 in such a way that there exists a solution of (1.16) the form (1.17) with zero average: this will be the content of Lemma 2.2. We note that as long as the constraint (1.15) is neglected the solutions (1.17) of the equation of motion (and the corresponding counterterm) are analytic in ε as well in A0 : so we shall use the powerful machinery of the Lindstedt series for the analytic KAM theory also in a case in which the solution of the original equations of motion cannot be expected to be analytic. The idea of introducing suitable counterterms in the equations of motion in order to make them soluble is not new, and dates back to [13], where it was successfully used in order to give (among other things) a proof of the KAM theorem. Also in [10] (see also [9]) the existence of whiskers for hyperbolic tori (of codimension one) was proved for a class of almost integrable systems by studying a modified Hamiltonian obtained by introducing suitable counterterms. Of course one had to prove eventually that the original Hamiltonian could be recovered: in [10] the proof of such an assertion was a simple application of the implicit function theorem. Also in the present case, because of the introduction of the counterterm, we study a different system of equations, and, since the counterterm is uniquely determined, in general there is no hope to obtain, for ε, A0 and µ0 arbitrarily chosen, that such a counterterm satisfies (1.15). In our case (1.15) is not quite an implicit function problem, because the dependence of η(A0 , ε, µ0 ) on µ0 is not even smooth (the counterterm is defined on a Cantor-like set, as far as the dependence on µ0 is concerned). Nevertheless (1.15), seen either as an equation for ε for fixed A0 and µ0 or as an equation for A0 for fixed ε and µ0 , can be solved: this will lead to, respectively, Theorem 1.4 and Theorem 1.6 as it will be shown in Sec. 2. This means that we shall look for solutions of the form (1.17) where ε, A0 and µ0 are not independent of each other: we can choose ε as a function of µ0 and A0 , in such a way that the constraint (1.15) is satisfied, so that we can write ¯ h = h(ψ, A0 , µ0 ) = h(ψ, A0 , ε(µ0 , A0 ), µ0 ), and analogous expressions hold for the other quantities H and η. Then, fixed µ0 and A0 , ε is not a free parameter, and in principle we could not study the dependence of h in ε by varying ε without changing µ0 and A0 . Nevertheless, considering ε and (A0 , µ0 ) as independent parameters, then h(ψ, A0 , ε, µ0 ) turns out to be analytic in ε, and we shall see that we can use the analyticity of such a dependence in order to write the solution as a power series in ε. Analogously we can fix ε and µ0 (in an appropriate way) and choose A0 = A∗ (ε, µ0 ) as a function of both parameters, again in such a way that the constraint (1.15) is satisfied, so that we can write the solution as h = h∗ (ψ, ε, µ0 ) =
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
127
h(ψ, A∗ (ε, µ0 ), ε, µ0 ), and the same can be done for H and η. Note that in such a case even if h(ψ, A0 , ε, µ) is analytic in ε, the solution h∗ (ψ, ε, µ0 ) is not, as A∗ (ε, µ0 ) does not depend analytically on ε. 1.9. Comments about the statement of the theorems. Theorem 1.4 deals with the problem of fixing a rotation vector ω 0 = (µ0 , 1), with µ0 close to µ, and looking for a value ε = ε(µ0 , A0 ) such that the Hamiltonian (1.2) admits an invariant torus run with rotation vector ω 0 . Such a problem is of physical relevance, in studying the stability and the persistence of KAM tori near elliptic equilibrium points; applications are discussed in [1]. On the other hand one could also consider the following (different) problem: set Bρ (A0 ) = {A : |A − A0 | < ρ} ⊂ D × C, fix ε small enough and look for values µ0 close to µ and, correspondingly, values A∗ ∈ Bρ (A0 ) such that, for that value of ε, the Hamiltonian (1.2) admits a solution parameterized by the action A∗ ( i.e. such that the average of the action variables is A∗ ) and run with rotation vector ω 0 = (µ0 , 1). Theorem 1.6 deals exactly with such a problem. 1.10. Invariant tori for fixed ε. One can also ask how many tori persist under perturbations, that is, with the same notations as in Sec. 1.9, which fraction of phase space in Bρ (A0 )×T2 , once ε has been fixed to some value (small enough), correspond to invariant tori (run with some rotation vector) persisting under perturbation for that value of ε: the answer is that, if the condition (1.11) of Theorem 1.6 is replaced with a non-degeneracy condition like Z 2 (1) dα ∂A1 f (α, A1 ) > 0 , (1.18) inf A1 ∈D
T2
then the set of initial data (A, α) ∈ Bρ (A0 ) × T2 for trajectories which lie on invariant tori persisting under perturbation form a set of relative measure tending to 1 as ε → 0. This can be proved with standard arguments of KAM theory (see for instance [5] and [15]): it could also be studied by using the same techniques introduced in the present paper; we shall briefly (and informally) discuss such an aspect in Sec. 6 below. 1.11. Comments about the proof of the theorem. In studying the functions h, H, η in which ε and µ0 are seen as independent parameters neither the special form of the interaction nor the fact that the dimension is d = 2 play any rˆ ole. So Lemma 2.2 below can be extended (essentially with no change) to any perturbations of Hamiltonian isochronous systems in any dimensions. The notations we shall use will make such an extension trivial: simply interpret the vectors as vectors in Rd , and note (while reading the proof) that the special form of the interaction is not really used; for this reason we shall write f (α, A) = f (α, A1 ), even if in our case the perturbation does not depend explicitly on A2 ; see also Remark 3.5.
February 4, 2002 10:11 WSPC/148-RMP
128
00112
M. Bartuccelli & G. Gentile
On the contrary in order to solve the compatibility condition (1.15), the discussion in Sec. 2 applies only to the Hamiltonians of the special form (1.2). The methods extend to the general situation, but some further arguments become necessary; in Sec. 6 we briefly discuss how such an extension can be carried out. 1.12. Contents of the paper. In Sec. 2 we state the main technical result of the paper (Lemma 2.2), which deals with the modified system (1.6), and we show how it implies Theorems 1.4 and 1.6. The immediately following sections are devoted to the proof of Lemma 2.2: in Sec. 3 we introduce the tree formalism which will be used in Sec. 4 to prove that Eqs. (1.16) are formally soluble, for a suitable choice of the counterterm, and in Sec. 5 to prove that the formal series defining the conjugating functions and the counterterm are converging: in particular this will implies that all quantities are analytic in the perturbative parameter ε (the more technical aspects of the proofs will be relegated into the Appendices). Finally in Sec. 6 we discuss possible extensions and generalizations of the results, particularly those concerning the problem of studying the measure of the persisting invariant tori in phase space. For the proof of Lemma 2.2 we could have relied on the existing literature, as [3] and [11], and simply outlined the main differences with respect to it. However we preferred to give the proof in full details both because the paper is meant as a review of the techniques (hence a selfcontained discussion makes it more readable) and becuase in this way we profit to try to clarify the graphic construction through a number of examples and pictures; furthermore some technical improvements are presented with respect to the quoted papers.
2. Persistence of Invariant Tori 2.1. The modified model. As outlined in Sec. 1.8, for the moment, we study Eqs. (1.16), where µ0 is fixed, and η(A0 , ε, µ0 ) is a function to be determined. Of course this is not the original model, so at the end we shall have the problem to show that the results we find can be fruitfully used in order to draw conclusions also for the model in which the constraint (1.15) is taken into account. We shall prove the following result for the modified model, given by (1.16) without the constraint (1.15). ¯ = (A¯1 , A¯2 ), with A¯1 ∈ D, and ρ > 0 such that Bρ (A¯1 ) ⊂ D. 2.2. Lemma. Fix A Given the equations of motions α˙ 1 = µ0 + ∂A1 f , α˙ 2 = 1 , A˙ 1 = −∂α1 f , ˙ A2 = −∂α2 f ,
(2.1)
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
129
with ω 0 = (µ0 , 1) satisfying the Diophantine condition |ω0 · ν| > C0 |ν|−τ
∀ ν ∈ Z2 \ {0} ,
(2.2)
with Diophantine constants C0 > 0 and τ > 1, there exist three constants ε0 ∈ (0, ε1 ), ρ0 ∈ (0, ρ) and κ0 ∈ (0, κ), two functions h(ψ, A, ε, µ0 ) and H(ψ, A, ε, µ0 ), analytic in (ψ, A, ε) for ψ ∈ Σκ0 , A ∈ Aρ0 (A01 ) and |ε| < ε0 and with vanishing ψ-average on T2 , and a unique function η(A, ε, µ0 ), analytic in ε for |ε| < ε0 and in A for |A1 − A¯1 | < ρ0 , such that all three functions are vanishing for ε = 0, and, for all A0 ∈ Aρ0 (A¯1 ) and |ε| < ε0 , ( α(t) = ω 0 t + h(ω 0 t, A0 , ε, µ0 ) , (2.3) A(t) = A0 + H(ω 0 t, A0 , ε, µ0 ) is a solution of α˙ 1 = µ0 + ∂A1 f + η(A0 , ε, µ0 ) , α˙ 2 = 1 , A˙ 1 = −∂α1 f , ˙ A2 = −∂α2 f .
(2.4)
Moreover one has ε0 = min{E0 C0 , ε1 } for some constant E0 independent of µ0 . 2.3. Proof of Theorem 1.4. The proof of Lemma 2.2 will be performed in the following Secs. 3–5 (and in the Appendices). Now we come back to the original problem (1.5), and we show how the above lemma can be used in order to prove Theorem 1.4. Lemma 2.2 shows that, for ω 0 satisfying the Diophantine condition |ω0 · ν| > C0 |ν|−τ
∀ ν ∈ Z2 \ {0} ,
(2.5)
the series defining the conjugating functions and the counterterm converge and have a radius of convergence in ε bounded from below by ε0 = E0 C0 , for some constant E0 (assume for simplicity E0 C0 ≤ ε1 ). Now we want to use such a property to prove the following result, which immediately yields Theorem 1.4. 2.4. Proposition. Let η(A0 , ε, µ0 ) be given by Lemma 2.2, such that the functions (1.17) solve (1.16), and suppose (see (1.7)) that Z (1) dα∂A1 f (1) (α, A01 ) 6= 0 . (2.6) ∂A1 f0 (A0 ) ≡ T2
Then given µ ∈ (0, 1) such that ω = (µ, 1) satisfies the Diophantine condition |ω · ν| > C|ν|−τ
∀ ν ∈ Z2 \ {0} ,
(2.7)
February 4, 2002 10:11 WSPC/148-RMP
130
00112
M. Bartuccelli & G. Gentile
with Diophantine constants C > 0 and τ > 1, there exists a > 0 such that it is possible to fix µ0 ∈ BaC (µ) such that ω 0 = (µ0 , 1) satisfies the Diophantine condition |ω 0 · ν| > C0 |ν|−τ
∀ ν ∈ Z2 \ {0} ,
(2.8)
with C0 = bC, for some positive constant b, and to fix ε ≡ ε(µ0 , A0 ), with |ε| < ε0 , such that µ = µ0 + η(A0 , ε, µ0 ) ,
(2.9)
holds for ε = ε(µ0 ). 2.5. Continued fractions and approximants. We shall prove Proposition 2.4 through a series of (elementary) lemmata. We need some preliminary notations. Given µ ∈ (0, 1) denote by [a0 , a1 , a2 , . . .] its continued fraction expansion and by {pk /qk } its best approximants. Then if ω = (µ, 1) and ν k = (qk , pk ) one has [17], 1 1 > |ω · ν k | > , qk+1 2qk+1
(2.10)
|ω · ν| > |ω · ν k | ∀ ν = (q, p) such that qk < q < qk+1 .
(2.11)
and
Note also that qk < |ν k | < 2qk ,
(2.12)
for all k ∈ N. 2.6. Lemma. If ω = (µ, 1) satisfies the Diophantine condition |ω · ν| > C0 |ν|−τ
∀ ν ∈ Z2 \ {0} ,
(2.13)
then 2τ qk+1 < , τ qk C0
(2.14)
for any k ∈ N. 2.7. Proof of Lemma 2.6. For any k one has by (2.10) and (2.13) 1 qk+1
> |ω · ν k | > C0 |ν k |−τ ,
(2.15)
so that by (2.12) (2qk )τ > |ν k |τ > C0 qk+1 , and the assertion follows.
(2.16)
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
131
2.8. Lemma. If one has 1 qk+1 < , qkτ 2C0
(2.17)
for any k ∈ N, then ω = (µ, 1) satisfies the Diophantine condition (2.13). 2.9. Proof of Lemma 2.8. For ν = ν k one has by (2.10), (2.12) and (2.17) |ω · ν k | >
1 2qk+1
>
C0 C0 > , τ qk |ν k |τ
(2.18)
while for ν 6= ν k one can reason as follows. If ν = (qk , p), with p 6= pk , then |ω · ν| > 1/2 and (2.17) is trivially satisfied. If ν = (q, p), with qk−1 < q < qk , then by (2.10) and (2.11) |ω · ν| > |ω · ν k−1 | >
1 C0 C0 C0 > τ > τ > , 2qk qk−1 q |ν|τ
(2.19)
so that (2.13) follows. 2.10. Lemma. Given a rotation vector ω = (µ, 1), with µ ∈ (0, 1), satisfying the Diophantine condition |ω · ν| > C|ν|−τ
∀ ν ∈ Z2 \ {0} ,
(2.20)
and fixed any interval I ⊂ R with center in µ, there exist infinitely many µ0 ∈ I ∩ (0, 1) such that ω 0 = (µ0 , 1) satisfies the Diophantine condition |ω 0 · ν| > C0 |ν|−τ
∀ ν ∈ Z2 \ {0} ,
(2.21)
with C0 = bC, for some constant positive b. 2.11. Proof of Lemma 2.10. Given µ = [a0 , a1 , a2 , . . .] and any interval I with center in µ define µ0 = [a00 , a01 , a02 , . . .] in the following way: ( 0 ak = ak , if k ≤ k0 , (2.22) a0k ≤ N , if k > k0 , where k0 is so large that µ0 ∈ I and N is an integer. Then qk0 = qk for all k ≤ k0 , so that 0 qk+1 2τ , < qk0τ C
(2.23)
0 0 ≤ N qk0 + qk−1 ≤ 2N qk0 < 2N qk0τ qk0 +1 = a0k qk0 + qk−1
(2.24)
by Lemma 2.6, while
for all k > k0 . Then, by Lemma 2.8, ω 0 = (µ0 , 1) satisfies the Diophantine condition |ω0 · ν| > C0 |ν|−τ
∀ ν ∈ Z2 \ {0} ,
(2.25)
February 4, 2002 10:11 WSPC/148-RMP
132
00112
M. Bartuccelli & G. Gentile
with C0 = min{C/2τ +1, 1/4N }. If we set a0k ∈ {1, . . . , N } for all k > k0 , we obtain an infinite set of values satisfying the Diophantine condition (2.25). So, as one has C ≤ µ + 1 ≤ 2, the lemma is proved, with b = min{1/2τ +1, 1/16N }. 2.12. Proof of Proposition 2.4. Lemma 2.2 shows that the function η(A, ε, µ0 ) is analytic in ε, with radius of convergence ε0 = E0 C0 , so that one has η(A0 , ε, µ0 ) = εη (1) (A0 , µ0 ) + O(ε2 ), for |ε| < ε0 small enough. The condition (1.7) assures that (1) one has η (1) (A0 , µ0 ) = −∂A1 f0 (A0 ) 6= 0 (see Remark 4.4): then there exists a positive constant η1 such that |η (1) (A0 , µ0 )| > η1 , for any µ0 ∈ BaC (µ) — in fact η (1) (A0 , µ0 ) ≡ η (1) (A0 ) does not depend on µ0 . Therefore, varying ε in (−ε0 , ε0 ), with ε0 = E0 C0 = E0 bC, the function η(A0 , ε, µ0 ) covers an interval BdC (0), for some positive constant d (depending on b). As |µ − µ0 | is bounded by aC, one can choose a < d so that by moving ε in (−ε0 , ε0 ) there is at least one value ε = ε(µ0 , A0 ) such that η(A0 , ε(µ0 , A0 ), µ0 ) = µ − µ0 . (1)
2.13. Remark. The condition ∂A1 f0 6= 0 imposed on f is not really necessary, and it is simply aimed to assures that the counterterm is not identically vanishing. In fact under such a weaker condition, if η (k0 ) is the first nonvanishing coefficient in the expansion (3.1) for η(A0 , ε, µ0 ), then, at worst, when ε is varied in (−ε0 , ε0 ), the counterterm η(A0 , ε, µ0 ) covers an interval of width at least 2dC k0 , for some d > 0, so that a result analogous to Proposition 2.4 follows, provided the width of the interval I is chosen |I| = 2aC k0 , for some a < d. Note that the condition that the counterterm is not identically vanishing to first order amounts to a genericity condition on the perturbation f . 2.14. Proof of Theorem 1.6. Now suppose that, instead of the condition (1.7), one requires Z (1) ¯ 2 2 dα ∂A f (1) (α, A¯1 ) 6= 0 . (2.26) M ≡ ∂A1 f0 (A) ≡ 1 T2
Then, instead of fixing the action variables and moving ε until the compatibility condition (1.15) is satisfied (as in Sec. 2.12), we can fix ε small enough (say smaller than a value ε¯ to be determined) and slightly change A¯1 into a nearby value A01 such that one still has µ0 + η(A0 , ε, µ0 ) = µ ,
A0 = (A01 , A¯2 ) ,
(2.27)
for some µ0 such that ω0 = (µ0 , 1) satisfies the Diophantine condition |ω 0 · ν| > C0 |ν|−τ
∀ ν ∈ Z2 \ {0} ,
ω0 = (µ0 , 1) ,
(2.28)
with C0 = bC, for some constant b ∈ (0, 1). This can be easily seen by reasoning as follows. Henceforth we denote A = (A1 , A¯2 ), where A¯2 is fixed once and for all. Fix A¯1 ∈ D and ρ such that Bρ (A¯1 ) ⊂ D; let us fix 0 < δ < ρ0 , where ρ0 is (1) 2 f (A)| > M/2: given by Lemma 2.2, such that for all A ∈ Aδ (A¯1 ) one has |∂A 1 0 (1) therefore for A varying in Aδ/2 (A¯1 ) the quantity ε∂A1 f (A) covers an interval 0
J(ε) such that |J(ε)| = O(ε); more precisely one has |J(ε)| > M εδ/4.
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
133
Let ε¯ ≤ ε0 be such that for any |ε| < ε¯ the interval µ + J(ε) contains at least one value µ0 , such that ω0 = (µ0 , 1) verifies the the Diophantine condition |ω 0 · ν| > C1 |ν|−τ
∀ ν ∈ Z2 \ {0} ,
ω 0 = (µ0 , 1) ,
(2.29)
with C1 = b1 C, for some constant b1 ∈ (0, 1); by reasoning as for proving Lemma 2.10 it is easy to realize that this is possible. By applying once more Lemma 2.10 we can conclude that for all a0 > 0 the interval Ba0 C (µ0 ) contains values µ0 such that ω 0 = (µ0 , 1) satisfies (2.28), for some constant b such that C0 = b2 C1 = b2 b1 C ≡ bC: in particular this yields that J(ε) contains infinitely many such µ0 . Fix ε such that |ε| ≤ ε¯ and choose A00 ∈ Aδ/2 (A¯1 ) such that µ0 = µ − εη (1) (A00 ) ,
η (1) (A00 ) = −∂A1 f (1) (A00 ) ;
(2.30)
we can suppose that A00 is such that η (1) (A00 ) 6= 0 (if not simply choose a nearby value µ0 6= 0 and use the property (2.26)). Then write, for any A ∈ Aδ (A¯1 ), η(A, ε, µ0 ) = ε η (1) (A) + ε2 ξ(A, ε, µ0 ) ,
sup ¯1 ) A∈Aδ (A
|ξ(A, ε, µ0 )| ≤ Ξ ,
(2.31)
with Ξ a suitable constant: this follows again from Lemma 2.2, by taking into account that δ < ρ0 . Then define A0 implicitly as the solution of the equation µ = µ0 − ε∂A1 f (1) (A00 ) = µ0 − ε∂A1 f (1) (A0 ) + ε2 ξ(A0 , ε, µ0 ) ≡ µ0 + η(A0 , ε, µ0 ) :
(2.32)
if such a solution exists with A01 in Bρ0 (A¯1 ), then we have proved (2.27). The solution A0 of (2.32) can be found as a simple consequence of the implicit function theorem. In fact the function F (A01 , ε) ≡ ∂A1 f (1) (A00 ) − ∂A1 f (1) (A0 ) + ε ξ(A0 , ε, µ0 )
(2.33)
is analytic both in A01 and in ε (for A01 ∈ Bρ0 (A¯1 ) and ε ∈ Bε0 (0)). As F (A001 , 0) = 0 ,
2 ∂A01 F (A001 , 0) = −∂A f (1) (A00 ) 6= 0 , 1
(2.34)
there exists a value A01 = A01 (ε) such that F (A01 (ε), ε) = 0 ;
(2.35)
moreover one has M 0 |A − A00 | , (2.36) 2 so that there exists A01 ∈ Bδ (A¯1 ) ⊂ Bρ0 (A¯1 ), provided that ε¯ is so small that one has 4Ξ¯ ε < M δ, and the assertion is proved. |∂A1 f (1) (A0 ) − ∂A1 f (1) (A00 )| >
2.15. Remark. As already noted in Remark 1.7 (1), the condition (1.11) is not really necessary in order to prove Theorem 1.6: what one has to require is that the counterterm is not constant in the action variable A1 , for A1 ∈ D.
February 4, 2002 10:11 WSPC/148-RMP
134
00112
M. Bartuccelli & G. Gentile
3. Perturbation Theory 3.1. Lindstedt series. In the following we assume that A0 and µ0 are fixed once and for all, and we shall not write explicitly the dependence on A0 and µ0 : so we shall write η(A0 , ε, µ0 ) = η(ε), and so on. We look for a solution of the form (2.3), with h(ψ, ε) = H(ψ, ε) = η(ε) =
∞ X k=1 ∞ X k=1 ∞ X
X
εk
eiν·ψ h(k) ν ,
ν∈Z2
X
εk
eiν·ψ H(k) ν ,
(3.1)
ν∈Z2
εk η (k) .
k=1
The formal series (3.1) are called the Lindstedt series. Note that writing h = (h1 , h2 ) one has h2 = 0 identically as α2 (t) = t for any ε. More generally, for any function F = F (ψ, A0 , ε) analytic in its arguments and (k) (k) 2π-periodic in ψ, we denote by [F ]ν the coefficient Fν with Fourier label ν and Taylor label k in its expansion F (ψ, A0 , ε) =
∞ X k=1
εk
X
∞ X
eiν·ψ Fν(k) (A0 ) ≡
ν∈Z2
k=1
εk
X
eiν·ψ Fν(k) .
(3.2)
ν∈Z2
If we put (2.3) into (2.4) by using the expansion (3.1) we obtain, for ν 6= 0, (k)
(k)
hν = g(ω0 · ν)[∂A f ]ν , (k)
(3.3)
(k)
Hν = −g(ω0 · ν)[∂α f ]ν , with g(ω 0 · ν) =
1 iω0 · ν
,
(3.4)
provided that, for ν = 0, one has (k)
η (k) + [∂A1 f ]0 = 0 ,
(3.5)
(k)
[∂α f ]0 = 0 . We can write (3.3) as h(k) ν
H(k) ν
∗ X 1 1 = g(ν) p! q! ∗ X 1 1 = −g(ν) p! q!
p Y
! iν 0 ·
(k 0 ) hν pp0
p0 =1 p Y p0 =1
p+q Y
! (k 0 ) Hν qq0
· ∂A ∂A fν(k) (A0 ) , 0
q0 =p+1
! iν 0 ·
(k 0 ) hν pp0
p+q Y
! (k 0 ) Hν qq0
· ∂A (iν 0 )fν(k) (A0 ) , 0
q0 =p+1
(3.6)
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
where
P∗
135
is a shorthand for ∗ X
=
∞ X ∞ ∞ X X k0 =1 p=0 q=0
X
X
;
(3.7)
k1 ≥1,...,kp+q ≥1 ν1 ∈Z2 ,...,νp+q ∈Z2 k0 +k1 +···+kp+q =k ν0 +ν1 +···+ν p+q =ν
for k = 1 the formulae in (3.6) have to be interpreted in the appropriate (obvious) (k) way. Note in (3.6) that Hν is given by a sum of contributions which have always (k) (at least) one derivative with respect to α, whereas hν is given by the sum of contributions which have always (at least) one derivative with respect to A. Then (k) we can introduce the following notation: Hν is given by a sum of terms which (k) are of the form H ← h, where H denotes that they contribute to Hν and h that there is always a derivative with respect to the angle variables; in the same way, (k) hν is given by a sum of terms which are of the form h ← H, where h denotes that (k) they contribute to hν and H that there is always a derivative with respect to the action variables. So we have two kinds of problems: first to show the formal solubility of Eqs. (2.4), i.e. to show that to each perturbative order (3.5) are satisfied so that no division by zero is performed; then to show that the formal series (3.1) defining h, H and η converge. 3.2. Notations. We introduce some notations which will be used in the following. Given a set of elements S we denote by |S| the number of elements in S. Recall that ∂x denotes the partial derivative with respect to x. If a function F depends only on one argument, F = F (x), then we shall write sometimes ∂F (x) for ∂x F (x), as no ambiguity can arise in such a case. Given a vector v = (v1 , v2 ) ∈ R2 we set |v| = |v1 | + |v2 |. If v ∈ R2 then for any p ∈ N the quantity vp denotes the tensor with entries vi1 · · · vip , where ij ∈ {1, 2} q will denotes the tensor with entries ∂Ai1 · · · ∂Aiq for all j = 1, . . . , p. Likewise ∂A where ij ∈ {1, 2} for all j = 1, . . . , q. 3.3. Tree expansion. An unlabeled tree θ is a partially ordered set of points and lines connecting the points. The partial ordering relation between the nodes is from right to left and it denoted by . The leftmost point r is called the root of the tree; all the other points are called nodes and are denoted by v. The lines are denoted by `; they carry an arrow oriented towards the root. If a line ` connects a node v2 to a node v1 v2 , we shall say that the line is attached to the nodes v1 and v2 , and write ` = `v2 and v20 = v1 : we say also that the line enters v1 and exits from v2 , and that v1 is the node immediately following v2 . The line ` entering the root is the root line. The node v0 which it exits from is called the last node of the tree: one has `0 = `v0 . See Fig. 1. We shall call V (θ) the set of nodes in θ and Λ(θ) the set of lines in θ; one has |V (θ)| = |Λ(θ)|.
February 4, 2002 10:11 WSPC/148-RMP
136
00112
M. Bartuccelli & G. Gentile
%
&
'
(
)
0
!
"
1
,
2
Fig. 1.
*
/
3
-
.
#
4
An unlabeled tree θ with 18 nodes.
Two trees are said to be equivalent if they are obtained from each other by continuously deforming the lines in such a way that the latter do not cross each other: in the following we shall always identify equivalent trees. Given a tree θ and any node v ≺ v0 , the set θ0 of nodes w v and of lines connecting them form with the line `v a tree with root v 0 and root line `v : we say that θ0 is a subtree of θ. We denote by θ \ θ0 the set of nodes in V (θ) \ V (θ0 ) and of lines connecting them; we shall write also V (θ) \ V (θ0 ) = V (θ \ θ0 ). We shall write the perturbation f as f (α, A1 , ε) = f (α, A, ε), even if the dependence is only through the first component A1 of A; see Remark 3.5. Then f can be expanded as f (α, A, ε) =
∞ X
εk f (k) (α, A) =
k=1
∞ X
εk
k=1
X
eiν·α fν(k) (A);
(3.8)
ν∈Z2
note that, by the analyticity assumptions on f , one has k −κ|ν| e , |fν(k) (A)| < F01 F02
F01 =
max
|ε|=ε0 |A1 −A01 |=ρ0
|f (α, A, ε)| ,
F02 = ε0−1 ,
(3.9)
for any 0 < ε0 < ε1 and 0 < ρ0 < ρ. Then to each node v we associate a mode label ν v ∈ Z2 and an order label kv ∈ N. We define the order k of a tree as the sum of the values of the order labels of the nodes: X kv . (3.10) k= v∈V (θ)
Note that if kv = 1 for all v ∈ V (θ) then k = |V (θ)|. Define the momentum flowing through a line `v as X νw . ν `v = wv
(3.11)
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
137
To each node v we associate a node factor Fv , which is a function of ν v , while to each line ` we associate a propagator G` , which is a function of ω 0 · ν ` . We distinguish between two kinds of lines, lines h ← H and lines H ← h, and we assign to them and to the nodes they exit from different values for G` and Fv , if ` = `v , in the following way. We associate to each node v three non-negative integer labels pv , qv and mv with the constraint pv + qv = mv : mv is the number of lines entering v, while the labels pv and qv denote, respectively, the number of lines h ← H and H ← h entering it. Given a node v let us denote by ` = `v the line exiting from it. Then the node factor Fv and the propagator G` are defined as h←H
H ←h
1 1 qv +1 (kv ) (iν v )pv ∂A fν v (A0 ) pv ! qv ! g(ω 0 · ν ` )
1 1 qv (kv ) (iν v )pv +1 ∂A fν v (A0 ) , pv ! qv ! −g(ω0 · ν ` ) .
` Fv G`
(3.12)
Note that for f (α, A, ε) = f (α, A1 , ε) the only nonvanishing entry of the tensor (k) q (k) q fν (A0 ) is ∂A fν (A0 ). ∂A 1 If we introduce a label δv such that δv = 1 if `v is a line h ← H and δv = 0 if `v is a line H ← h, then Fv can be written as 1 1 qv +δv (kv ) (iν v )pv +(1−δv ) ∂A fν v (A0 ) (3.13) Fv = pv ! qv ! in both cases. Note that one has 1 q (k) |∂ f (A)| < F1 F2k e−κ|ν| F3q , q! A ν F1 =
max
|ε|=ε0 |A1 −A01 |=ρ0
|f (α, A, ε)| ,
F2 = ε0 − 1 ,
F3 = ρ0−1 ,
(3.14)
by (3.9) and by the assumptions on the dependence on A. We have that X ∈ {h, H} can be written as X (k) Val(θ) , Xν = θ∈Tk, (X)
Val(θ) =
Y v∈V (θ)
Fv
!
Y
! G`
(3.15) ,
`∈Λ(θ)
where Tk,ν (X) is the set of all labeled trees of order k with momentum ν flowing through the root line and such that if X = h then the root line is a line h ← H, while if X = H then the root line is a line H ← h. The proof of (3.15) can be performed by induction on the order k, by using (3.3) and expanding the functions in the square brackets. Define also ! ! Y Y 0 Fv G` , (3.16) Val (θ) = v∈V (θ)
`∈Λ(θ)\`0
February 4, 2002 10:11 WSPC/148-RMP
138
00112
M. Bartuccelli & G. Gentile
where `0 is the root line. If we introduce the vector η = (η, 0), so that η (k) = (η (k) , 0), then also η admits a representation X Val0 (θ) , (3.17) η (k) = − θ∈Tk,0 (h)
where Tk,0 (h) means that the trees over which the sum is performed have the constraint that the root line is a line h ← H; see (5.5). We denote also by Tk,ν the set of all labeled trees of order k and with momentum ν flowing through the root line, with no condition on the kind of root line. 3.4. Formal solubility and solubility. We shall prove first of all that the coefficients defining the formal series (3.1) are finite to each perturbative order, i.e. that (k) one has |hν | < ∞ for all k ∈ N and for all ν ∈ Z2 . Hence we shall deal with the problem to prove the convergence of the series: this will require a more careful analysis of the tree values, inspired to the renormalization group approach in quantum field theory. 3.5. Remark. The tree representation given in this section can be carried out, essentially unchanged, for any analytic perturbation of isochronous systems of any dimension d. This explains why we used the notation (3.8) for the perturbation and the vectorial notation (3.17) for the counterterms: of course in general all the components of the counterterms are not vanishing. 4. Proof of the Formal Solubility of the Equations of Motion 4.1. Formal solubility. To show that there exists a formal solution (2.3) of the equations of motion (2.4) we have to show that for any θ no division by zero occurs in Val(θ) and in Val0 (θ). Recall that, given any tree θ, any line ` ∈ Λ(θ) can be considered as the root line of the subtree formed by the nodes and lines preceding (k) `. So we have to show that the sum of all trees contributing to [∂α f ]0 is vanishing, (k) provided that η (k) is chosen in such a way that η (k) + [∂A f ]0 = 0. Then the formal solubility is implied from the following result. 4.2. Lemma. There is a unique choice for the coefficients η (k) such that, for all (k) (k) k ≥ 1, one has [∂α f ]0 = 0 and η (k) + [∂A f ]0 = 0. Such a choice is given by (3.17). 4.3. Proof of Lemma 4.2. The proof can be done by induction. For k = 1 the assertion is trivially satisfied, as (1) [∂α f ](1) ν = iνfν (A0 ) ,
(4.1)
which is vanishing for ν = 0, while imposing (1)
[∂A1 f ]0 + η (1) = 0 fixes η
(1)
as in (3.17).
(4.2)
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
139
If the assertion holds for all k 0 < k then we can show that it holds also for k. By the inductive hypothesis all lines in θ which are not the root line have a nonvanishing momentum (as they are the root lines of subtrees of order strictly less than k), so that Val0 (θ) is well defined. Then consider all contributions arising from the trees θ ∈ Tk,0 (H), hence having as root line a line H ← h: we group together all trees obtained from each other by shifting the root line, i.e. by changing the node which the root line exits and orienting the arrows in such a way that they still point towards the root. We call F (θ) such a class of trees (here θ is any element inside the class). See Fig. 2.
!
"
#
$
Fig. 2.
The family F (θ) = {θ, θ 0 , θ 00 } for a tree θ ∈ T3,0 . The labels are not explicitly shown.
The values Val0 (θ0 ) of such trees θ0 ∈ F (θ) differ as (1) there is a factor iν v depending on the node v which the root line is attached to (see the definition (3.10) of Fv for the lines H ← h), and (2) some arrows change their directions. More precisely, when the root line is detached from the node v0 and reattached to the node v, if P(v0 , v) = {w ∈ V (θ) : v0 w v} denotes the path joining the node v0 to the node v, all the lines h ← H along the path P(v0 , v) become lines H ← h and vice versa. As a consequence the signs of the momenta flowing through them and the factorials of the node factors corresponding to the nodes joined by them can change. The change of the signs of the momenta simply follows from the fact that X νv = 0 , (4.3) v∈V (θ)
as θ ∈ Tk,0 : this means that some propagators G` change from g(ω 0 ·ν ` ) to −g(−ω0 · ν ` ) (compare the definitions of the propagators for lines h ← H and H ← h in (3.10)), but, by the definition of propagator (3.4), one has g(ω 0 ·ν ` ) = −g(−ω0 ·ν ` ). The change of the node factors is due to the fact that for the nodes along the path P(v0 , v), an entering line can become an exiting line and vice versa, so that the labels pv and qv can be transformed into pv ± 1 and qv ± 1, respectively: this does qv +δv (kv ) fν v (A0 ) in (3.13), as one immediately not modify the factor (iν v )pv +(1−δv ) ∂A checks, but it can produce a change of the factorials. If we neglect the change of the factorials, i.e. if we assume that all combinatorial factors are the same, by summing over all possible trees inside the class F (θ) we obtain a common value times i times (4.3), and the sum gives zero. One can easily show that a correct counting of the trees implies that all factorials are in fact equal:
February 4, 2002 10:11 WSPC/148-RMP
140
00112
M. Bartuccelli & G. Gentile
simply reason as in [2, Sec. 3], by using topological trees. Therefore the above argument proves the second equation in (3.5). In order to make soluble the equation for h to order k one has to impose that (k) η deletes the Fourier component with label ν = 0 arising from [∂A1 f ](k) (the one arising from [∂A2 f ](k) is automatically vanishing as f does not depend on A2 ): this gives the condition (3.17). The summation on the trees can be easily performed, as the summability over the Fourier labels is assured by the Diophantine condition (which is not the optimal condition under which the formal solubility can be proved; see also [2] for the case of the maps on the cylinder), while all the other labels can assume only a finite number of values. So the proof of the lemma is concluded. 4.4. Remark. The assumption (1.7) on the perturbation f implies, by (4.2), that one has η (1) 6= 0. 5. Proof of Convergence of the Perturbative Expansion 5.1. Bound on the node factors. To prove the solubility of the equations of motions, i.e. the summability of the Lindstedt series (3.1), we shall prove the bounds 0
k −κ |ν| , |h(k) ν |≤B e
0
k −κ |ν| |H(k) , ν |≤ B e
|η (k) | ≤ B k ,
(5.1)
for suitable constants B > 0 and κ0 ∈ (0, κ). The sum over the labeled trees in Tκ,ν can be written as the sum over all the unlabeled trees and over all the ways to assign the mode and order labels to the nodes of the unlabeled trees with the constraints X X kv = k , νv = ν . (5.2) v∈V (θ)
v∈V (θ)
Define M (θ) =
X
|ν v | .
(5.3)
v∈V (θ)
Of course M (θ) ≥ |ν| for θ ∈ Tk,ν . The number of unlabeled trees with V nodes is bounded by 22V . We can see that, if it was possible to neglect the propagators in the definition of Val(θ) and Val0 (θ), a bound like (5.1) would immediately follow. In fact suppose for the time being to neglect the propagators, i.e. to replace G` with 1 in the definition of Val(θ) and Val(θ0 ) in Sec. 3.3. Then the sum over the labels can be performed as follows: the sum over the mode labels is controlled through Y 1 1 X qv +δv (kv ) |ν v |pv +(1−δv ) |∂A fν v (A0 )| qv ! pv ! P {ν v }v∈V (θ)
≤ e
v∈V (θ)
−κM(θ)
ν v =ν v∈V (θ)
2|V (θ)| |V (θ)| k 2k M (θ) F1 F2 F3
(2|V (θ)|)!
≤e
−κM(θ)
|V (θ)| F1
F2 F32 κ21
k eκ1 M(θ) , (5.4)
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
for any κ1 ∈ (0, κ), where we used that X mv = |V (θ)| − 1 , v∈V (θ)
X
pv ≤ |V (θ)| − 1 ,
v∈V (θ)
X
qv ≤ |V (θ)| − 1 ,
141
(5.5)
|V (θ)| ≤ k ,
v∈V (θ)
while the sum over the unlabeled trees and over the contraction of the indices of the node factors gives a constant to the power V , say F4V . So we are left with the sum over the order labels, which is controlled through k X
X
22V F1V F4V ≤ 22k F1k F4k
V =1 k1 +···+kV =k
k X kV ≤ (22 eF1 F4 )k . k!
(5.6)
V =1
Then the bound (5.1) follows with κ0 = κ − κ1 . So we have to handle the propagators. We shall see that not all propagators can give problems; more exactly only the accumulation of propagators with the same momenta can be source of problems, as Lemma 5.4 below shows. 5.2. Multi-scale decomposition and clusters. We introduce a partition of unity through characteristic functions 1=
1 X
χn (ω 0 · ν) ,
(5.7)
n=−∞
where χn (x) has support on |x| ∈ [C0 2n−1 , C0 2n ) for n ≤ 0, while χ1 (x) has support on |x| ∈ [C0 , ∞); note that χn (x) = χ(2−n x) if χ(x) is the characteristic function of the interval [C0 /2, C0 ). For each propagator we write G` =
1 X n` =−∞
1 X
χn` (ω 0 · ν)G` ≡
(n` )
G`
.
(5.8)
n` =−∞ (n )
We say that n` is the scale label of the line ` and G` ` is a propagator on scale n` : note that, given the momentum ν ` flowing through the line `, there is only one scale n such that either C0 2n−1 ≤ |ω 0 · ν ` | < C0 2n ,
n ≤ 0,
(5.9)
or |ω 0 · ν ` | ≥ C0 , so that, even if (5.8) is written as an infinite series, in fact only one term is really nonvanishing. We shall say that the scale n for which (5.9) holds is the scale compatible with the line `. Once the scale labels have been assigned to the lines one has a natural decomposition of the tree into clusters. A cluster T on scale n is a maximal set of nodes and lines connecting them such that all the lines have scales n0 ≥ n and there is at least one line on scale n. The mT ≥ 0 lines entering the cluster T and the (only one or zero) exiting line are called the external lines of the cluster T . Given a cluster T on scale n, we shall denote by nT = n the scale of the cluster. We call T (θ) the
February 4, 2002 10:11 WSPC/148-RMP
142
00112
M. Bartuccelli & G. Gentile
set of all clusters in a tree θ; given a cluster T ∈ T (θ) call V (T ) and Λ(T ) the set of nodes and the set of lines of T , respectively. 5.3. Resonances. We call resonance a cluster T with only one entering line `2T such that X X νv = 0 , |ν v | ≤ (2 2(n+3)/τ )−1 , (5.10) v∈V (T )
v∈V (T )
if n is the scale of the exiting line `1T . Note that the entering line `2T must have, by the first condition in (5.10), the same momentum of the exiting line `1T and, by construction, a scale n`2T = n`1T = n. We say that the line `1T exiting a resonance T is a resonant line. We call nonresonant line a line which is not a resonant line. For any resonance T and any line ` ∈ Λ(T ) one can write, by setting ` = `v , ν ` = ν 0` + σ` ν ,
(5.11)
where X
ν 0` =
νw ,
(5.12)
w∈V (T ) wv
ν ≡ ν `2T is the momentum flowing through the line `2T entering T , and σ` is defined as follows: writing ` ≡ `v then σ` = 1 if `2T enters a node w v and σ` = 0 otherwise. Given a resonance T , define the resonance value as ! ! Y Y (n` ) Fv G` , (5.13) VT (ω 0 · ν) = v∈V (T )
`∈Λ(T )
seen as a function of ω 0 · ν, if ν ≡ ν `2T = ν `1T is the momentum flowing through the external lines of the resonance T . We can have four types of resonances: `1T
`2T
1.
H ←h
h←H,
2.
H ←h
H ← h,
3.
h←H
h←H,
4.
h←H
H ← h.
(5.14)
Given a tree θ, define Nn (θ) = {` ∈ Λ(θ) : n` = n} , pn (θ) = {T ⊂ T (θ) : nT = n} .
(5.15)
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
143
Call Nn∗ (θ) the number of nonresonant lines on scale ≤ n and call Rnj (θ) the number of resonant lines on scale ≤ n exiting from resonances of type j. Of course Nn (θ) = Nn∗ (θ) +
4 X
Rnj (θ) .
(5.16)
j=1
In Appendix 1 we prove the following result (note that the bound (5.17) is a version of Siegel–Bryuno lemma). 5.4. Lemma. For any tree θ ∈ Tκ,ν one has Nn∗ (θ) + pn (θ) ≤ c M (θ)2n/τ ,
(5.17)
Rn4 (θ) ≤ c M (θ)2n/τ + Rn1 (θ) ,
(5.18)
and
for some constant c. 5.5. Bound on the nonresonant lines. Define Λ∗ (θ) the set of nonresonant lines in Λ(θ). Then one has Y
(n` )
|G`
∗
| ≤ (2C0−1 )|Λ
`∈Λ∗ (θ)
(θ)|
1 Y
∗
2−nNn (θ) .
(5.19)
n=−∞
Let n0 = n0 (κ) be a negative integer to be fixed later (see (5.21) below). One has in (5.19) 1 Y
∗
2−nNn (θ) ≤ 2−2n0 k
n=−∞
n0 Y
∗
2−nNn (θ) ≤ 2−2n0 k
n=−∞
n0 Y
2−cnM(θ)2
n/τ
,
(5.20)
n=−∞
where (5.17) has been used for the lines on scale ≤ n0 . Choose n0 so that c log 2
∞ X
p2−p/τ ≤ κ2 ,
(5.21)
p=|n0 |
for some κ2 ∈ (0, κ − κ1 ); then (5.19) and (5.20) give Y ∗ (n ) |G` ` | ≤ (2C0−1 )|Λ (θ)| 2−2n0 k eκ2 M(θ) .
(5.22)
`∈Λ∗ (θ)
Together with the bounds (5.4) and (5.6), taking into account the product of the node factors, this shows that, by neglecting the resonances, a bound like (5.1) still holds with κ0 = κ − κ1 − κ2 . So we have to prove that the presence of the resonances does not destroy the results implied by the above discussion. 5.6. Localization and renormalization operators. For any resonance T we define VT (ω 0 · ν) = LVT (ω 0 · ν) + RVT (ω 0 · ν) ,
(5.23)
February 4, 2002 10:11 WSPC/148-RMP
144
00112
M. Bartuccelli & G. Gentile
where for resonances of type either 2 or 3 one has LVT (ω 0 · ν) ≡ VT (0) ,
Z
RVT (ω 0 · ν) ≡ (ω 0 · ν)
1
dtT ∂VT (tT ω0 · ν) ,
(5.24)
0
while for resonances of type 1 one has LVT (ω 0 · ν) ≡ VT (0) + (ω0 · ν)∂VT (0) , Z 1 2 dtT (1 − tT )∂ 2 VT (tT ω 0 · ν) , RVT (ω 0 · ν) ≡ (ω 0 · ν)
(5.25)
0
and for resonances of type 4 one has simply LVT (ω 0 · ν) ≡ 0 , RVT (ω 0 · ν) = VT (ω 0 · ν) .
(5.26)
Here ∂VT and ∂ 2 VT denote the first and the second derivatives of VT with respect to its argument (see Sec. 3.2). We shall call L the localization operator and R the renormalization operator : correspondingly LVT (ω 0 · ν) and RVT (ω 0 · ν) are called the localized part and the renormalized part of the resonance value. The quantity VT (0) is obtained from VT (ω 0 · ν) by replacing ν ` with ν 0` in the (n ) argument of each propagator G` ` , while ∂VT (0) is obtained from VT (ω0 · ν) by deriving it with respect to x = ω 0 ·ν, hence replacing ν ` with ν 0` in the argument of (n ) each propagator G` ` . Analogously ∂VT (tT ω0 · ν) and ∂ 2 VT (tT ω 0 · ν) are obtained from VT (ω 0 ·ν) by deriving it with respect to x = ω0 ·ν, once and twice, respectively, (n ) hence replacing ν ` with ν 0` + tT σ` ν in the argument of each propagator G` ` : in such a case we shall write ν ` (tT ) = ν`0 + tT σ` ν ,
(5.27)
where, as usual, we denote ν = ν `2T . The expressions for the renormalized parts of the resonance values are explicitly written in (5.35) and (5.36) below. Note that, given a resonance T , even if the renormalization procedure can change the compatible scales of its external lines, nevertheless the two scales have to remain equal to each other: in fact the momenta flowing through the external lines are still equal as their difference is left zero. 5.7. Remark. Because of the renormalization procedure it is no more true that there can be only one compatible scale per line (such that the corresponding propagator is not vanishing). For instance, if T is a resonance and ν ` is the momentum flowing through a line ` ∈ Λ(T ), let n` be the scale compatible with ` before renormalizing the resonance, i.e. the scale n` such that χn` (ω 0 · ν ` ) 6= 0. In the localized part of the resonance value the momentum ν ` has to be replaced with ν 0` , which is in general different from ν ` . So it can happen that χn` (ω 0 · ν 0` ) = 0, while χn0 (ω 0 · ν 0` ) = 0 for some
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
145
scale n0 6= n` : in such a case the scale label compatible with the line is no more n` but n0 . Moreover in the renormalized part of the resonance value, even if the mode labels are fixed, the arguments of the propagators can change: they can assume any value reachable by varying tT ∈ [0, 1], i.e. one has kω0 · ν 0` | − |ω0 · νk ≤ |ω 0 · ν ` | ≤ |ω0 · ν 0` | + |ω 0 · ν| ,
(5.28)
so that (5.8) are (in principle) really infinite sums. The second condition in (5.10) has been introduced exactly with the aim of preventing the number of compatible scales from being too large: see Sec. 5.10 and Lemma 5.11. 5.8. Resummation families. The reason why to split the resonance values as in (5.21) is given by the fact that the contributions arising from the localized parts of the resonance values, when summed over all trees, give a vanishing contributions. In order to prove such a (remarkable) property, we need to introduce suitable resummation families. Given a tree θ containing a resonance T , we can consider all trees obtained by changing the location of the nodes internal to T which the external lines of T are attached to: we denote by FT (θ) the set of trees so obtained, and call it the resummation family associated to the resonance T . And we shall refer to the operation of detaching and reattaching the external lines, by saying that we are shifting such lines. See Fig. 3.
Fig. 3. The resummation family FT (θ) = {θ, θ1 , θ2 , θ3 } obtained by shifting the external lines of the resonance T . The black balls represent the remaining parts of the trees. The labels are not explicitly shown.
Of course shifting the external lines of a resonance produces a change of the propagators of the trees. In particular as all arrows have to point towards the root, some lines can revert their arrows: correspondingly some lines h ← H become lines H ← h and vice versa. Moreover the momentum can change, as a reversal of the arrow implies a change of the partial ordering of the nodes inside the resonance and a shifting of the entering
February 4, 2002 10:11 WSPC/148-RMP
146
00112
M. Bartuccelli & G. Gentile
line can add or subtract the contribution of the momentum flowing through it. More precisely, if the external lines of a resonance T are detached then reattached to some other nodes in V (T ), the momentum flowing through the line ` ∈ Λ(T ) can be changed into ±ν 0` + σν, with σ ∈ {0, 1}: if we call V1 and V2 the two disjoint sets into which ` divides T , such that the arrow superposed on ` is directed from V2 to V1 (before detaching the external lines), then the sign is + if the exiting line is reattached to a node inside V1 and it is − otherwise, while σ = 1 if the entering line is reattached to a node inside V2 when the sign is + and to a node inside V1 when the sign is −, and σ = 0 otherwise. See Fig. 4.
Fig. 4. The sets V1 and V2 in a resonance T ; note that, even if they are drawn like circles, he sets V1 and V2 are not clusters. One has ν 0` = ν v3 +ν v4 +ν v5 +ν v6 and ν = ν `2 ; of course ν `1 = ν `2 T
T
T
and ν 0` = −(ν v1 + ν v2 ) by definition of resonance. The black balls represent the remaining parts of the trees. The labels are not explicitly shown.
Then the following result follows. The proof is in Appendix 2. 5.9. Lemma. For any resonance T ∈ T (θ) one has X LVT (ω 0 · ν) = 0 ,
(5.29)
θ 0 ∈FT (θ)
where ν = ν `2T and the sum is over the resummation family associated to T. 5.10. Changing the scales. When considering separately the localized parts and the renormalized parts of the resonance values, as we said before in Remark 5.7, the scales are no more uniquely fixed by assigning the mode labels. This means that the sum over the scale labels is no more a fictitious sum as in the case in which no renormalization is performed (in such a case the scale associated to each line is simply determined by the momentum flowing through it by (5.9), so that no sum has to be really done).
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
147
Anyway the number of scale labels compatible with a line is not arbitrarily large, as the following result shows (the proof is in Appendix 3). 5.11. Lemma. When shifting the lines external to the resonances of a tree θ, for any line ` ∈ Λ(θ) the scale compatible with ` can change at most by one unit. 5.12. Renormalization of the maximal resonances. If we have several resonances which are external to each other then we apply Lemma 5.9 to each of them: so for all of them we can replace the resonance values with their renormalized parts. The situation is a little more involved when one has to consider a tree θ in which some resonances are contained inside some other resonances. In such a case we define the depth D(T ) of a resonance T recursively as follows: given a resonance T , we set D(T ) = 1 if there is no resonance containing T , and set D(T ) = D(T 0 )+ 1 if T is contained inside a resonance T 0 and all the other resonances inside T 0 (if there are any) do not contain T . Then consider the maximal resonances, i.e. the resonances T ∈ T (θ) with depth D(T ) = 1. We call T1 (θ) the set of such resonances; likewise we call TD (θ) the set of the resonances with depth D. For each T ∈ T1 (θ) let VT (ω 0 ·ν `2T ) be its resonance value. By Lemma 5.9 we can neglect the localized part LVT (ω 0 · ν `2T ), as it will give a vanishing contributions when the values of all trees are summed together, so that we have to consider only the renormalized value RVT (ω 0 · ν `2T ): we say then that the resonance is a renormalized resonance. Note that the scale compatible with the line `2T has not changed by the renormalization procedure (as the momentum flowing through the line entering the resonance remains the same), so that |ω 0 · ν `T2 | ≤ C0 2n ,
(5.30)
if n = n`2T , while, by shifting `2T , the scales compatible with the lines internal to T can change at most by one unit (by Lemma 5.11). Then for any line ` ∈ Λ(T ) one has, setting n = n`2T = n`1T , and, by the second condition in (5.10), X |ν v | ≤ (2 2(n+3)/τ )−1 , (5.31) |ν 0` | ≤ v∈V (T )
so that |ω 0 · ν 0` | > C0 |ν 0` |−τ ≥ C0 2τ 2n+3 ,
(5.32)
hence, with the notation (5.27), |ω0 · ν ` (tT )| > C0 2τ 2n+3 − C0 2n > C0 2n+3 ,
(5.33)
so that the scales compatible with any line ` ∈ Λ(T ) have to be strictly larger than n: the cluster structure imposed by the presence of the resonance is preserved by
February 4, 2002 10:11 WSPC/148-RMP
148
00112
M. Bartuccelli & G. Gentile
the renormalization procedure (recall that in order to define the resonances, even before stating the conditions (5.10), we have required them to be clusters!). As all the lines internal to the resonances in T1 (θ) have a scale larger than the scales of the lines external to the resonances themselves, we can reason as in [3] (or [4]) and consider all the trees having the same structure as the just considered tree θ, but with different scale labels associated to the lines internal to the resonances in T1 (θ), i.e. the trees obtained by assigning to the lines in Λ(T ), for T ∈ T1 (θ), scale labels n0 ≥ n + 1, if n = n`2T . In particular this means that all the considered tree θ0 have the same sets of maximal resonances T1 (θ0 ). So we have the sum over the scale labels compatible with the resonance structure (see comments about (5.37) below) of renormalized resonance values which (setting ν 2`T = ν) are given by RVT (ω 0 · ν) = VT (ω 0 · ν) , for resonances of type 4, by X Z 1 dtT RVT (ω 0 · ν) = 0
`∈Λ(T )
×
Y
(5.34)
! Fv
v∈V (T )
(n ) ∂G` ` (ω 0
!
Y
· ν ` (tT ))
(n ) G`0 `0 (ω 0
· ν `0 (tT )) ,
(5.35)
`0 ∈Λ(T )\`
for resonances of type 2 and 3, and by X Z 1 dtT (1 − tT ) RVT (ω 0 · `) = 0
`∈Λ(T )
× (∂
2
!
Y
Fv
v∈V (T )
(n ) G` ` (ω 0
Y
· ν ` (tT )))
! (n ) G`0 `0 (ω 0
· ν `0 (tT ))
`0 ∈Λ(T )\`
+
Z
X
dtT (1 − tT )
`6=`0 ∈Λ(T )
0
(n` )
× (∂G` ×
Y
1
! Fv
v∈V (T ) (n )
(ω 0 · ν ` (tT ))∂G`0 `0 (ω 0 · ν `0 (tT ))) ! Y (n`0 ) G`0 (ω 0 · ν `00 (tT )) ,
(5.36)
`00 ∈Λ(T )\{`,`0 }
for resonances of type 1. Here we have explicitly written the argument of the propagators and the symbol ∂ is meant as the partial derivative with respect the argument (see Sec. 3.2). Note that we can write ω0 · ν ` (tT ) = ω 0 · ν 0` + tT σ` ω0 · ν, by (5.27). The sum over the scale labels is such that for each line the characteristic functions χn reconstruct a step function: each line ` ∈ Λ(T ) can have any scale
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
149
n0 ≥ nT ≥ n`2T + 1, and 1 X
χn0 (ω 0 · ν ` (tT )) = ϑ(|ω 0 · ν(tT )| − C0 2n ) ,
n0 =n+1
n = n`2T ,
(5.37)
where ϑ denotes the step function, i.e. ϑ(x) = 1 if x > 0 and ϑ(x) = 0 otherwise. For each summand contributing to the renormalized part of the resonance value there are either zero (see (5.34)) or one (see (5.35)) or two (see (5.36)) derived propagators. 5.13. Remarks. (1) First of all note that for any value of the interpolation parameter tT the arguments of the step functions lay inside the region in which their arguments are positive, so that no contribution of the derivatives arise from them. In fact for any line ` ∈ Λ(T ) one has (5.33), while the discontinuity of the theta function is at C0 2n by (5.37). (2) For any line ` ∈ Λ(T ), if n is the scale compatible with it before renormalizing, i.e. C0 2n−1 < |ω 0 · ν ` | ≤ C0 2n ,
(5.38)
C0 2n−2 < |ω 0 · ν ` (tT )| ≤ C0 2n+1 ,
(5.39)
then one has
by Lemma 5.11. 5.14. Iterative renormalization of the resonances. By using Remark 5.13 (1), we redecompose the step functions, so obtaining again characteristic functions. This means that we have to study expressions like (5.35) and (5.36), in which no derivative can acts on the characteristic functions (by the just given argument). Consider explicitly the case in which only one propagator is derived, i.e. the case (5.35) of resonances of type 2 and 3. Then the derived propagator in (5.35) can correspond to a line ` contained inside some resonance T 0 ⊂ T with depth D(T 0 ) = 2. If this is the case we do not split the resonance value of T 0 into the sum of a localized and a renormalized part: we say then that such a resonance is not renormalized. Let T 00 be the resonance with higher depth containing `. Then we do not renormalize any resonances containing T 00 and contained inside T . Then we pass to consider T 00 and we repeat the above analysis, i.e. we apply once more Lemma 5.9, hence we study the renormalized value as in Sec. 5.12. The reason how we proceed in such a way will become clear later (see Remark 5.16). Otherwise, if the line ` corresponding to the derived propagator is external to any resonances T 0 ∈ T2 (θ) we pass to consider such resonances and we iterate the above argument, i.e. we apply again Lemma 5.9, so getting rid of the localized parts of the resonances values for all resonances T 0 ∈ T2 (θ) such that T 0 ⊂ T (which then become renormalized resonances), then, by summing over the scale labels, we group
February 4, 2002 10:11 WSPC/148-RMP
150
00112
M. Bartuccelli & G. Gentile
together all trees having the same cluster structure imposed by the set T2 (θ), and proceed as above. The only differences with respect with the previous case are that now also the momentum flowing through the line `2T 0 entering T 0 can have been changed into ν `2 0 (tT ), and the momenta flowing through the lines internal to T 0 will depend T in general on two interpolation parameters tT and tT 0 , one for each renormalized resonance. By Remark 5.13 (2) one has that (5.30) has to be replaced with |ω 0 · ν `2 0 (tT )| ≤ C0 2n+1 , T
(5.40)
if n = n`2 0 = n`1 0 (of course here n has a different value with respect to n in (5.30), T T where one had n = n`2T ). Moreover for any line ` ∈ Λ(T 0 ) the momentum flowing through it, setting t ≡ {tT , tT 0 } can be written as ν ` (t) = ν 0` + tT 0 σ` (ν 0`2 0 + tT σ`2 0 ν `2T ) . T
T
(5.41)
Then for any line ` ∈ Λ(T 0 ) one has, setting again n = n`2 0 and using the second T condition in (5.10), |ω 0 · ν ` (tT 0 )| > C0 2τ 2n+3 − C0 2n+1 > C0 2n+3 ,
(5.42)
so that the same conclusions as before can be drawn. Note that with respect to the original tree it can happen that the renormalization procedure, by changing the scales compatible with the lines, make some resonances to disappear, while some new resonances can appear: recall that the definition of resonance depends on the scale of the resonant line. But this is not a problem at all: the bound on the number of nonresonant lines (which is of course equal to the number of resonances) is derived in Appendix 1 by using that if a line has a momentum ν ` (t) and a scale n, then C0 2n−2 < |ω0 · ν ` (t)| ≤ C0 2n+1 ,
(5.43)
and such a bound is satisfied also for the trees with renormalized resonances by Lemma 5.11. One deals in a similar way also with the resonances of type 1: in this case if both lines corresponding to the derived propagators are inside some resonance T 0 ∈ T2 (θ), then the resonance T 0 is not renormalized, if only one line is inside T 0 then T 0 is renormalized to one order less (i.e. to zeroth order if of type 2 or 3 and to first order if of type 1), while if both lines are external to T 0 then T 0 is renormalized to its proper order ( i.e. to first order if of type 2 or 3 and to second order if of type 1). 5.15. Final result of the renormalization procedure. In this way we have obtained a sum of contributions such that for each of them the following situation
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
151
arises for a given tree θ. We do not give the full details, as the analysis can be performed as in [3]. All localized parts of the resonance values cancel out, so that only renormalized parts have to be considered for the resonance values (by Lemma 5.9). As some resonances contained in the tree θ are renormalized (not all, as the discussion in Sec. 5.14), we say then that the tree is a renormalized tree. For each renormalized resonance T we have an interpolation parameter tT . Defining the set interpolation parameters as t = {tT : T is a renormalized resonance in θ} ,
(5.44)
the momenta of the lines ` ∈ Λ(θ) become functions of t, ν ` = ν ` (t). The explicit dependence on such parameters is obtained as follows: if a line ` = `v is contained inside the renormalized resonances T1 ⊂ T2 ⊂ · · · ⊂ Tp and tT1 , . . . , tTp are the corresponding interpolation parameters, then X Y νw tT . (5.45) ν` (t) = T ∈{T1 ,...,Tp }:`w ∈Λ(T )
wv
Of course (5.45) generalizes (5.41) to the case of more than two resonances contained inside each other. See Fig. 5 for an example.
Fig. 5. The momenta as functions of the interpolation parameters in the case of two resonances T 0 ⊂ T . One has ν ` = ν 0` +tT 0 ν 00 +tT 0 tT ν 0 , where ν 0` = ν v1 +ν v2 and ν 00 = ν v3 +ν v4 +ν v5 +ν v6 . The black balls represent the remaining parts of the trees. The labels are not explicitly shown.
Furthermore, by construction, each propagator is derived at most twice, and one has (n` )
∂ p G`
(ω 0 · ν ` (t)) ∼ (−1)p
p! χn (ω 0 · ν ` (t)) (iω 0 · ν ` (t))p+1 `
= (−1)p
p! (n ) G ` (ω 0 · ν ` (t)) , (iω 0 · ν ` (t))p `
(5.46)
February 4, 2002 10:11 WSPC/148-RMP
152
00112
M. Bartuccelli & G. Gentile
where ∼ means that the two quantities differ by the derivatives of the characteristic functions, which, however, have to be discarded in the valuation of the value of the renormalized tree (see Remark 5.13 (1)). Note that 1 −1 nT −2 , (5.47) iω0 · ν ` (t) ≤ C0 2 if T is the resonance with highest depth such that ` ∈ Λ(T ), as the compatible scales of ` could have been changed at most by one unit with respect to that associated to ` before the renormalization procedure was applied, again by Lemma 5.11. Then for each renormalized resonance, if ν(t) denotes the momentum flowing through the entering line, we have an extra factor iω0 · ν(t) iω0 · ν(t) , iω0 · ν ` (t) iω0 · ν 0` (t)
(5.48)
(possibly times 2, when arising from a propagator derived twice; see (5.46) with p = 2) if T is of type 1, and an extra factor iω0 · ν(t) , iω0 · ν ` (t)
(5.49)
if T is of type 2 or 3. Here ` and `0 denote the lines corresponding to the propagators which have been derived by renormalizing T . This means that for each resonance of type 2 and 3 one has a factor iω0 · ν(t), which deletes the propagator of the corresponding resonant line. This does not happens for resonances of type 4; on the other hand one has a factor (iω0 · ν(t))2 for resonances of type 1, and we can take advantage of such a fact through (5.18) (see (5.54) below). As for any resonance T and for any line ` ∈ Λ(T ) one has |ω 0 · ν ` (t)| > C0 2nT −2 (see (5.47)) and |ω0 · ν(t)| ≤ C0 2n+1 , if ν(t) = ν `2T (t) and n = n`2T , then (5.48) and (5.49) give a “factor gain” Γ2T or ΓT , where ΓT = O(2n−nT ) .
(5.50)
This is evident for resonances which are renormalized, but it is not difficult to realize that it holds also for nonrenormalized resonances. For simplicity (and for expository clarity) we explicitly discuss only the case in which only resonances of type 2 or 3 are involved, but the argument can be easily extended to cover also the case in which some resonances are of type 1. Suppose that a resonance T1 ∈ TD (θ), for some D ≥ 1, is renormalized and that the derived propagator corresponds to a line ` ∈ Λ(Tp ), with Tp ∈ TD+p (θ) such that Tp is the resonance with highest depth containing `. Call T2 , . . . , Tp−1 the (not renormalized) resonances such that Tp ⊂ Tp−1 ⊂ · · · ⊂ T2 ⊂ T1 . If `2Tj denotes the line entering the resonance Tj , for j = 1, . . . , p, one has ν(t) = ν `2T (t) in (5.49). 1
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
Then
j=1
!
iω0 · ν `2T (t)
p−1 Y
iω 0 · ν(t) = iω 0 · ν ` (t)
iω0 · ν `2T
(t) j+1
iω0 · ν `2T (t) p
j
iω0 · ν ` (t)
,
153
(5.51)
and, as each line `2Tj+1 is a line internal to the cluster Tj , hence on scale ≥ nTj we can bound (5.51) by p Y
O(2nj −nTj ) =
j=1
p Y
ΓTj ,
nj = n`2T ,
(5.52)
j
j=1
which proves the assertion. 5.16. Remark. Finally we have that, by construction, there are at most two derived propagators per cluster. It is exactly with this aim that no remormalization (or a renormalization to one order less) is performed on a cluster T 0 ⊂ T when a derived line obtained by renormalizing T is contained inside T 0 . In fact if all resonances were renormalized we could have some lines derived arbitrarily many times, and this would give dangerous factorials (see 5.46)). 5.17. Bound on the propagators. In conclusion, as the effect of the renormalization procedure, we obtain a sum of terms each of which can be bounded as follows. If Λ1 (θ) and Λ2 (θ) denote the sets of lines corresponding to the propagators which are derived once and twice, respectively, we have that each term is bounded by a constant to the power k times ! ! Y Y −k 2−n` 2−2n` C0 `∈Λ1 (θ)
×
1 Y
`∈Λ2 (θ)
! 2
∗ −nNn (θ)
n=−∞
1 Y
! 2
−nR4n (θ)
n=−∞
1 Y
! 2
nR1n (θ)
where the last two products can be bounded by using (5.18), i.e. ! ! ! 1 1 1 Y Y Y −nR4n (θ) nR1n (θ) −cnM(θ)2(n+3)/τ 2 2 2 ≤ , n=−∞
n=−∞
,
(5.53)
n=−∞
(5.54)
n=−∞
while the product taking into account the nonresonant lines can be bounded as in Sec. 5.5 (by using of course also Lemma 5.11). Finally the first two products can be bounded by ! ! 1 Y Y Y −n` −2n` 2 2 2−2npn (θ) , (5.55) ≤ `∈Λ1 (θ)
`∈Λ2 (θ)
n=−∞
as, as we noted, there are at most two derivatives per cluster.
February 4, 2002 10:11 WSPC/148-RMP
154
00112
M. Bartuccelli & G. Gentile
So we are left with the problems of counting how many terms we have to sum over. More precisely we have to sum over the possible choices of the propagators to be derived times a sum over all the possible scale labels compatible with the lines. The first sum is bounded by a constant to the power k as there can be at most two derived propagators per cluster (as already remarked above), while the second sum is bounded by 3|V (θ)| ≤ 3k : once the mode labels have been fixed for each line the scale label can assume only 3 values, by Lemma 5.11. (k) (k) Then the bounds (5.1) are proven for the first two quantities hν and Hν . 5.18. Remark. As a matter of fact there are only 2 compatible scales per line. In fact the change of the momentum ν(t) flowing through a line ` ∈ Λ(T ) is such that at most |ω 0 · ν| ≤ C0 2n+1 , if n = n`2T , while the momentum was originally contained in an interval of width at least C0 2n+3 : this means that ω 0 · ν(t) can fall 0 0 in one of the two contiguous intervals [C0 2n −1 , C0 2n ), with |n0 − n| = 1, but not in both of them. 5.19. Bounds for the counterterms. Repeating the discussion of the previous sections for the counterterms we obtain a bound of the same kind, the only difference being that there is no factor C0−1 associated to the root line (as there is no propagator corresponding to the root line): such a property has been used in Sec. 2.12. So the proof of Lemma 2.2 is complete. 6. Conclusions and Extensions of the Results 6.1. General perturbations and higher dimensions. In this section we briefly review the possible generalizations and extensions of the results discussed in the previous sections. We confine ourselves to give some ideas how the proofs could be carried out, as we think that the main interest of the present paper relies on the techniques described in the previous sections rather than on the results (which are essentially well known from the standard KAM theory). One can consider more general systems described by Hamiltonians of the form H = ω · A + f (α, A, ε) ,
(6.1)
where (α, A) ∈ T × R , with d ≥ 2, are conjugate variables, the rotation vector ω satisfies the Diophantine condition d
d
|ω · ν| > C|ν|−τ
∀ ν ∈ Zd \ {0} ,
(6.2)
with Diophantine constants C > 0 and τ > d − 1, and f is a real analytic function, (d) (d) holomorphic ic in a domain D = Σκ × D × Bε1 (0), where Σκ = {α ∈ Cd : d Re αj ∈ T, |Im αj | < κ, j = 1, . . . , d} and D ⊂ C is an open subset. With respect to the Hamiltonian (1.2) we allow the perturbation to depend on all the action variables and no restriction is made on the dimension d. Then one can ask if a result analogous to Theorem 1.6 holds for the Hamiltonian (6.1).
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
155
The reason why we considered the condition (2.26) for the Hamiltonian (1.2) is that the argument given in Sec. 2.14 can be repeated also for the Hamiltonian (6.1), with some minor (obvious) changes: this represents the first step in order to study the problem outlined in Sec. 1.10. ¯ ⊂ D. By ¯ ∈ D, let ρ be such that Bρ (A) So, for the Hamiltonian (6.1), given A setting Z 2 (1) ¯ 2 (1) ¯ 6= 0 , f (A) ≡ det dα ∂A f (α, A) (6.3) M ≡ det ∂A T2
then, if one fixes ε small enough (and nonvanishing), one can find infinitely many rotation vectors ω 0 close enough to ω and satisfying the Diophantine condition |ω 0 · ν| > C0 |ν|−τ
∀ ν ∈ Zd \ {0} ,
(6.4)
with C0 = bC, for some constant b, and, for each of them, a value A0 = A0 (ε) ∈ ¯ for a suitable δ < ρ0 < ρ, such that one has solutions of the form Bδ (A), ( α(t) = ω 0 t + h(ω 0 t, A0 (ε), ε, ω 0 ) , (6.5) A(t) = A0 (ε) + H(ω 0 t, A0 (ε), ε, ω0 ) and A0 (ε) verifies the equation ω0 + η(A0 (ε), ε, ω0 ) = ω ,
(6.6)
and the functions h(ψ, A, ε, ω 0 ) and H(ψ, A, ε, ω 0 ) are analytic for (ψ, A, ε) ∈ (d) ¯ × Bε0 (0), with suitbale κ0 < κ, ρ0 < ρ and ε0 < ε1 . (Of course A0 (ε) Σκ0 × Bρ0 (A) will depend also on ω0 , even if we are not explicitly writing such a dependence.) Here η(A0 (ε), ε, ω 0 ) is the counterterm naturally arising when trying to look for solutions of the equations of motion with rotation vector ω0 (see Lemma 2.2 and Remark 3.5). The only difference with respect to the simplified problem studied in Sec. 2.14 is that now one has to prove that, given ω verifying the Diophantine condition (6.2) then there exists ε¯ ∈ Bε1 (0) such that, for all |ε| < ε¯, any neighborhood of radius O(ε) close enough to ω contains infinitely many ω 0 satisfying the Diophantine condition (6.4): this can also be easily proved. Then one can reason as in Sec. 2.14, and a result analogous to Theorem 1.6 holds for the Hamiltonian (6.1). 6.2. Measure of the persisting tori in phase space. Under the condition (1)
2 f0 (A)| > 0 , M ≡ inf |det ∂A A∈D
(6.7)
one can ask how many tori persist for perturbed Hamiltonian systems of the form (6.1). If one wants to use the Lindstedt series instead of the usual KAM techniques (see [5] and [15]), one can reason in the following way. For simplicity let us assume ε > 0 (small enough).
February 4, 2002 10:11 WSPC/148-RMP
156
00112
M. Bartuccelli & G. Gentile
First of all note that it is straightforward to prove that if ω satisfies the Diophantine condition (6.2), then, for any constant a > 0, the Lebesgue measure of the set I(ω) of vectors ω 0 ∈ Baε (ω) satisfying the Diophantine condition |ω 0 · ν| > Cεβ |ν|−τ
∀ ν ∈ Zd \ {0} ,
(6.8)
for some constant β ∈ (0, 1), is given by m(I(ω)) ≥ m(Baε (ω))(1 − cεα ) ,
α=
τ −d+1 +β −1, τ +1
(6.9)
for some constant c, so that m(I(ω))/m(Baε (ω)) → 0 for ε → 0, provided that one has β > d/(τ + 1). The condition β > d/(τ + 1) assures that α in (6.9) is strictly positive, while the condition β < 1 is required as a consequence of the bound on the radius of convergence (see below); we shall see in Sec. 6.4 that in fact such a condition is highly improvable. Then one can prove that, fixed ε small enough, most of invariant tori persist under perturbation, in the sense that the fraction of initial data in phase space for trajectories lying on invariant tori tends to 1 for ε → 0. The discussion proceeds as follows. ¯ ⊂ D. Consider ω 0 ∈ Baε (ω), for some ¯ ∈ D and ρ such that Bρ (A) Fix A constant a independent of ε (to be fixed), such that (6.8) is satisfied for ω0 . Then the radius of convergence (in ε) of the series defining the functions h, H, η is of the form ε0 = E0 C0 εβ (simply use the extension of Lemma 2.2 discussed in Sec. 6.1, with C0 replaced with C0 εβ ). As β < 1, if ε is small enough, one has ε < ε0 , so ¯ for some that the series for h, H, η converge for that value of ε and for A ∈ Bρ0 (A), 0 0 0 0 ρ ∈ (0, ρ); in particular η(A, ε, ω ) depends weakly on ω , provided ω ∈ I(ω). More precisely one can choose ε small enough and δ < ρ0 so that, for all A ∈ ¯ Bδ (A), η(A, ε, ω 0 ) = ε η (1) (A) + ε1+γ ξ(A, ε, ω 0 ) ,
sup
ω0 ∈I
sup ¯ A∈Bδ (A)
|ξ(A, ε, ω 0 )| ≤ Ξ , (6.10)
with γ = 1 − β > 0 and Ξ a suitable constant (we are using that the radius of convergence in ε for the counterterm is O(εβ )). By the analyticity in A of the counterterm, by (6.10) and by the condition (6.7) one has that, for ε small enough, for A, A0 ∈ Bδ (A) and for all ω0 ∈ I, |η(A0 , ε, ω0 ) − η(A, ε, ω 0 )| >
M ε|A0 − A| . 2
(6.11)
¯ define ω0 ≡ ω 0 (A) ≡ ω − ε η (1) (A). By varying A ∈ Bδ/2 (A), ¯ For A ∈ Bδ/2 (A) 0 the corresponding values ω cover a set Ω, whose distance from ω is of order ε: this (1) follows from (6.7), as η (1) (A) = −∂A f0 (A). Then we can suppose that a is such that Ω ⊂ Baε (ω). Call Ω0 ⊂ Ω the set of vectors in Ω satisfying (6.8). By the condition (6.7) and ¯ such that ω0 (A) ∈ Ω0 by the bound (6.9) one has that the set of values A ∈ Bδ/2 (A)
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
157
α ¯ has complement with measure bounded by m(Bδ/2 (A))dε , for some constant d. 0 ¯ and A ∈ Bδ (A), ¯ Then one has, for A ∈ Bδ/2 (A)
ω 0 + η(A0 , ε, ω0 ) = ω − ε η (1) (A) + η(A0 , ε, ω0 ) = ω − ε η (1) (A) + η(A, ε, ω 0 ) + η(A0 , ε, ω0 ) − η(A, ε, ω 0 ) = ω + ε1+γ ξ(A, ε, ω 0 ) + η(A0 , ε, ω0 ) − η(A, ε, ω0 ) ,
(6.12)
where ξ(A, ε, ω 0 ) admits the bound in (6.10), while η(A0 , ε, ω0 ) − η(A, ε, ω 0 ) is bounded from below through (6.11). Choose β so that α > γ and a as said above: for ε small enough one has ¯ one δ > 4Ξ εγ /M , so that we conclude that, when varying A0 in Bδ/2 (A) ⊂ Bδ (A), 0 reaches a value A such that ε1+γ ξ(A, ε, ω0 ) + η(A0 , ε, ω0 ) − η(A, ε, ω 0 ) = ε(η (1) (A0 ) − η (1) (A)) + ε1+γ ξ(A0 , ε, ω0 ) = 0
(6.13)
so that (6.12) gives ω 0 + η(A0 , ε, ω0 ) = ω. Of course the value A0 depends on A: we want to prove that if we take two dif˜ 0 , respectively, the corresponding ˜ and we denote by A0 and A ferent values A and A values that one finds by following the above procedure, then one has ˜ 0 = (1 + O(εγ ))(A − A) ˜ + O((A − A) ˜ 2) . A0 − A
(6.14)
To prove the last assertion we shall use that, to each perturbative order k and for all ω 0 , ω 00 ∈ I, if ∂ω η (k) (A, ω 00 ) denotes the formal derivative of η (k) (A, ω) with respect to ω computed in ω00 , then one has η (k) (A, ω 0 ) − η (k) (A, ω 00 ) = ∂ω η(k) (A, ω 00 )(ω 0 − ω00 ) + O(ω 0 − ω 00 )2 ,
(6.15)
¯ This follows from the analysis in [5] (where in fact a uniformly in A ∈ Bρ0 (A). much stronger result is proved, i.e. C ∞ differentiability in the sense of Whitney, [18]), and it can likely be proved also with the techniques used in the present paper; see Sec. 6.3 below. ˜ and call ω0 and ω ˜ 0 the corresponding values Then choose two values A and A, 0 0 (1) (1) ˜ ˜ are both such ˜ = ω − ε η (A): suppose that A and A ω = ω − ε η (A) and ω 0 ˜0 0 0 0 ˜ are defined so that that ω , ω ∈ Ω . The values A and A (η (1) (A0 ) − η (1) (A)) + εγ ξ(A0 , ε, ω0 ) = 0 , ˜ 0 ) − η (1) (A)) ˜ + εγ ξ(A ˜ 0 , ε, ω ˜ 0) = 0 , (η (1) (A
(6.16)
(see (6.13)). The difference between the two equations gives ˜ 0 )) + εγ (ξ(A0 , ε, ω0 ) − ξ(A ˜ 0 , ε, ω ˜ , ˜ 0 )) = (η (1) (A) − η (1) (A)) (η (1) (A0 ) − η (1) (A (6.17)
February 4, 2002 10:11 WSPC/148-RMP
158
00112
M. Bartuccelli & G. Gentile
where one can write ˜ 0 ) = ∂A η (1) (A0∗ )(A0 − A ˜ 0) , η (1) (A0 ) − η (1) (A ˜ = ∂A η (1) (A∗ )(A − A) ˜ , η (1) (A) − η (1) (A)
(6.18)
and ˜ 0 , ε, ω ˜ 0 , ε, ω 0 ) + ξ(A ˜ 0 , ε, ω0 ) − ξ(A ˜ 0 , ε, ω ˜ 0 ) = ξ(A0 , ε, ω0 ) − ξ(A ˜ 0) ξ(A0 , ε, ω 0 ) − ξ(A ˜ 0 ) + ∂ω ξ(A ˜ 0 , ε, ω ˜ 0 )(ω 0 − ω ˜ 0) = ∂A ξ(A0• , ε, ω0 )(A0 − A ˜ 0 )2 ) , + O((ω 0 − ω
(6.19)
˜ 0 , and so on); for suitable A0∗ , A∗ , A0• (for instance A0∗ is a value between A0 and A by construction one has ˜ − εη (1) (A) = ε∂A η (1) (A∗ )(A ˜ − A) , ˜ 0 = εη (1) (A) ω0 − ω
(6.20)
˜ 0 , ε, ω ˜ 0 ) is the derivative (in the sense of Whitney) of ξ with respect to and ∂ω ξ(A the third argument, according to (6.15). Therefore (6.17) reads as ˜ 0 ) + εγ ∂A ξ(A0 , ε, ω0 )(A0 − A ˜ 0) ∂A η (1) (A0∗ )(A0 − A • ˜ 0 , ε, ω ˜ − A) + O((A ˜ − A)2 ) ˜ 0 )∂A η (1) (A∗ )(A + ε1+γ ∂ω ξ(A ˜ , = ∂A η (1) (A∗ )(A − A)
(6.21)
˜ − A)2 )) so that we obtain (up to corrections O((A ˜ 0 = (∂A η (1) (A0 ) + εγ ∂A ξ(A0 , ε, ω0 ))−1 (1 + ε1+γ ∂ω ξ(A ˜ 0 , ε, ω ˜ 0 )) A0 − A ∗ • ˜ , × ∂A η (1) (A∗ )(A − A)
(6.22)
which implies (6.14): hence the map A → A0 is differentiable where it is defined (a property which indeed follows by the smoothness in the sense of Whitney). ¯ for which ω0 = ω 0 (A) As we have seen before that the set of values A ∈ Bδ/2 (A) 0 ¯ belongs to Ω has measure bounded from below by m(Bδ (A))(1−dεα ), the condition (6.14) implies that also the set of values A0 representing the average values of the action variables along the tori with rotation vector in Ω0 has measure bounded from ¯ − d0 εα ), for some constant d0 . below by m(Bδ (A))(1 Therefore we can reason asfollows. Fix δ0 = δ¯0 εα , for some constant δ¯0 inde¯ ⊂ D; of ¯ ∈ D such that Bδ+δ (A) pendent of ε, and define A as the set of values A 0 course the complement of the set A has measure bounded by m(D)cεα . For each of ¯ we can repeat the above discussion, by choosing A = A, ¯ so that we obtain such A 0 0 ¯ the following result: the measure of the values A ≡ A which we find by solving ¯ A ¯ 0 — has complement the sequence of Eqs. (6.13) — with A, A0 replaced with A, 0 α (in D) with measure bounded by m(D)c ε , for some constant c0 independent of ε. By using the fact that the solution can be written in the form (6.5) (and recalling that A0 (ε) depends also on ω0 ) we can reasoning as above to prove that, for fixed ε,
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
159
the actions A(t) are differentiable in A0 = A0 (ε) (in the sense of Whitney) and with derivative close to 1, so that also the action variables corresponding to motions on invariant tori fulfill the set D up to a set of measure of order O(εα ). This answers the question we asked above about the measure in phase space of the persisting invariant tori. 6.3. Differentiability in the sense of Whitney. The functions h, H, η are well defined for ω 0 ∈ I and admit bounds which are uniform for such ω 0 ; they are known to be extendible to functions which are C ∞ in the sense of Whitney. In particular they can be extended to continuous functions: this simply means that (6.15) as well as the equivalent expressions for the functions h and H hold for any ω 0 , ω 00 ∈ I. To prove this one has to compare the Lindstedt series expansions for η (k) (A, ω 0 ) and η(k) (A, ω 00 ). Here we simply sketch how the analysis could be performed. One can write both η (k) (A, ω 0 ) and η (k) (A, ω 00 ) as sums over trees (see (3.17)), so that the difference η (k) (A, ω 0 ) − η(k) (A, ω 00 ) can be written as sum of differences of pairs of tree values, computed the first with the rotation vector ω0 and the second with the rotation vector ω 00 . For the trees of each pair there is a difference between the propagators appearing in the corresponding values, while the remaining factors of the tree values are equal to each other. Therefore each difference can be decomposed as sum of values corresponding to trees whose lines ` have all propagators of the form either 1 , (6.23) ω0 · ν ` or 1 , (6.24) ω 00 · ν ` ˜ which has a new propagator given by the difference up to one, say `, 1 1 − . ω 0 · ν `˜ ω00 · ν `˜
(6.25)
Furthermore one can arrange the decomposition in such a way that the set Λ0 of the lines with propagator (6.23) is connected and contains the root line (except when `˜ is the root line: in such case there is no line with propagator of the form (6.23)), while the set Λ00 of the lines with propagator (6.24) is formed by disjoint sets Λ001 , . . . , Λ00N , each of which is connected. In particular this means that, if for j = 1, . . . , N we call θj00 the set of lines in Λ00j and of the nodes connected by such lines, and, analogously, θ0 the set of lines in Λ0 and of the nodes connected by such lines, then each set θj00 is a subtree which has as root either a node of θ0 or the node which `˜ exit from; for each node v in θ0 call θjv1 , . . . , θjvNv the subtrees which have v as root. ˜ If we denote by ν j the momentum For the time being let us neglect the line `. 00 flowing through the root line of θj and by kj its order, we see that when we sum over all the possible subtrees in Tkj ,ν j we obtain a quantity which admits the bound
February 4, 2002 10:11 WSPC/148-RMP
160
00112
M. Bartuccelli & G. Gentile
(2.5), with k = kj and ν j : in fact all the lines of such subtrees have a propagator of the form (6.24), so that the analysis of the previous sections apply. Then we can consider the set θ0 , which can be considered as a tree, provided that to each node v we replace the mode label ν v with ν v + ν jv1 + · · · + ν jvNv . Again if we sum over all the possible trees θ0 , we obtain a bound like (2.5). The newly introduced propagator (6.25) can be written as (ω 0 − ω00 )ν `˜ 1 1 , − = ω0 · ν `˜ ω 00 · ν `˜ (ω 0 · ν `˜)(ω 00 · ν `˜)
(6.26)
when |ν `˜| <
C 2|ω0 − ω 00 |
−1/(τ +1) ,
while it can be only bounded with 1 1 1 1 ≤ + − ω 0 · ν ˜ ω 00 · ν ˜ ω0 · ν ˜ ω 00 · ν ˜ , ` ` ` `
(6.27)
(6.28)
when ν `˜ does not satisfy (6.27). In the first case we can add a line `˜ both to the set Λ0 and to the set Λ00 , and reason as above: simply the bound (5.17) has to be multiplied times a factor 2 as the line `˜ has to be counted twice. In the latter case one can think to use the exponential decay of the Fourier coefficients in order to obtain a bound small in |ω 0 − ω00 |; note that there is only one line with propagator of the form (6.25), so that we can use, say, the square root of the product of the Fourier coefficients of all the nodes preceding the line `˜ in order to create a quantity bounded by exp[−κ|ν `˜|/2]. Of course the above analysis is only heuristic: but we think that it can be made easily rigorous by using the analysis of Sec. 5, with some minor changes. With similar arguments we can prove differentiability to all orders, i.e. that the functions h, H, η are C ∞ in the sense of Whitney. 6.4. Improving the results. In all the paper we assumed ω to be Diophantine. In that case the measure of the complement of the set of vectors ω 0 whose invariant tori persist under the perturbation, for fixed ε is known to be exponentially small. This is obtained in the usual KAM theory by performing suitably many Birkhoff transformations before applying the KAM theorem. To see such a feature with the Lindstedt series one can reason in a way similar to what was done in [9] about the problem of the persistence of KAM tori for three time scale systems. Again we shall proceed at a heuristic level. Suppose for simplicity f to be a trigonometric polynomial in α of order N , so that to order k one has |ν ` | ≤ kN for ˜ with |ω| ˜ < aε, all ` ∈ Λ(θ) and for all trees θ of order k. By setting ω 0 = ω + ω, one has ˜ · ν| ≥ |ω 0 · ν| ≥ |ω · ν| − |ω
C −τ |ν| , 2
(6.29)
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
for all ν such that
|ν| ≤ N0 ≡
C 2aε
161
1/(τ +1) ,
(6.30)
So for all k such that k < k0 ≡ N0 /N one has that (6.27) applies in order to bound all propagators, and only to order k = k0 a line ` such that one has to use the bound |ω0 · ν ` | > Cεβ |ν ` |−τ ,
(6.31)
can really appear. To such an order k0 , by summing over all trees θ ∈ Tk0 ,ν , one obtains the bound B k0 C −k0 ε−β ,
(6.32)
for some constant B, and this suggests that to any order k one has the bound (BC −1 ε−βN/N0 )k ,
(6.33)
BC −1 ε−βN/N0 ε < 1 ,
(6.34)
which requires
that is βN/N0 < 1. The proof of (6.33) can be performed by reasoning as in [9]. Therefore we can take 1/(τ +1) 1 N0 =c , (6.35) β ≤ β0 = 2N ε for some constant c. Then, by using the analysis of Sec. 5, the discussion could be extended to analytic perturbations which are not necessarily trigonometric polynomials: it would be interesting to do so, and to compare the results arising by using the Lindstedt series (in particular the exponent 1/(τ + 1)) with the ones known from the usual KAM theory. Appendix 1. Proof of Lemma 5.4 A1.1. Remark. In the following we shall use that if a line ` is on a scale n then C0 2n−2 ≤ |ω 0 · ν ` | ≤ C0 2n+1 .
(A1.1)
Without considering the renormalization procedure then (5.9) holds, and it implies (A1.1). We shall use (A1.1) instead of (5.9), as we have seen that the renormalization procedure can make the number of scales compatible with some lines to increase (see Sec. 5.10), but, by Lemma 5.11, the change of the compatible scales with respect to the scale originally (i.e. before renormalizing) associated to the line, can be at most by one unit, i.e. n` can be changed into n0` = n` ± 1, so that (A1.1) will continue to hold also when the renormalization procedure is applied. This means that all the following analysis will be still valid for renormalized trees (as it has to be).
February 4, 2002 10:11 WSPC/148-RMP
162
00112
M. Bartuccelli & G. Gentile
A1.2. Proof of Lemma 5.4. We prove inductively on the number of nodes of the trees the bounds Nn∗ (θ) ≤ max{0, 2 M (θ)2(n+3)/τ − 1} , pn (θ) ≤ max{0, 2 M (θ)2(n+3)/τ − 1} , Rn4 (θ)
≤ max{0, 2 M (θ)2
(n+3)/τ
−1+
(A1.2) Rn1 (θ)} ,
where M (θ) is defined in (5.3). The proof is inspired by [16] (see also [8]), and gives (5.14) and (5.15) with c = 22+3/τ . First of all note that if M (θ) < 2−(n+3)/τ then Nn (θ) = 0 as in such a case for any line ` ∈ Λ(θ) one has |ω 0 · ν ` | > C0 |ν ` |−τ > C0 M (θ)−τ > C0 2n+3 ,
(A1.3)
by the Diophantine hypothesis (2.2). A1.3. Bound on N ∗n (θ). If θ has only one node the bound is trivially satisfied as, if v is the only node in V (θ), one must have M (θ) = |ν v | ≥ 2−(n+3)/τ in order that the line exiting from v be on scale ≤ n: then 2 M (θ)2(n+3)/τ ≥ 2. If θ is a tree with V > 1 nodes, we assume that the bound holds for all trees having V 0 < V nodes. Define En = (2 2(n+3)/τ )−1 : so we have to prove that Nn∗ (θ) ≤ max{0, M (θ)En−1 − 1}. If the root line ` of θ is either on scale > n or resonant and on scale ≤ n, call θ1 , . . . , θm the m ≥ 1 subtrees entering the last node of θ and with M (θi ) ≥ 2−(n+3)/τ ; see Fig. 6. Then Nn∗ (θ) =
m X
Nn∗ (θi ) ,
(A1.4)
i=1
hence the bound follows by the inductive hypothesis. If the root line ` is nonresonant and on scale ≤ n, call `1 , . . . , `m the m ≥ 0 lines on scale ≤ n which are the nearest to ` (this means that no other line along the paths connecting the lines `1 , . . . , `m to the root line is on scale ≤ n). Note that in such a case `1 , . . . , `m are the entering line of a cluster T on scale > n; see Fig. 7.
Fig. 6. A tree θ consisting of a node v0 and m subtrees θ1 , . . . , θm with M (θi ) ≥ 2−(n+3)/τ entering v0 . The subtrees are represented by black balls. The subtree θ0 has M (θ 0 ) < 2−(n+3)/τ . The labels are not explicitly shown.
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
163
Fig. 7. A tree θ consisting of a line (root line) exiting from a cluster T with m entering lines `1 , . . . , `m . Each of such lines is the root line of a subtree which is represented as a black ball. The labels are not explicitly shown.
One has Nn∗ (θ) = 1 +
m X
Nn∗ (θi ) ,
(A1.5)
i=1
so that the bound becomes trivial if either m = 0 or m ≥ 2. If m = 1 then T = θ \ θ1 , ` and `1 are both on scales ≤ n and `1 is not entering a resonance, so that |ω0 · ν ` | ≤ C0 2n+1 ,
|ω 0 · ν `1 | ≤ C0 2n+1 ,
and either ν ` = ν `1 and one has X |ν v | > (2 2(n+3)/τ )−1 = En ,
(A1.6)
(A1.7)
v∈V (T )
or ν ` 6= ν `1 , otherwise T would be a resonance. If ν ` 6= ν `1 , then, by (A1.6) one has |ω 0 · (ν ` − ν `1 )| ≤ C0 2n+2 , which, by the Diophantine condition (2.2), implies |ν ` − ν `1 | > 2−(n+2)/τ , so that again X |ν v | ≥ |ν ` − ν `1 | > 2−(n+2)/τ > (2 2(n+3)/τ )−1 = En , (A1.8) v∈V (T )
as in (A1.7). Therefore in both cases we get X |ν v | > En , M (θ) − M (θ1 ) =
(A1.9)
v∈T
which, inserted into (A1.5) with m = 1, gives, by using the inductive hypothesis, Nn∗ (θ) = 1 + Nn∗ (θ1 ) ≤ 1 + M (θ1 )En−1 − 1 ≤ 1 + (M (θ) − En )En−1 − 1 ≤ M (θ)En−1 − 1 ,
(A1.10)
hence the bound is proved also if the root line is nonresonant and on scale ≤ n.
February 4, 2002 10:11 WSPC/148-RMP
164
00112
M. Bartuccelli & G. Gentile
A1.4. Bound on pn (θ). The bound is trivial if M (θ) < 2(n+3)/τ , as Nn (θ) = 0 in such a case. Otherwise one can reason as follows. If the last node v0 of θ is not in a cluster on scale n one has pn (θ) =
m X
pn (θi ) ,
(A1.11)
i=1
if θ1 , . . . , θm are the subtrees entering v0 and with M (θi ) ≥ 2−(n+3)/τ , so that the bound follows from the inductive hypothesis. If the last node v0 of θ is inside a cluster T on scale n one has pn (θ) = 1 +
mT X
pn (θi ) ,
(A1.12)
i=1
where now θ1 , . . . , θmT are the subtrees entering the cluster T . The only nontrivial case if the one with mT = 1: in such a case there is only one line `1 entering the cluster T , and its scale is strictly smaller than n, i.e. n`1 ≤ n − 1. But then one has X |ν v | > 2−(n+3)/τ , (A1.13) v∈V (T )
as we are going to show. First note that T has to contain at least one line on scale n. Call ` such a line: then C0 2n−2 |ω 0 · ν ` | ≤ C0 2n+1 ,
(A1.14)
by (A1.1). We can write ν ` = ν 0` + σ` ν 1 , where ν 0` is the sum of the mode labels of the nodes w inside the resonance such that w v, if ` = `v , and ν 1 is the momentum flowing through `1 (see (5.11) and (5.12)). Therefore if (A1.13) did not hold one would have !−τ X |ν v | > C0 2n+3 , (A1.15) |ω0 · ν 0` | ≥ C0 |ν 0` |−τ ≥ C0 v∈V (T )
and, as |ω 0 · ν 1 | ≤ C0 2n`1 +1 ≤ C0 2n ,
(A1.16)
|ω 0 · ν ` | ≥ |ω 0 · ν 0` | − |ω0 · ν 1 | ≥ C0 2n+3 − C0 2n ≥ C0 2n+2 ,
(A1.17)
one would have
which is not consistent with (A1.14). Then note that (A1.13) implies M (θ1 ) ≤ M (θ) − En , so that the bound follows. A1.5. Bound on R4n (θ). If θ has only one node the bound is trivially satisfied. Let θ be a tree with V > 1 nodes and suppose that the bound on Rn4 (θ) holds for all trees with V 0 < V nodes.
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
165
If the root line ` of θ is not the exiting line of a resonance or it is the exiting line of a resonance of type either 1 or 2 or 3, then call θ1 , . . . , θm the m ≥ 1 subtrees entering the last node of θ and with M (θi ) ≥ 2−(n+3)/τ . By the inductive hypothesis one has Rn4 (θi ) ≤ 2 M (θi )2(n+3)/τ − 1 + Rn1 (θi ) ,
(A1.18)
for all i = 1, . . . , m. Moreover Rn1 (θ) =
m X
Rn1 (θi ) ,
(A1.19)
i=1
if the line ` does not exit from a resonance of type 1, while Rn1 (θ) = 1 +
m X
Rn1 (θi )
(A1.20)
i=1
otherwise, so that Rn4 (θ)
=
m X
Rn4 (θi )
≤2
i=1
m X
! M (θi ) 2(n+3)/τ − m + Rn1 (θ)
i=1
≤ 2 M (θ)2(n+3)/τ − 1 + Rn1 (θ) ,
(A1.21)
as m ≥ 1. If the root line ` is the exiting line of a resonance T of type 4, then ` is a line h ← H and there is only one subtree θ0 entering T . One has Rn1 (θ) = Rn1 (θ0 ) , Rn4 (θ) = 1 + Rn4 (θ0 ) ,
(A1.22)
and the root line `0 of θ0 is a line H ← h with momentum ν `0 equal to ν ` and on scale ≤ n (as T is a resonance of type 4). Call `1 , . . . , `m the m ≥ 0 lines on scale ≤ n preceding `0 and which are the nearest to `0 : by construction they enter a cluster T0 having `0 as exiting line; see Fig. 8. If either m = 0 or m ≥ 2 the bound follows easily from the inductive hypothesis. If m = 1, denote by `1 the line entering T0 ; see Fig. 9. If ν `1 = ν `0 and X |ν v | < En , (A1.23) v∈V (T0 )
then T0 is a resonance, and in such a case it is a resonance of type 1 if `1 is a line h ← H and of type 2 if `1 is a line H ← h; otherwise T0 is not a resonance. In the latter case (i.e. if T0 is not a resonance) one has X |ν v | > En , (A1.24) v∈V (T0 )
February 4, 2002 10:11 WSPC/148-RMP
166
00112
M. Bartuccelli & G. Gentile
Fig. 8. A tree θ consisting of a line (root line) exiting from a resonance T of type 4 whose entering line `0 is the exiting line of a cluster T0 with m entering lines `1 , . . . , `m . Each of such lines is the root line of a subtree which is represented as a black ball. The labels are not explicitly shown.
Fig. 9. A tree θ consisting of a line (root line) exiting from a resonance T of type 4 whose entering line `0 is the exiting line of a cluster T0 with only one entering line `1 . The subtree θ1 having `1 as root line is represented as a black ball. The labels are not explicitly shown.
which is obvious if ν `1 = ν `0 (by definition of resonance) and follows from the Diophantine condition if ν `1 6= ν `0 (see (A1.8)); then if T0 is not a resonance one has M (θ) − M (θ1 ) ≥ M (θ0 ) − M (θ1 ) > En .
(A1.25)
Note also that if T0 is not a resonance then Rn1 (θ) = Rn1 (θ0 ) = Rn1 (θ1 ) , Rn4 (θ) = 1 + Rn4 (θ0 ) = 1 + Rn4 (θ1 ) ,
(A1.26)
so that, by the inductive hypothesis (applied to θ1 ) and by (A1.25), one has Rn4 (θ) ≤ 1 + M (θ1 )En−1 − 1 + Rn1 (θ1 ) ≤ 1 + (M (θ) − En )En−1 − 1 + Rn1 (θ) ≤ M (θ)En−1 − 1 + Rn1 (θ) ,
(A1.27)
so that the assertion is proved. If T0 is a resonance of type 1, then Rn1 (θ) = Rn1 (θ0 ) = 1 + Rn1 (θ1 ) , Rn4 (θ) = 1 + Rn4 (θ0 ) = 1 + Rn4 (θ1 ) ,
(A1.28)
so that, by the inductive hypothesis (applied to θ1 ), one has Rn4 (θ) = 1 + Rn4 (θ1 ) ≤ 1 + (2 M (θ1 )2(n+3)/τ − 1 + Rn1 (θ1 )) ≤ 2 M (θ1 )2(n+3)/τ − 1 + Rn1 (θ) ≤ 2 M (θ)2(n+3)/τ − 1 + Rn1 (θ) , and the bound follows.
(A1.29)
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
167
Finally, if T0 is a resonance of type 2, then Rn1 (θ) = Rn1 (θ0 ) = Rn1 (θ1 ) ,
(A1.30)
Rn4 (θ) = 1 + Rn4 (θ0 ) = 1 + Rn4 (θ1 ) ,
and the line `1 entering T0 is again a line H ← h on scale ≤ n, so that one can ole of `0 ) and, if needed, we repeat for θ1 the above argument (simply `1 plays the rˆ iterate further the analysis until one of the two following possibilities arise: either one obtains a case for which the bound follows or one reaches a tree θ˜ having only a node, so not containing any resonances at all (in particular no other resonances of type 1 and 4). Note that if the first possibility never arise, then the second one sooner or later is unavoidable as at each iterative step the number of the nodes is ˜ + 1 = 1 and R1 (θ) = R1 (θ) ˜ = 0, and the decreased. In such a case Rn4 (θ) = Rn4 (θ) n n (n+3)/τ ≥ 1 in order that there be at least one line on scale bound holds as M (θ)2 ≤ n in θ. So at the end the assertion follows in all cases. Appendix 2. Proof of Lemma 5.9 A2.1. Introduction. We consider separately the first three types of resonances (for the type 4 there is nothing to prove). As in Sec. 3 we ignore the problem of the factorials, which, again, is solved by reasoning as in [2]. P A2.2. Resonance of type 1. First we prove that θ0 ∈FT (θ) VT (0) = 0. Given a tree θ consider all trees which can be obtained by shifting the entering line `2T ; see Fig. 10. Note that the trees so obtained are contained in the resummation family FT (θ) introduced in Sec. 5.8.
Fig. 10. The trees obtained by shifting the line entering a two-node resonance. The black balls represent the remaining parts of the trees. The labels are not explicitly shown.
Corresponding to such an operation VT (0) changes by a factor iν v if v is the node which the entering line is attached to, as all node factors and propagators do not change. By (5.10) the sum of all such values is zero. Then consider ∂VT (0). By construction ! ! X Y Y (n` ) (n`0 ) Fv G`0 ∂G` , (A2.1) ∂VT (0) = `∈Λ(T )
v∈V (T )
`0 ∈Λ(T )\`
February 4, 2002 10:11 WSPC/148-RMP
168
00112
M. Bartuccelli & G. Gentile
where all propagators have to be computed for ω 0 · ν = 0, and d (n` ) (n` ) 0 G (ω 0 · ν ` + x) , x = ω0 · ν . ∂G` = dx ` x=0
(A2.2)
The line ` divides V (T ) into two disjoint set of nodes V1 and V2 , such that `1T exits from a node inside V1 and `2T enters a node inside V2 : if ` = `v one has V2 = {w ∈ V (T ) : w v} and V1 = V (T ) \ V2 ; see Fig. 4. By (5.10) if X X νv , ν2 = νv , (A2.3) ν1 = v∈V1
v∈V2
one has ν 1 + ν 2 = 0. Then consider the families F1 (θ) and F2 (θ) of trees obtained as follows: F1 (θ) is obtained from θ by detaching `1T then reattaching to all the nodes w ∈ V1 and by detaching `2T then reattaching to all the nodes w ∈ V2 , while F2 (θ) is obtained from θ by reattaching the line `1T to all the nodes w ∈ V2 and by reattaching the line `2T to all the nodes w ∈ V2 , and, simultaneously, by replacing all lines h ← H with lines H ← h and vice versa; note that F1 (θ) ∪ F2 (θ) ⊂ FT (θ). See Fig. 11 for a simple example.
!
+
,
*
-
"
.
#
&
'
(
$
%
)
/
0
Fig. 11. The two families F1 (θ) = {θ1 , θ10 } and F2 (θ) = {θ2 , θ20 } for a three-node resonance. Here V1 = {v2 , v3 } and V2 = {v1 }. The black balls represent the remaining parts of the trees. The labels are not explicitly shown.
As a consequence of such an operation the arrows of some lines `0 ∈ Λ(θ) \ `0 change their directions: this means that some line h ← H becomes H ← h and vice versa, and, correspondingly, some propagator g(ν) becomes −g(−ν), but g(−ν) = −g(ν), so that no overall change is produced by such factors. On the other hand one has a derived propagator ±g 0 (ν ` ) for the trees in F1 (θ) and a derived propagator ∓g 0 (−ν ` ) for the trees in F2 (θ), and g 0 (ν) = g 0 (−ν). Then by summing over all the possible trees in F1 (θ) we obtain a value i2 ν 1 ν 2 times a common factor, while
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
169
by summing over all the possible trees in F2 (θ) we obtain −i2 ν 1 ν 2 times the same common factor, so that the sum of two sums gives zero. P A2.3. Resonance of type 3. To prove that θ0 ∈FT (θ) VT (0) = 0 simply reason as for VT (0) in the previous case, by using that the entering line is a line h ← H. A2.4. Resonance of type 2. We want to show that also in such a case P θ 0 ∈FT VT (0) = 0. Given a tree θ consider all trees obtained by detaching the exiting line, then reattaching to all the nodes v ∈ V (T ); see Fig. 12. Note again that the trees so obtained are contained in the resummation family FT (θ).
Fig. 12. The trees obtained by shifting the line entering a two-node resonance. The black balls represent the remaining parts of the trees. The labels are not explicitly shown.
In such a case again some propagators can change, i.e. a line h ← H can become H ← h and vice versa but the corresponding propagator does not change (reason as above for resonances of type 1). So at the end we obtain a common factor times iν v , where v is the node which the exiting line is attached to. By (5.10) again we P obtain θ0 ∈FT VT (0) = 0. So the lemma is proved. Appendix 3. Proof of Lemma 5.11 A3.1. Notations. As in Sec. 5.12 we define the depth D(T ) of a resonance T by setting D(T ) = 1 if there is no resonance containing T , and setting D(T ) = D(T 0 ) + 1 if T is contained inside a resonance T 0 and all the other resonances inside T 0 (if there are any) do not contain T . Given a resonance T , we denote here by T0 the set of nodes and lines internal to T0 not contained in any resonance inside T . Given a resonance T and a line ` ∈ T we write ν ` = ν 0` + σ` ν as in (5.27), with ν = ν `2T . By shifting the lines external to the resonances a momentum ν ` can be changed into ±ν 0` + σν (as noted in Sec. 5.8). A3.2. Proof of Lemma 5.11. The proof is by induction of the depth D of the resonances. If T is a resonance with depth D(T ) = 1 and ` ∈ T0 , denoting by n the scale of the line `2T , then one must have nT ≥ n + 3 by definition of resonance, as it can easily proved. Moreover n` ≥ nT for all ` ∈ T , by definition of cluster).
February 4, 2002 10:11 WSPC/148-RMP
170
00112
M. Bartuccelli & G. Gentile
Then, by denoting by ν the momentum of the line entering T , one has C0 2n` ≥ |ω0 · ν ` | > C0 2n` −1 ,
(A3.1)
|ω0 · ν| ≤ C0 2n ≤ C0 2nT −3 ≤ C0 2n` −3 ,
(A3.2)
while
so that |ω0 · ν 0` | ≥ |ω 0 · ν ` | − |ω 0 · ν| ≥ C0 2n` −1 − C0 2n` −3 ≥ C0 2n` −2 , |ω0 · ν 0` | ≤ |ω 0 · ν ` | + |ω 0 · ν| ≤ C0 2n` + C0 2n` −3 ≤ C0 2n` +1 ,
(A3.3)
which proves the assertion for lines in T0 with D(T ) = 1. Fix D > 1. Then suppose that the assertion holds for all resonances of depth D0 < D: we show that then it holds also for resonances of depth D. Let ` be a line in T0 , for some resonance T ∈ T (θ) of depth D(T ) = D. By the inductive hypothesis `2T is contained inside a resonance of depth D − 1, so that its scale can be changed at most by one unit, i.e., if one had n`2T = n before shifting the lines, the scale n`2T has become n0`2 with n`2T −1 ≤ n0`2 ≤ n`2T +1: then n0`2 ≤ nT −2. T T T Therefore one has C0 2n` ≥ |ω0 · ν ` | > C0 2n` −1 ,
(A3.4)
|ω0 · ν| ≤ C0 2n+1 ≤ C0 2nT −2 ≤ C0 2n` −2 ,
(A3.5)
while
so that again |ω0 · ν 0` | ≥ |ω 0 · ν ` | − |ω 0 · ν| ≥ C0 2n` −1 − C0 2n` −2 ≥ C0 2n` −2 , |ω0 · ν 0` | ≤ |ω 0 · ν ` | + |ω 0 · ν| ≤ C0 2n` + C0 2n` −2 ≤ C0 2n` +1 ,
(A3.6)
which proves the assertion for the lines in T0 with D(T ) = 1. Acknowledgments We are indebted to L. Chierchia and especially to G. Gallavotti for many discussions. References [1] M. V. Bartuccelli, K. V. Georgiou and G. Gentile, “KAM theory, Lindstedt Series and the stability of the upside-down pendulum”, to appear in Discrete Contin. Dyn. Syst. Ser. A. [2] A. Berretti and G. Gentile, “Scaling properties for the radius of convergence of Lindstedt series: generalized standard maps”, J. Math. Pures Appl. (9) 79(7) (2000) 691–713. [3] F. Bonetto, G. Gallavotti, G. Gentile and V. Mastropietro, “Lindstedt series, ultraviolet divergences and Moser’s theorem”, Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 26(3) (1998) 545–593.
February 4, 2002 10:11 WSPC/148-RMP
00112
Lindstedt Series for Perturbations of Isochronous Systems
171
[4] F. Bonetto, G. Gallavotti, G. Gentile and V. Mastropietro, “Quasi linear flows on tori: regularity of their linearization”, Comm. Math. Phys. 192(3) (1998) 707–736. [5] L. Chierchia and G. Gallavotti, “Smooth prime integrals for quasi-integrable Hamiltonian systems”, Nuovo Cimento B 67(2) (1982) 277–295. [6] L. H. Eliasson, “Absolutely convergent series expansions for quasi-periodic motions”, Math. Phys. Electron. J. 2 (1996) 1–33. [7] G. Gallavotti, “Quasi-integrable mechanical systems”, pp. 539–624 in Critical Phenomena, Random Systems, Gauge Theories, Les Houches, Session XLIII (1984), Vol. II, eds. K. Osterwalder and R. Stora, North Holland, Amsterdam, 1986. [8] G. Gallavotti, “Twistless KAM tori”, Comm. Math. Phys. 164(1) (1994) 145–156. [9] G. Gallavotti, G. Gentile and V. Mastropietro, “A field theory approach to Lindstedt series for hyperbolic tori in three time scales problems”, J. Math. Phys. 40(12) (1999) 6430–6472. [10] G. Gentile, “Whiskered tori with prefixed frequencies and Lyapunov spectrum”, Dynamics and Stability of Systems 10(3) (1995) 269–308. [11] G. Gentile and V. Mastropietro, “Methods for the analysis of the Lindstedt series for KAM tori and renormalizability in Classical Mechanics: A review with some applications”, Rev. Math. Phys. 8(3) (1996) 393–444. [12] A. N. Kolmogorov, “On the preservation of conditionally periodic motions”, Dokl. Akad. Nauk 96 (1954) 527–530; English translation in G. Casati and J. Ford, Stochastic behavior in classical and quantum Hamiltonians, Lecture Notes in Physics 93, Springer, Berlin, 1979. [13] J. Moser, “Convergent series expansions for quasi–periodic motions”, Math. Ann. 169 (1967) 136–176. [14] H. Poincar´e, Les m´ethodes nouvelles de la m´ecanique c´eleste, Gauthier-Villars, Paris, Vol. I, 1892, Vol. II, 1893, Vol. III, 1899. ¨ [15] J. P¨ oschel, Uber Invariante Tori in Differenzierbaren Hamiltonschen Systemen, Bonner Mathematische Schriften 120, Universit¨ at Bonn, Mathematisches Institut, Bonn, 1980. “Integrability of Hamiltonian systems on Cantor sets”, Comm. Pure Appl. Math. 35(5) (1982) 653–696. [16] J. P¨ oschel, “Invariant manifolds of complex analytic mappings near fixed points”, in Critical Phenomena, Random Systems, Gauge Theories, Les Houches, Session XLIII (1984), Vol. II, eds. K. Osterwalder and R. Stora, North Holland, Amsterdam, 1986. [17] W. M. Schmidt, Diophantine Approximation, Lecture Notes in Mathematics 785, Springer, Berlin, 1980. [18] H. Whitney, “Analytic extensions of differentiable functions defined in closed sets”, Trans. Amer. Math. Soc. 36(1) (1934) 63–89.
February 4, 2002 10:52 WSPC/148-RMP
00111
Reviews in Mathematical Physics, Vol. 14, No. 2 (2002) 173–198 c World Scientific Publishing Company
GROUND STATE PROPERTIES OF THE NELSON HAMILTONIAN: A GIBBS MEASURE-BASED APPROACH
´ ˝ VOLKER BETZ, FUMIO HIROSHIMA∗ , JOZSEF LORINCZI, ROBERT A. MINLOS† and HERBERT SPOHN Zentrum Mathematik, Technische Universit¨ at M¨ unchen Gabelsbergerstr. 49, 80290 M¨ unchen, Germany ∗Department
of Mathematics and Physics, Setsunan University Osaka 572-8508, Japan †Dobrushin Mathematical Laboratory of the Institute for Information Transmission Problems of the Russian Academy of Sciences Bol’shoy Karetny per. 19, Moscow, 101447, Russia
Received 25 June 2001 The Nelson model describes a quantum particle coupled to a scalar Bose field. We study properties of its ground state through functional integration techniques in case the particle is confined by an external potential. We obtain bounds on the average and the variance of the Bose field both in position and momentum space, on the distribution of the number of bosons, and on the position space distribution of the particle.
1. Introduction Ground states in quantum mechanics can be analysed through two in essence distinct techniques. The obvious choice is the eigenvalue equation, Hψ = Eψ, which after all serves as a definition of the ground state. The second route is more indirect and uses positivity properties of the semigroup e−tH , t ≥ 0, which happen to be valid for many models. Through a Trotter type formula one can then represent ground state expectation values as averages with respect to a certain probability measure on a function space. This measure has the structure of a Gibbs measure and methods from statistical mechanics become available. The standard folklore is that for systems with a few degrees of freedom the eigenvalue equation is the more powerful approach, whereas for quantum fields with an infinite number of degrees of freedom one should employ functional integration. In fact in the latter case, the Hamiltonian H is in general not available as a self-adjoint operator on some Hilbert space, and one uses functional integration techniques to define H in the first place. In this paper we investigate the Nelson model of a quantum particle confined by an external potential and coupled to a scalar Bose field. This is a borderline case: the model has a well-defined Hamiltonian H, cf. Sec. 2 below, as well as a 173
February 4, 2002 10:52 WSPC/148-RMP
174
00111
V. Betz et al.
natural functional measure. There has been growing interest in this model recently in connection with a rigorous control of resonances and radiation damping [2]. Here we take up the technique of functional integration with the goal of establishing bounds on ground state expectations of physical interest. A basic qualitative question is how the coupling to the field modifies the localization of the particle. We will prove a pointwise exponential bound, even a superexponential bound if the potential is sufficiently confining, for the ground state density of the particle. They support the physical picture that coupling enhances localization. For the Bose field we study the fluctuations, which turn out to increase through the presence of the particle, and the average density in position and momentum space. For the latter we prove upper and lower bounds which are sharp enough to pin down the infrared divergent behaviour. Finally we establish superexponential bounds on the distribution of the boson number. The core of our paper is a “dictionary” which translates ground state expectations in Fock space into averages over the Gibbs measure. When this translation is applied to quantities of physical interest, the aforementioned bounds turn out to be a consequence of elementary inequalities. The only extra tool that we need is a diamagnetic type inequality for estimating the position density of the particle. Some of our bounds, possibly in weaker form, have been proved before by other means; we refer to Sec. 6 for a discussion. The Nelson model has the special feature that, as first observed by Feynman [4], one can integrate over the field degrees of freedom resulting in an effective action for the particle. Nelson [16] used this method in a study of the ultraviolet limit, which turned out to be the gateway to his famous work on Markov random fields. Since then the understanding of the probabilistic structure of the functional measure corresponding to Nelson’s model has improved considerably; we use the occasion to provide a concise and self-contained framework in Secs. 3 and 4. The Nelson model with massless bosons is both infrared and ultraviolet divergent. As proved by Nelson [17] through operator techniques, the latter is of a rather mild nature, since only the energy has to be renormalized. In this work we simply assume the appropriate cutoffs at small and large k to hold so that the Hamiltonian H of (2.1) is a self-adjoint operator in Fock space with a unique ground state. The functional integral for the Nelson model with massless bosons in dimension d ≥ 3 is studied in [14]. The construction of the appropriate functional measure relies on a cluster expansion for the effective Gibbs measure on particle trajectories [13]. This model is infrared divergent in d = 3 and convergent for d > 3. Infrared divergence means in the language of functional integration that the time t = 0 path measure is singular with respect to the free t = 0 measure. In fact, this measure is absolutely continuous with respect to an appropriately shifted Gaussian measure, which then leads to a renormalized Hamiltonian Hren in Fock space [15]. Arai [1] studies Hren through operator techniques. We also refer to the monumental work of J. Fr¨ ohlich [5, 6] where ground state properties of the Nelson model with zero external potential are studied, including the removal of ultraviolet and infrared cutoffs.
February 4, 2002 10:52 WSPC/148-RMP
00111
Ground State Properties of the Nelson Hamiltonian
175
2. Representation in Fock Space The Hamiltonian of the model in Fock space is the operator H := Hp ⊗ 1 + 1 ⊗ Hf + HI
(2.1)
in L2 (Rd ) ⊗ F. We use L2 (Rd ) to denote the set of the square integrable functions with respect to Lebesgue measure on Rd , while we will write L2 (µ) for the square integrable functions with respect to any other measure µ. F denotes the symmetric Fock space over Rd , and 1 Hp = − ∆ + V , 2 Z Hf = ω(k)a∗k ak dk , Z HI =
1 p (ˆ %(k)eikq ak + %ˆ(−k)e−ikq a∗k )dk . 2ω(k)
odinger operator Hp to be of the We require the potential V : Rd → R of the Schr¨ form V + − V − with V + , V − > 0, V − in the Kato class Kd (see [21]) and V + locally in Kd . In particular, V can be the sum of a continuous function that is bounded below and a function having Coulomb singularities. In addition, we assume that V is chosen such that Hp has a unique ground state, i.e. inf spec(Hp ) is an eigenvalue of multiplicity one. As for Hf and HI , we require ω(k) = ω ¯ (k) = ω(−k) ,
%(k) = %¯(k) ,
0 < ω(k) except on a set of Lebesgue measure zero, %ˆ √ ∈ L2 (Rd ) , ω
%ˆ ∈ L2 (Rd ) . ω
(2.2) (2.3) (2.4)
Here and henceforth fˆ denotes Fourier transform, f ∨ will be used for the inverse Fourier transform of f , and f¯ denotes the complex conjugation of f . For the convenience of the reader, we briefly recall the notions concerning symˆ the space of metric Fock space involved in the above formulas. Denote by L2 (Rd )⊗n L2 (Rdn )-functions f that are symmetric in the sense that for each k1 , . . . , kn ∈ Rd and each permutation π of {1, . . . , n}, we have f (k1 , . . . , kn ) = f (kπ(1) , . . . , kπ(n) ). L∞ ˆ The symmetric Fock space F is the set of all F = (f0 , f1 , . . .) ∈ n=0 L2 (Rd )⊗n for which the direct sum norm, kF k2F =
∞ X
kfn k2L2 (Rd )⊗n , ˆ
(2.5)
n=0 ˆ , it follows from the polarisation formula for converges. Putting F (n) = L2 (Rd )⊗n (n) is spanned by linear combinations of functions multilinear maps (see [19]) that F ˆ with f ∈ L2 (Rd ) (where we use the convention f ⊗0 ∈ C). of the form f ⊗n = f ⊗n
February 4, 2002 10:52 WSPC/148-RMP
176
00111
V. Betz et al.
Thus for defining linear operators on F it is sufficient to specify their action on these elements. Given such an f ⊗n ∈ F (n) and g ∈ L2 (Rd ), we define Z √ ∗ ⊗n ∗ ˆ ∈ F (n+1) , ≡ ak g(k)dk f ⊗n = n + 1f ⊗n ⊗g a (g)f a(g)f ⊗n ≡
Z
√ ak g(k)dk f ⊗n = nh¯ g , f iL2 (Rd ) f ⊗(n−1) ∈ F (n−1) for n > 0 ,
ˆ is given by and (a(g))(F (0) ) = 0. Here, f ⊗n ⊗g (f
n+1 1 X ⊗g)(k1 , . . . , kn+1 ) = n + 1 i=1
⊗n ˆ
n+1 Y
! f (kj ) g(ki ) .
j6=i
a∗ is called the Bose creation operator and a the annihilation operator. Both of them P∞ < ∞}. are defined on the common domain {(f0 , f1 , . . .) ∈ F : n=0 nkfn k2L2 (Rd )⊗n ˆ g )F, GiF with F, G in the above domain. Furthermore, hF, Ra(g)GiF = ha∗ (¯ The operator ω(k)a∗k ak dk is the differential second quantisation of the multiplication operator f 7→ ωf in L2 (Rd ). In general, given an operator B in L2 (Rd ), the second quantisation Γ(B) of B is the operator in F with Γ(B)f ⊗n = (Bf )⊗n . If (Bt )t≥0 is a contraction semigroup on L2 (Rd ) with generator A, then it is easy to see that (Γ(Bt ))t≥0 is a contraction semigroup on F . The generator of this semigroup is then called the differential second quantisation of A. Explicitly, dΓ(A)f ⊗n =
n X
ˆ ⊗(n−1) (Af )⊗f
(2.6)
i=1
for all f ∈ D(A). It follows that Z n X ⊗n ∗ ⊗n ˆ ⊗(n−1) . ≡ ω(k)ak ak dk f = (ωf )⊗f dΓ(ω)f
(2.7)
i=1
For a self-adjoint operator A, both Γ(A) and dΓ(A) are self-adjoint. For every ε > 0 there exists b > 0 such that
%ˆ
kHfg k + b √%ˆ kgk kHIg k ≤ ε
ω 2
ω 2 L
(2.8)
L
for all g ∈ D(Hf ). Thus by the Kato–Rellich theorem, conditions (2.4) ensure that H is self-adjoint on D(∆ ⊗ 1) ∩ D(1 ⊗ Hf ) and bounded from below. 3. Representation in Function Space In this section, we develop the Schr¨odinger representation of H, i.e. we find an ˜ which is unitarily equivalent to H and acts in an L2 -space. Moreover, operator H ˜ will be the generator of a Markov process. H
February 4, 2002 10:52 WSPC/148-RMP
00111
Ground State Properties of the Nelson Hamiltonian
177
We begin our construction by applying the so-called ground state transformation to Hp . Let us write ψ0 for the strictly positive, unique ground state of Hp . The operator of multiplication with ψ0 will be denoted by ψ0 as well. ψ0 is a unitary map ˜ p = ψ −1 Hp ψ0 of from L2 (ψ02 dq) to L2 (Rd ), and thus the ground state transform H 0 ˜ p is unitary equivalent to Hp . H ˜ p is the generator of a Hp acts in L2 (ψ02 dq), and H stationary Rd -valued P (φ)1 -process, i.e. the stationary solution of the SDE dqt = (∇(log ψ0 ))(qt )dt + dBt . We will denote the path measure of the P (φ)1 -process by N 0 , and its stationary measure by N0 . Note that dN0 (q) = ψ02 (q)dq. In order to construct an L2 -space for the bosonic field, consider the space S 0 = 0 S (Rd ) of tempered distributions and the space S = S(Rd ) of (real-valued) Schwartz functions. Write G for the Gaussian measure on paths ξ = {ξt : t ∈ R} (ξt ∈ S 0 ) with mean 0 and covariance Z 1 e−ω(k)|t−s| dk g (k) (3.1) G(ξs (f )ξt (g)) = fˆ(k)ˆ 2ω(k) R for all f ∈ S with |fˆ|2 (k)/ω(k)dk < ∞. G is the measure of an S 0 -valued OrnsteinUhlenbeck process, i.e. a stationary Gaussian Markov process with state space contained in S 0 . The stationary measure of G will be denoted by G. It is the Gaussian measure on S 0 with mean 0 and covariance obtained by setting t = s in (3.1). To get some information about support properties and path continuity of G, we define a Hilbert seminorm on S by Z 2 kf kBD = fˆ(k) max{ω(k), 1}D(k, k 0) max{ω(k 0 ), 1}fˆ(k 0 )dk dk 0 , where D(k, k 0 ) is the integral kernel of (−∆ + |k|2 )−(d+1) . The completion of S with respect to k.kBD will be denoted by BD . For G-almost all ξ, the map t 7→ ξt takes its values from BD and is continuous with respect to the topology generated by BD [13]. When working with the measure G it is convenient to introduce the Hilbert space K obtained by completing S with respect to the (complex) scalar product Z 1 dk (f, g ∈ S) . (3.2) g (k) hf, giK = fˆ(k)ˆ 2ω(k) Extending the action of ξ to complex-valued functions by putting ξ(f + ig) = ξ(f ) + iξ(g), we find that Z ξ(f )ξ(g)dG(ξ) = hf, giK for all f, g ∈ K . In particular, the map ξ 7→ ξ(f ) is a well-defined element of L2 (G) for each f ∈ K; we will denote it by ξ(f ).
February 4, 2002 10:52 WSPC/148-RMP
178
00111
V. Betz et al.
The connection between the Fock space F and L2 (G) is given by the Wiener– Itˆo–Segal isomorphism. In order to describe this isomorphism, we need Wick polynomials. The Wick polynomial of order n with respect to G is defined recursively by : ξ(f )0 : = 1 , : ξ(f ) : = ξ(f ) , √ n : ξ(f1 ) · · · ξ(fn ) : = : ξ(f1 ) · · · ξ(fn−1 ) : ξ(fn ) (3.3) Z n−1 n−1 X Y 1 ξ(fi )ξ(fn )dG(ξ) : ξ(fj ) : . −√ n − 1 i=1 j6=i Note √ that this differs from the convention used e.g. in white noise theory by a factor of n! in the nth Wick polynomial. The reason is that there, the Fock space norm is usually defined with an n! in front of the nth term in (2.5), while we use the convention from [17]. The Wiener–Itˆo–Segal isomorphism now is the map θ : F → L2 (G) ,
ˆ · · · ⊗f ˆ n 7→ : f1 ⊗
n Y
√ ξ(( 2ωfi )∨ ) : .
(3.4)
i=1
A carefully done proof of the fact that θ is indeed an isomorphism can be found in [10], although there a different norm convention is used for the Fock space. Remark 3.1. The fact that the Fourier transform is part of our version of the Wiener–Itˆo–Segal isomorphism is somewhat inconvenient and leads to aesthetically slightly unsatisfactory formulas. We could have avoided this by defining the Gaussian process G on distributions that produce real numbers when applied to Fourier transforms of real-valued functions, and omitting the hats in (3.1). However, since this also does not seem to be the most natural thing to do, we decided to stick to the established convention [13, 14]. Let us now describe the images of Hf and HI under θ. From (3.4) and (2.7) it is easy to see that ˜ f : ξ(f1 ) · · · ξ(fn ) : ≡ (θHf θ−1 ) : ξ(f1 ) · · · ξ(fn ) : H =
n X
: ξ((ω fˆi )∨ )
i=1
n Y
ξ(fj ) : .
(3.5)
j6=i
˜ f is the generator of the process G [18]. Note that H On the other hand, the unitary map ψ0 ⊗ 1 commutes with HI . Thus writing Θ = ψ0−1 ⊗ θ, we easily see from (3.3) and (3.4) that for g ∈ L2 (N0 ) and f ∈ K, ˜ I (g⊗ : ξ(f )n :) ≡ (ΘHI Θ−1 )(g⊗ : ξ(f )n :) H = (g⊗ : ξ(f )n :)ξ(%(. − q)) = (g⊗ : ξ(f )n :) · (ξ ∗ %)(q) .
(3.6)
February 4, 2002 10:52 WSPC/148-RMP
00111
Ground State Properties of the Nelson Hamiltonian
179
Of course, ξ(%(. − q)) in the above means the map (ξ, q) 7→ ξ(%(. − q)). Extending ˜ I is the operator of multiplication with (q, ξ) 7→ (3.6) by linearity, we find that H (ξ ∗ %)(q). In sum, we find ˜p ⊗ 1 + 1 ⊗ H ˜f + H ˜I . ˜ ≡ ΘHΘ−1 = H H
(3.7)
˜ p ⊗1 acting in L2 (N0 ⊗G) = L2 (N0 )⊗L2 (G) is the generator ˜f +H The operator 1⊗ H of a stationary Markov process. We will denote the measure N 0 ⊗ G corresponding to this Markov process by P 0 , and its stationary measure N0 ⊗ G by P0 . ˜ is the sum of the generator of a Markov process and From (3.7) we see that H a multiplication operator. Modulo technical assumptions (see below), this implies Z Rt ˆ hF, e−tH GiL2 (P0 ) = F (q0 , ξ0 )e− 0 (ξs ∗%)(qs )ds G(qt , ξt )dP 0 , (F, G ∈ H) . (3.8) (3.8) is called Feynman–Kac–Nelson formula. Nelson [16] proved it by explicit ˜ I . However, since we have path continuity of P 0 and HI is approximation of H infinitesimally bounded with respect to Hp ⊗ 1 + 1 ⊗ Hf (see (2.8)), the standard proof using the Trotter formula (see e.g. [20]) also works. 4. Gibbs Measures Rt The factor exp(− 0 ξs ∗ %(qs )ds)dP 0 appearing in (3.8) defines a finite measure on path space C(R, Rd × BD ). Normalizing it results in a probability measure with a Gibbsian structure for finite intervals (or in “finite volume”). We are going to investigate the existence of the infinite volume limit (i.e. t → ∞) of this measure. The method we use here to prove existence relies on the following ˜ has a normalized, positive ground state Ψ ∈ L2 (P). Main assumption: H We will require this assumption to be fulfilled throughout the remainder of the paper. Sufficient conditions for the existence and uniqueness of an L2 -ground state of ˜ are [22] H √ (i) %ˆ/ ω ∈ L2 , %ˆ/ω ∈ L2 , (ii) %ˆ/ω 3/2 ∈ LZ2 , |ˆ %(k)|2 k 2 dk, (iii) Σ − Ep > ω(k)(2ω(k) + k 2 ) where Σ is the infimum of the essential spectrum of Hp , and Ep = inf spec Hp . In [8], more general particle-field couplings are allowed. When specialized to our setting, the assumptions in [8] correspond to Σ = ∞. Let us briefly comment on the above conditions: (i) appears in Sec. 2 as Assumption (2.4) and was needed there to ensure existence and self-adjointness of H. In the context of Gibbs measures (i) is required for
February 4, 2002 10:52 WSPC/148-RMP
180
00111
V. Betz et al.
the existence of the free energy limT →∞ T1 log(ZT ); see below the definition of ZT . (ii) is called the infrared cut-off Under additional assumptions on V and R condition. on the coupling strength |ˆ %|2 /ω dk, (ii) is also necessary for the existence of an L2 ground state [14]. Thus although we will explicitly assume (ii) to hold only after Theorem 4.1, implicitly it plays a role also there. (iii) is needed for currently available proofs. For the Pauli–Fierz model with external potential, Griesemer et al. [9] prove the existence of a ground state without such an extra assumption. Thus one would expect (i) and (ii) to suffice. Note that if lim|q|→∞ V (q) = ∞, then Σ = ∞ and (iii) follows from (i). Let us write X = {Xt : t ∈ R} = {(qt , ξt ) : t ∈ R} for elements of C(R, Rd × BD ), and Z T 1 exp − (ξs ∗ %)(qs )ds dP 0 (X) , (4.1) dPT (X) = ZT −T R RT for the finite volume Gibbs measure. Here ZT = exp(− −T (ξs ∗ %)(qs )ds)dP 0 (X) is the partition function. In order to state our theorem about existence of the T → ∞ limit of PT , we still need some preparations. First let us recall the notion of local weak convergence: For a topological space Y and an interval S ⊂ R, denote by FS the σ-field over C(R, Y ) generated by the point evaluations with points in S. A sequence of probability measures (µn ) on C(R, Y ) is said to converge locally weakly to a measure µ if for each compact interval S ⊂ R and each bounded, FS -measurable function F , limn→∞ µn (F ) = µ(F ). ˜ and put H ¯ =H ˜ − E0 , where E0 = Secondly, let Ψ be the ground state of H, ˜ Denote by P the unique probability inf spec(H) is the ground state energy of H. d measure on C(R, R × BD ) characterized by the conditions Z ¯ ¯ (4.2) F dP = hΨ, f1 e−(t2 −t1 )H f2 · · · e−(tn −tn−1 )H fn ΨiL2 (P0 ) for all F (X) = f1 (Xt1 ) · · · fn (Xtn ) with f1 , . . . , fn ∈ L∞ (Rd × BD ), t1 < · · · < tn . Note that the r.h.s. of (4.2) in fact defines a probability measure because of ¯ e−tH Ψ = Ψ, kΨkL2(P0 ) = 1 and Kolmogorov’s consistency theorem. Theorem 4.1. PT → P in the topology of local weak convergence as T → ∞. Moreover, P fulfills the DLR-equations with respect to the family {PT : T > 0}, ¯ ∈ C(R, Rd × BD ), i.e. for F ∈ F[−T,T ] and P-almost all X ! Z Z T 1 0,T ¯ = F (X) exp − (ξs ∗ %)(qs )ds dPX P(F |F[−T,T ]c )(X) ¯ T (X) , ¯ −T ,X ZT −T (4.3)
February 4, 2002 10:52 WSPC/148-RMP
00111
Ground State Properties of the Nelson Hamiltonian
181
0,T 0 ¯ where PX ¯ −T ,X ¯ T (X) is P conditional on {X±T = X±T }. Hence, P is a Gibbs RT measure with respect to P 0 for the interaction given by −T (ξs ∗ %)(qs )ds.
Proof. Let S > 0 and F ∈ F[−S,S] be bounded. Since ˜
˜
ZT = h1, e−2T H 1iL2 (P0 ) = ke−T H 1k2L2 (P0 ) , by using the Feynman–Kac–formula and the Markov property of P 0 we find that, for T > S, ZZ Z 1 ˜ ˜ (e−(T −S)H 1)(X−S )(e−(T −S)H 1)(XS ) F dPT = ke−T H˜ 1k2L2 (P0 ) Z ×
Z −
exp
S
−S
0,S F (X)dPX (X) −S ,XS
(ξs ∗ %)(qs )ds
× dP0 (X−S )dP0 (XS ) .
(4.4) ¯ −(T −τ )H
1 → hΨ, 1iΨ as T → ∞ in By spectral theory, for any τ ∈ R we have e L2 (P0 ). Ψ is strictly positive, therefore hΨ, 1i > 0 and 1 ke−T H¯ 1k
¯
T →∞
e−(T −τ )H 1 −→ Ψ
in L2 (P0 ) , for every fixed τ ∈ R .
(4.5)
L2 (P0 )
From this it follows that 1 ˜ T →∞ e−(T −τ )H 1 −→ e−τ E0 Ψ in L2 (P0 ) , for every fixed τ ∈ R , (4.6) ¯ −T H ke 1kL2 (P0 ) and thus
Z lim
T →∞
ZZ F dPT =
Ψ(X−S )Ψ(XS )e−2SE0
Z ×
e−
RS −S
(ξs ∗%)(qs )ds
0,S F (X)dPX (X) −S ,XS Z
× dP0 (X−S )dP0 (XS ) =
F dP .
This shows local weak convergence, and (4.2) as well as (4.3) now follow from the last equation by using the Feynman–Kac formula and the Markov property of P 0 . From (4.2) it is immediate that P is the measure of a stationary Markov process and dP = Ψ2 , dP0 where P is the stationary measure of P.
(4.7)
February 4, 2002 10:52 WSPC/148-RMP
182
00111
V. Betz et al.
From now on we assume that Z 0 Z ∞ Z |ˆ %|2 C% = 2 sup ds dt |W (qs − qt , s − t)| = dk < ∞ . 2ω 3 q∈C(R,Rd ) −∞ 0
(4.8)
Note that (4.8) is actually the infrared condition. As discussed above, in general we will not have Ψ ∈ L2 (P0 ) if the infrared condition is violated, and thus it is clear that (4.8) will be essential for the results of Secs. 5 and 6 to hold. Here we use (4.8) to describe some nice additional structure of P. Fix q¯ ∈ C(R, Rd ) and denote by PTq¯ the measure PT conditional on {q = q¯}. Note that the condition q¯ appears as an upper index here, as opposed to the lower indices used in Theorem 4.1. The convention we use throughout is that conditioning on a path is denoted by upper indices, while conditioning on points is denoted by lower ones. PTq¯ is a Gaussian measure on C(R, BD ) with mean Z
Z T ξt (f )dPTq¯ (ξ) = Mt,¯ q (f ) = −
Z
T
ds
dk
−T
fˆ(k)ˆ %(k)eik¯qs −ω(k)|t−s| e 2ω(k)
(f ∈ K, f real-valued, t ∈ R) and covariance equal to that of G. Since Z
T
fˆ(k)ˆ %(k)eik¯qs −ω(k)|t−s| e dk 2ω(k) Z ˆ Z ∞ Z ˆ |f | |ˆ %| |f | |ˆ %|e−ω|t−s| dk = ≤ ds dk 2 2ω ω −∞ Z
ds −T
Z ≤ kf kK
2|ˆ %|2 dk ω3
1/2 < ∞,
(4.9)
T Mt,¯q (f ) = limT →∞ Mt,¯ ¯, t and f . By the convergence theory for q (f ) exists for all q Gaussian measures it follows that P q¯ = limT →∞ PTq¯ exists in the topology of weak convergence, and is a Gaussian measure with mean Mt,¯q and the same covariance as G. Knowing the structure of P q¯ for each q¯, in order to understand P we need only study the distribution N of q¯ under P. This will then give us a convenient representation of P as a mixture of Gaussian measures that R T we will use in the next section. To obtain N , let f ∈ L1 (N 0 ). Then, since ξ 7→ −T (ξs ∗ %)(qs )ds is linear and G is a Gaussian measure, the integration with respect to ξ can be explicitly done and we obtain Z T Z T Z Z RT − −T (ξs ∗%)(qs )ds 0 dP = f (q) exp − W (qs − qt , s − t)ds dt dN 0 f (q)e −T
with 1 W (q, t) = − 2
Z
−T
|ˆ %(k)|2 cos(kq)e−ω(k)|t| dk . 2ω(k)
February 4, 2002 10:52 WSPC/148-RMP
00111
Ground State Properties of the Nelson Hamiltonian
183
From now on we thus look at the Gibbs measure Z T Z T 1 dNT = exp − W (qs − qt , s − t)ds dt dN 0 . ZT −T −T By taking F = f ⊗ 1 in Theorem 4.1 we have the following Corollary 4.2. There exists a probability measure N on C(R, Rd ) such that NT → N in the topology of local weak convergence. N is the measure of a stationary Rd valued process. The stationary measure of N will be denoted by N. We have Z Z f dN = f ⊗ 1 dP , for each f ∈ L1 (N ), and Z Z Z F (q, ξ)dP(q, ξ) = F (q, ξ)dP q (ξ) dN (q)
(4.10)
for each F ∈ L1 (P). Both PT and NT are finite volume Gibbs measures relative to the reference measures P 0 and N 0 , respectively. Checking that N admits a DLR representation (i.e. is a Gibbs measure) is, however, slightly more involved than it was for P because N is no longer the measure of a Markov process. This is due to the long range pair potential W that is picked up by integration over the field. The next theorem states that N does however fulfill the DLR-equations with respect to the family of measures ZZ 1 q¯ W (qs − qt , s − t)ds dt dNT0,¯q (q) , (4.11) dNT (q) = q¯ exp − ZT ΛT where NT0,¯q is the P (φ)1 -measure conditioned on {q(s) = q¯(s) ∀ |s| > T }, and ΛT = ([−T, T ] × R) ∪ (R × [−T, T ]) . Theorem 4.3. N is a Gibbs measure for the family {NTq¯ : T > 0}. The proof of this theorem would interrupt the main line of the paper and is therefore moved to the Appendix. To conclude this section, let us note that if the infrared condition (condition (ii) above) is fulfilled, then the interaction energy between the path on the left and right half-lines is uniformly bounded. Thus [13] we expect N to be unique on the set of paths with at most logarithmic increase in this case. 5. Ground State Expectations as Gibbs Averages We now establish an explicit formula for writing expressions of the form hΨ, BΨiL2 (P0 ) as Gibbs averages with respect to N . Remember that Ψ is the ground
February 4, 2002 10:52 WSPC/148-RMP
184
00111
V. Betz et al.
˜ For stating our results we need Wick exponentials, which for g ∈ K are state of H. given by S 0 3 ξ 7→ : exp(ξ(g)) : =
∞ X 1 √ : ξ(g)n : . n! n=0
(5.1)
The following formulas hold for all f, g ∈ K and all bounded, self-adjoint operators A in L2 (Rd ): 1 2 (5.2) : exp(ξ(g)) : = exp ξ(g) − kgkK , 2 h: exp(ξ(f )) :, : exp(ξ(g)) :iL2 (G) = exp(hf, giK ) , √ 1 ˜ Γ(A) : exp(ξ(g)) : = : exp(ξ(( wA √ gˆ)∨ )) : , w
(5.3) (5.4)
√ 1 ˜ h: exp(ξ(f )) :, dΓ(A) : exp(ξ(g)) :iL2 (G) = hf, ( wA √ gˆ)∨ iK exp(hf, giK ) . (5.5) w ˜ ˜ Here we put Γ(A) = θΓ(A)θ−1 and dΓ(A) = θdΓ(A)θ−1 . Moreover, for each T ∈ d [0, ∞], q ∈ C(R, R ) we define Z T + (k) = − %ˆ(k)eikqs e−ω(k)|s| ds , fˆT,q 0
− (k) = − fˆT,q
Z
0
%ˆ(k)eikqs e−ω(k)|s| ds ,
−T
± (k). Note that and write fq± (k) for f∞,q Z 0 Z hfq− , fq+ iK = −2 ds −∞
and kfq± k2K
∞
dt W (qt − qs , t − s) ,
(5.6)
0
Z
%ˆ(k)2 dk = C% . 2ω(k)3
≤
(5.7)
Theorem 5.1. Let B be a bounded operator on L2 (G). Then Z hΨ, (1 ⊗ B)ΨiL2 (P0 ) = h: exp(ξ(fq− )) :, B : exp(ξ(fq+ )) :iL2 (G) Z × exp 2
Z
0
∞
ds
−∞
dt W (qt − qs , t − s) dN (q) .
0
Proof. Put ΨT :=
1 ke−T H˜ 1k
˜
e−T H 1 .
(5.8)
February 4, 2002 10:52 WSPC/148-RMP
00111
Ground State Properties of the Nelson Hamiltonian
Then by (3.8), we have in L2 -sense Z T Z ¯ = √1 ΨT (¯ exp − q , ξ) (ξs ∗ %)(qs )ds dPq¯0,ξ¯(q, ξ) , ZT 0
185
(5.9)
where Pq¯0,ξ¯ = Nq¯0 ⊗ Gξ¯ denotes the measure P = N 0 ⊗ G conditional on {q0 = q¯, ¯ G ¯ is a Gaussian measure with mean ξ0 = ξ}. ξ Z ¯ −|t|ω fˆ)∨ ) (t ∈ R, f ∈ S) ξt (f )dGξ¯(ξ) = ξ((e (5.10) Mξ,t ¯ (f ) ≡ and covariance
Z ξt (f )ξs (g)dGξ¯(ξ) − Mξ,t ¯ (f )Mξ,s ¯ (g)
Z ˆ¯ f gˆ −ω|t−s| (e − e−ω(|t|+|s|) )dk , (s, t ∈ R) . (5.11) = 2ω The proof of these formulas can be found in the Appendix. Now the integration with respect to Gξ¯ in (5.9) can be carried out with the result Z T Z T Z Z 1 |ˆ %(k)|2 1 + ¯ ¯ exp(ξ(fT,q )) exp q , ξ) = √ ds dt dk ΨT (¯ 2 0 2ω(k) ZT 0 (5.12) × cos(k(qs − qt ))(e−ω(k)|t−s| − e−ω(k)(t+s) ) dNq¯0 . By (5.2) we have,
Z T Z T Z |ˆ %(k)|2 1 ds dt dk 2 0 2ω(k) 0 × cos(k(qs − qt ))e−ω(k)(t+s) ,
¯ + )) : exp ¯ + )) = : exp(ξ(f exp(ξ(f T,q T,q
and hence ¯ = √1 ΨT (¯ q , ξ) ZT
Z ¯ + )) : exp : exp(ξ(f T,q
Z −
Z
T
ds 0
T
dt W (qs − qt , s − t) dNq¯0 .
0
(5.13) By the time reversibility of Nq¯0 , also Z 0 Z Z 0 ¯ − )) : exp − ¯ = √1 : exp(ξ(f q , ξ) ds dt W (q − q , s − t) dNq¯0 ΨT (¯ s t T,q ZT −T −T (5.14) holds. Now we write (5.14) for the left entry and (5.13) for the right entry of the scalar product hΨT , (1 ⊗ B)ΨT i and use the fact that for F[0,∞[ -measurable f, g ∈ L1 (N 0 ), Z Z Z Z q ) = f (q + )g(q − )dN 0 (q) f dNq¯0 g dNq¯0 dN0 (¯
February 4, 2002 10:52 WSPC/148-RMP
186
00111
V. Betz et al.
(with qs+ = qs and qs− = q−s for s ≥ 0), to write hΨT , (1⊗B)ΨT i as an integral with R0 RT respect to N 0 . Then we add and subtract the term 2 −T ds 0 dt W (qs − qt , s − t) in the exponent and incorporate the term with the minus sign into the measure NT . The result reads Z − + )) :, B : exp(ξ(fT,q )) :iL2 (G) hΨT , (1 ⊗ B)ΨT iL2 (N0 ⊗G) = h: exp(ξ(fT,q Z × exp 2
Z
0
−T
ds
T
dt W (qt − qs , t − s) dNT (q) .
0
(5.15) This is the finite T version of (5.8). It remains to justify the passing to the limit T → ∞. On the left hand side of (5.15), this is immediate since ΨT → Ψ in L2 (N0 ⊗ G) and B is continuous. On the right hand side, we already know that NT → N in the topology of local weak convergence, and thus it only remains to show that the integrand converges uniformly in q ∈ C(R, Rd ). For the second factor of the integrand this is a consequence of (4.8). As for the first factor, we find that for k 6= 0, ± (k)| ≤ |fT,q
|ˆ %(k)| ω(k)
uniformly in T and q ,
and T →∞
± fT,q (k) −→ fq± (k)
uniformly in q .
+ )) : → : exp(ξ(fq+ )) : in L2 (G) Thus : exp(ξ(fq+ )) : is well defined, and : exp(ξ(fT,q and uniformly in q by dominated convergence. Since the same argument applies to − and B is continuous, the claim follows. fT,q
Most operators of physical interest are not bounded. Therefore we need to extend formula (5.8) to unbounded operators. Proposition 5.2. Let B be a self-adjoint operator in L2 (G) with Z kB : exp(ξ(fq± )) : k2L2 (G) dN (q) < ∞ .
(5.16)
Then Ψ ∈ D(1 ⊗ B), and (5.8) holds. Let E be the projection valued measure corresponding to B, and let BN = RProof. N λ dE(λ) for N ∈ N. Then BN is a bounded operator, hence (5.8) holds for BN . −N Using (4.8) and the Cauchy–Schwarz inequality, we have k(1 ⊗ BN )Ψk2L2 (N0 ⊗G) Z R0 R∞ 2 = h: exp(ξ(fq− )) :, BN : exp(ξ(fq+ )) :iL2 (G) e2 −∞ ds 0 dt W (qt −qs ,t−s) dN (q)
February 4, 2002 10:52 WSPC/148-RMP
00111
Ground State Properties of the Nelson Hamiltonian
Z ≤ eC % Z ≤ eC%
187
kBN : exp(ξ(fq− )) : kL2 (G) kBN : exp(ξ(fq+ )) : kL2 (G) dN (q) kB : exp(ξ(fq− )) : kL2 (G) kB : exp(ξ(fq+ )) : kL2 (G) dN (q) ,
which is finite according to (5.16). This shows that Ψ ∈ D(1 ⊗ B) and (1⊗ BN )Ψ → (1 ⊗ B)Ψ as N → ∞. On the other hand, it follows from (5.16) that : exp(ξ(fq± )) : ∈ D(B) for N -almost all q . From this we conclude h: exp(ξ(fq− )) :, BN : exp(ξ(fq+ )) :iL2 (G) → h: exp(ξ(fq− )) :, B : exp(ξ(fq+ )) :iL2 (G) for N almost all q as N → ∞. Since by (5.3) we have h: exp(ξ(fq− )) :, BN : exp(ξ(fq+ )) :iL2 (G) ≤ k : exp(ξ(fq− )) : kL2 (G) kBN : exp(ξ(fq+ )) : kL2 (G) ≤ e2C% kB : exp(ξ(fq+ )) : kL2 (G) for all q, and the right hand side of the above is N -integrable, the dominated convergence theorem implies Z R0 R∞ h: exp(ξ(fq− )) :, BN : exp(ξ(fq+ )) :iL2 (G) e2 −∞ ds 0 dt W (qt −qs ,t−s) dN (q) Z →
h: exp(ξ(fq− )) :, B : exp(ξ(fq+ )) :iL2 (G) e2
R0
−∞
ds
R∞ 0
dt W (qt −qs ,t−s)
dN (q)
as N → ∞. This finishes the proof. We now present one minor extension and two important special cases of (5.8). Corollary 5.3. Let g ∈ L∞ (Rd ), and suppose B satisfies the assumptions of Proposition 5.2. Then Z hΨ, (g ⊗ B)ΨiL2 (P0 ) = h: exp(ξ(fq− )) :, B : exp(ξ(fq+ )) :iL2 (G) Z × g(q0 ) exp 2
Z
0
−∞
ds
∞
dt W (qt − qs , t − s) dN (q) .
0
Here g is again used to denote the operator of multiplication with g. Note that if BR is chosen to be the identity operator, then we arrive at hΨ, gΨiL2 (P0 ) = g(q0 )dN , a formula that also follows from Corollary 4.2. Corollary 5.4. For β > 0 and g ∈ K, put M (β) = hΨ, eβξ(g) ΨiL2 (P0 ) .
February 4, 2002 10:52 WSPC/148-RMP
188
00111
V. Betz et al.
M is the moment generating function for the random variable ξ 7→ ξ0 (g) under P, and Z M (β) = eβξ0 (g) dP(q, ξ)
Z =
exp
β2 2
Z
|ˆ g |2 dk − β 2ω
Z
Z
∞
ds −∞
%ˆ(k)ˆ g (k)eikqs −ω(k)|s| e dN . (5.17) dk 2ω(k)
By (4.9), M (β) is finite for all β, and hence hΨ, ξ(g)n ΨiL2 (P0 ) =
dn M (β)|β=0 dβ n
for all n ∈ N .
(5.18)
Note that, although (5.17) can in principle be deduced from Proposition 5.2, it can be obtained more easily by using (4.10), i.e. by fixing q ∈ C(R, Rd ) and integrating the function ξ 7→ eξ0 (g) with respect to the corresponding Gaussian measure. The next statement deals with second quantisation and differential second quantisation of operators. Corollary 5.5. Let A be a bounded self-adjoint operator on L2 (Rd ), and write ˜ ˜ ˜ ˜ = θdΓ(A)θ−1 . Then Ψ ∈ D(Γ(A)), Ψ ∈ D(dΓ(A)) Γ(A) = θΓ(A)θ−1 and dΓ(A) and ˜ hΨ, Γ(A)Ψi L2 (P0 ) ∨ R Z R∞ 0 √ 1 e2 −∞ ds 0 dt W (qs −qt ,s−t) dN (q) , = exp fq− , wA √ fˆq+ w K ∨ Z √ 1 ˆ+ − ˜ √ 2 0 f = f , wA dN (q) . (5.19) hΨ, dΓ(A)Ψi L (P ) q w q K Proof. First note that by (5.7) and boundedness of A, k(Afˆq± )∨ kK is uniformly bounded in q. On the other hand, ∨ 2
√ 1 ˆ+ ± 2
˜ √ f wA kΓ(A) : exp(ξ(fq )) : kL2 (G) = exp
w q K follows directly from (5.3) and (5.4). Furthermore, ˜ kdΓ(A) : exp(ξ(fq± )) : k2L2 (G) ∨ 2 ∨ 2
√ √ ± 2 1 ˆ± 1 ˆ± ±
√ √ f fq ekfq kK = wA wA , fq
+ w q w K K can be obtained from the definitions of differential second quantisation (2.6), Wick ˜ exponentials (5.1) and of dΓ(A) above. Thus (5.16) is fulfilled, and Proposition 5.2 ˜ ˜ now gives Ψ ∈ D(Γ(A)) and Ψ ∈ D(dΓ(A)). Now that this is established, formulas (5.19) follow directly from (5.8) and (5.3) to (5.5).
February 4, 2002 10:52 WSPC/148-RMP
00111
Ground State Properties of the Nelson Hamiltonian
189
6. Bounds on Ground State Expectations We are now ready to apply the results of the previous section in order to investigate some qualitative properties of the ground state of H. Example 6.1 (Boson number distribution). Let Pn be the projection onto the nth Fock space component (or n-boson sector). Then P˜n = θPn θ−1 is the projection onto the closure of the subspace spanned by {: ξ(f )n :, f ∈ K} ⊂ L2 (G). By (5.1), we have 1 h: exp(ξ(fq− )) :, P˜n : exp(ξ(fq+ )) :iL2 (G) = hfq+ , fq− inK , n! and with (5.6) and Theorem 5.1 we find n Z 0 Z Z ∞ 1 ˜ −2 ds dt W (qs − qt , s − t) pn ≡ hΨ, 1 ⊗ Pn ΨiL2 (P0 ) = n! −∞ 0 Z 0 Z ∞ × exp 2 ds dt W (qs − qt , s − t) dN . −∞
0
˜ Obviously, pn is the probability of finding n bosons in the ground state of H. pn ≤
C%n C% e . n!
(6.1)
Denoting by N = dΓ(1) the number operator, the superexponential bound (6.1) implies hΨ, eαN ΨiL2 (P0 ) < ∞ for each α > 0 and is useful in the context of scattering theory [12]. Let us now assume in addition that W (q, t) < 0 for all q and all t. This is true e.g. for the massive Nelson model √ with mass parameter κ > 0 and ultraviolet cutoff parameter K 1, i.e. ω(k) = k 2 + κ2 , %ˆ(k) = 1{|k|≤K} . Then there exists D ≤ C% with C%n Dn −C% e . ≤ pn ≤ n! n!
(6.2)
The right hand side of (6.2) is again obvious, and the left hand side follows from n Z Z 0 Z ∞ 1 −C% pn ≥ e −2 ds dt W (qs − qt , s − t) dN n! −∞ 0 Z Z 0 n Z ∞ 1 ds dt W (qs − qt , s − t)dN . ≥ e−C% − 2 n! −∞ 0 D is then the expectation of the double integral above. In the next two examples we will look at the mean value and variance of the random variable ξ 7→ ξ0 (g) under P for g ∈ K, using the results of Corollary 5.4.
February 4, 2002 10:52 WSPC/148-RMP
190
00111
V. Betz et al.
Example 6.2 (Average field strength). For n = 1, (5.18) yields Z Z Z ∞ %ˆ(k)ˆ g (k) −ω(k)|s| ikqs e e hΨ, ξ(g)ΨiL2 (P0 ) = − dk ds dN (q) 2ω(k) −∞ Z Z %ˆ(k)ˆ g (k)eikq = − dk dqψ02 (q)λ2 (q) , (6.3) ω(k)2 R where λ2 (q) = Ψ2 (ξ, q)dG(ξ) is the stationary density of N with respect to N0 , and ψ02 is the density of N0 with respect to Lebesgue measure. Writing χ = ψ02 λ2 for the position density of the particle, and taking g to be a delta function in momentum space and in position space, respectively, we find hΨ, ξ(k)ΨiL2 (P0 ) = −
%ˆ(k)χ(k) ˆ (2π)d/2 ω 2 (k)
(k ∈ Rd ) ,
and hΨ, ξ(q)ΨiL2 (P0 ) = (χ ∗ Vω ∗ %)(q)
(q ∈ Rd ) ,
(6.4)
respectively. Here Vω denotes the Fourier transform of −1/ω and is the Coulomb potential for massless bosons, i.e. for ω(k) = |k|. (6.4) is the classical field generated by a particle with position distribution χ(q)dq. Note that equality (6.3) follows also from the equations of motion and the stationarity of Ψ0 . 2
Example 6.3 (Field fluctuations). For n = 2, (5.18) becomes 2 Z Z ∞ Z Z %ˆ(k)ˆ g (k)eikqs −ω(k)|s| |ˆ g (k)|2 2 dk + e hΨ, ξ(g) ΨiL2 (P0 ) = ds dk dN . 2ω(k) 2ω(k) −∞ By using the previous result and the Cauchy–Schwarz inequality, we find that Z Z |ˆ g |2 hΨ, ξ(g)2 ΨiL2 (P0 ) − hΨ, ξ(g)Ψi2L2 (P0 ) ≥ dk = ξ(g)2 dG(ξ) . 2ω The latter term represents the fluctuations of the free field. We thus see that fluctuations increase by coupling the field to the particle. We now consider special cases of Corollary 5.5. Example 6.4 (Average number of bosons at given momentum). For realvalued g ∈ L∞ consider Z a∗k ak g(k)dk = dΓ(g) . ˜ By Corollary 5.5, we have Ψ ∈ D(dΓ(g)). With g chosen to be the indicator function d ˜ 2 0 of some set B ⊂ R , hΨ, dΓ(g)Ψi L (P ) is the expected number of bosons with momentum within B. By (5.19), Z 0 Z Z Z ∞ |ˆ %(k)|2 −ω(k)(t−s) ˜ 2 0 g(k) cos(k(qt − qs ))dN . ds dt e hΨ, dΓ(g)ΨiL (P ) = dk 2ω(k) −∞ 0 (6.5)
February 4, 2002 10:52 WSPC/148-RMP
00111
Ground State Properties of the Nelson Hamiltonian
On the one hand, from cos(kx) ≤ 1 we get Z |ˆ %(k)|2 ˜ hΨ, dΓ(g)ΨiL2 (P0 ) ≤ g(k)dk . 2ω(k)3
191
(6.6)
(6.6) is proven in [2] using the pullthrough formula. On the other hand, from 1 − (k 2 x2 )/2 ≤ cos(kx) we get Z Z k2 (qt2 + qs2 − 2qt qs )dN cos(k(qt − qs ))dN ≥ 1 − 2 Z 2 q 2 ψ02 (q)λ2 (q)dq . ≥ 1−k The last inequality above follows from Z ¯ ¯ qs qt dN = hΨq, e−|t−s|H ΨqiL2 (P0 ) = ke−(|t−s|/2)H qΨk2L2 (P0 ) ≥ 0 . Writing C =
R
q 2 ψ02 (q)λ2 (q)dq, we have for g ≥ 0 that Z |ˆ %(k)|2 ˜ hΨ, dΓ(g)ΨiL2 (P0 ) ≥ (1 − Ck 2 )g(k)dk . 2ω(k)3
The above results can be compactly (and somewhat formally) written as |ˆ %(k)|2 |ˆ %(k)|2 (1 − Ck 2 ) ≤ hΨ, 1 ⊗ a∗k ak ΨiL2 (N0 ⊗G) ≤ . 3 2ω(k) 2ω(k)3
(6.7)
˜ − k)). The quantity in the middle Here, a∗k ak denotes the formal expression dΓ(δ(· of (6.7) is the expected number of bosons with momentum k. In particular, for the massless Nelson model, one can see from the lower bound how the infrared divergence occurs. In this model, ω(k) = |k|, d = 3, and %ˆ(k) = 1{κ<|k|
Z
Z
0
ds −∞ 0
Z
∞
dt 0
Z dk
%ˆ(k)ˆ %(k 0 ) dk 0 p √ 2 w(k) w(k 0 ) 0
× ei(kqs −k qt ) gˆ(k − k 0 )e−ω(k)|s|−ω(k )|t| .
February 4, 2002 10:52 WSPC/148-RMP
192
00111
V. Betz et al.
Thus for g ∈ L1 , we find hΨ, A˜g ΨiL2 (P0 ) Z Z 1 1 |ˆ %(k)| |ˆ %(k 0 )| dk dk 0 g (k − k 0 )| ≤ C C kgkL1 , 3 3 |ˆ d 1 2 d/2 0 2(2π) (2π) 2 2 2ω(k) ω(k ) R %(k)|2 /ω(k)n dk for n = 1, 2. Taking g to be the indicator of some with Cn = |ˆ bounded set B ⊂ Rd , hΨ, A˜g ΨiL2 (P0 ) measures the expected number of bosons with position within B. From the above estimate we see that this number is bounded by a multiple of the volume of B. Moreover, it is interesting to note that this bound is insensitive to formally removing the infrared cutoff. On the other hand, the total number of bosons in the ground state is obtained by taking g = 1 in this or the previous example, and we see from (6.7) that this quantity diverges when the infrared cutoff is formally removed. ≤
Example 6.6 (Localization of the particle). We conclude this section by showing exponential decay of the Lebesgue-density of the stationary measure of N . We ˜ will need the following property of H: Proposition 6.7 (Diamagnetic inequality). For f, g ∈ L2 (P0 ) we have ˜
˜
hf, e−tH giL2 (P0 ) ≤ etVeff hkf kL2 (G) , e−tHp kgkL2 (G) iL2 (N0 ) , where Veff =
1 2
Z
|ˆ %(k)|2 dk < ∞ , ω 2 (k)
˜ p = (1/ψ0 )Hp ψ0 . and H A proof can be found in [11]. The second ingredient we need is a result due to Carmona [3]. For this result to hold, some mild additional restrictions on the single site potential V are needed. We say that V : Rd → R is in the Carmona class if there exists a breakup V = V1 − V2 , such that d/2+
V1 ∈ Lloc
for some > 0, and V1 is bounded below ,
V2 ∈ L for some p > max{1, d/2}, and V2 ≥ 0 . p
Then from the proofs of [3, Lemma 3.1, Propositions 3.1 and 3.2] one can extract the following Lemma 6.8. Let V = V1 − V2 be of the Carmona class, and use Wq¯ to denote the measure of Brownian motion on Rd starting in q¯. (a) Suppose there exist γ > 0, m > 0 such that V1 (q) ≥ γ|q|2m
February 4, 2002 10:52 WSPC/148-RMP
00111
Ground State Properties of the Nelson Hamiltonian
193
outside a compact set. Put t(q) = max{|q|1−m , 1}. Then for each E > 0 there exist D > 0 and δ > 0 such that Z R t(q) ¯ q |m+1 ) . ∀ q¯ ∈ Rd : et(¯q )E e− 0 V (qs )ds dWq¯(q) ≤ D exp(−δ|¯ (b) Put α := lim inf |q|→∞ V (q), t(q) := β|q| with β > 0. Then for each E ∈ R with E < α, there exist D > 0, δ > 0 and β > 0 such that Z R t(q) ¯ d t(¯ q)E e− 0 V (qs )ds dWq¯(q) ≤ D exp(−δ|¯ q |) . ∀ q¯ ∈ R : e Recall that N denotes the stationary measure of N , ψ0 λ equals the square root of the Lebesgue density of N (cf. Example 6.2) and E0 is the ground state energy ˜ Our result now reads: of H. Theorem 6.9. For any V fulfilling the general conditions given in Sec. 2, we have ψ0 λ ∈ L∞ (Rd ). If, in addition, V = V1 −V2 is of the Carmona class, then there exists a version of ψ0 λ (denoted by q 7→ ψ0 (q)λ(q)) for which the following statements hold : (a) If V satisfies the assumptions of Proposition 6.8(a), then there exist D, δ > 0 with ∀ q ∈ Rd : ψ0 (q)λ(q) ≤ D exp(−δ|q|m+1 ) .
(6.8)
(b) Put α := lim inf |q|→∞ V1 (q). If α− (E0 + Veff ) > 0, then there exist D > 0, δ > 0 such that ∀ q ∈ Rd : ψ0 (q)λ(q) ≤ D exp(−δ|q|) .
(6.9)
˜ = E0 Ψ, for h ∈ L∞ (Rd ), h ≥ Proof. We first show that ψ0 λ ∈ L∞ (Rd ). Since HΨ 0, the diamagnetic inequality implies Z ˜ h(q)ψ02 (q)λ2 (q)dq = hhΨ, ΨiL2 (P0 ) = etE0 hhΨ, e−tH ΨiL2 (P0 ) ˜
≤ et(Veff +E0 ) hhλ, e−tHp λiL2 (N0 ) Z ˜ = et(Veff +E0 ) h(q)ψ0 (q)λ(q)(e−tHp λψ0 )(q)dq .
(6.10)
Since we required V to be in the Kato class, e−tHp takes L2 (dq) into L∞ (dq) [21]. Thus we can find C ∈ R with Z Z 2 2 (6.11) h(q)ψ0 (q)λ (q)dq ≤ C h(q)ψ(q)λ(q)dq , which implies ψ0 λ ∈ L∞ . Using this result in (6.10) and the Feynman–Kac formula to express the kernel of e−tHp , we get
February 4, 2002 10:52 WSPC/148-RMP
194
00111
V. Betz et al.
Z
Z h(q)ψ02 (q)λ2 (q)dq ≤ et(Veff +E0 ) Z ×
e−
Rt 0
d¯ q h(¯ q )ψ0 (¯ q )λ(¯ q) V (qs )ds
ψ0 (qt )λ(qt )dWq¯(q)
≤ et(Veff +E0 ) kψ0 λk2L∞ Z Z Rt × d¯ q h(¯ q ) e− 0 V (qs )ds dWq¯(q) .
(6.12)
The version of ψ0 λ mentioned above can now be explicitly defined by Z 2 2 ψ0 (q)λ(q) = lim sup hq,n (x)ψ02 (x)λ2 (x)dx , n→∞
in L1 to a delta peak where hq,n is any fixed sequence of L1 -functions R converging Rt at q. We now use this sequence in (6.12). Since exp( 0 V (qs )ds)dWq¯ is continuous in q¯ and finite for all q¯, the right hand side of (6.12) converges and we have Z Rt q )λ2 (¯ q ) ≤ et(Veff +E0 ) kψ0 λk2L∞ e− 0 V (qs )ds dWq¯(q) . ψ02 (¯ This inequality is valid for each t > 0, and therefore in case V is in the Carmona class, we can use Proposition 6.8 with E replaced by Veff + E0 to conclude the proof. A version of the preceding result already appears in [2]. There it is shown that ψ0 (q)λ(q) exp(αq) ∈ L1 (dq) for some α > 0, while the present results (when applicable) imply ψ0 (q)λ(q) exp(αq) ∈ L∞ (dq) in case of a decaying external potential V and superexponential localization in case of growing potentials. Appendix A A.1. Conditional Gaussian measures In the first part of this Appendix we prove formulas (5.10) and (5.11). In fact, we give a simple and powerful method for explicitly calculating certain conditional Gaussian measures. This method must be known in some form, but we could not find it in the literature. Complete the space S(Rd+1 ) with respect to a Hilbert seminorm k.kK and denote its closure by K. Consider on S 0 (Rd+1 ) the Gaussian measure γ with mean 0 and covariance Z η(f )η(g)dγ(η) = hf, giK (f, g ∈ K) . (A.1) The σ-field for γ is generated by {η 7→ η(f ) : f ∈ K}. Consider now a closed subspace K0 ⊂ K and denote by F0 the σ-field generated by {η 7→ η(f ) : f ∈ K0 }. Moreover, write P0 for the projection onto K0 . By writing f ∈ K as f = P0 f + f ⊥ ,
February 4, 2002 10:52 WSPC/148-RMP
00111
Ground State Properties of the Nelson Hamiltonian
195
we find Eγ (eiη(f ) |F0 )(¯ η ) = Eγ (eiη(P0 f ) eiη(f
⊥
⊥
|F0 )(¯ η ) = ei¯η(P0 f ) Eγ (eiη(f ) |F0 )(¯ η) ⊥ 1 η (P0 f ) − kf ⊥ k2K . (A.2) = ei¯η(P0 f ) Eγ (eiη(f ) ) = exp i¯ 2 )
The equalities above are in L2 (γ), and the third equality is due to the fact that independence with respect to γ is equivalent with orthogonality in L2 (γ). For η¯ ∈ S 0 (Rd+1 ), we denote by γη¯ the Gaussian measure with mean η¯(P0 f ) and covariance (f, g) 7→ hf ⊥ , g ⊥ iK = hf, giK − hP0 f, P0 giK . It follows from (A.2) that the map (S 0 (Rd ), L1 (γ)) → R ,
Z (¯ η , F ) 7→
F (η)γη¯ (η)
is a version of the regular conditional probability Eγ (.|F0 ). To specialize to our context, we take for K the closure of S 0 (Rd+1 ) with respect to the norm associated with the scalar product Z 1 g (k, κ) 2 dk dκ (k ∈ Rd , κ ∈ R) hf, giK = fˆ(k, κ)ˆ ω (k) + κ2 and derive from this the Gaussian measure γ according to (A.1). By performing the corresponding Fourier integration, it can be checked that a distribution of the form f ⊗δt (with f ∈ K, cf. (3.2), and δt denoting the delta-peak at t ∈ R) is an element of K, and that the S 0 (Rd )-valued stochastic process {ξt (f ) = η(f ⊗ δt ), f ∈ K, t ∈ R} coincides in law with the process G. Moreover, by taking K0 to be the closure of the set {f ⊗ δ0 : f ∈ K}, we find that (g ⊗ δt ) = e−tω g\ ⊗ δ0 P0\
for g ∈ K .
This can be used in the above general result to obtain (5.10) and (5.11). A.2. Proof of Theorem 4.3 Before we prove the theorem, by showing compatibility [7] we first make sure that the family of measures {NTq¯ : T > 0} (cf. (4.11)) has a chance to fulfill the DLR equations. Remember that ΛT = ([−T, T ] × R) ∪ (R × [−T, T ]) . Lemma A.1. The family {NTq¯ : T > 0} is compatible. Proof. We have to check that for T > S, measurable A ⊂ C(R, Rd ) and q¯ ∈ C(R, Rd ): Z q¯ • NT (NS (A)) ≡ NSq (A)dNTq¯(q) = NTq¯(A) .
February 4, 2002 10:52 WSPC/148-RMP
196
00111
V. Betz et al.
Here and henceforth we write NTq¯(f ) instead of EN q¯ (f ) in order to avoid too many T subscript levels. By a monotone class argument, we may assume A to be of the form A = A1 ∩ A2 ∩ A3 with A1 ∈ F[−S,S], A2 ∈ F[−T,T ]\[−S,S] and A3 ∈ F[−T,T ]c . Writing TT for FR\[−T,T ] , it is clear from the definition of NT0,¯q that for S < T and f ∈ L1 (NT0,¯q ), NT0,¯q (f |TS ) = NS0,¯q (f ) for NT0,¯q -almost all q¯ ∈ C(R, Rd ) . Plugging
f (q) = NSq (A1 ) exp
(A.3)
W (qs − qt , s − t)ds dt 1A2 (q)
ZZ − ΛT
into this equality, and writing
ZZ
WΛ (q) := −
W (qs − qt , s − t)ds dt Λ
for Λ ⊂ R2 , we find NTq¯(NS• (A1 )1A2 ) = NT0,¯q (f ) = NT0,¯q (NT0,• (f |TS )) = NT0,¯q (NS• (A1 )1A2 eW(ΛT \ΛS ) NS0,• (eWΛS )) 1 1A2 eW(ΛT \ΛS ) NS0,• (eWΛS ) = NT0,¯q NS0,• (1A1 eWΛS ) 0,• W NS (e ΛS ) = NT0,¯q (NT0,• (1A1 eWΛS |TS )1A2 eW(ΛT \ΛS ) ) = NT0,¯q (eWΛS eW(ΛT \ΛS ) 1A1 1A2 ) = NTq¯(1A1 ∩A2 ) . Since furthermore NTq¯(A3 ) = 1A3 (¯ q ), the lemma is proved. Proof Theorem 4.3. Let S < T , put ΛS,T := ([−T, T ] × [−S, S]) ∪ ([−S, S] × [−T, T ]), and define ZZ W (qs − qt , s − t)ds dt , WΛS,T := − ΛS,T q¯ dNS,T (q) :=
1 q¯ ZS,T
exp(−WΛS,T (q))dNS0,¯q .
We claim that q¯ NT (·|TS )(¯ q ) = NS,T
for NT -almost all q¯ .
q ) = NS0,¯q (·) and proceed exactly as in the proof of To see this, note that N 0 (·|TS )(¯ Lemma A.1. As a consequence, if A ∈ FR for some R > 0, we have • NT (A) = NT (NT (A|TS )) = NT (NS,T (A)) .
(A.4)
February 4, 2002 10:52 WSPC/148-RMP
00111
Ground State Properties of the Nelson Hamiltonian
197
As a last ingredient, we have for every q ∈ C(R, Rd ) and T > S Z S Z ∞ Z |ˆ %(k)|2 −ω(k)|t−s| e |WΛS,T (q) − WΛS (q) ≤ 4 ds dt dk 2ω(k) −S T Z |ˆ %(k)|2 T →∞ ≤ 8S e−ω(k)(T −S) 2 dk −→ 0 2ω (k) by dominated convergence and (2.4). Thus, sup q¯∈C(R,Rd )
T →∞
q¯ |NS,T (A) − NSq¯(A)| −→ 0 ,
and by taking T → ∞ on both sides of (A.4), we arrive at N (A) = N (NS• (A)), which is what we wanted to show. Acknowledgments The author, R. A. Minlos, thanks Zentrum Mathematik of Technische Universit¨ at M¨ unchen for warm hospitality and financial support. He also thanks the Russian Fundamental Research Foundation (grants 99-01-00284 and 00-01-00271), CRDF (grant NRM 1-2085) and DFG (grant 436 RUS 113/485/5) for financial support. The author, J. L˝ orinczi thanks Schwerpunktprogramm “Interagierende stochastische Systeme von hoher Komplexit¨ at” (grant SP 181/12). The author, F. Hiroshima thanks Technische Universit¨at M¨ unchen for kind hospitality. This work was partially supported by the Graduiertenkolleg “Mathematik in ihrer Wechselbeziehung zur Physik” of the LMU Munich and Grant-in-Aid 13740106 for Encouragement of Young Scientists from the Japanese Ministry of Education, Science, Sports and Culture. References [1] A. Arai, “Ground state of the massless Nelson model without infrared cutoff in a non-Fock representation”, Rev. Math. Phys. 13 (2001) 1075–1094. [2] V. Bach, J. Fr¨ ohlich and I. M. Sigal, “Quantum electrodynamics of confined nonrelativistic particles”, Adv. Math. 137 (1998) 299–395. [3] R. Carmona, “Pointwise bounds for Schr¨ odinger eigenstates”, Commun. Math. Phys. 62 (1978) 97–106. [4] R. P. Feynman and A. R. Hibbs, Quantum Mechanics and Path Integrals, McGraw Hill, 1965. [5] J. Fr¨ ohlich, “On the infrared problem in a model of scalar electrons and massless scalar bosons”, Ann. Inst. H. Poincar´ e 19 (1973) 1–103. [6] J. Fr¨ ohlich, “Existence of dressed one electron states in a class of persistent models”, Fortschr. Phys. 22 (1974) 159–198. [7] H.-O. Georgii, Gibbs Measures and Phase Transitions, Berlin, New York, de Gruyter, 1988. [8] C. G´erard, “On the existence of ground states for massless Pauli–Fierz Hamiltonians”, Ann. Henri Poincar´e 1 (2000) 443–459.
February 4, 2002 10:52 WSPC/148-RMP
198
00111
V. Betz et al.
[9] M. Griesemer, E. Lieb and M. Loss, “Ground states in non-relativistic quantum electrodynamics”, to appear in Inv. Math. [10] T. Hida, H.-H. Kuo, J. Potthoff and L. Streit, White Noise, Dordrecht, Boston London, Kluwer Academic Publishers, 1993. [11] F. Hiroshima, “Diamagnetic inequalities for systems of nonrelativistic particles with a quantized field”, Rev. Math. Phys. 8 (1996) 185–203. [12] M. H¨ ubner and H. Spohn, “Radiative decay: nonperturbative approaches”, Rev. Math. Phys. 7 (1995) 363–387. [13] J. L˝ orinczi and R. A. Minlos, “Gibbs measures for Brownian paths under the effect of an external and a small pair potential”, J. Stat. Phys. 105 (2001) 607–649. [14] J. L˝ orinczi, R. A. Minlos and H. Spohn, “The infrared behaviour in Nelson’s model of a quantum particle coupled to a massless scalar field”, preprint. [15] J. L˝ orinczi, R. A. Minlos and H. Spohn, “Infrared regular representation of the three dimensional massless Nelson model”, preprint. [16] E. Nelson, “Schr¨ odinger particles interacting with a quantized scalar field”, p. 87 in Proceedings of a Conference on Analysis in Function Space, eds. W. T. Martin and I. Segal, MIT Press, Cambridge 1964. [17] E. Nelson, “Interaction of nonrelativistic particles with a quantized scalar field”, J. Math. Phys. 5 (1964) 1990–1997. [18] E. Nelson, “The free Markoff field”, J. Funct. Anal. 12 (1973) 211–227. [19] N. Obata, White Noise Calculus and Fock Space, Berlin, Heidelberg, Springer, 1994. [20] B. Simon, Functional Integration and Quantum Physics, New York, San Francisco, London, Academic Press, 1979. [21] B. Simon, “Schr¨ odinger semigroups”, Bull. Amer. Math.Sec. 7 (1982) 447–526. [22] H. Spohn, “Ground state of a quantum particle coupled to a scalar boson field”, Lett. Math. Phys. 44 (1998) 9–16.
February 4, 2002 11:31 WSPC/148-RMP
00113
Reviews in Mathematical Physics, Vol. 14, No. 2 (2002) 199–240 c World Scientific Publishing Company
ON SPECTRAL AND SCATTERING THEORY FOR N -BODY ¨ SCHRODINGER OPERATORS IN A CONSTANT MAGNETIC FIELD
TADAYOSHI ADACHI Department of Mathematics, Faculty of Science, Kobe University 1-1, Rokkodai-cho, Nada-ku, Kobe-shi, Hyogo 657-8501, Japan
[email protected]
Received 26 February 2001 We consider an N -body quantum system in a constant magnetic field which consists of just one charged and the other N − 1 neutral particles. We prove the existence of a conjugate operator for the Hamiltonian which governs the system, and show the asymptotic completeness of the system under short-range assumptions on the pair potentials.
1. Introduction The scattering theory for N -body quantum systems in a constant magnetic field has been studied by G´erard–Laba [8, 9, 10]. But they have assumed that all particles in the systems are charged, that is, there is no neutral particle in the systems under consideration, even if the systems consist of only two particles (see also [14, 15]). Under this assumption, if there is no neutral proper subsystem, one has only to observe the behavior of all subsystems parallel to the magnetic field. Skibsted [21, 22] studied the scattering theory for N -body quantum systems in combined constant electric and magnetic fields, but his result needs the asymptotic completeness for the systems in a constant magnetic field. Recently we studied the scattering theory for a two-body quantum system, which consists of one neutral and one charged particles, in a constant magnetic field (see [1]). Showing how to choose a conjugate operator for the Hamiltonian which governs the system was one of the ingredients in [1]. By virtue of this, we obtained the Mourre estimate and used it in order to obtain the so-called minimal velocity estimate which is one of useful propagation estimates. Throughout this paper, we consider an N -body quantum system which has N − 1 neutral particles and just one charged particle in a constant magnetic field. Our goal is to prove the asymptotic completeness of this system under short-range assumptions on the pair potentials. For achieving it, it is useful to obtain the Mourre estimate for the Hamiltonian which governs this system. The Mourre estimate is powerful also in studying spectral properties of the Hamiltonian. 199
February 4, 2002 11:31 WSPC/148-RMP
200
00113
T. Adachi
Finding a conjugate operator for the Hamiltonian is one of the ingredients in this paper. We consider a system of N particles moving in a given constant magnetic field B = (0, 0, B) ∈ R3 , B > 0. For j = 1, . . . , N , let mj > 0, qj ∈ R and xj ∈ R3 be the mass, charge and position vector of the jth particle, respectively. Throughout this paper, we assume that the last particle is charged and the rest are neutral, that is, (1.1) qj = 0 if 1 ≤ j ≤ N − 1 , qN 6= 0 . P In particular, the total charge q = j qj of the system is non-zero in this case. The total Hamiltonian for the system is defined by N −1 X 1 1 ˜ = H Dx j 2 + (DxN − qN A(xN ))2 + V (1.2) 2m 2m j N j=1 acting on L2 (R3×N ), where the potential V is the sum of the pair potentials Vjk (xj − xk ), that is, X Vjk (xj − xk ) , V = 1≤j
Dxj = −i∇xj , j = 1, . . . , N , is the momentum operator of the jth particle, and A(r) is the vector potential. Using the Coulomb gauge, the vector potential A(r) is given by B (−r2 , r1 , 0) , r = (r1 , r2 , r3 ) . (1.3) 2 As is well-known, it is easy to remove the center of mass motion of the system ˜ (see e.g. [3]). In order to achieve it, parallel to the field from the Hamiltonian H we write the position xj of the jth particle for xj = (yj , zj ) with yj ∈ R2 and zj ∈ R. Moreover we identify the vector potential A(xj ) ∈ R3 with A(yj ) ≡ (B/2)(−yj,2 , yj,1 ) ∈ R2 because A(xj ) can be written as (A(yj ), 0). Thus we study the spectral and scattering theory for the following Hamiltonian : ! N −1 X 1 1 1 2 + H= Dy j (DyN − qN A(yN ))2 − ∆zamax + V (1.4) 2m 2m 2 j N j=1 A(r) =
acting on L2 (R2×N × Z amax ), where Z amax is defined by ( ) N X N amax = z = (z1 , . . . , zN ) ∈ R mj z j = 0 Z j=1
which is equipped with the metric hz, z˜i =
N X j=1
mj zj z˜j ,
|z|1 =
p hz, zi
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
201
for z = (z1 , . . . , zN ) ∈ RN and z˜ = (˜ z1 , . . . , z˜N ) ∈ RN , and ∆zamax is the Laplace– Beltrami operator on Z amax . Moreover, introducing the total pseudomomentum ktotal of the system perpendicular to the field B which is defined by ! N −1 X Dyj + (DyN + qN A(yN )) , (1.5) ktotal = j=1
one can remove the dependence on ktotal from the Hamiltonian H: It is well-known that ktotal commutes with H, and that since the total charge q = qN of this system is non-zero, the two components of the total pseudomomentum ktotal cannot commute with each other, but satisfy the Heisenberg commutation relation (see e.g. [3]). Now we introduce the unitary operator U = e−iycm ·qA(ycc ) eiqBycm,1 ycm,2 /2 eiDycm,1 Dycm ,2 /(qB) on L2 (R
2×N
(1.6)
× Z amax ) with ycm =
N 1 X mj y j , M j=1
1X qj yj , q j=1 N
ycc =
(1.7)
P where M = j mj is the total mass of the system. We note that ycc = yN holds in this case. Then we obtain U ∗ ktotal,1 U = Dycm,1 ,
U ∗ ktotal,2 U = qBycm,1 ,
(1.8)
and see that U ∗ HU is independent of (Dycm,1 , qBycm,1 ) (see [8, 9, 10, 21, 22 and 1]). Here the dot · means the usual Euclidean metric, and we wrote ktotal = (ktotal,1 , ktotal,2 ), ycm = (ycm,1 , ycm,2 ) and Dycm = (Dycm ,1 , Dycm ,2 ). Thus one can identify the Hamiltonian U ∗ HU acting on U ∗ L2 (R2×N × Z amax ) with an operator acting on H = L2 (Y amax × Rycm ,2 × Z amax ), where Y amax is defined by N ( ) X 2×N amax = y = (y1 , . . . , yN ) ∈ R mj y j = 0 Y j=1
which is equipped with the metric hy, y˜i =
N X
mj yj · y˜j ,
|y|1 =
p hy, yi
j=1
for y = (y1 , . . . , yN ) ∈ R2×N and y˜ = (˜ y1 , . . . , y˜N ) ∈ R2×N . We denote this reduced ˆ It is a part of our goal to study the spectral theory Hamiltonian acting on H by H. ˆ for H. Now we state the assumptions on the pair potentials Vjk . For r = (r1 , r2 , r3 ) ∈ R3 , we denote (r1 , r2 ) by r⊥ and write ∇r⊥ = ∇⊥ . For any interval I ⊂ R, we denote the characteristic function of I on R by 1I . 3 (V.1) Vjk = Vjk (r) ∈ L2 (R3 ) + L∞ (R ) (1 ≤ j < k ≤ N ) is a real-valued function.
February 4, 2002 11:31 WSPC/148-RMP
202
00113
T. Adachi
(V.2) If j and k satisfy that 1 ≤ j < k ≤ N − 1, r · ∇Vjk is −∆-bounded and satisfies
1[1,∞) |r| r · ∇Vjk (−∆ + 1)−1 = O(R−µ ) , R → ∞ ,
R for some µ > 0. Otherwise, that is, if l satisfies that 1 ≤ |∇⊥ VlN |2 and r · ∇VlN are all −∆-bounded, and satisfy
1[1,∞) |r| ∇⊥ VlN (−∆ + 1)−1 = O(R−µ ) ,
R
1[1,∞) |r| |∇⊥ VlN |2 (−∆ + 1)−1 = O(R−µ ) ,
R
1[1,∞) |r| r · ∇VlN (−∆ + 1)−1 = O(R−µ ) ,
R
l ≤ N − 1, ∇⊥ VlN ,
R → ∞, R → ∞, R → ∞,
for some µ > 0. (V.3) If j and k satisfy that 1 ≤ j < k ≤ N − 1, (r · ∇)2 Vjk is −∆-bounded. Otherwise, that is, if l satisfies that 1 ≤ l ≤ N − 1, (∇⊥ )2 VlN , (r · ∇)2 VlN , ∇⊥ (r · ∇VlN ) and r⊥ · ∇⊥ VlN are all −∆-bounded. (SR) Vjk satisfies that ∇Vjk is −∆-bounded and
1[1,∞) |r| Vjk (−∆ + 1)−1 = O(R−µS1 ) ,
R
1[1,∞) |r| ∇Vjk (−∆ + 1)−1 = O(R−1−µS2 )
R as R → ∞, with µS1 > 1 and µS2 > 0. ˆ are self-adjoint. Under these assumptions, the Hamiltonians H and H To formulate the main result in this paper precisely, we introduce some notations in many body scattering theory: A non-empty subset of the set {1, . . . , N } is called S a cluster. Let Cj , 1 ≤ j ≤ j0 , be clusters. If 1≤j≤j0 Cj = {1, . . . , N } and Cj ∩ Ck = ∅ for 1 ≤ j < k ≤ j0 , a = {C1 , . . . , Cj0 } is called a cluster decomposition. We denote by #(a) the number of clusters in a. Let A be the set of all cluster decompositions. Suppose a, b ∈ A. If b is a refinement of a, that is, if each cluster in b is a subset of a certain cluster in a, we say b ⊂ a, and its negation is denoted by b 6⊂ a. Any cluster decomposition a can be regarded as a refinement of itself. If, in particular, b is a strict refinement of a, that is, if b ⊂ a and b 6= a, we denote by b ( a. We identify the pair (j, k) with the (N − 1)-cluster decomposition ˇ . . . , {N }}. We denote by amax and amin the 1- and {{j, k}, {1}, . . . , {ˇj}, . . . , {k}, N -cluster decompositions, respectively. In this paper, we often use the following notation A(amax ) = A \ {amax } .
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
For a ∈ A, the intercluster potential Ia is defined by X Vjk (xj − xk ) , Ia =
203
(1.9)
(j,k)6⊂a
and the cluster Hamiltonian Ha is given by ! N −1 X 1 1 1 2 Dyj + (DyN − qj A(yN ))2 − ∆zamax + V a , Ha = 2m 2m 2 j N j=1 Va =
X
(1.10)
Vjk (xj − xk )
(j,k)⊂a
acting on L2 (R2×N × Z amax ). We also define the innercluster Hamiltonian H Cj on L2 (R2×#(Cj ) ×Z Cj ) for each cluster Cj = {cj (1), . . . , cj (#(Cj ))} in a, where #(Cj ) is the number of the elements in the cluster Cj : ! ! X 1 X 1 2 Cj 2 + Dy (Dyl2 − ql2 A(yl2 )) H = 2ml1 l1 2ml2 n c l1 ∈Cj
V Cj
l2 ∈Cj
1 − ∆zCj + V Cj , 2 X = Vl1 l2 (xl1 − xl2 ) .
(1.11)
{l1 ,l2 }⊂Cj l1
Here Cjn = Cj ∩ {1, . . . , N − 1}, Cjc = Cj ∩ {N }, the configuration space Z Cj is defined by #(Cj ) ( ) X #(Cj ) Cj mcj (l) zcj (l) = 0 , Z = (zcj (1) , . . . , zcj (#(Cj )) ) ∈ R l=1
which is equipped with the metric defined by X
#(Cj )
˜ = hζ, ζi
mcj (l) zcj (l) z˜cj (l) ,
|ζ|1 =
p hζ, ζi
l=1
zcj (1) , . . . , z˜cj (#(Cj )) ) ∈ R#(Cj ) , for ζ = (zcj (1) , . . . , zcj (#(Cj )) ) ∈ R#(Cj ) and ζ˜ = (˜ and ∆zCj is the Laplace–Beltrami operator on Z Cj . We also define two subspaces Z a and Za of Z amax by ( ) X a amax ml zl = 0 for each cluster Cj ∈ a , Za = Z amax Z a . Z = z∈Z l∈Cj
And we denote by ∆za and ∆za the Laplace–Beltrami operators on Z a and Za , respectively. As is well-known, one can identify Z a with Z C1 ⊕ · · · ⊕ Z C#(a) . What
February 4, 2002 11:31 WSPC/148-RMP
204
00113
T. Adachi
we notice here is that the cluster Hamiltonian Ha is decomposed into the sum of all the innercluster Hamiltonians H Cj and −∆za /2: ! #(a) X 1 Cj Id ⊗ · · · ⊗ Id ⊗ H ⊗ Id ⊗ · · · ⊗ Id + Id ⊗ · · · ⊗ Id ⊗ − ∆za Ha = 2 j=1 (1.12) on L2 (R2×N × Z amax ) = L2 (R2×#(C1 ) × Z C1 ) ⊗ · · · ⊗ L2 (R2×#(C#(a) ) × Z C#(a) ) ⊗ L2 (Za ). Let a = {C1 , . . . , C#(a) } ∈ A. Choose j1 such that 1 ≤ j1 ≤ #(a) and {N } ⊂ Cj1 . Of course, this j1 associated with a exists uniquely. If necessary, by renumbering the clusters in a, one can put j1 = #(a) without loss of generality. What we should emphasize here is that H C#(a) is just the #(C#(a) )-body Hamiltonian under consideration. We consider the sum of all the innercluster Hamiltonians except H C#(a) X
#(a)−1
K(a) =
Id ⊗ · · · ⊗ Id ⊗ H Cj ⊗ Id · · · ⊗ Id
(1.13)
j=1
on K(a) = L2 (R2×#(C1 ) × Z C1 ) ⊗ · · · ⊗ L2 (R2×#(C#(a)−1 ) × Z C#(a)−1 ). Here we note that if one removes the center of mass motion perpendicular to the field B of this (N − #(C#(a) ))-body system from K(a), the obtained Hamiltonian is an odinger operator without external electromagnetic fields (N − #(C#(a) ))-body Schr¨ in the center of mass frame. Now we equip R2×#(Cj ) , j = 1, . . . , #(a) − 1, with the metric X
#(Cj )
hη, η˜i =
mcj (l) ycj (l) · y˜cj (l) ,
|η|1 =
p hη, ηi
l=1
ycj (1) , . . . , y˜cj (#(Cj )) ) ∈ for η = (ycj (1) , . . . , ycj (#(Cj )) ) ∈ R2×#(Cj ) and η˜ = (˜ R2×#(Cj ) , and define two subspaces Y Cj and YCj of R2×#(Cj ) by #(Cj ) ( ) X 2×#(C ) j mcj (l) ycj (l) = 0 , Y Cj = (ycj (1) , . . . , ycj (#(Cj )) ) ∈ R l=1
YCj = R2×#(Cj ) Y Cj . And we put X Cj = Y Cj × Z Cj and X a,n = X C1 × · · · × X C#(a)−1 , and define two subspaces Y a,n and Ya,n of R2×(N −#(C#(a) )) by Y a,n = Y C1 × · · · × Y C#(a)−1 and Ya,n = R2×(N −#(C#(a) )) Y a,n which are equipped with the metric h , i. Then K(a) can be decomposed into 1 a (1.14) K(a) = K ⊗ Id + Id ⊗ − ∆ya,n 2 on K(a) = L2 (X a,n ) ⊗ L2 (Ya,n ), where ∆ya,n is the Laplace–Beltrami operator on Ya,n . As we mentioned above, this Hamiltonian K a is an (N − #(C#(a) ))-body
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
205
Schr¨ odinger operator without external electromagnetic fields in the center of mass frame. Thus we have Ha = K a ⊗ Id ⊗ Id ⊗ Id + Id ⊗ H C#(a) ⊗ Id ⊗ Id 1 1 + Id ⊗ Id ⊗ − ∆ya,n ⊗ Id + Id ⊗ Id ⊗ Id ⊗ − ∆za 2 2
(1.15)
on L2 (R2×N × Z amax ) = L2 (X a,n ) ⊗ L2 (R2×#(C#(a) ) × Z C#(a) ) ⊗ L2 (Ya,n ) ⊗ L2 (Za ). Denoting by P˜ a and Pˆ a the eigenprojections for K a on L2 (X a,n ) and for H C#(a) on L2 (R2×#(C#(a) ) × Z C#(a) ), respectively, we put P a = P˜ a ⊗ Pˆ a ⊗ Id ⊗ Id on L2 (R2×N × Z amax ) = L2 (X a,n ) ⊗ L2 (R2×#(C#(a) ) × Z C#(a) ) ⊗ L2 (Ya,n ) ⊗ L2 (Za ). Then the usual wave operators Wa± , a ∈ A(amax ), are defined by Wa± = s - lim eitH e−itHa P a . t→±∞
(1.16)
The main result of this paper is the following theorem. Theorem 1.1. Assume that (V.1), (V.2), (V.3) and (SR) are fulfilled. Then the usual wave operators Wa± , a ∈ A(amax ), exist and are asymptotically complete X ⊕ Ran Wa± . L2c (H) = a∈A(amax )
Here L2c (H) is the continuous spectral subspace of the Hamiltonian H. The problem of the asymptotic completeness for N -body quantum systems has been studied by many mathematicians and they have succeeded. For example, for N -body Schr¨ odinger operators without external electromagnetic fields, this problem was first solved by Sigal–Soffer [19] for a large class of short-range potentials, and some alternative proofs appeared (see e.g. Graf [11] and Yafaev [23]). On the other hand, for the long-range case, Derezi´ nski [5] solved this problem with arbitrary N √ −µL ) with some µL > 3 − 1 (see for the class of potentials decaying like O(|xj − xk | also e.g. [6]). As for the results for the systems in external electromagnetic fields, see e.g. the references in [6]. Throughout this paper, we assume that the number of charged particles L in the system under consideration is just one. In fact, we have not solved the problem in the case when L ≥ 2 yet. In the case when L ≥ 2, by virtue of the constant magnetic field, the physical situation in R3 seems quite different from the one in R2 : Imagine N -body quantum scattering pictures both in R3 and in R2 under the influence of a constant magnetic field. Suppose that the last L particles are charged. Put C n = {1, . . . , N − L} and C c = {N − L + 1, . . . , N }, and introduce the set of cluster decompositions B = {a = {C1 , . . . , C#(a) } ∈ A | C c ⊂ C#(a) }
February 4, 2002 11:31 WSPC/148-RMP
206
00113
T. Adachi
with renumbering the clusters in a if necessary. For simplicity of the argument below, we suppose that the pair potentials are “short-range”. As in the case when L = 1, one can also introduce the Hamiltonian H, cluster Hamiltonians Ha and the wave operators Wa± . Then one expects that the statement of the asymptotic completeness says that X ⊕ Ran Wa± L2c (H) = a∈A(amax )
when the space dimension is three. As is well-known, it is equivalent to that the time evolution of any scattering state ψ ∈ L2c (H) is asymptotically represented as X e−itH ψ = e−itHa P a ψa± + o(1) as t → ±∞ (1.17) a∈A(amax )
ψa±
∈ L (R2×N × Z amax ). We note that each summand e−itHa P a ψa± with some describes the motion of the particles in which those in the clusters in a form bound states and the centers of mass of the clusters in a move freely. Since the motion of the particles parallel to the magnetic field B is not influenced by B, we need take a superposition of e−itHa P a ψa± whose index a ranges in the whole of A(amax ) in general, as in the case when H is a usual N -body Schr¨ odinger operators without external electromagnetic fields. On the other hand, when the space dimension is two, the statement of the asymptotic completeness may be X ⊕ Ran Wa± , L2c (H) = 2
a∈B(amax )
where B(amax ) = B \ {amax } ⊂ A(amax ). This says that the time evolution of any scattering state ψ ∈ L2c (H) is asymptotically represented by a superposition of e−itHa P a ψa± , a ∈ B(amax ), which particularly describes the particles in the only charged cluster C#(a) in a ∈ B(amax) form bound states. The reason why we should take this B(amax ) instead of A(amax ) is as follows: All charged particles are bound in the directions perpendicular to the magnetic field B by the influence of B. So one expects that the distance among all charged particles is bounded with respect to time t, and one can suppose that all charged particles belong to one cluster. Hence we need not consider cluster decompositions a ∈ A(amax ) which have at least two charged clusters. Moreover, neutral particles can move freely without being influenced by the magnetic field B even when the space dimension is two. Thus one should study the motion of particles in the directions perpendicular to B more carefully in the case when L ≥ 2. Now what we would like to emphasize here is that our case, that is, the case when L = 1 is the unique one in which B(amax ) = A(amax ) holds, because C c = {N } only when L = 1. In fact, our argument can also be applied to studying the problem in R2 when L = 1, because the motion of the only
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
207
charged particle in the directions perpendicular to B can be controlled by the total pseudomomentum ktotal which does commute with the Hamiltonian H. This fact is a key in order to prove the main result. The plan of this paper is as follows: In Sec. 2, we find a conjugate operator ˆ and prove the Mourre estimate. A conjugate operator for for the Hamiltonian H, ˆ H under consideration was not found before. Recently we found it for the twobody case in [1]. But, since by virtue of the merit of two-body systems, we used the relative coordinates and the center of mass coordinates in its representation, its definition was slightly complicated and unsuitable for finding it for general N even if L = 1. Its definition in this paper may be suggestive in considering general cases. As mentioned above, the Mourre estimate is powerful in studying spectral and scattering theory for the Hamiltonian, and so finding a conjugate operator for the Hamiltonian itself is of interest to us. In Sec. 3, we show some propagation estimates which are useful for proving the asymptotic completeness, which one can utilize even for studying long-range scattering for the system under consideration. One of useful propagation estimates is called the minimal velocity estimate, which can be obtained by virtue of the Mourre estimate in Sec. 2. It seems new to show the way how to analyze the motions of neutral and charged particles in the directions perpendicular to B simultaneously, although we can deal with the case when L = 1 only up to the present. In Sec. 4, we prove the main result Theorem 1.1 of this paper. We give the proofs in the case t → ∞ only. The case t → −∞ can be proved in the same way. 2. The Mourre Estimate ˆ and prove the In this section, we find a conjugate operator for the Hamiltonian H, Mourre estimate. ˆ by induction in the number First we define the set of thresholds Θ for H (or H) of neutral particles in the system. If N = 2, we put Θ = τ2 (see [1]). Here ! ) ( 1 |qN |B (2.1) n+ τN = n ∈ N ∪ {0} . mN 2 Next let N ≥ 3 and suppose that the sets of thresholds are defined for all k-body systems in which the number of charged particles is just one, with 2 ≤ k ≤ N − 1. Let a = {C1 , . . . , C#(a) } ∈ A(amax ) with {N } ⊂ C#(a) . As we emphasized above, H C#(a) is just the #(C#(a) )-body Hamiltonian under consideration. Then one can define the set of thresholds τa,c for H C#(a) by the assumption of induction. Here it seems convenient that in the case when C#(a) = {N } one puts τa,c = ∅. Put σa,c = σpp (H C#(a) ). Next we consider K(a) on K(a). As we noted above, (1.14) odinger operator without exterholds, and K a is an (N − #(C#(a) ))-body Schr¨ nal electromagnetic fields in the center of mass frame. Thus one can define the set of thresholds τa,n for K a as in the usual way. Put σa,n = σpp (K a ). And set τ˜a,n = τa,n ∪ σa,n and τ˜a,c = τa,c ∪ σa,c . Now we define the set of thresholds Θ for
February 4, 2002 11:31 WSPC/148-RMP
208
00113
T. Adachi
ˆ by H (or H) [
Θ=
(˜ τa,n + τ˜a,c ) .
(2.2)
a∈A(amax )
Now we find the origin operator A of a conjugate operator Aˆ for the Hamiltonian ˆ In order to achieve it, we recall the argument in [1] for a two-body system. Begin H. with the following self-adjoint operator A1 on L2 (R2×2 × Z amax ) for H: A1 =
1 {(hz amax , Dzamax i + hDzamax , z amax i) + (y1 · Dy1 + Dy1 · y1 )} . 2
(2.3)
Putting H0 = Hamin , one can obtain the following commutation relation by a straightforward computation: 1 1 (2.4) i[H0 , A1 ] = −∆zamax + Dy1 2 = 2 H0 − (Dy2 − q2 A(y2 ))2 . m1 2m2 As is well-known, the spectrum of the last term consists of the Landau levels τ2 . The commutation relation (2.4) seems nice for studying the spectral theory for the ˆ However, since A1 does not commute with ktotal , U ∗ A1 U reduced Hamiltonian H. cannot be reduced on H. In order to overcome this difficulty, we introduce the self-adjoint operator Aˆ1 on H, which is obtained by removing the dependence on the total pseudomomentum (Dycm,1 , qBycm,1 ) from the operator U ∗ A1 U . This Aˆ1 ˆ In [1], using the relative is a conjugate operator for the reduced Hamiltonian H. coordinates and the center of mass coordinates, we obtained this Aˆ1 , but its representation was slightly complicated and unsuitable for generalizations to N -body systems. Now we review the argument in [1]: We see that the self-adjoint operator U (Aˆ1 ⊗ Id)U ∗ on L2 (R2×2 × Z amax ) = U (H ⊗ L2 (Rycm ,1 )) can be written as 1 U (Aˆ1 ⊗ Id)U ∗ = {(hz amax , Dzamax i + hDzamax , z amax i) + (w1 · Dy1 + Dy1 · w1 )} 2 (2.5) with w1 = y1 + κ0 ,
κ0 =
2 A(ktotal ) , qB 2
(2.6)
where Id is the identity operator on L2 (Rycm,1 ). In this case, one knows that q = q2 , of course. Now we note that ycc + κ0 = y2 + κ0 =
2 A(Dy1 + (Dy2 − q2 A(y2 ))) qB 2
(2.7)
is H-bounded. Since ycc + κ0 commutes with the total pseudomomentum ktotal , ˆ where we regarded U ∗ (ycc + κ0 )U as the reduced one U ∗ (ycc + κ0 )U is H-bounded, acting on H. We notice that one can write i[V12 , Aˆ1 ] = −(x1 − x2 ) · ∇V12 (x1 − x2 ) − (U ∗ (y2 + κ0 )U ) · ∇⊥ V12 (x1 − x2 )
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
209
on H, noting that V12 commutes with ktotal . Under the assumptions (V.1) and (V.2), ˆ H ˆ 0 + 1)−1 is bounded on H, and that for any ε > 0 ˆ 0 + 1)−1 i[V12 , A]( we see that (H ∞ and real-valued f ∈ C0 (R) there exists a compact operator K on H such that ˆ ˆ ˆ 2 ˆ f (H)i[V 12 , A1 ]f (H) ≥ −εf (H) + K ˆ which was mentioned holds. Here we used the fact that U ∗ (ycc +κ0 )U is H-bounded, above. Since both Dy1 and ktotal commute with H0 , it is clear that 1 ∗ 2 ˆ ˆ ˆ (2.8) (Dy2 − q2 A(y2 )) U i[H0 , A1 ] = 2 H0 − U 2m2 ˆ 0 and U ∗ {(1/2m1)(Dy2 − q2 A(y2 ))2 }U are the holds by virtue of (2.4), where H reduced operators acting on H of H0 and (1/2m1 )(Dy2 − q2 A(y2 ))2 , respectively. By virtue of these two estimates, we obtained the desired Mourre estimate in [1] (see also Theorem 2.1 in this section). Now we return to the present problem. We define the origin operator A of a ˆ conjugate operator Aˆ for the reduced Hamiltonian H: ( ) N −1 X 1 amax amax (hz , Dzamax i + hDzamax , z i) + (wj · Dyj + Dyj · wj ) (2.9) A= 2 j=1 with wj = yj + κ0 ,
κ0 =
2 A(ktotal ) , qB 2
j = 1, . . . , N − 1 .
(2.10)
We see that A commutes with the total pseudomomentum ktotal , by taking account of the fact that Dyj and wj , j = 1, . . . , N − 1, commute with ktotal . Here we note that q = qN and ycc = yN in this case, and that ! ! N −1 X 2 A Dyj + (DyN − qN A(yN )) (2.11) ycc + κ0 = qB 2 j=1 ˆ 0 -bounded, since ycc + κ0 is H0 -bounded. We also notice that U ∗ (ycc + κ0 )U is H commutes with ktotal and is H0 -bounded as we mentioned just now, where we regarded U ∗ (ycc +κ0 )U as the reduced one acting on H. Since Dyj , j = 1, . . . , N −1, and ktotal all commute with H0 , it is clear that N −1 X 1 1 2 2 (2.12) Dy = 2 H0 − (DyN − qN A(yN )) i[H0 , A] = −∆zamax + mj j 2mN j=1 ˆ as holds. And we define a conjugate operator Aˆ for the reduced Hamiltonian H the reduced operator on H of A. The Nelson’s commutator theorem guarantees the self-adjointness of A and Aˆ (see e.g. [18]). Moreover, by virtue of the fact that ˆ 0 -bounded, one can check that (H ˆ 0 + 1)−1 i[V, A]( ˆ H ˆ 0 + 1)−1 U ∗ (ycc + κ0 )U is H is bounded on H in the same way as in the two-body case which we mentioned above, under the assumptions (V.1) and (V.2). We have only to keep in mind that wj1 − wj2 = yj1 − yj2 with 1 ≤ j1 , j2 ≤ N − 1.
February 4, 2002 11:31 WSPC/148-RMP
210
00113
T. Adachi
Then we have the following main result of this section by virtue of the abstract Mourre theory (see e.g. [16] and [4]) and the HVZ theorem for the reduced Hamilˆ (it is well-known that the HVZ theorem for H cannot hold, since H has tonian H the so-called Landau degeneracy which was proved in [3]): Theorem 2.1. Suppose that the potential V satisfies the conditions (V.1) and (V.2). Put d(λ) = dist(λ, Θ ∩ (−∞, λ]) for λ ≥ inf Θ, where Θ is as in (2.2). Then for any λ ≥ inf Θ and any ε > 0, there exists a δ > 0 such that for any real-valued f ∈ C0∞ (R) supported in the open interval (λ − δ, λ + δ), there exists a compact operator K on H such that ˆ H, ˆ A]f ˆ (H) ˆ ≥ 2(d(λ) − ε)f (H) ˆ 2+K f (H)i[
(2.13)
ˆ can accumulate only at Θ, and Θ ∪ σpp (H) ˆ is a holds. Moreover, eigenvalues of H closed countable set. Proof. We follow the argument of Froese–Herbst [7]. We introduce a partition of unity {ja }a∈A(amax ) of the configuration space amax = Y amax × Z amax : Let ja be a real-valued smooth function on X amax which X are homogeneous of degree 0 outside the unit ball {xamax ∈ X amax |xamax | ≤ 1} amax | holds outside with the properties athat on the support of ja |xj − xk | ≥ C(j,k)a|x amax amax |x max | ≤ 1} for any pair (j, k) 6⊂ a, where x max = (y amax , z amax ), ∈X {x P P and that a∈A(amax ) ja2 ≡ 1. Inserting 1 ≡ a∈A(amax ) ja2 , we have X ˆ H, ˆ A]f ˆ (H) ˆ ≥ ˆ a i[H, ˆ A]j ˆ a f (H) ˆ + K1 f (H)i[ f (H)j (2.14) a∈A(amax )
with some compact operator K1 on H. Here we used the fact that for any function ˆ + i)−1 is compact g(xamax ) such that |g(xamax )| → 0 as |xamax | → ∞, g(xamax )(H on H, which was proved by Avron–Herbst–Simon [3] (see also [8, 9, 10] and [15]). ˆ a )}(H ˆ 0 + 1) is compact on H, which follows ˆ a − ja f (H Next we note that {f (H)j ˆ a − ζ)−1 }(H ˆ 0 + 1), ζ ∈ C \ R is compact on ˆ − ζ)−1 ja − ja (H from the fact that {(H H, which can be shown by the proof in [3] and a standard argument (see e.g. [7]). ˆ a , A]( ˆ H ˆ 0 + 1)−1 is bounded on H, we obtain ˆ 0 + 1)−1 i[H And also noticing that (H X ˆ a )i[H ˆ a , A]f ˆ (H ˆ a )ja − εf (H) ˆ 2 + K2 (2.15) ˆ H, ˆ A]f ˆ (H) ˆ ≥ ja f (H f (H)i[ a∈A(amax )
with some compact operator K2 on H, by virtue of (V.2). Now we claim that for each a ∈ A(amax ), 1 ˆ a )2 ˆ ˆ ˆ ˆ f (Ha )i[Ha , A]f (Ha ) ≥ 2 d(λ) − ε f (H 2
(2.16)
holds for sufficiently small δ > 0, by virtue of the results for many body Schr¨ odinger operators without external electromagnetic fields (see e.g. [4, 6, 7 and 17]) and in [1]:
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
211
Let a = {C1 , . . . , C#(a) } ∈ A(amax ) with {N } ⊂ C#(a) . By a simple computation, we have ! ! #(a) X X 1 2 Cj − ∆zCj + i[V , A] − ∆za . Dy (2.17) i[Ha , A] = ml l n j=1 l∈Cj
We notice that for 1 ≤ j < k ≤ N − 1, i[Vjk , A] = −(xj − xk ) · (∇Vjk )(xj − xk ) holds, and that for 1 ≤ j ≤ N − 1, i[VjN , A] = −(xj − xN ) · (∇VjN )(xj − xN ) 1 − {(ycc + κ0 ) · (∇⊥ VjN )(xj − xN ) + (∇⊥ Vj N )(xj − xN ) · (ycc + κ0 )} 2 holds. Since Ha and A commute with the total pseudomomentum ktotal , ˆ a , A] ˆ holds. Here we regarded U ∗ i[Ha , A]U as the reduced one U ∗ i[Ha , A]U = i[H ˆ a -boundedness of U ∗ (ycc + κ0 )U , we obtain acting on H. Taking account of the H ! ! #(a) X ∗ ˆ a) ˆ a , A]f ˆ (H ˆ a ) ≥ f (H ˆ a )U ˆ a )i[H Bj − ∆za U f (H (2.18) f (H j=1
with for 1 ≤ j ≤ #(a) − 1, ! X 1 2 − ∆zCj − D Bj = ml y l l∈Cj
X
(xl1 − xl2 ) · (∇Vl1 l2 )(xl1 − xl2 ) ,
{l1 ,l2 }⊂Cj l1
(2.19) and X
B#(a) =
n l∈C#(a)
1 2 D ml y l
− ∆zC#(a) −
!
X
(xl1 − xl2 ) · (∇Vl1 l2 )(xl1 − xl2 )
{l1 ,l2 }⊂C#(a) l1
−
X 1 ε − Cε |(∇⊥ VlN )(xl − xN )|2 , 32 n
(2.20)
l∈C#(a)
where Cε > 0 depends on ε > 0 and λ only as long as δ > 0 runs the interval (0, P#(a) 1]. Here we regard U ∗ (( j=1 Bj ) − ∆za )U as the reduced one acting on H, and notice that Cjn = Cj if 1 ≤ j ≤ #(a) − 1. First we note that X
#(a)−1
j=1
Bj = i[K a , AK a ] − ∆ya,n
(2.21)
February 4, 2002 11:31 WSPC/148-RMP
212
00113
T. Adachi
holds, where AK a is the generator of dilations on L2 (X a,n ), which is one of the conjugate operators for K a . And we also notice that each Bj acts on L2 (R2×#(Cj ) × Z Cj ), and that ∆za , K a and AK a are invariant under the transformation U , that is, U ∗ ∆za U = ∆za , U ∗ K a U = K a and U ∗ AK a U = AK a hold. Though we used both new and old coordinates at the same time here, we think that this does not confuse the reader. And we use the fact that if C#(a) = {N }, we have B#(a) = −ε/32 in the argument below. Taking account of the form of B#(a) , we prove both the statement of the theorem and the following statement by induction with respect to the number N −1 of neutral particles in the systems: For sufficiently small δ > 0, 1 ∗ ˜ ˆ 2 ˆ ˆ (2.22) f (H)U B0 U f (H) ≥ 2 d(λ) − ε f (H) 8 holds, where ( ˜ = d(s)
ˆ dist(s, (−∞, s] ∩ (Θ ∪ σpp (H)))
ˆ if s ≥ inf(Θ ∪ σpp (H))
s+C
ˆ if s < inf(Θ ∪ σpp (H))
ˆ and for some large constant C > 0 such that −C < inf(Θ ∪ σpp (H)), ! N −1 X X 1 B0 = Dyj 2 − ∆zamax − (xl1 − xl2 ) · (∇Vl1 l2 )(xl1 − xl2 ) mj j=1 1≤l1
−
N −1 X 1 ε − C |(∇⊥ VjN )(xj − xN )|2 32 j=1
(2.23)
for some Cε > 0. Here we took account of the fact that B0 commutes with the total pseudomomentum ktotal and regarded U ∗ B0 U as the reduced one acting on H. The theorem for N = 2 was proved in [1]. Then one can also prove the above ˆ 0 + 1) is ˆ − f (H ˆ 0 )}(H estimate (2.22) for N = 2 as follows: When N = 2, {f (H) compact on H by (V.1). By taking account of B0 = i[H0 , A] − (x1 − x2 ) · (∇V12 )(x1 − x2 ) −
1 ε − Cε |(∇⊥ V12 )(x1 − x2 )|2 32
and following the same way as in [1], one can obtain ˆ 2 + K3 ˆ ≥ 2 d(λ) − 1 ε f (H) ˆ ∗ B0 U f (H) f (H)U 16
(2.24)
with some compact operator K3 on H, for sufficiently small δ > 0. Then, by virtue of the theorem for N = 2, we have ˆ 2 ˆ ≥ 2 d˜ λ + 1 ε − 2 × 1 ε − 1 ε f (H) ˆ ∗ B0 U f (H) (2.25) f (H)U 64 64 16
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
213
for sufficiently small δ > 0, by following the idea of Froese–Herbst [7] (see also [4] ˜ + ε/64) ≥ d(λ) ˜ − ε/64, we obtain the estimate (2.22) for N = 2. and [6]). Since d(λ Now let N1 ≥ 3 and assume that the theorem and the above estimate (2.22) holds for any N such that 2 ≤ N ≤ N1 − 1. Then we prove the theorem and the estimate (2.22) for N = N1 . Since (2.21) holds and (2.22) holds for any N such that 2 ≤ N ≤ N1 − 1, by following the argument in [4], one can obtain the estimate (2.16). We sketch the proof of this: Ha is decomposed into 1 Ha = K a ⊗ Id ⊗ Id + Id ⊗ T a ⊗ Id + Id ⊗ Id ⊗ − ∆za 2 on L2 (R2×N1 × Z amax ) = L2 (X a,n ) ⊗ T a ⊗ L2 (Za ), where 1 a T = − ∆ya,n ⊗ Id + Id ⊗ H C#(a) = T1a + T2a 2 acting on T a = L2 (Ya,n ) ⊗ L2 (R2×#(C#(a) ) × Z C#(a) ). Here we recall that K a and −∆za /2 are invariant under the transformation U . Introducing the reduced operator Tˆja , j = 1, 2, on H of U ∗ (Id ⊗ Tja ⊗ Id)U and putting Tˆa = Tˆ1a + Tˆ2a , we use ˆ a = K a + Tˆ a + (−∆za /2). In a direct integral a direct integral representation of H decomposition Z ∞Z ⊕ H(t1 , t2 )dµ(t1 )dt2 , H= 0
σ(Tˆ a )
ˆ a is represented by K a + t1 + t2 on each fiber H(t1 , t2 ). Since K a , Tˆa and −∆za /2 H are all bounded from below, f (K a + t1 + t2 ) is nonzero in some compact subset odinger M ⊂ σ(Tˆ a ) × [0, ∞) only. Then by using the results for many body Schr¨ operators without external electromagnetic fields, in particular the result of [7], we see that for sufficiently small δ > 0 f (K a + t1 + t2 )i[K a , AK a ]f (K a + t1 + t2 ) 1 1 ≥ 2 d1 λ + ε − (t1 + t2 ) − 3 × ε f (K a + t1 + t2 )2 64 64 1 3 ≥ 2 d1 (λ − (t1 + t2 )) − ε − ε f (K a + t1 + t2 )2 64 64 holds uniformly in (t1 , t2 ) ∈ M. Here ( dist(s, (−∞, s] ∩ τ˜a,n ) d1 (s) = s+C
(2.26)
if s ≥ inf τ˜a,n if s < inf τ˜a,n
for some large constant C > 0 such that −C < inf τ˜a,n , and we used the fact that d1 (s + ε/64) ≥ d1 (s) − ε/64. Thus we obtain 1 ˆ a )2 ˆ a ) ≥ 2 d1 λ − Tˆ a − 1 ∆za ˆ a )i[K a , AK a ]f (H − ε f (H f (H 2 16
February 4, 2002 11:31 WSPC/148-RMP
214
00113
T. Adachi
1 a ˆ a )2 = 2 d1 (K + (λ − Ha )) − ε f (H 16 1 1 a ˆ a )2 , ≥ 2 d1 (K ) − ε − ε f (H 16 16
(2.27)
where we took δ > 0 as δ < ε/16 and used the fact that d1 (s1 + s2 ) ≥ d1 (s1 ) − |s2 |. Let a ∈ A(amax ) such that {N1 } ( C#(a) . By virtue of (2.22) and the assumption of induction, one can obtain the following estimate similarly : ˆ a )2 , ˆ a ) ≥ 2 d2 (Tˆ a ) − 1 ε f (H ˆ a )U ∗ B#(a) U f (H (2.28) f (H 2 8 where
( d2 (s) =
dist(s, (−∞, s] ∩ τ˜a,c )
if s ≥ inf τ˜a,c
s+C
if s < inf τ˜a,c
for some large constant C > 0 such that −C < inf τ˜a,c . Summing up these two estimates (2.27) and (2.28), and using (2.21), we have 1 1 a a a ˆ a )2 . (2.29) ˆ ˆ ˆ ˆ ˆ ˆ f (Ha )i[Ha , A]f (Ha ) ≥ 2 d1 (K ) + d2 (T2 ) + T1 − ∆za − ε f (H 2 4 Now we show that for λ1 ∈ σ(K a ), λ2 ∈ σ(Tˆ2a ), t1 ∈ σ(Tˆ1a ) = [0, ∞) and t2 ∈ σ(−∆za /2) = [0, ∞), we have d1 (λ1 ) + d2 (λ2 ) + t1 + t2 ≥ d(λ1 + λ2 + t1 + t2 ) .
(2.30)
We note that d1 (λ1 ) = λ1 − θ1 for some θ1 ∈ τ˜a,n ∩ (−∞, λ1 ], and d(λ2 ) = λ2 − θ2 for some θ2 ∈ τ˜a,c ∩ (−∞, λ2 ]. Then the left-hand side is equal to λ1 + λ2 + t1 + ˆ θ 1 + θ2 ∈ t2 − (θ1 + θ2 ). However, by the definition of the set of thresholds of H, Θ ∩ (−∞, λ1 + λ2 + t1 + t2 ] since t1 , t2 ≥ 0. This implies the inequality (2.30) by the definition of d(s). Combining these inequalities (2.29) and (2.30), we have 1 ˆ a )2 ˆ ˆ ˆ ˆ ˆ f (Ha )i[Ha , A]f (Ha ) ≥ 2 d(Ha ) − ε f (H 4 1 ˆ a )2 ˆ = 2 d(λ + (Ha − λ)) − ε f (H 4 1 1 ˆ a )2 . ≥ 2 d(λ) − ε − ε f (H (2.31) 4 16 Therefore we get the estimate (2.16) for a ∈ A(amax ) such that {N1 } ( C#(a) . Let a ∈ A(amax ) such that C#(a) = {N1 }. We first notice that H C#(a) =
1 (DyN1 − qN1 A(yN1 ))2 , 2mN1
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
215
whose spectrum is the set of Landau levels τN1 . Then, by following the same way as above (see also [1]), one can get ˆ a ) = f (H ˆ a )(Tˆa − Tˆa )f (H ˆ a) ˆ a )Tˆ a f (H f (H 1 2 1 a ˆ a )2 ˆ ≥ d3 (T ) − ε f (H 8
(2.32)
for sufficiently small δ > 0 (such that δ < ε/4). Here ( dist(s, (−∞, s] ∩ τN1 ) if s ≥ inf τN1 d3 (s) = s if s < inf τN1 . Summing up the estimates (2.27) and (2.32), we have 1 1 a a ˆ ˆ a )2 ˆ ˆ ˆ ˆ f (Ha )i[Ha , A]f (Ha ) ≥ 2 d1 (K ) + d3 (T ) − ∆za − ε f (H 2 4 1 ˆ a )2 ˆ ≥ 2 d(Ha ) − ε f (H 4 1 ˆ a )2 ˆ = 2 d(λ + (Ha − λ)) − ε f (H 4 1 1 ˆ a )2 ≥ 2 d(λ) − ε − ε f (H (2.33) 4 4 in the way quite similar to the one above. Therefore we get the estimate (2.16) for a ∈ A(amax ) such that C#(a) = {N1 }, too. ˆ a − By combining (2.15) with (2.16), and using the fact that {f (H)j ˆ a )}(H ˆ 0 + 1) is compact on H again, we have the Mourre estimate (2.13). ja f (H The other statement in the theorem follows from the Mourre estimate (2.13) and ˆ which the abstract Mourre theory, by taking account of the HVZ theorem for H says that ˆ = inf σess (H)
inf
a∈A(amax )
ˆ a ) = inf Θ . σ(H
(2.34)
After getting the statement of the theorem for N = N1 , one can also prove the estimate (2.22) for N = N1 in the way quite similar to the one in the proof of the Mourre estimate (2.13). Therefore the theorem is proved. Remark 2.2. By regarding the conjugate operator Aˆ as Aˆ = 0 when N − 1 = 0, maybe one can start from N = 1 in the induction argument and the proof may become simpler. However, the situation of systems which have some neutral particles seems different from the one of systems that have no neutral particle. Hence we started from N − 1 = 1 in order to present the difference in the above proof. As one has already seen, the above argument in the proof is valid also for the case when the space dimension is two.
February 4, 2002 11:31 WSPC/148-RMP
216
00113
T. Adachi
In order to study the scattering theory for the Hamiltonian H, the following ˆ is the reduced operator corollary seems useful, which follows from the fact that H on H of H and a standard argument immediately (cf. [1]): Corollary 2.3. Suppose that the potential V satisfies the conditions (V.1) and (V.2). Then for any λ ∈ R \ (Θ ∪ σpp (H)), there exist δ > 0 and c > 0 such that for any real-valued f ∈ C0∞ (R) supported in the open interval (λ − δ, λ + δ), f (H)i[H, A]f (H) ≥ cf (H)2
(2.35)
holds. 3. Propagation Estimates In this section, we prove some propagation estimates which are useful for showing the asymptotic completeness for the system under consideration. Throughout this section, we assume that the potential V satisfies the following condition (LR) as well as (V.1), (V.2) and (V.3). (LR) Vjk is decomposed as Vjk = Vjk,S +Vjk,L , where a real-valued Vjk,L ∈ C ∞ (R3 ) such that |∂rα Vjk,L (r)| ≤ Cα hri−|α|−µL with 0 < µL ≤ 1, and a real-valued Vjk,S satisfies that ∇Vjk,S is −∆-bounded and
1[1,∞) |r| Vjk,S (−∆ + 1)−1 = O(R−µS1 ) ,
R
1[1,∞) |r| ∇Vjk,S (−∆ + 1)−1 = O(R−1−µS2 )
R as R → ∞, with µS1 > 1 and µS2 > 0. One can use this condition (LR) in the study of long-range scattering for N body quantum systems in a constant magnetic field under the condition that the number of charged particles in the systems is only one. We note that by putting VL ≡ 0, (LR) implies (SR). Inspired by [1], we first introduce the configuration space X = R2×(N −1) ×Z amax which is equipped with the metric ! N −1 X p ˜ = mj yj · y˜j + hz amax , z˜amax i , |Ξ|1 = hΞ, Ξi hΞ, Ξi j=1
˜ = (˜ for Ξ = (y1 , . . . , yN −1 , z amax ) ∈ X and Ξ y1 , . . . , y˜N −1 , z˜amax ) ∈ X . We denote the velocity operator associated with Ξ by pΞ = −i∇Ξ . Now, for a = {C1 , . . . , C#(a) } ∈ A with {N } ⊂ C#(a) , we introduce two subspaces X a and Xa of X as follows: ( ) X 2×(N −1) a mk yk = 0 for any j = 1, . . . , #(a) − 1 X = (y1 , . . . , yN −1 ) ∈ R k∈Cj
× Za ,
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
217
Xa = {(y1 , . . . , yN −1 ) ∈ R2×(N −1) | yl1 = yl2 if l1 , l2 ∈ Cj , for any j = 1, . . . , #(a) − 1 ; yk = 0 if k ∈ C#(a) } × Za . We see that these two subspaces are mutually orthogonal, and that X a ⊕Xa = X . We denote by π a and πa the orthogonal projections of X onto X a and Xa , respectively. And we write Ξa = π a Ξ and Ξa = πa Ξ. Denoting the velocity operators associated with Ξa and Ξa by pΞa = −i∇Ξa and pΞa = −i∇Ξa , respectively, we see that pΞa = π a pΞ and pΞa = πa pΞ . For a, b ∈ A, we denote the smallest cluster decomposition c ∈ A with a ⊂ c and b ⊂ c by a∪b, whose existence and uniqueness are well-known. Then we note that for a, b ∈ A Xa∪b = Xa ∩ Xb holds, which can be seen easily. For δ > 0, we put Xa (δ) = {Ξ ∈ X ||Ξa |1 < δ} .
(3.1)
Now we would like to introduce the so-called Graf vector field. We recall the way how to construct it in [11] and [5] (see also [6]): Let ρ = {ρa | a ∈ A} be a sequence of non-negative numbers indexed by elements of A such that ρamin = 0. Then we introduce the set Ωa (ρ) 2 2 = {Ξ ∈ X |Ξa |1 + ρa > |Ξb |1 + ρb for any b ∈ A such that b ( a or a ( b} (3.2) for a ∈ A, by combining the idea of Graf [11] with the one of Derezi´ nski [5] (see also [6]). 2 1/2 2 is a norm on X a∪b , we have Since the seminorm (|Ξa |1 + |Ξb |1 2
2
10r × |Ξa∪b |1 ≤ |Ξa |1 + |Ξb |1
2
(3.3)
for all a, b ∈ A with sufficiently small r > 0. We also require r ≤ 1/5. Now we take a sequence ρ such that 1 #(a) r ≤ ρa ≤ r#(a) , 2
a 6= amin .
(3.4)
And, for simplicity of notations, we put r#(amin ) ≡ 0. The following lemma and proposition can be obtained as in [11, 5 and 6], and so we omit the proofs. 2
Lemma 3.1. Let a, b ∈ A. Assume that a 6⊂ b and b 6⊂ a. If |Ξa∪b |1 ≥ r#(a∪b) /2 − 2 r#(a) and |Ξa |1 ≤ 2r#(a) , then one has 2
|Ξb |1 ≥ r#(b) .
February 4, 2002 11:31 WSPC/148-RMP
218
00113
T. Adachi
Proposition 3.2. Let a, b ∈ A. (1) For a 6= b, the intersection Ωa (ρ) ∩ Ωb (ρ) is a set of measure zero. Here Ωa (ρ) and Ωb (ρ) are the closures of the sets Ωa (ρ) and Ωb (ρ), respectively. The family of sets {Ωa (ρ) | a ∈ A} is a family of disjoint open sets in X and one has [ Ωa (ρ) = X . a∈A
(2) For Ξ ∈ Ωa (ρ) and b 6⊂ a, one has 3 #(b) 3 N −1 r r ≥ . 10 10 p Moreover, for δ > 0 such that δ < 3rN −1 /10, [ Ωb (ρ) Xa (δ) ⊂ 2
|Ξb |1 ≥
˜a b∈A
holds with A˜a = {b ∈ A | a ⊂ b}. Such a family of sets {Ωa (ρ) | a ∈ A} should be called a Graf partition of X . Now we put Rρ (Ξ) =
1 2 max(|Ξa |1 + ρa ) , 2 a∈A
(3.5)
a ∈ A,
(3.6)
qa,ρ (Ξ) = 1Ωa (ρ) (Ξ) ,
where 1Ωa (ρ) is the characteristic function of the set Ωa (ρ). The following proposition can be shown as in [5, 6], by virtue of Proposition 3.2. Hence we omit the proof. Proposition 3.3. Rρ (Ξ) is a continuous convex function on X . Moreover, one has the following: X1 qa,ρ (Ξ)(|Ξa |1 2 + ρa ) Rρ (Ξ) = 2 a∈A
holds up to a set of measure zero. (∇Ξ Rρ )(Ξ) =
X
Ξa qa,ρ (Ξ) ,
a∈A
(∇2Ξ Rρ )(Ξ) ≥
X
πa qa,ρ (Ξ) ,
a∈A
hξ, (∇2Ξ Rρ )(Ξ)ξi − hξ, (∇Ξ Rρ )(Ξ)i − h(∇Ξ Rρ )(Ξ) , ξi + 2Rρ (Ξ) X 2 ≥ qa,ρ (Ξ)|ξa − Ξa |1 a∈A
for ξ ∈ X hold as distributions.
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
max{|Ξ|21 , C1 } ≤ 2Rρ (Ξ) ≤ |Ξ|21 + C2
219
for some C1 , C2 > 0 ,
and for any a ∈ A, Rρ depends on Ξa only in some neighborhood of Xa . Next we fix a function f0 ∈
C0∞ (×a6=amin [r#(a) /2, r#(a) ])
where dρ =
N a6=amin
Z such that f0 ≥ 0 ,
f0 (ρ)dρ = 1 ,
dρa . Then we define Z R(Ξ) = f0 (ρ)Rρ (Ξ)dρ ,
(3.7)
Z q˜a (Ξ) =
f0 (ρ)qa,ρ (Ξ)dρ ,
qa (Ξ) = pP
q˜a (Ξ) , ˜b 2 (Ξ) b∈A q
a∈A
(3.8)
a ∈ A.
(3.9)
The following proposition can also be shown in the same way as in [5] and [6], by virtue of Proposition 3.3. So we omit the proof. Proposition 3.4. R(Ξ) is a smooth convex function on X . q˜a (Ξ) and qa (Ξ), a ∈ A, are all bounded smooth functions on X with bounded derivatives. Moreover, one has X q˜a (Ξ) ≡ 1 , a∈A
X
qa 2 (Ξ) ≡ 1 ,
a∈A
max{|Ξ|1 2 , C1 } ≤ 2R(Ξ) ≤ |Ξ|1 2 + C2 (∇Ξ R)(Ξ) =
X
for some C1 , C2 > 0 ,
Ξa q˜a (Ξ) ,
a∈A
(∇2Ξ R)(Ξ) ≥
X
πa q˜a (Ξ) ,
a∈A
hξ, (∇2Ξ R)(Ξ)ξi − hξ, (∇Ξ R)(Ξ)i − h(∇Ξ R)(Ξ), ξi + 2R(Ξ) X ≥ q˜a (Ξ)|ξa − Ξa |1 2 , ξ ∈ X , a∈A
and that for any a ∈ A, R depends on Ξa only in some neighborhood of Xa . For any multi-index α, ∂Ξα (2R(Ξ) − |Ξ|1 2 ), ∂Ξα (hΞ, (∇Ξ R)(Ξ)i − |Ξ|1 2 ) and ∂Ξα (hΞ, (∇2Ξ R)(Ξ)Ξi − |Ξ|1 2 ) are all bounded functions on X . The following lemma is also needed for proving some propagation estimates in this section.
February 4, 2002 11:31 WSPC/148-RMP
220
00113
T. Adachi
Lemma 3.5. There exists σ0 > 0 such that for any σ ≥ σ0 , one has qa (Ξ) = qa (Ξ)
X
q˜b (σΞ) ,
a ∈ A,
b∈Aa
where Aa = {b ∈ A | b ⊂ a}. Proof. By virtue of Proposition 3.4, it is sufficient to show that there exists σ0 > 0 such that for any σ ≥ σ0 and b 6⊂ a, supp qa (Ξ)∩ supp q˜b (σΞ) = ∅. By the definition of qa (Ξ) and Proposition 3.2, we see that for Ξ ∈ supp qa (Ξ) and b 6⊂ a, |Ξb |12 ≥
3 #(b) r 10
holds. On the other hand, for Ξ ∈ supp q˜b (σΞ), |σΞb |1 2 = |σΞamin |1 2 − |σΞb |1 2 ≤ r#(b) ,
b 6= amin
holds, where we took account of (3.4). Thus taking σ0 > 0 as σ0 > obtain this lemma.
p 10/3, we
Following the argument of [3], we introduce the creation operator β ∗ by using the total pseudomomentum ktotal = (ktotal,1 , ktotal,2 ) as follows (see also [1]): 1 β =√ 2 ∗
1 ktotal,2 − iktotal,1 qB
.
(3.10)
Here we took account of (1.8). In the argument below, we use the localization of the number operator N0 = β ∗ β in addition to the localization of the energy. Now we show the following important propagation estimate, which was due to Graf [11] in the case of N -body Schr¨ odinger operators without external electromagnetic fields (see also [5, 6]). ∞ function such that J = 1 on Theorem 3.6. Let a ∈ A, J ∈ C0 (X ) be a cut-off ∞ {Ξ ∈ X |Ξ|1 ≤ θ} and J ≥ 0, and f, h ∈ C0 (R) be real-valued. Suppose that max{(1 + µS2 )−1 , (1 + µL )−1 } < ν ≤ 1. Then, for sufficiently large θ > 0, there exists a constant C > 0 such that for any ψ ∈ L2 (R2×N × Z amax )
Z 1
holds.
∞
2
Ξa
Ξ Ξ −itH dt
≤ Ckψk2 ψ
t − pΞa qa tν J t f (H)h(N0 )e t 1
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
221
In order to prove this theorem, we need the following lemma. Lemma 3.7. Let f, h ∈ C0∞ (R) and ν > 0. Then for any ε > 0 and M ∈ N , one has
1[ε,∞) |yN | f (H)h(N0 ) = O(t−Mν )
ν t as t → ∞. Proof. First we recall that |yN |M f (H)h(N0 ), M ∈ N , is bounded on L2 (R2×N × Z amax ), because of the relation N −1 X Dyj − (DyN − qA(yN )) 2qA(yN ) = ktotal − j=1
P −1 and the boundedness of the operators (ktotal )α h(N0 ) and {( N j=1 Dyj ) + (DyN − qA(yN ))}α f (H), α ∈ (N ∪ {0})2 . For u1 ∈ L2 (R2×N × Z amax ), we put u = 1[ε,∞) (|yN |/tν )f (H)h(N0 )u1 . Then we have (εtν )M kuk ≤ k |yN |M uk ≤ Cku1 k with some C > 0 independent of t ≥ 1. This implies the lemma. We also need the following maximal velocity estimate in order to prove Theorem 3.6. Proposition 3.8. For any real-valued f ∈ C0∞ (R) there exists M > 0 such that for any M2 > M1 ≥ M,
2 Z ∞
1[M ,M ] |Ξ|1 f (H)e−itH ψ dt ≤ Ckψk2 1 2
t t 1 for any ψ ∈ L2 (R2×N × Z amax ), with C > 0 independent of ψ. Moreover, for any ψ ∈ L2 (R2×N × Z amax ) such that (1 + |Ξ|1 )1/2 ψ ∈ L2 (R2×N × Z amax ),
2 Z ∞
1[M ,∞) |Ξ|1 f (H)e−itH ψ dt < ∞ 1
t t 1 holds. Proof. The proof is done in the way similar to the one in [20] (see also e.g. [5, 6, 11 and 1]). We sketch the proof. Let G1 be defined by Z s g1 (u)2 du , G1 (s) = −∞
with real-valued g1 ∈ We use
C0∞ (R)
supported in [M0 , ∞) such that g1 = 1 on [M1 , M2 ].
February 4, 2002 11:31 WSPC/148-RMP
222
00113
T. Adachi
Φ1 (t) = −G1
|Ξ|1 t
as a propagation observable. We note that Φ1 (t) is uniformly bounded in t ≥ 1. The Heisenberg derivative DH (Φ1 (t)) = ∂t Φ1 (t) + i[H, Φ1 (t)] of Φ1 (t) is calculated as follows: ( 2 2 |Ξ|1 1 |Ξ|1 |Ξ|1 Ξ g1 − , pΞ DH (Φ1 (t)) = 2 g1 t t 2t t |Ξ|1 2 ) Ξ |Ξ|1 . g1 + pΞ , |Ξ|1 t By virtue of the boundedness of pΞ f (H), which can be seen easily, we have 2 M0 − C |Ξ|1 f (H)g f (H)DH (Φ1 (t))f (H) ≥ f (H) + O(t−2 ) t t for some C > 0 which depends on f . Then if we take M0 > 0 so large that M0 > C, we obtain the first estimate. Next we put Z s g2 (u)2 du G2 (s) = −∞
with g2 ∈ C0∞ (R) which is supported in [M0 , M0 + 1] and satisfies R ∞ real-valued g (u)2 du = 1. We use −∞ 2 |Ξ|1 |Ξ|1 − M0 G2 Φ2 (t) = − t t as a propagation observable. We note that (1 + |Ξ|1 )−1/2 eitH f (H)Φ2 (t) f (H)e−itH (1 + |Ξ|1 )−1/2 is uniformly bounded in t ≥ 1. In the same way as above, we have |Ξ|1 M0 − C1 G2 f (H)DH (Φ2 (t))f (H) ≥ f (H) t t ) 2 C2 |Ξ|1 −2 g2 + O(t ) f (H) . − t t Thus if we take M0 so large that M0 > C1 and the first estimate holds, we obtain the second estimate. Proof of Theorem 3.6. We follow the argument of [11, 5 and 6]. First we take ν1 > 0 such that max{(1 + µS2 )−1 , (µL + 1)−1 } < ν1 < ν ≤ 1. We define 1 Ξ Ξ (∇Ξ R) ν1 , pΞ + pΞ , (∇Ξ R) ν1 Bt = tν1 −1 2 t t Ξ Ξ Ξ Ξ , − t2ν1 −2 R ν1 + ν1 t2ν1 −2 2R ν1 − ν1 , (∇Ξ R) ν1 t t t t
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
223
Ξ Ξ Ξ 2 , (∇Ξ R) ν1 pΞ − ν1 Ct = pΞ − ν1 t t t Ξ Ξ Ξ Ξ ν1 −1 + pΞ − ν1 , (∇Ξ R) ν1 (∇Ξ R) ν1 , pΞ − ν1 − (1 − ν1 )t t t t t Ξ + 2(1 − ν1 )2 t2ν1 −2 R ν1 . t Then, by a straightforward computation, we have Ξ 2ν1 −1 , R ν1 Bt = DH0 t t 1 −2ν1 −1 2 Ξ (∆Ξ R) ν1 DH0 Bt = t Ct − t 4 t Ξ Ξ Ξ . + t2ν1 −3 ν1 (ν1 − 1) 2R ν1 − ν1 , (∇Ξ R) ν1 t t t
(3.11)
−1
(3.12)
One can write Ct as
Ξ 2 τ, (∇Ξ R) ν1 τ Ct = (1 − ν1 ) t t Ξ Ξ Ξ − (∇Ξ R) ν1 , τ + 2R ν1 − τ, (∇Ξ R) ν1 t t t 2 2ν1 −2
with
Ξ . τ = (1 − ν1 )−1 t−ν1 +1 pΞ − ν1 t
Taking account of
we obtain
Ξ Ξ −1 −ν1 +1 , pΞ − τ − ν1 = (1 − ν1 ) t t t
X Ξa Ξa Ξ q˜a ν1 pΞa − p Ξa − Ct ≥ t t t
(3.13)
a∈A
by Proposition 3.4. Now we introduce a propagation observable Ξ Ξ Bt J f (H)h(N0 ) . Φ(t) = h(N0 )f (H)J t t For simplicity of notations, we write Jt = J(Ξ/t). We note that Φ(t) is uniformly bounded in t. We compute the Heisenberg derivative DH Φ(t) of Φ(t): DH Φ(t) = h(N0 )f (H)(DH0 Jt )Bt Jt f (H)h(N0 ) + h(N0 )f (H)Jt Bt (DH0 Jt )f (H)h(N0 )
February 4, 2002 11:31 WSPC/148-RMP
224
00113
T. Adachi
ν1 −1
−t
h(N0 )f (H) ∇Ξ V (x), (∇Ξ R)
Ξ tν1
Jt2 f (H)h(N0 )
+ h(N0 )f (H)Jt (DH0 Bt )Jt f (H)h(N0 ) =: I1 (t) + I2 (t) + I3 (t) + I4 (t) .
(3.14)
We take j ∈ C0∞ (X ) such that j ≥ 0, j = 1 on supp ∇Ξ J and supp j ⊂ {Ξ ∈ X | θ/2 ≤ |Ξ|1 ≤ 2θ}. Since Ξ Ξ −1 B(t)j f (H)h(N0 ) + O(t−2 ) I1 (t) + I2 (t) = t h(N0 )f (H)j t t for some uniformly bounded observable B(t), by using Proposition 3.8, we have for sufficiently large θ > 0 Z ∞ |(e−itH ψ, (I1 (t) + I2 (t))e−itH ψ)| dt ≤ Ckψk2 . (3.15) 1
By Proposition 3.4, one can rewrite I3 (t) as X Ξa Ξ q˜a ν1 Jt2 f (H)h(N0 ) . I3 (t) = − h(N0 )f (H) ∇Ξa Ia (x), t t a∈A
In fact, one can identify R2×N × Z amax with X × R2yN , and then write V (x) as V (Ξ, yN ). Hence we have only to show that for (j, k) ⊂ a, Vjk (xj − xk ) depends on Ξa only for any fixed yN . We write a = {C1 , . . . , C#(a) } with {N } ⊂ C#(a) . We note that X a can be written as X a = Y C1 × · · · × Y C#(a)−1 × R2×#(C#(a) ) × Z a .
(3.16)
When {j, k} ⊂ Cl with l = 1, . . . , #(a)−1, it is well-known that Vjk (xj −xk ) depends odinger operators without external on Ξa only as in the case of many body Schr¨ electromagnetic fields. On the other hand, when {j, k} ⊂ C#(a) , Vjk (xj − xk ) can be written as Vjk (yj − yk , z (j,k) ). So we see that this Vjk (xj − xk ) also depends on Ξa only for any fixed yN , by taking account of (3.16). Thus we obtain h∇Ξ V (x), Ξa i = h∇Ξa V (x), Ξa i = h∇Ξa Ia (x), Ξa i . Next we consider (j, k) 6⊂ a. Taking account of (3.16), when j < k < N ,p we see that |xj − xk | ≥ c holds with some c > 0 on supp q˜a (Ξ), because |Ξ(j,k) |1 ≥ 3rN −1 /10 holds on supp q˜a (Ξ), by the definition of q˜a and Proposition 3.2. Similarly, when j < k = N , we see that |yj | + |z (j,N ) |1 ≥ c1 holds with some c1 > 0 on supp q˜a (Ξ). qa (Ξ)1[0,ε] (|yN |)) for Thus we have |xj − xN | ≥ c holds with some c > 0 on supp(˜ any ε > 0 such that ε < c1 . By virtue of these facts and Lemma 3.7, we obtain Z ∞ |(e−itH ψ, I3 (t)e−itH ψ)| dt ≤ Ckψk2 (3.17) 1
for any ψ ∈ L2 (R2×N × Z amax ), because I3 (t) = O(t−ν1 (1+µS2 ) ) + O(t−ν1 (1+µL ) ) + O(t−ν1 M ) is integrable in t ∈ [1, ∞) for sufficiently large M ∈ N such that
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
225
−ν1 M < −1, by the assumptions on the pair potentials. Here we used the facts that Ξa is bounded on supp J(Ξ), and that −ν1 (1 + µS2 ) < −1 and −ν1 (1 + µL ) < −1. Finally, by (3.12), (3.13) and Proposition 3.4, we see I4 (t) = t−1 h(N0 )f (H)Jt Ct Jt f (H)h(N0 ) + O(t−2ν1 −1 ) + O(t2ν1 −3 ) X Ξb Ξ Ξb −1 − pΞb q˜b ν1 − pΞb Jt f (H)h(N0 ) h(N0 )f (H)Jt ≥t t t t b∈A
+ O(t−2ν1 −1 ) + O(t2ν1 −3 ) .
(3.18)
Here we note that −2ν1 − 1 < −1 and 2ν1 − 3 < −1 because of 0 < ν1 < 1. Taking account of t−ν1 = t−ν × tν−ν1 with ν − ν1 > 0, by virtue of Lemma 3.5 and its proof, we have for a ∈ A and sufficiently large t ≥ 1, X Ξb Ξ Ξb − pΞb q˜b ν1 − pΞb Jt f (H)h(N0 ) h(N0 )f (H)Jt t−1 t t t b∈A
≥ t
X
−1
h(N0 )f (H)Jt
b∈A
× = t−1
Ξb Ξ Ξ 2 − pΞb qa q˜b ν1 t tν t
Ξb − pΞb Jt f (H)h(N0 ) t X Ξb Ξ Ξ − pΞb qa 2 ν q˜b ν1 h(N0 )f (H)Jt t t t
b∈Aa
× ≥ t−1
Ξb − pΞb Jt f (H)h(N0 ) t X Ξa Ξ Ξ 2 − pΞa qa q˜b ν1 h(N0 )f (H)Jt t tν t
b∈Aa
Ξa − pΞa Jt f (H)h(N0 ) × t Ξa Ξ Ξa − pΞa qa 2 ν − pΞa Jt f (H)h(N0 ) = t−1 h(N0 )f (H)Jt t t t 2 Ξ Ξa − pΞa = t−1 h(N0 )f (H)Jt qa ν t t Ξ × qa ν Jt f (H)h(N0 ) + O(t−1−2ν ) , t
where we used the fact that πa ≤ πb for b ⊂ a. Thus, combining (3.14), (3.15), (3.17) and (3.18) with this, we obtain the theorem.
February 4, 2002 11:31 WSPC/148-RMP
226
00113
T. Adachi
When we take ν = 1 in Theorem 3.6, one can obtain an improvement of Theorem 3.6 as follows: ∞ Theorem 3.9. Let a ∈ A, J ∈ C0 (X ) be a cut-off∞function such that J = 1 on {Ξ ∈ X |Ξ|1 ≤ θ} and J ≥ 0, and f, h ∈ C0 (R) be real-valued. Then, for sufficiently large θ > 0, there exists a constant C > 0 such that for any ψ ∈ L2 (R2×N × Z amax )
2 1/2 Z ∞
dt
Ξ Ξ Ξ
a −itH − pΞa J f (H)h(N0 )e ≤ Ckψk2 qa ψ
t t t t 1 1
holds. Proof. Since the proof is quite similar to the one in [11], we sketch it. Put 2 Ξa − pΞa + t−2γ Λa (t) = t 1 with 0 < γ < 1. We compute the Heisenberg derivative of Ξ Ξ Ξ Ξ qa Λa 1/2 (t)qa J f (H)h(N0 ) , h(N0 )f (H)J t t t t which is bounded uniformly in t. For simplicity of notations, we write Jt = J(Ξ/t) and qa,t = qa (Ξ/t). One can first prove that the terms arising from DH Jt are integrable in t ∈ [1, ∞). In fact, taking j as in the proof of Theorem 3.6, as in [11], we have h(N0 )f (H)Jt qa,t Λa 1/2 (t)qa,t (DH Jt )f (H)h(N0 ) Ξ Ξ B(t)j f (H)h(N0 ) + O(t−2+γ ) . = t−1 h(N0 )f (H)j t t Thus, by Proposition 3.8, we obtain Z ∞ |(e−itH ψ, h(N0 )f (H)Jt qa,t Λa 1/2 (t)qa,t (DH Jt )f (H)h(N0 )e−itH ψ)| dt 1
≤ Ckψk2 .
(3.19)
Next we consider the terms arising from Ξ i −2 Ξ Ξ −1 , − pΞ − t (∆Ξ qa ) . (∇Ξ qa ) DH qa,t = −t t t 2 t Now we introduce another partition of unity {ˆ qa (Ξ) | a ∈ A} which is defined by the definition of {qa (Ξ) | a ∈ A} with replacing r by a sufficiently small r1 > 0 in its definition. Then, by the proof of Lemma 3.5, we see that if r1 < 3r/10, X qˆb 2 (Ξ) . ∇Ξ qa (Ξ) = ∇Ξ qa (Ξ) b∈Aa
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
227
Moreover we claim the following lemma. The reason why we follow not the argument of [5] and [6] but the idea of Graf [11] in the definition of Ωa (ρ), a ∈ A, is that the definition (3.2) makes the proof of this lemma much simpler. We think that this lemma presents one of the important properties of {qa (Ξ) | a ∈ A} and {ˆ qa (Ξ) | a ∈ A}. qa (Ξ) | a ∈ A} as r1 < Lemma 3.10. If one takes r1 > 0 in the definition of {ˆ 3r/10, one has for b ⊂ a qˆb (Ξ)∇Ξ qa (Ξ) = qˆb (Ξ)∇Ξb qa (Ξ) . Proof. The proof is quite similar to the one in [11]. First we have to prove that qa,ρ (Ξ) depends on Ξb only on supp qˆb (Ξ) with b ⊂ a. In order to achieve it, we have only to show that on supp qˆb (Ξ) with b ⊂ a, qa,ρ (Ξ) = 1Ωa,b (ρ) (Ξ) with
(3.20)
2 2 Ωa,b (ρ) = {Ξ ∈ X |Ξa |1 + ρa > |Ξb1 |1 + ρb1 for any b1 ∈ A such that b ⊂ b1 ( a or a ( b1 } .
One can assume that b 6= amin without loss of generality, because Ωa,amin (ρ) = Ωa (ρ). Put 2 2 ωa,b (ρ) = {Ξ ∈ X |Ξa |1 + ρa > |Ξb |1 + ρb } for a, b ∈ A. Then we see that Y 1ωa,b1 (ρ) (Ξ) , qa,ρ (Ξ) =
A¯a = {b1 ∈ A | b1 ( a or a ( b1 } ,
(3.21)
¯a b1 ∈A
1Ωa,b (ρ) (Ξ) =
Y
1ωa,b1 (ρ) (Ξ) ,
Aa,b = {b1 ∈ A | b ⊂ b1 ( a or a ( b1 } . (3.22)
b1 ∈Aa,b
As for b1 ( a with b 6⊂ b1 and b1 6⊂ b, we have either i) b ∪ b1 = a or ii) b ∪ b1 ( a. On supp qˆb (Ξ), we have in the case i) 1ωa,amin (ρ) (Ξ)1ωa,b1 (ρ) (Ξ) = 1ωa,amin (ρ) (Ξ) ,
(3.23)
while in the case ii) 1ωa,amin (ρ) (Ξ)1ωa,b∪b1 (ρ) (Ξ)1ωa,b1 (ρ) (Ξ) = 1ωa,amin (ρ) (Ξ)1ωa,b∪b1 (ρ) (Ξ) .
(3.24)
Assume that there exists a Ξ ∈ supp qˆb (Ξ) in the support of the right-hand side of one of the above two equalities (3.23) and (3.24), such that |Ξa |1 2 + ρa ≤ |Ξb1 |1 2 + ρb1 . In the case i), this is written as |Ξb∪b1 |1 2 + ρb∪b1 ≤ |Ξb1 |1 2 + ρb1 . In the case ii), we have |Ξb1 |1 2 − |Ξb∪b1 |1 2 = (|Ξb1 |1 2 − |Ξa |1 2 ) − (|Ξb∪b1 |1 2 − |Ξa |1 2 ) > (ρa − ρb1 ) − 2 2 2 (ρa − ρb∪b1 ) = ρb∪b1 − ρb1 . Thus both cases imply |Ξb∪b1 |1 ≥ |Ξb∪b1 |1 − |Ξb1 |1 = 2 2 #(b∪b1 ) #(b1 ) /2 − r . On the other hand, we have |Ξb1 |1 − |Ξb∪b1 |1 ≥ ρb∪b1 − ρb1 ≥ r 2 2 2 2 b1 2 |Ξ |1 = (|Ξa |1 − |Ξb1 |1 ) + (|Ξamin |1 − |Ξa |1 ) < (ρb1 − ρa ) + (ρa − ρamin ) =
February 4, 2002 11:31 WSPC/148-RMP
228
00113
T. Adachi 2
ρb1 ≤ r#(b1 ) . Hence, by Lemma 3.1, we have |Ξb |1 ≥ r#(b) with b 6= amin . This 2 contradicts Ξ ∈ supp qˆb (Ξ), because, for Ξ ∈ supp qˆb (Ξ), |Ξb |1 ≤ r1 #(b) holds with r1 < 3r/10 and b 6= amin . Therefore we have (3.23) and (3.24). As for b1 ( a with b1 ( b, we have on supp qˆb (Ξ) 1ωa,b (ρ) (Ξ)1ωa,b1 (ρ) (Ξ) = 1ωa,b (ρ) (Ξ) . 2
(3.25) 2
2
2
In fact, for Ξ ∈ supp qˆb (Ξ) ∩ ωa,b (ρ), we have |Ξb1 |1 − |Ξa |1 = (|Ξb |1 − |Ξa |1 ) + 2 2 (|Ξb1 |1 − |Ξb |1 ) < (ρa − ρb ) + (r1 #(b) − r1 #(b1 ) /2). If we take r1 > 0 as r1 < 3r/10, we have r1 #(b) − r1 #(b1 ) /2 < r#(b) /2 − r#(b1 ) for any b1 ( b. In fact, we have #(b) r1 (1−r1 #(b1 )−#(b) /2) ≤ r1 #(b) and r#(b) (1/2−r#(b1 )−#(b) ) ≥ 3r#(b) /10 because #(b1 ) − #(b) ≥ 1 and r ≤ 1/5. Thus it is enough to show r1 #(b) < 3r#(b) /10 for 2 2 #(b) = 1, . . . , N −1, which holds if r1 < 3r/10. Therefore we have |Ξb1 |1 −|Ξa |1 < (ρa − ρb ) + (r#(b) /2 − r#(b1 ) ) ≤ (ρa − ρb ) + (ρb − ρb1 ) = ρa − ρb1 by (3.4), which implies (3.25). By using (3.23), (3.24) and (3.25), we obtain (3.20) on supp qˆb (Ξ) with b ⊂ a, taking account of (3.21) and (3.22). Thus we see that q˜a (Ξ) depends on Ξb only on supp qˆb (Ξ) with b ⊂ a. We prove that qa (Ξ) also depends on Ξb only on supp qˆb (Ξ) with b ⊂ a. As in the proof of Lemma 3.5, for Ξ ∈ supp q˜c (Ξ) and b 6⊂ c, 2 2 |Ξb |1 ≥ 3r#(b) /10 holds. On the other hand, for Ξ ∈ supp qˆb (Ξ), |Ξb |1 ≤ r1 #(b) holds. Since r1 < 3r/10, we see that supp q˜c (Ξ) ∩ supp qˆb (Ξ) = ∅ for b 6⊂ c. Thus we have on supp qˆb (Ξ) qa (Ξ) = qP
q˜a (Ξ) ˜c 2 (Ξ) ˜b q c∈A
,
A˜b = {c ∈ A | b ⊂ c} ,
which implies the lemma because q˜c (Ξ), c ∈ A˜b , depends on Ξb only on supp qˆb (Ξ). We continue the proof of Theorem 3.9. Since, by virtue of Lemma 3.10, (DH qa,t )Jt f (H) = −t
−1
X b∈Aa
= −t
−1
X b∈Aa
qˆb
2
Ξb Ξ Ξ (∇Ξb qa ) , − pΞb Jt f (H) + O(t−2 ) t t t
Ξb Ξ Ξ Ξ (∇Ξb qa ) , − pΞb qˆb Jt f (H) + O(t−2 ) , qˆb t t t t
we obtain |(e−itH ψ, h(N0 )f (H)Jt qa,t Λa 1/2 (t)(DH qa,t )Jt f (H)h(N0 )e−itH ψ)|
X
Ξb − p kΛa 1/2 (t)qa,t Jt f (H)h(N0 )e−itH ψk ≤ Ct−1 Ξ b
t 1 b∈Aa
Ξ −2 2 Jt f (H)h(N0 )e−itH ψ × qˆb
+ O(t )kψk . t
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
229
Combining this with Theorem 3.6, we obtain
Z
∞
|(e−itH ψ, h(N0 )f (H)Jt qa,t Λa 1/2 (t)(DH qa,t )Jt f (H)h(N0 )e−itH ψ)| dt ≤ Ckψk2 .
1
Now we consider the term arising from DH Λa 1/2 (t) = DHa Λa 1/2 (t) + i[Ia (x), Λa 1/2 (t)] . Taking account of e
itHa
(DHa Λa
1/2
(t))e
−itHa
d d = (eitHa Λa 1/2 (t)e−itHa ) = dt dt
we have DHa Λa
1/2
(t) = −t
−1
= −t
−1
Λa Λa
−1/2
1/2
2
|Ξa |1 + t−2γ t2
!1/2
! 2 Ξa −2γ − pΞa + γt (t) t 1
(t) + (1 − γ)t
−1−2γ
Λa
−1/2
,
(3.26)
(t) .
Because of Λa −1/2 (t) = O(tγ ), we see that the term arising from (1 − γ)t−1−2γ Λa −1/2 (t) is integrable in t ∈ [1, ∞). On the other hand, we claim h(N0 )f (H)Jt qa,t [Ia (x), Λa 1/2 (t)]qa,t Jt f (H)h(N0 ) = O(tmax{−1−µ0 ,−2}+2γ )
(3.27)
with µ0 = min{µS1 , µS2 , µL }. In order to carry it out, we have only to show that h(N0 )(Hamin + 1)−1 Jt qa,t [Ia (x), Λa 1/2 (t)]qa,t Jt (Hamin + 1)−1 h(N0 ) = O(tmax{−1−µ0 ,−2}+2γ ) .
(3.28)
Since [(N0 + 1)−1 (Hamin + 1)−1 Jt qa,t Ia (x)qa,t Jt (Hamin + 1)−1 (N0 + 1)−1 , Λa 1/2 (t)] = (N0 + 1)−1 (Hamin + 1)−1 Jt qa,t [Ia (x), Λa 1/2 (t)] × qa,t Jt (Hamin + 1)−1 (N0 + 1)−1 + [(N0 + 1)−1 (Hamin + 1)−1 Jt qa,t , Λa 1/2 (t)] × Ia (x)qa,t Jt (Hamin + 1)−1 (N0 + 1)−1 + (N0 + 1)−1 (Hamin + 1)−1 × Jt qa,t Ia (x)[qa,t Jt (Hamin + 1)−1 (N0 + 1)−1 , Λa 1/2 (t)] ,
(3.29)
we have only to prove the left-hand side and the last two terms on the right-hand side of (3.29) are O(tmin{−1−µ0 ,−2}+2γ ). By a straightforward computation, we have [(N0 + 1)−1 (Hamin + 1)−1 Jt qa,t , Λa (t)] = O(t−1 ). Moreover, taking account of Ξa Ξa − pΞa , ∇Ξa Ia (x) + ∇Ξa Ia (x), − pΞa , i[Ia (x), Λa (t)] = t t
February 4, 2002 11:31 WSPC/148-RMP
230
00113
T. Adachi
we obtain (N0 + 1)−1 (Hamin + 1)−1 Jt qa,t [Ia (x), Λa (t)]qa,t Jt (Hamin + 1)−1 (N0 + 1)−1 = O(tmax{−1−µS2 ,−1−µL ,−2} ) , as in the proof of Theorem 3.6. Here we used
1[ε,∞) |yN | (Hamin + 1)−1 (N0 + 1)−1 = O(t−2 ) ,
t
t→∞
instead of Lemma 3.7, which can be proved as Lemma 3.7 because |yN |2 (Hamin + 1)−1 (N0 + 1)−1 is bounded on L2 (R2×N × Z amax ). Thus we have [(N0 + 1)−1 (Hamin + 1)−1 Jt qa,t Ia (x)qa,t Jt (Hamin + 1)−1 (N0 + 1)−1 , Λa (t)] = O(tmax{−1−µ0 ,−2} ) by (3.29) with replacing Λa 1/2 (t) by Λa (t). As in [11], one can prove that if [Λa (t), B(t)] is bounded for fixed t, then k[Λa (t)1/2 , B(t)]k ≤ Ct2γ k[Λa (t), B(t)]k . Using this, we have (3.28), which implies (3.27). Taking γ as 2γ < min{µ0 , 1}, we see that the term arising from i[Ia (x), Λa 1/2 (t)] is integrable in t ∈ [1, ∞). By using these facts and taking account of (3.26), we obtain Z ∞ dt (e−itH ψ, h(N0 )f (H)Jt qa,t Λa 1/2 (t)qa,t Jt f (H)h(N0 )e−itH ψ) ≤ Ckψk2 . t 1 This implies the theorem. Next we prove the following minimal velocity estimate, which can be shown by virtue of the Mourre estimate in Corollary 2.2. Theorem 3.11. Let λ, δ, c and f be also as in Corollary 2.2. Then for any realvalued h ∈ C0∞ (R), there exists ε0 > 0 such that
2 Z ∞
1[0,ε ] |Ξ|1 f (H)h(N0 )e−itH ψ dt ≤ Ckψk2 0
t t 1 for any ψ ∈ L2 (R2×N × Z amax ), with C > 0 independent of ψ. This theorem can be obtained by Proposition 3.12 and Lemma 3.13 below: First we have the following propagation estimate associated with the observable A, which is defined by (2.9), by following a standard argument in the N -body scattering theory originated with the works of Sigal–Soffer (see e.g. [19, 20]). Proposition 3.12. Let λ, δ, c and f be as in Corollary 2.2. Let c0 be such that 0 < c0 < c. Then there exists C > 0 such that
2 Z ∞
1[−c ,c ] A f (H)e−itH ψ dt ≤ Ckψk2 0 0
t t 1 for any ψ ∈ L2 (R2×N × Z amax ).
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
231
Proof. The proof is done in exactly the same way as in [20]. We sketch it (see also e.g. [10] and [1]). Take g ∈ C0∞ (R) such that g is real-valued, g = 1 on [−c0 , c0 ] and supp g ⊂ (−∞, c). Let G be defined by Z s g(u)2 du , G(s) = −∞ 0
so that G (s) = g(s) ∈ 2
C0∞ (R)
with g being real-valued. We use A Φ1 (t) = G t
as a propagation observable. We note that Φ1 (t) is uniformly bounded in t ≥ 1. If we take f1 ∈ C0∞ (R) such that f1 = 1 on the support of f , then f (H)i[H, Φ1 (t)]f (H) = f (H)i[f1 (H)H, Φ1 (t)]f (H) . By using the almost analytic extension method due to Helffer–Sj¨ ostrand [12] in order to calculate several commutators, we have A A A f (H)i H, f (H)g + O(t−2 ) . f (H)i[H, Φ1 (t)]f (H) = g t t t Here we note that [(H + i)−1 , A] and [[(H + i)−1 , A], A] are bounded on L2 (R2×N × Z amax ) by the assumptions on the pair potentials. Hence it follows from the Mourre estimate (2.35) that c A A f (H)2 g + O(t−2 ) f (H)i[H, Φ1 (t)]f (H) ≥ g t t t 2 c A f (H) + O(t−2 ) . ≥ f (H)g t t On the other hand, we have 2 c1 A f (H) f (H)∂t Φ1 (t)f (H) ≥ − f (H)g t t for some c1 such that c0 < c1 < c, which depends on the support of g. Thus we obtain 2 c − c1 A f (H)g f (H) + O(t−2 ) . f (H)DH (Φ1 (t))f (H) ≥ t t This proves the proposition. Lemma 3.13. Let λ, δ, c and f be as in Corollary 2.2. Let c0 be such that 0 < exists ε0 > 0 such that for any c0 < c. Then for any real-valued h ∈ C0∞ (R), there ∞ real-valued F ∈ C0 (X ) supported in {Ξ ∈ X |Ξ|1 ≤ 2ε0 } such that F = 1 on {Ξ ∈ X |Ξ|1 ≤ ε0 }, one has
1[c ,∞) |A| F Ξ f (H)h(N0 ) = O(t−1 )
0
t t as t → ∞.
February 4, 2002 11:31 WSPC/148-RMP
232
00113
T. Adachi
Proof. We follow the argument in [1]. First we introduce the operator N −1 X 1 (yj · Dyj + Dyj · yj ) . (3.30) (hz amax , Dzamax i + hDzamax , z amax i) + A˜ = 2 j=1 Then we note that the following equality holds, which can be checked by a straightforward computation: 1 A˜ − A = − 2
N −1 X
(κ0 · Dyj + Dyj · κ0 ) ,
(3.31)
j=1
where κ0 is as in (2.10). Then it is obvious that (A˜ − A)f (H)h(N0 ) is bounded on L2 (R2×N × Z amax ). We take F1 ∈ C ∞ (R) such that F1 ≥ 0, F1 = 1 on [c0 , ∞) and supp F1 ⊂ [c0 /2, ∞), and put FA (t) = F1 (|A|/t). We also put FΞ (t) = F (Ξ/t), where F is as in the statement of the lemma. And for u1 ∈ L2 (R2×N × Z amax ), we put u = FA (t)FΞ (t)f (H)h(N0 )u1 . Since (A˜ − A)f (H)h(N0 ) is bounded on L2 (R2×N × Z amax ), by controlling some commutators, we have ˜ + ku1 k) k |A|uk ≤ C1 (k |A|uk
(3.32)
with some C1 > 0 independent of t ≥ 1. Now we introduce G ∈ C0∞ (X ) such that G = 1 on supp F and supp G ⊂ {Ξ ∈ X |Ξ|1 ≤ 3ε0 } and put GΞ (t) = ˜ controlling some G(Ξ/t). Taking account of the definition (3.30) of the operator A, commutators, and using GΞ (t), we obtain c0 tkFA (t)FΞ (t)f (H)h(N0 )u1 k , 2 ˜ k |A|uk ≤ C2 ε0 tkFA (t)FΞ (t)f (H)h(N0 )u1 k + C3 ku1 k
k |A|uk ≥
with some Cj > 0 (j = 2, 3) independent of t ≥ 1. Hence, combining these inequalities with (3.32), if we take ε0 > 0 so small that C1 C2 ε0 < c0 /2, we obtain the lemma. 4. Proof of Theorem 1.1 Throughout this section, we assume the conditions (V.1), (V.2), (V.3) and (SR). First we prove the existence of the Deift–Simon wave operators ˇ + = s - lim eitHa q˜a Ξ e−itH , a ∈ A . (4.1) W a t→∞ t We note that N0 commutes with H. By a density argument, for ψ ∈ L2 (R2×N × Z amax ) such that ψ = f (H)ψ ,
ψ = h(N0 )ψ
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
with f, h ∈ C0∞ (R), we have only to prove the existence of ˇ a+ ψ = lim eitHa q˜a Ξ e−itH ψ , a ∈ A . W t→∞ t
233
(4.2)
In order to carry it out, by taking f1 , h1 ∈ C0∞ (R) such that f1 f = f and h1 h = h, we have only to show the existence of Ξ e−itH ψ , a ∈ A . lim eitHa h1 (N0 )f1 (Ha )˜ qa (4.3) t→∞ t Here we note that qa h1 (N0 )f1 (Ha )˜
Ξ Ξ − q˜a f1 (H)h1 (N0 ) = O(tmax{−1,−µS1 } ) = O(t−1 ) , t t
which can be proved as in the proof of Theorem 3.6 by virtue of Lemma 3.7. As is well-known, Proposition 3.8 implies Ξ 2 e−itH f (H) = 0 , s - lim 1 − J (4.4) t→∞ t where J ∈ C0∞ (X ) be a cut-off function such that J = 1 on {Ξ ∈ X |Ξ|1 ≤ θ} and J ≥ 0 with sufficiently large θ > 0 (see e.g. [1]). By virtue of (4.4), we have only to show the existence of Ξ −itH Ξ Ξ q˜a J e ψ, a ∈ A. (4.5) lim eitHa h1 (N0 )f1 (Ha )J t→∞ t t t Since the proof is quite similar to the one in [11], we sketch it. Putting Ξ Ξ Ξ itHa ˇ q˜a J f (H)h(N0 )e−itH , h1 (N0 )f1 (Ha )J Wa (t) = e t t t we compute ˇa dW (t) = eitHa h1 (N0 )f1 (Ha )DH (Jt q˜a,t Jt )f (H)h(N0 )e−itH dt − ieitHa h1 (N0 )f1 (Ha )Jt q˜a,t Jt Ia (x)f (H)h(N0 )e−itH with Jt = J(Ξ/t) and q˜a,t = q˜a (Ξ/t), where the last term in the right-hand side is integrable in t in the norm sense because of µS1 > 1, by virtue of Lemma 3.7. We used the fact that N0 commutes with Ha , a ∈ A. We have to compute qa,t Jt + Jt q˜a,t (DH Jt ) . DH (Jt q˜a,t Jt ) = Jt (DH q˜a,t )Jt + (DH Jt )˜ As for the terms arising from DH Jt , by taking j ∈ C0∞ (X ) such that j ≥ 0, j = 1 on supp ∇Ξ J and supp j ⊂ {Ξ ∈ X | θ/2 ≤ |Ξ|1 ≤ 2θ}, we have |(Φ, h1 (N0 )f1 (Ha )Jt q˜a,t (DH Jt )f (H)h(N0 )ψ)| ≤ Ct−1 kjt f1 (Ha )h1 (N0 )Φkkjt (H + i)f (H)h(N0 )ψk + O(t−2 )kΦk kψk
February 4, 2002 11:31 WSPC/148-RMP
234
00113
T. Adachi
with jt = j(Ξ/t) for any Φ ∈ L2 (R2×N × Z amax ). Thus we have as t1 , t2 → ∞ Z t2 |(Φ, eitHa h1 (N0 )f1 (Ha )Jt q˜a,t (DH Jt )f (H)h(N0 )e−itH ψ)| dt t1
Z
∞
≤ C
kjt f1 (Ha )h1 (N0 )e−itHa Φk2
1
Z ×
t2
dt t
1/2
kjt (H + i)f (H)h(N0 )e−itH ψk2
t1
dt t
1/2 + o(1)kΦk kψk
≤ o(1)kΦk + o(1)kΦk kψk = o(1)kΦk ,
(4.6)
by virtue of Proposition 3.8. Next we consider Ξ Ξ −1 , − pΞ Jt + O(t−2 ) Jt (DH q˜a,t )Jt = −t Jt (∇Ξ q˜a ) t t X Ξb Ξ Ξ 2 −1 (∇Ξb q˜a ) , − pΞb Jt + O(t−2 ) = −t Jt qˆb t t t b∈Aa
= −t−1
X b∈Aa
Jt qˆb
Ξb Ξ Ξ (∇Ξb q˜a ) , − pΞb t t t
Ξ Jt + O(t−2 ) , × qˆb (4.7) t by virtue of the proof of Theorem 3.9, where we used the same notations as in Sec. 3. Since |(Φ, h1 (N0 )f1 (Ha )Jt (DH q˜a,t )Jt f (H)h(N0 )ψ)| X ≤ Ct−1 kΛb 1/4 (t)ˆ qb,t Jt f1 (Ha )h1 (N0 )Φk kΛb 1/4 (t)ˆ qb,t Jt f (H)h(N0 )ψk b∈Aa
+ O(t−2 )kΦk kψk with qˆb,t = qˆb (Ξ/t) and the same notations as in Sec. 3, we obtain as t1 , t2 → ∞ Z t2 |(Φ, eitHa h1 (N0 )f1 (Ha )Jt (DH q˜a,t )Jt f (H)h(N0 )e−itH ψ)| dt = o(1)kΦk (4.8) t1
ˇ a (t)ψ | t ≥ 1} by virtue of Theorem 3.9. Combining (4.6) with (4.8), we see that {W is a Cauchy sequence, which implies that (4.5) exists. Therefore we get the existence ˇ a+ , a ∈ A. of the Deift–Simon wave operators W Next we prove the existence of the usual wave operators Wa+ , a ∈ A(amax ), which are defined by (1.16). First we note that in the same argument as the one to ˇ a+ , a ∈ A, there exist the show the existence of the Deift–Simon wave operators W limits Ξ −itHa e s - lim eitH q˜a , a ∈ A. (4.9) t→∞ t
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
235
Next we notice that finite linear combinations of tensor products of eigenstates of K a and H C#(a) are dense in Ran(P˜ a ⊗ Pˆ a ). So it is sufficient to show the existence ˜ a for an eigenstate of Wa+ with assuming that P˜ a ⊗ Pˆ a is the eigenprojection of K ˜ a = K a ⊗ Id + Id ⊗ H C#(a) on ˜ a (P˜ a ⊗ Pˆ a ) = E(P˜ a ⊗ Pˆ a ), where K such that K 2×#(C#(a) ) 2 a,n 2 C#(a) ×Z ). Put L (X ) ⊗ L (R Y Dr = Ran 1ω˜ a,b1 (r) (pΞ ) , A¯a,1 = {b ∈ A | a ( b} ¯a,1 b1 ∈A
with r > 0 and
2 2 ω ˜ a,b1 (r) = {Ξ ∈ X |Ξa |1 + r#(a) /2 ≥ |Ξb1 |1 + r#(b1 ) } .
Then we have only to prove the existence of Wa+ on states in the dense set D = S r>0 Dr . Now we note that we have only to consider sufficiently small r > 0, and we see that Ξ it(∆ya,n +∆za )/2 e 1ω˜ a,b1 (r) (pΞ ) = 0 , a ( b1 , s - lim 1X \˜ωa,b1 (r) t→∞ t Ξ = 0, b ( a s - lim 1X \˜ωa,b (r) t→∞ t as t → ∞, where as for the first limit, see [18] (see also e.g. [11]). Here we used the facts that the definition of ω ˜ a,b1 (r), a ( b1 , is independent of Ξa , and that −(∆ya,n + ∆za ) = hpΞa , pΞa i. Thus, by taking account of (1.15), we have for ψ ∈ Dr with r > 0 ! Y Ξ −itHa a e−itHa P a ψ + o(1) P ψ= 1ω˜ a,b (r) e t ¯ b∈Aa
Ξ −itHa a e = q˜a P ψ + o(1) t as t → ∞, where we used the facts that q˜a (Ξ) = 1 on the set ˜ a (r) = {Ξ ∈ X |Ξa |1 2 + r#(a) /2 ≥ |Ξb |1 2 + r#(b) Ω Q
for any b ∈ A such that b ( a or a ( b} ,
˜ a (r). Thus, by virtue of and b∈A¯a 1ω˜ a,b (r) (Ξ) is the characteristic function of Ω + the existence of (4.9), we see that Wa exists. Now we note that one can prove the closedness of the ranges of Wa+ , a ∈ A(amax ), their mutual orthogonality and X M Ran Wa± ⊂ L2c (H) a∈A(amax )
in the same way as in the case for many body Schr¨ odinger operators without external electromagnetic fields. Finally we prove the asymptotic completeness. We first claim that letting f ∈ C0∞ (R) as in Corollary 2.2, we have for any real-valued h ∈ C0∞ (R) ˇ + f (H)h(N0 ) = 0 W amax
(4.10)
February 4, 2002 11:31 WSPC/148-RMP
236
00113
T. Adachi
with sufficiently small r > 0 in the definition of {˜ qa (Ξ) | a ∈ A}. In fact, by virtue of Theorem 3.11, we have only to take r > 0 so small that r < ε0 2 , where ε0 > 0 is as in Theorem 3.11. Now we prove the asymptotic completeness by induction with respect to N ≥ 2. First we note that in the case when N = 2, the asymptotic completeness was proved in [1]. Assume that the asymptotic completeness holds for M -body systems in which there exists only one charged particle with 2 ≤ M < N . By a density argument, we have only to consider ψ ∈ L2c (H) such that ψ = h(N0 )ψ ,
ψ = f (H)ψ
with h ∈ C0∞ (R) and f ∈ C0∞ (R) as in Corollary 2.2. Here we also notice that Θ ∪ σpp (H) is a closed countable set (see Theorem 2.1). If we take r > 0 so small that r < ε0 2 , we see that X Ξ e−itH ψ q˜a e−itH ψ = t a∈A X ˇ + ψ + o(1) = e−itHa W a (4.11) a∈A(amax )
=
X
X
ˇ +ψ + e−itHa P a W a
a∈A(amax )
ˇ + ψ + o(1) e−itHa (Id − P a )W a
a∈A(amax )
as t → ∞. Here we used Proposition 3.4, the existence of the Deift–Simon wave ˇ + , and (4.10). For any ε > 0, there exist a finite number of ψ˜a ∈ operators W a j L2 (X a,n ), ψˆja ∈ L2 (R2×#(C#(a) ) × Z C#(a) ), ψa,j ∈ L2 (Ya,n ) ⊗ L2 (Za ) such that
X O O
ˇ+ (4.12) ψ˜ja ψˆja ψa,j < ε .
Wa ψ −
j:finite
Now one can apply the asymptotic completeness for K a and H C#(a) , where we odinger operator without external recall that K a is an (N − #(C#(a) ))-body Schr¨ electromagnetic fields in the center of mass frame, and H C#(a) is the #(C#(a) )-body Hamiltonian under consideration. We also note that the asymptotic completeness for K a under the condition (SR) was already obtained by several authors (see e.g. [19], [11] and [23]). For any a = {C1 , . . . , C#(a) } ∈ A(amax ) with {N } ⊂ C#(a) , we put an = {C1 , . . . , C#(a)−1 } and ac = {C#(a) }. Let Ana be the set of all cluster decompositions S#(a)−1 Cj such that bn ⊂ an , and Aca be the set of all cluster decompositions bn of j=1 c b of C#(a) such that bc ⊂ ac . Put Ana (an ) = Ana \ {an } and Aca (ac ) = Aca \ {ac }. Taking account of that the asymptotic completeness for H C#(a) , a ∈ A(amax ), holds by the assumption of induction, we have X M ˜ + (K a , K an ) (4.13) Ran (Id − P˜ a ) = Ran W b n bn ∈An a (a )
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
237
with ˜ + (K a , Kban ) = s - lim eitK a e−itKban P˜ban W t→∞
an bn
on L2 (X a,n ), where Kban = K a − I˜ n I˜ban =
with X Vl1 l2 (xl1 − xl2 ) ,
(l1 ,l2 )⊂an (l1 ,l2 )6⊂bn
P˜ban = P˜ban ⊗ Id is the eigenprojection for the subsystem Hamiltonian associated with Kban , as well as X M ˆ + (H C#(a) , H Cc#(a) ) (4.14) Ran (Id − Pˆ a ) = Ran W b bc ∈Aca (ac )
with ˆ + (H C#(a) , H Cc#(a) ) = s - lim eitH W b t→∞
C
C#(a)
C#(a)
e−itHbc
Pˆbac
c
on L2 (R2×#(C#(a) ) × Z C#(a) ), where Hbc#(a) = H C#(a) − Iˆbac with X c Vl1 l2 (xl1 − xl2 ) , Iˆbac = (l1 ,l2 )⊂ac (l1 ,l2 )6⊂bc C Pˆbac is the eigenprojection for Hbc#(a) , which is defined in the same way as P a ˜ bn ,j ∈ L2 (X a,n ), bn ∈ An (an ), such that associated with Ha . Thus there exist Φ a X ˜ bn ,j ˜ + (K a , K an )Φ (4.15) W (Id − P˜ a )ψ˜ja = b n bn ∈An a (a )
ˆ bc ,j ∈ L2 (R2×#(C#(a) ) × Z C#(a) ), bc ∈ Ac (ac ), such that by (4.13), and there exist Φ a X C ˆ bc ,j ˆ + (H C#(a) , H c#(a) )Φ (4.16) W (Id − Pˆ a )ψˆja = b bc ∈Aca (ac )
by (4.14). Thus, taking account of O O O O O (Id− Pˆ a )+(Id− P˜ a ) Pˆ a + P˜ a Id Id− P˜ a Pˆ a = (Id− P˜ a ) (Id− Pˆ a ) , we have as t → ∞ X ˇ a+ ψ + o(1) + O(ε) e−itH ψ = e−itHa P a W a∈A(amax )
+
X
X
n n n b ∈Aa (a ) a∈A(amax ) j:finite
O
˜ + (K a , Kban )Φ ˜ bn ,j e−itHa (W
bc ∈Aca (ac )
ˆ bc ,j ⊗ ψa,j ) ˆ + (H C#(a) , H Cc#(a) )Φ W b
February 4, 2002 11:31 WSPC/148-RMP
238
00113
T. Adachi
+
X
˜ + (K a , Kban )Φ ˜ bn ,j e−itHa (W
O
Pˆ a ψˆja
O
ψa,j )
n bn ∈An a (a )
+
X
e−itHa (P˜ a ψ˜ja
O
ˆ bc ,j ˆ + (H C#(a) , H Cc#(a) )Φ W b
bc ∈Aca (ac )
O
ψa,j ) . (4.17)
c n n c c c B1 , . . . , B#(b For bn = {B1n , . . . , B#(b n ) } ∈ Aa and b = c ) } ∈ Aa , we write n c c bn + bc = {B1n , . . . , B#(b n ) , B1 , . . . , B#(bc ) } ∈ Aa = {b ∈ A | b ⊂ a} .
We note that, for bn ∈ Ana and bc ∈ Aca , we see that bn + bc , bn + ac , an + bc ∈ A(a) = ˜ + (K a , K an ) {b1 ∈ A | b1 ( a} = Aa \ {a}. Taking account of the definition of W b C ˆ + (H C#(a) , H c#(a) ), and rearranging some terms in (4.17) with respect to and W b b ∈ A(a), we have as t → ∞ X ˇ a+ ψ e−itHa P a W e−itH ψ = a∈A(amax )
+
X
X
e−itHb P b (ψjb
O
ψa,j ) + o(1) + O(ε)
(4.18)
a∈A(amax ) b∈A(a) j:finite
N with some ψjb ∈ L2 (X a,n ) L2 (R2×#(C#(a) ) × Z C#(a) ). Multiplying both sides of (4.18) by eitH and taking t → ∞, we have X X X O ˇ +ψ + Wa+ W Wb+ (ψjb (4.19) ψ= ψa,j ) + O(ε) . a a∈A(amax )
a∈A(amax ) b∈A(a) j:finite
Since one can take ε > 0 arbitrary, this implies X M Ran Wa+ , ψ∈ a∈A(amax )
by virtue of the closedness of the ranges of Wa+ , a ∈ A(amax ). The proof is completed. Remark 4.1. If one follows the argument of [5] and [6], one can first prove the existence of the asymptotic velocity s -C∞ - lim eitH t→∞
Ξ −itH e , t
and after that, one can show the asymptotic completeness for the system under consideration by using this asymptotic velocity. Though we have not used this asymptotic observable here, it seems to be very useful in the study of long-range scattering for the system under consideration.
February 4, 2002 11:31 WSPC/148-RMP
00113
Spectral and Scattering Theory
239
Acknowledgments I would like to express my deep gratitude to Professor Christian G´erard for his valuable comments as well as his hospitality, because a part of the contents of this ´ paper was developed during my staying at the Ecole Polytechnique. This work was supported by the Ministry of Education, Science, Sports, and Culture, the Government of Japan. References [1] T. Adachi, “Scattering theory for a two-body quantum system in a constant magnetic field”, J. Math. Sci. Univ. Tokyo 8 (2001) 243–274. [2] J. Avron, I. W. Herbst and B. Simon, “Schr¨ odinger operators with magnetic fields. I. General interactions”, Duke Math. J. 45 (1978) 847–883. [3] J. Avron, I. W. Herbst and B. Simon, “Separation of center of mass in homogeneous magnetic fields”, Ann. Phys. 114 (1978) 431–451. [4] H. Cycon, R. G. Froese, W. Kirsch and B. Simon, Schr¨ odinger Operators with Application to Quantum Mechanics and Global Geometry, Springer-Verlag, 1987. [5] J. Derezi´ nski, “Asymptotic completeness of long-range N -body quantum systems”, Ann. Math. 138 (1993) 427–476. [6] J. Derezi´ nski and C. G´erard, Scattering Theory of Classical and Quantum N -Particle Systems, Springer-Verlag, 1997. [7] R. Froese and I. W. Herbst, “A new proof of the Mourre estimate”, Duke Math. J. 49 (1982) 1075–1085. [8] C. G´erard and I. Laba, “Scattering theory for N -particle systems in constant magnetic fields”, Duke Math. J. 76 (1994) 433–465. [9] C. G´erard and I. Laba, “Scattering theory for N -particle systems in constant magnetic fields, II. Long-range interactions”, Comm. Partial Ditterential Equations 20 (1995) 1791–1830. [10] C. G´erard and I. Laba, “Scattering theory for 3-particle systems in constant magnetic fields: Dispersive case”, Ann. Inst. Fourier, Grenoble 46 (1996) 801–876. [11] G. M. Graf, “Asymptotic completeness for N -body short-range quantum systems: a new proof”, Comm. Math. Phys. 132 (1990) 73–101. [12] B. Helffer and J. Sj¨ ostrand, Equation de Schr¨ odinger avec Champ Magn´etique et ´ Equation de Harper, Lecture Notes in Physics 345, Springer-Verlag 1989, 118–197. [13] A. Jensen and S. Nakamura, “The 2D Schr¨ odinger equation for a neutral pair in a constant magnetic field”, Ann. Inst. Henri Poincar´e — Phys. Th´eor. 67 (1997) 387–410. [14] I. Laba, “Scattering for hydrogen-like systems in a constant magnetic field”, Comm. Partial Ditterential Equations 20 (1995) 741–762. [15] I. Laba, “Multiparticle quantum systems in constant magnetic fields”, pp. 147–215 in Multiparticle Quantum Scattering with Applications to Nuclear, Atomic and Molecular Physics, Minneapolis, MN, 1995, IMA Vol. Math. Appl., 89, Springer-Verlag 1997. [16] E. Mourre, “Absence of singular continuous spectrum for certain self-adjoint operators”, Comm. Math. Phys. 78 (1981) 391–408. [17] P. Perry, I. M. Sigal and B. Simon, “Spectral analysis of N -body Schr¨ odinger operators”, Ann. Math. 114 (1981) 517–567. [18] M. Reed and B. Simon, Methods of Modern Mathematical Physics I–IV, Academic Press.
February 4, 2002 11:31 WSPC/148-RMP
240
00113
T. Adachi
[19] I. M. Sigal and A. Soffer, “The N -particle scattering problem: asymptotic completeness for short-range systems”, Ann. Math. 125 (1987) 35–108. [20] I. M. Sigal and A. Soffer, “Long-range many body scattering: Asymptotic clustering for Coulomb type potentials”, Invent. Math. 99 (1990) 115–143. [21] E. Skibsted, “On the asymptotic completeness for particles in constant electromagnetic fields”, pp. 286–320 in Partial Differential Equations and Mathematical Physics Copenhagen, 1995; Lund, 1995, Progr. Nonlinear Differential Equations Appl., 21, Birkh¨ auser 1996. [22] E. Skibsted, “Asymptotic completeness for particles in combined constant electric and magnetic fields, II”, Duke Math. J. 89 (1997) 307–350. [23] D. Yafaev, “Radiation conditions and scattering theory for N -particle Hamiltonians”, Comm. Math. Phys. 154 (1993) 523–554.
March 19, 2002 11:43 WSPC/148-RMP
00117
Reviews in Mathematical Physics, Vol. 14, No. 3 (2001) 241–272 c World Scientific Publishing Company
QUANTUM STOCHASTIC ANALYSIS VIA WHITE NOISE OPERATORS IN WEIGHTED FOCK SPACE
DONG MYUNG CHUNG∗ Department of Mathematics, Sogang University Seoul, 121-742 Korea UN CIG JI∗ Department of Mathematics, College of Natural Science Chungbuk National University, Cheongju, 361-763 Korea NOBUAKI OBATA† Graduate School of Information Sciences, Tohoku University Sendai, 980-8579 Japan
Received 9 May 2000 Revised 17 August 2001 White noise theory allows to formulate quantum white noises explicitly as elemental quantum stochastic processes. A traditional quantum stochastic differential equation of Itˆ o type is brought into a normal-ordered white noise differential equation driven by lower powers of quantum white noises. The class of normal-ordered white noise differential equations covers quantum stochastic differential equations with highly singular noises such as higher powers or higher order derivatives of quantum white noises, which are far beyond the traditional Itˆ o theory. For a general normal-ordered white noise differential equation unique existence of a solution is proved in the sense of white noise distribution. Its regularity properties are investigated by means of weighted Fock spaces interpolating spaces of white noise distributions and associated characterization theorems for S-transform and for operator symbols. Keywords: White noise theory; Fock space; quantum stochastic differential equation; normal-ordered white noise differential equation; quantum white noise; Wick product. Mathematics Subject Classification 2000: Primary: 60H40; Secondary: 34G10, 46A32, 46F25, 81S25.
Introduction Quantum stochastic differential equations have attracted the interest of both physicists and mathematicians for many years. They serve a useful tool for the study of ∗ Partially
supported by Korea Research Foundation Grant (KRF-2000-015-DP0016). supported by Grant-in-Aid for Scientific Research (No. 12440036), Ministry of Education, Japan. † Partially
241
March 19, 2002 11:43 WSPC/148-RMP
242
00117
D. M. Chung, U. C. Ji & N. Obata
noisy quantum systems appearing in many branches of quantum physics, e.g. quantum open systems, quantum measurement theory, quantum optics, quantum information, and so on, see e.g. [1, 9]. The mathematical basis of quantum stochastic differential equations was first given by Hudson and Parthasarathy [16], where the role of Brownian motion in the classical Itˆo theory was played by three quantum stochastic processes (annihilation, creation and number processes) and the concept of quantum stochastic integration of Itˆ o type was established. Since then the Hudson–Parthasarathy approach has developed considerably as a quantum extension of the traditional Itˆ o theory, for a comprehensive account see the books by Meyer [22] and by Parthasarathy [30], see also [29]. Meanwhile, the white noise theory (Hida calculus), originally aiming at extension of Itˆo theory keeping contact with L´evy’s stochastic variational calculus [11], has developed considerably into an infinite dimensional analysis with a wide range of applications [12, 17, 21] and references therein. Among others, the significant feature that the pointwisely defined creation and annihilation operators (quantum white noise process) were formulated as continuous operators on white noise functions sublimated into the white noise operator theory [24] and brought a new approach to quantum stochastic analysis [5, 14, 15, 25, 26]. This approach enables us to discuss very singular (nonlinear) noises involving higher powers or higher order derivatives of quantum white noises and nonlinear extension of stochastic analysis. Moreover, together with the stochastic limit of quantum theory due to Accardi, Lu and Volovich [2], which reveals a mechanism of emerging a quantum stochastic differential equation (in a broad sense) from a standard Hamiltonian model, white noise approach is expected to be a clue to go beyond the traditional Itˆ o theory. In this paper we focus on white noise approach to quantum stochastic analysis. A typical quantum stochastic differential equation (QSDE) of Hudson–Parthasarathy type is of the form: dX = (L1 dAt + L2 dA∗t + L3 dΛt + L4 dt)X ,
X(0) = I ,
(0.1)
where {At }, {A∗t } and {Λt } are respectively the annihilation process, the creation process and the number process. As these processes are originally (unbounded) operators acting in the Fock space Γ(L2 (R)), on a suitably prepared Gelfand triple: W ⊂ Γ(L2 (R)) ⊂ W ∗ , they can be formulated as continuous operators from W into W ∗ , i.e. within the space L(W, W ∗ ). In general, a member of L(W, W ∗ ) is called a white noise operator, which is expected to play a role of a “generalized observable” in Fock space Γ(HC ). This idea shares a common spirit with the so-called rigged Hilbert space formalism for quantum mechanics, see e.g. [4]. It is noteworthy that d At = at , dt
d ∗ A = a∗t , dt t
d Λt = a∗t at , dt
March 19, 2002 11:43 WSPC/148-RMP
00117
Quantum Stochastic Analysis via White Noise Operators
243
hold in L(W, W ∗ ), where at and a∗t are the annihilation and creation operators at a point t ∈ R. With these machinery (0.1) is translated into an equation for white noise operators: dX = (L1 at + L2 a∗t + L3 a∗t at + L4 ) X , X(0) = I , dt where stands for the Wick product (normal-ordered product). A significance here is that the stochastic equation (0.1) is equivalent to a simple ordinary differential equation for white noise operators. From the mathematical aspect it is more natural to consider a linear differential equation for white noise operators of the form: dΞ = Lt Ξ , Ξ(0) = I , (0.2) dt where {Lt } is a quantum stochastic process in the sense of white noise operators, i.e. we only assume that t 7→ Lt ∈ L(W, W ∗ ) is a continuous map defined on an interval. An equation of the form (0.2) is called a normal-ordered white noise differential equation (NOWNDE ). We emphasize that the seemingly simple linear equation (0.2) already covers a quantum stochastic differential equation with very singular driving noises that are not handled by the traditional Itˆ o theory. As we have established general existence of a unique solution to (0.2) as white noise operators [6, 27, 28], regularity properties of the solution are to be discussed next. In particular, we need a systematic answer to the question when the “distributionvalued” solution becomes a process with values in operators acting in a certain Hilbert space. In this paper, introducing weighted Fock spaces (a particular class of interacting Fock spaces [2]) interpolating spaces of white noise distributions, we shall discuss regularity properties of a solution and improve preliminary results in [7]. This paper is organized as follows: In Sec. 1 we recall standard notations in white noise distribution theory due to Cochran, Kuo and Sengupta [8]. Moreover, we state, in terms of its S-transform, a sufficient criterion for a white noise distribution to belong to a weighted Fock space. In Sec. 2 we study the white noise operators. A sufficient condition for a general white noise operator to be an operator in a certain weighted Fock space is obtained by refining the famous characterization theorem for operator symbols. In Sec. 3 we introduce the Wick product and explain the relation between quantum stochastic differential equations and normal-ordered white noise differential equations. In Sec. 4 regularity properties of the Wick exponential of a white noise operator are studied in detail. The results are applied to a normalordered white noise differential equation and a proper Hilbert space is specified for the solution. General Notation. For locally convex spaces X, Y let L(X, Y) denote the space of continuous operators from X into Y. It is always assumed that L(X, Y) is equipped with the topology of uniform convergence on every bounded subset. If X is a real space, its complexification is denoted by XC .
March 19, 2002 11:43 WSPC/148-RMP
244
00117
D. M. Chung, U. C. Ji & N. Obata
1. White Noise Distribution Theory 1.1. Gaussian space and Brownian motion In order to realize a Brownian motion we start with the real Gelfand triple: E = S(R) ⊂ H = L2 (R, dt) ⊂ E ∗ = S 0 (R) ,
(1.1)
where S(R) is the space of rapidly decreasing functions, and S 0 (R) the space of tempered distributions. The canonical bilinear form on E ∗ ×E and the inner product of H are denoted by the common symbol h· , ·i since they are compatible. The norm of H is denoted by | · |0 . We shall introduce the canonical topology of E = S(R) by means of Hilbertian norms. For p ∈ R we put |ξ|p = |Ap ξ|0 ,
ξ ∈H,
A = 1 + t2 −
d2 . dt2
Then, for p ≥ 0 the set Ep = {ξ ∈ H; |ξ|p < ∞} becomes a Hilbert space with norm | · |p . While, E−p denotes the completion of H with respect to the norm | · |−p . Note that Ep and E−p are dual each other. With these notations we have E ∗ = S 0 (R) = ind lim E−p .
E = S(R) = proj lim Ep ,
(1.2)
p→∞
p→∞
There exists an orthonormal basis {ei }∞ i=0 ⊂ E of H such that Aei = (2i + 2)ei , i ≥ 0. The constant numbers ρ = kA−1 kOP =
1 , 2
kA−q k2HS =
∞ X i=0
1 < ∞, (2i + 2)2q
q>
1 , 2
play an important role in norm estimates throughout. It follows from the Bochner–Minlos theorem that there exists a probability measure µ on E ∗ such that Z 2 eihx,ξi µ(dx) , ξ ∈ E . e−|ξ|0 /2 = E∗
This µ is called the standard Gaussian measure and the probability space (E ∗ , µ) the Gaussian space. By the standard L2 -approximation we can define a family of random variables: Bt (x) = hx, 1[0,t] i ,
x ∈ E∗ ,
t ≥ 0.
(1.3)
As is easily verified, {Bt ; t ≥ 0} ⊂ L2 (E ∗ , µ) and B0 (x) = 0 ,
E(Bt ) = 0 ,
E(Bs Bt ) = min{s, t} ,
that is, {Bt } is a (realization of) Brownian motion. The white noise process {Wt } is formally defined as the time derivative of Brownian motion: Wt (x) =
d Bt (x) = hx, δt i . dt
The rigorous meaning of the above expression will be given in Sec. 1.3.
March 19, 2002 11:43 WSPC/148-RMP
00117
Quantum Stochastic Analysis via White Noise Operators
245
1.2. Weighted Fock space b be the n-fold symmetric tensor power of H and their norms are For n ≥ 0 let H ⊗n denoted by the common symbol | · |0 . Given a sequence α = {α(n)}∞ n=0 of positive numbers we put ( ) ∞ X b ∞ 2 2 ⊗n n!α(n)kfn |0 < ∞ . Γα (H) = φ = (fn )n=0 ; fn ∈ H , kφ k0,+ ≡ n=0
Then Γα (H) becomes a Hilbert space and is called a weighted Fock space. The (Boson) Fock space is the special case of α(n) ≡ 1 and is denoted by Γ(H). For two sequences of positive numbers α = {α(n)} and β = {β(n)} we write β ≺ α if there exists a positive number C > 0 such that β(n) ≤ Cα(n) for all n ≥ 0. With these notation, we easily see the following Lemma 1.1. Assume that a Hilbert space K2 is imbedded in another Hilbert space K1 and the inclusion map K2 ,→ K1 is a contraction. Let α = {α(n)} and β = {β(n)} be two positive sequences such that β ≺ α. Then we have continuous inclusions with dense images: Γα (K2 ) ,→ Γβ (K2 ) ,→ Γβ (K1 ) . Moreover, the second inclusion is a contraction. 1.3. CKS-space For a positive sequence α = {α(n)} we consider the following four conditions: (A1) α(0) = 1 and inf n≥0 α(n)σ n > 0 for some σ ≥ 1; 1/n = 0; (A2) limn→∞ { α(n) n! } (A3) α is equivalent to a positive sequence γ = {γ(n)} such that {γ(n)/n!} is log-concave; m+n α(m + n) for all (A4) there exists a constant C1α > 0 such that α(m)α(n) ≤ C1α m, n. The generating function of the sequence α = {α(n)} defined by Gα (t) =
∞ X α(n) n t n! n=0
is entire due to condition (A2). Condition (A3) is necessary and sufficient for the power series ∞ X n2n Gα (s) n ˜ α (t) = inf t G n!α(n) s>0 sn n=0 to have a positive radius of convergence Rα > 0. Condition (A4) implies that there exists a constant C3α > 0 such that m α(m) , α(n) ≤ C3α
0 ≤ n ≤ m,
see [3, 19]. The following results are then easily verified.
(1.4)
March 19, 2002 11:43 WSPC/148-RMP
246
00117
D. M. Chung, U. C. Ji & N. Obata
Proposition 1.2. Let α = {α(n)} be a positive sequence satisfying (A2), and Gα (t) the generating function defined therein. Then, (1) Gα (0) = 1 and Gα (s) ≤ Gα (t) for 0 ≤ s ≤ t; (2) γ[Gα (t) − 1] ≤ Gα (γt) − 1 for any γ ≥ 1 and t ≥ 0. With additional condition (A4), we have (3) es Gα (t) ≤ Gα (C3α (s + t)) for s, t ≥ 0, in particular, et ≤ Gα (C3α t) for t ≥ 0, where the constant C3α is given in (1.4); (4) Gα (s)Gα (t) ≤ Gα (C1α (s + t)) for s, t ≥ 0. From now on we always assume that a weight sequence α = {α(n)} satisfies conditions (A1)–(A3). In fact, conditions (A1)–(A3) are necessary for basic characterization theorems for S-transform and for operator symbols, and condition (A4) for another essential properties of white noise operators. However, those conditions are not yet down to the minimum and an almost ultimate (but somehow implicit) description has been investigated in terms of a function Gα or another function controlling growth rate, see [3, 10]. Now we shall construct the space of white noise distributions. For the Hilbert space Ep defined in Sec. 1.1 consider the weighted Fock space Γα (Ep ) and according to (1.2) define Γα (E) = proj lim Γα (Ep ) . p→∞
It is easily shown that Γα (E) is a nuclear space whose topology is given by the family of norms: kφk2p,+ =
∞ X
n!α(n)|fn |2p ,
φ = (fn ) ,
p ≥ 0.
n=0
By a standard argument we see that Γα (E)∗ ∼ = ind lim Γα−1 (E−p ) , p→∞
where Γα (E)∗ carries the strong dual topology and ∼ = stands for a topological isomorphism. Then, by taking the complexification, we obtain a complex Gelfand triple: Wα ≡ Γα (EC ) ⊂ Γ(HC ) ⊂ Γα (EC )∗ ≡ Wα∗ ,
(1.5)
where the middle space is the usual Fock space over HC . Recall the famous Wiener–Itˆo–Segal isomorphism between Γ(HC ) and 2 L (E ∗ , µ), which is a unitary isomorphism uniquely determined by the correspondence: ξ ⊗n ξ ⊗2 ,..., , . . . ↔ φξ (x) = ehx,ξi−hξ,ξi/2 , ξ ∈ EC . 1, ξ, 2! n!
March 19, 2002 11:43 WSPC/148-RMP
00117
Quantum Stochastic Analysis via White Noise Operators
247
The above element is called an exponential vector or a coherent state (though not normalized to have norm one), and is denoted by the common symbol φξ . The Gelfand triple obtained from (1.5) through the Wiener–Itˆ o–Segal isomorphism is denoted also by Wα ⊂ L2 (E ∗ , µ) ⊂ Wα∗ .
(1.6)
In this context, elements of Wα and of Wα∗ are called a white noise test function and a white noise distribution, respectively. The Gelfand triple (1.6) is referred to as the Cochran–Kuo–Sengupta space (or CKS-space for short) with weight sequence α = {α(n)}. When there is no danger of confusion, we write W = Wα for simplicity. The canonical bilinear form on W ∗ × W is denoted by hh· , ·ii. Then hhΦ, φii =
∞ X
n!hFn , fn i ,
Φ = (Fn ) ∈ W ∗ ,
φ = (fn ) ∈ W ,
n=0
and it holds that |hhΦ, φii| ≤ kΦk−p,− kφkp,+ , where kΦk2−p,− =
∞ X n! |Fn |2−p , α(n) n=0
Φ = (Fn ) .
The following result, whose proof is easy, highlights one of the essential features of the space of white noise distributions (1.6). Proposition 1.3. Let {Bt } be the Brownian motion defined in (1.3). Then, for any α satisfying (A1)–(A3), the map t 7→ Bt ∈ Wα∗ is infinitely many times differentiable in Wα∗ . Thus the time derivative of Brownian motion: Wt =
d Bt dt
is defined in Wα∗ and t 7→ Wt ∈ Wα∗ is also infinitely many times differentiable. In that case, {Wt } is called the white noise process. Here are some important examples of CKS-spaces Wα with α satisfying conditions (A1)–(A4). The CKS-space corresponding to α(n) ≡ 1 is called the Hida– Kubo–Takenaka space [20] and is denoted by Wα = (E) ,
α(n) ≡ 1 .
˜ The one corresponding to α(n) = β(n) = (n!)β , 0 ≤ β < 1, is called the Kondratiev– Streit space [18] and is denoted by Wβ˜ = (E)β ,
˜ β(n) = (n!)β ,
0 ≤ β < 1.
March 19, 2002 11:43 WSPC/148-RMP
248
00117
D. M. Chung, U. C. Ji & N. Obata
Another important examples are constructed from the kth order Bell numbers {Bk (n)} defined by k−times
z }| { ∞ X Bk (n) n exp(exp(· · · (exp t) · · ·)) = t , GBell(k) (t) = exp(exp(· · · (exp 0) · · ·)) n=0 n! for more details see [8]. We here only mention a simple recurrence formula: GBell(k+1) (t) = exp γk {GBell(k) (t) − 1} ,
k ≥ 1;
GBell(1) (t) = et ,
(1.7)
where k ≥ 1;
γk+1 = exp γk ,
γ1 = 1 .
1.4. Weighted Fock spaces interpolating CKS-space In addition to the Gelfand triple (1.1) consider another Hilbert spaces to construct weighted Fock spaces interpolating the CKS-space Wα ⊂ Γ(HC ) ⊂ Wα∗ . Let K ± be Hilbert spaces with norms | · |± and assume that the inclusions E ⊂ K + ⊂ H ⊂ K − ⊂ E∗ are all continuous, the inclusion K + ,→ H is a contraction, and K ± are dual each other with respect to H. For example, K + = Ep has this property for all p ≥ 0. Lemma 1.4. Let β be a positive sequence such that 1 ≺ β ≺ α. Then we have continuous inclusions with dense images: + − ) ⊂ Γ(HC ) ⊂ Γβ −1 (KC ) ⊂ Wα∗ . Wα ⊂ Γβ (KC
(1.8)
+ − ) and Γβ −1 (KC ) are dual each other with respect to Γ(HC ). Moreover, Γβ (KC
Proof. Since E ,→ K + is continuous, there exist C ≥ 0 and p0 ≥ 0 such that 0
|ξ|+ ≤ C|ξ|p0 ≤ Cρp−p |ξ|p ,
ξ ∈E,
p ≥ p0 .
Hence for a sufficiently large p ≥ 0 we have |ξ|+ ≤ |ξ|p , ξ ∈ E; in other words, Ep ,→ K + is a contraction. It then follows from Lemma 1.1 that Γα (Ep ) ,→ Γβ (K + ) + ). Again by Lemma 1.1 we see that is continuous, and hence so is Wα ,→ Γβ (KC + Γβ (KC ) ,→ Γ(HC ) is also continuous for 1 ≺ β. ± For simplicity the norms of Γβ ± (KC ) are denoted by k · k± , namely,
kφk2± =
∞ X n=0
n!β ±1 (n)|fn |2± ,
± φ = (fn ) ∈ Γβ ± (KC ).
March 19, 2002 11:43 WSPC/148-RMP
00117
Quantum Stochastic Analysis via White Noise Operators
249
1.5. S-transform For Φ ∈ W ∗ , the S-transform is defined by SΦ(ξ) = hhΦ, φξ ii ,
ξ ∈ EC .
Since the exponential vectors {φξ ; ξ ∈ EC } span a dense subspace of W, each Φ is uniquely specified by the S-transform. Obviously, the S-transform F = SΦ possesses the following properties: (F1) for each ξ, η ∈ EC , the function z 7→ F (zξ + η) is entire holomorphic on C; (F2) there exist C ≥ 0 and p ≥ 0 such that |F (ξ)|2 ≤ CGα (|ξ|2p ) ,
ξ ∈ EC .
In fact, (F2) follows by kφξ k2p,+ = Gα (|ξ|2p ). More important, the converse assertion is also true. This famous characterization theorem for S-transform was first proved for the Hida–Kubo–Takenaka space by Potthoff and Streit [31]. The following result is due to Cochran, Kuo and Sengupta [8]. Theorem 1.5. Let F be a C-valued function on EC . Then F is the S-transform of some Φ ∈ W ∗ if and only if F satisfies conditions (F1) and (F2). In that case, for any q > 1/2 with kA−q k2HS < Rα we have ˜ α (kA−q k2 ) . kΦk2−(p+q),− ≤ C G HS (Such a choice of q is always possible for limq→∞ kA−q kHS = 0.) − ) in (1.8) is characterized. For We then ask how the intermediate space Γβ −1 (KC this question we have only a sufficient condition, which is, however, enough for our discussion later.
Theorem 1.6. Let β = {β(n)} be a positive sequence satisfying conditions (A1)– (A3). Let F : EC → C be a function satisfying (F1) and assume that there exist + with Tr Q < Rβ C ≥ 0 and a bounded, non-negative sesquilinear form Q on KC such that |F (ξ)|2 ≤ CGβ (Q(ξ, ξ)) , Then there exists a unique Φ ∈
− ) Γβ −1 (KC
ξ ∈ EC .
such that F = SΦ. Moreover, in that case,
˜ β (Tr Q) . kΦk2− ≤ C G The proof is a simple modification of that of usual characterization theorems for S-transform [8, 31]. Remark 1.7. In fact, the “if part” of Theorem 1.5 (this is an essential part since the “only if part” is obvious) follows from Theorem 1.6. Fix q > 1/2 and put K + = Ep+q . Let {ei } ⊂ E be a complete orthonormal basis of H such that Aei = (2i + 2)ei . Define an operator Q0 by X hξ, ei i(2i + 2)−2q ei Q0 ξ = i
March 19, 2002 11:43 WSPC/148-RMP
250
00117
D. M. Chung, U. C. Ji & N. Obata
and a sesquilinear form on Ep+q by Q(ξ, η) = hAp+q Q0 ξ, Ap+q η i . Noting that {(2i + 2)−(p+q) ei } is a complete orthonormal basis of Ep+q , we may show easily that Q(ξ, ξ) = |ξ|2p ,
Tr Q = kA−q k2HS < ∞ ,
q>
1 . 2
Hence condition (F2) is equivalent to |F (ξ)|2 ≤ CGα (Q(ξ, ξ)), and the “if part” of Theorem 1.5 follows immediately from Theorem 1.6. Remark 1.8. As for the Kondratiev–Streit space Wα = (E)β , 0 ≤ β < 1, the function Gα (t) in Theorems 1.5 and 1.6 can be replaced with G(t) = exp t1/(1−β) , see [8]. 2. White Noise Operators A continuous operator Ξ : W → W ∗ is called a white noise operator and the space of such operators is denoted by L(W, W ∗ ). Note that L(W, W) and L(Γ(HC ), Γ(HC )) are subspaces of L(W, W ∗ ). Moreover, L(W ∗ , W ∗ ) is isomorphic to L(W, W) by duality. A general theory for white noise operators has been extensively developed in [5, 24, 26]. In this section we shall focus on regularity properties of a white noise operator in terms of weighted Fock space. 2.1. Quantum white noise process Let at and a∗t be the annihilation and creation operators at a point t ∈ R. For φ ∈ W we have at φ(x) = lim
θ→0
φ(x + θδt ) − φ(x) , θ
where the limit always exists for all t ∈ R and x ∈ E ∗ . It is known that at ∈ L(W, W) and a∗t ∈ L(W ∗ , W ∗ ). Moreover, the maps t 7→ at and t 7→ a∗t are both infinitely many times differentiable. The pair {at , a∗t } is referred to as the quantum white noise process. The annihilation, creation and the number processes are defined by Z t Z t Z t as ds , A∗t = a∗s ds , Λt = a∗s as ds . At = 0
0
0
(A precise definition is due to the argument in Sec. 2.2.) These are white noise operators and d At = at , dt
d ∗ A = a∗t , dt t
d Λt = a∗t at , dt
March 19, 2002 11:43 WSPC/148-RMP
00117
Quantum Stochastic Analysis via White Noise Operators
251
hold in L(W, W ∗ ). Moreover, Bt = At + A∗t ,
Wt = at + a∗t ,
where the left hand sides are regarded as multiplication operators from W into W ∗ . Thus, in white noise theory it is reasonable that a continuous map t 7→ Ξt ∈ L(W, W ∗ ) defined on an interval is called a quantum stochastic process. A “classical” stochastic process is then a continuous map t 7→ Φt ∈ W ∗ defined on an interval. In that case, as multiplication operators a classical stochastic process is regarded as a quantum stochastic process, see also Sec. 4.3. The quantum white noise process plays a role of “elemental” quantum stochastic processes, in this connection see [25]. 2.2. Integral kernel operators ⊗(l+m)
)∗ we shall define a white noise operator called an integral kernel For κ ∈ (EC operator with kernel distribution κ: Z κ(s1 , . . . , sl , t1 , . . . , tm )a∗s1 · · · a∗sl at1 · · · atm ds1 · · · dsl dt1 · · · dtm Ξl,m (κ) = Rl+m
though the integral is understood in a formal sense. For the precise definition we ⊗(l+m) ∗ ⊗(n+m) ) and f ∈ EC defined by need the right m-contraction of κ ∈ (EC ! X X hκ, e(j) ⊗ e(i)ihf, e(k) ⊗ e(i)i e(j) ⊗ e(k) , κ ⊗m f = j,k
i
where e(i) = ei1 ⊗ · · · ⊗ eim , e(j) = ej1 ⊗ · · · ⊗ ejl , e(k) = ek1 ⊗ · · ·⊗ekn are respec⊗m ⊗l ⊗n , HC and HC . With these notation tively a complete orthonormal basis of HC we define (n + m)! κ ⊗m fn+m , φ = (fn ) ∈ W . Ξl,m (κ)φ = n! An integral kernel operator is always a white noise operator, that is, Ξl,m (κ) ∈ ⊗(l+m) ∗ ) . Moreover, if the weighted seL(W, W ∗ ) for an arbitrary kernel κ ∈ (EC quence α(n) fulfills conditions (A1)–(A4), then Ξl,m (κ) ∈ L(W, W) if and only if ⊗l ⊗m ∗ ⊗ (EC ) . For the proof of this result, in [5] the authors used the following κ ∈ EC property of the sequence {α(n)}: there exists a constant C2α > 0 such that for all m and n, m+n α(m)α(n) α(m + n) ≤ C2α
which is derived from condition (A3), see [3]. 2.3. Operator symbol The symbol of a white noise operator Ξ ∈ L(W, W ∗ ) is by definition a C-valued function on EC × EC defined by ˆ η) = hhΞφξ , φη ii , Ξ(ξ,
ξ, η ∈ EC .
March 19, 2002 11:43 WSPC/148-RMP
252
00117
D. M. Chung, U. C. Ji & N. Obata
Every white noise operator is uniquely specified by its symbol. The symbol is related to the S-transform in an obvious manner: ˆ η) = S(Ξφξ )(η) = S(Ξ∗ φη )(ξ) . Ξ(ξ, ˆ of a white noise operator Ξ ∈ L(W, W ∗ ) As is easily verified, the symbol Θ = Ξ possesses the following properties: (O1) for any ξ, ξ1 , η, η1 ∈ EC the function (z, w) 7→ Θ(zξ + ξ1 , wη + η1 ) is entire holomorphic on C × C; (O2) there exist constant numbers C ≥ 0 and p ≥ 0 such that |Θ(ξ, η)|2 ≤ CGα (|ξ|2p )Gα (|η|2p ) ,
ξ, η ∈ EC .
As in the case of S-transform, the characterization theorem for symbols, which was first proved in [23] for the Hida–Kubo–Takenaka space, is a significant consequence of white noise theory. The following result was proved in [5]. Theorem 2.1. A function Θ : EC × EC → C is the symbol of a white noise operator Ξ ∈ L(W, W ∗ ) if and only if Θ satisfies conditions (O1) and (O2). In that case ˜ 2α (kA−q k2HS )kφk2p+q,+ , kΞφk2−(p+q),− ≤ C G
φ∈W,
where q > 1/2 is taken as kA−q k2HS < Rα . For regularity properties of a white noise operator we prove the following Theorem 2.2. Let α = {α(n)} and β = {β(n)} be two sequences satisfying conditions (A1)–(A3). Let Θ : EC × EC → C be a function satisfying condition (O1). Assume that there exist C ≥ 0, p ≥ 0 and a bounded, non-negative sesquilinear + with Tr Q < Rβ such that form Q on KC |Θ(ξ, η)|2 ≤ CGα (|ξ|2p )Gβ (Q(η, η)) ,
ξ, η ∈ EC .
− ˆ Moreover, in )) such that Θ = Ξ. Then there exists a unique Ξ ∈ L(Wα , Γβ −1 (KC −q 2 that case, for any q > 1/2 with kA kHS < Rα we have
˜ α (kA−q k2 )G ˜ β (Tr Q)kφk2 kΞφk2− ≤ C G HS p+q,+ . Proof. Let η ∈ EC be fixed and define a function Fη : EC → C by Fη (ξ) = Θ(ξ, η), ξ ∈ EC . Then Fη satisfies conditions (F1) and (F2). In fact, for any ξ, η ∈ EC |Fη (ξ)|2 ≤ CGβ (Q(η, η))Gα (|ξp2 ) . Therefore by Theorem 1.5 there exists a unique Φη ∈ Wα∗ such that S(Φη )(ξ) = Fη (ξ) = Θ(ξ, η) .
March 19, 2002 11:43 WSPC/148-RMP
00117
Quantum Stochastic Analysis via White Noise Operators
253
Moreover, for each q > 1/2 with kA−q |2HS < Rα we have ˜ α (kA−q k2 )Gβ (Q(η, η)) . kΦη k2−(p+q),− ≤ C G HS Now, fix φ ∈ Wα and define a function Gφ : EC → C by Gφ (η) = hhΦη , φii ,
η ∈ EC .
Then we can easily show that Gφ satisfies (F1) and the assumption in Theorem 1.6. − ) such that Therefore, by Theorem 1.6, there exists a unique Ψφ ∈ Γβ −1 (KC S(Ψφ )(η) = Gφ (η) = hhΦη , φii ,
η ∈ EC .
Moreover, we have ˜ α (kA−q k2HS )G ˜ β (Tr Q)kφk2p+q,+ . kΨφ k2− ≤ C G
(2.1)
− ) by Ξφ = Ψφ , φ ∈ Wα . It then follows Define a linear operator Ξ : Wα → Γβ −1 (KC − by (2.1) that Ξ ∈ L(Wα , Γβ −1 (KC )). That Θ is the symbol of Ξ is clear and we complete the proof.
Remark 2.3. As in the case of S-transform, Theorem 2.1 follows from Theorem 2.2. Moreover, for the Kondratiev–Streit space (E)β with 0 ≤ β < 1 the function Gα (t) in the above theorems can be replaced with exp t1/(1−β) . 3. Normal-Ordered White Noise Equations 3.1. Wick product of white noise operators Lemma 3.1. Assume that the weight sequence α = {α(n)} satisfies (A1)–(A4). For two white noise operators Ξ1 , Ξ2 ∈ L(Wα , Wα∗ ) there exists a unique operator Ξ ∈ L(Wα , Wα∗ ) such that ˆ 2 (ξ, η)e−hξ,ηi , ˆ η) = Ξ ˆ 1 (ξ, η)Ξ Ξ(ξ,
ξ, η ∈ EC .
(3.1)
Proof. A simple application of Theorem 2.1. It is sufficient to show that a function Θ defined by ˆ 2 (ξ, η)e−hξ,ηi , ˆ 1 (ξ, η)Ξ Θ(ξ, η) = Ξ
ξ, η ∈ EC ,
satisfies conditions (O1) and (O2). In fact, (O1) being obvious, we shall obtain an estimate of Θ. By assumption there exist Cj ≥ 0 and pj ≥ 0 such that ˆ j (ξ, η)|2 ≤ Cj Gα (|ξ|2p )Gα (|η|2p ) , |Ξ j j
j = 1, 2 .
Since |ξ|p ≤ ρq |ξ|p+q ≤ |ξ|p+q , we have ˆ 2 (ξ, η)|2 ≤ CG2 (|ξ|2 )G2 (|η|2 ) , ˆ 1 (ξ, η)Ξ |Ξ α p α p
March 19, 2002 11:43 WSPC/148-RMP
254
00117
D. M. Chung, U. C. Ji & N. Obata
where C = C1 C2 and p = max{p1 , p2 }. Moreover, in view of |e−hξ,ηi |2 ≤ e2|ξ|0 |η|0 ≤ |e|ξ0 +|η|0 ≤ e|ξ|p +|η|p , 2
2
2
2
we obtain |Θ(ξ, η)|2 ≤ Ce|ξ|p G2α (|ξ|2p )e|η|p G2α (|η|2p ) . 2
2
By using Proposition 1.2 the above becomes |Θ(ξ, η)|2 ≤ CGα (C3α (2C1α + 1)|ξ|2p )Gα (C3α (2C1α + 1)|η|2p ) .
(3.2)
Choose q ≥ 0 with C3α (2C1α + 1)ρ2q ≤ 1 so that C3α (2C1α + 1)|ξ|2p ≤ C3α (2C1α + 1)ρ2q |ξ|2p+q ≤ |ξ|2p+q . Then (3.2) becomes |Θ(ξ, η)|2 ≤ CGα (|ξ|2p+q )Gα (|η|2p+q ) , which shows that Θ(ξ, η) satisfies condition (O2). The operator Ξ defined in (3.1) is called the Wick product of Ξ1 and Ξ2 , and is denoted by Ξ = Ξ1 Ξ2 . We note some simple properties: (Ξ1 Ξ2 ) Ξ3 = Ξ1 (Ξ2 Ξ3 ) ,
I Ξ = Ξ I = Ξ, ∗
(Ξ1 Ξ2 ) =
Ξ∗2
Ξ∗1
,
Ξ1 Ξ2 = Ξ2 Ξ1 .
Namely, equipped with the Wick product L(W, W ∗ ) becomes a commutative ∗algebra. As for the annihilation and creation operators we have as a∗t = a∗t as ,
a∗s a∗t = a∗s a∗t .
(3.3)
a∗s1 · · · a∗sl Ξat1 · · · atm = Ξ (a∗s1 · · · a∗sl at1 · · · atm ) ,
Ξ ∈ L(W, W ∗ ) .
(3.4)
a s at = as at ,
a∗s at = a∗s at ,
More generally, it holds that In fact, the Wick product is a unique bilinear map from L(W, W ∗ ) × L(W, W ∗ ) into L(W, W ∗ ) which is (i) separately continuous; (ii) associative; and (iii) satisfying (3.3). 3.2. Passage from QSDE to NOWNDE A typical quantum stochastic differential equation of Itˆ o type has the form: dX(t) = (L1 dAt + L2 dA∗t + L3 dΛt + L4 dt)X(t) ,
X(0) = I .
(3.5)
In fact, (3.5) is a short-hand notation for a stochastic integral equation of Itˆ o type and is equivalent to hhX(t)φξ , φη ii = hhφξ , φη ii Z t hh(ξ(s)L1 X(s) + η(s)L2 X(s) + ξ(s)η(s)L3 X(s) + 0
+ L4 X(s))φξ , φη ii ds ,
(3.6)
March 19, 2002 11:43 WSPC/148-RMP
00117
Quantum Stochastic Analysis via White Noise Operators
255
where ξ, η run over a certain dense subset of L2 (R). We here suppose that ξ, η ∈ EC . Since at φξ = ξ(t)φξ , the integral in (3.6) becomes Z t hh(L1 X(s)as + a∗s L2 X(s) + a∗s L3 X(s)as + L4 X(s))φξ , φη ii ds . 0
Then by reversing the above argument and by using the smoothness of s 7→ as ∈ L(W, W) we come to the differential equation: dX(t) = L1 X(t)at + L2 a∗t X(t) + L3 a∗t X(t)at + L4 X(t) , dt
X(0) = I .
(3.7)
Note that (3.7) is meaningful whenever t 7→ X(t) ∈ L(W, W ∗ ) is differentiable. Moreover, the right hand side is expressed by means of the Wick product. In fact, in view of (3.4) we see that (3.7) is equivalent to the following ordinary differential equation for white noise operators: dX(t) = (L1 at + L2 a∗t + L3 a∗t at + L4 ) X(t) , dt
X(0) = I .
More generally, it is natural to consider an equation of the form: dΞ = Lt Ξ , dt
Ξ(0) = I ,
(3.8)
where {Lt } is a quantum stochastic process, i.e. t 7→ Lt ∈ L(W, W ∗ ) is continuous. Equation (3.8) is generally called a normal-ordered white noise differential equation (NOWNDE). Consequently, a quantum stochastic differential equation of Itˆ o type (3.5) is brought into a normal-ordered white noise differential equation with coefficients involving only lower powers of quantum white noises. 3.3. Unique existence of a solution as white noise operators Recall that the space L(W, W ∗ ) is closed under the Wick product. Hence, a formal solution to (3.8) is given by the Wick exponential: Z t X n Z t ∞ 1 Ls ds = Ls ds , (3.9) Ξt = wexp n! 0 0 n=0 and our first task is to check its convergence in the sense of white noise operators. Several studies of the convergence of Wick exponential can be found in [28], see also [27]. As a general result, we have the following Theorem 3.2. Let α and ω be two weight sequences satisfying (A1)–(A4) and assume that their generating functions are related in such a way that Gω (t) = exp γ{Gα (t) − 1} ,
(3.10)
where γ > 0 is a certain constant. If t 7→ Lt ∈ L(Wα , Wα∗ ) is continuous, the solution is given by (3.9) and lies in L(Wω , Wω∗ ).
March 19, 2002 11:43 WSPC/148-RMP
256
00117
D. M. Chung, U. C. Ji & N. Obata
Proof. For each t ≥ 0 we put
Z Mt =
t
Ls ds 0
and consider a C-valued function Θt on EC × EC defined by ˆ t (ξ, η)} , Θt (ξ, η) = ehξ,ηi exp{e−hξ,ηi M
ξ, η ∈ EC .
Then condition (O1) is obviously satisfied. Condition (O2) is verified with a similar (but more tedious) argument as in the proof of Lemma 3.1 together with Proposition 1.2. It then follows from Theorem 2.1 that there exists Yt ∈ L(Wω , Wω∗ ) such that Yˆt = Θt . Finally, Yt = wexp Mt is easily verified. Then the assertion follows immediately. The relation (3.10) is motivated by the Bell numbers, see (1.7). 4. Regularity Properties in Terms of Weighted Fock Spaces 4.1. Main results Given a continuous map t 7→ Lt ∈ L(W, W ∗ ) defined on an interval (i.e. {Lt } is a quantum stochastic process), we shall discuss regularity properties of a solution to the initial value problem dΞ = Lt Ξ , Ξ(0) = I . dt Assume that Lt is a finite or infinite sum of integral kernel operators: X Ξl,m (λl,m (t)) . Lt =
(4.1)
(4.2)
l,m
(Every white noise operator L ∈ L(Wα , Wα∗ ) admits such an expansion if α satisfies ⊗(l+m) ∗ ) (A1)–(A4), for the proof see [5].) In that case, the map t 7→ λl,m (t) ∈ (EC is continuous, and so is Z t ⊗(l+m) ∗ λl,m (s) ds ∈ (EC ) . t 7→ κl,m (t) ≡ 0
Since the (formal) solution of (4.1) is given by X Ξl,m (κl,m (t)) , Ξt = wexp l,m
see (3.9), regularity properties of Ξt is described in terms of κl,m (t) instead of λl,m (t). In order to pick up κl,m (t) from (4.1) the following formula is useful: Z t X ˆ s (ξ, η) ds = L hκl,m (t), η ⊗l ⊗ ξ ⊗m i , (4.3) e−hξ,ηi 0
which is straightforward from (4.2).
l,m
March 19, 2002 11:43 WSPC/148-RMP
00117
Quantum Stochastic Analysis via White Noise Operators
257
Theorem 4.1. Assume that Lt is given by Lt =
n 1 X X
Z Ξl,m (λl,m (t)) ,
κl,m (t) ≡ 0
l=0 m=0
t
− ⊗l ⊗m ∗ λl,m (s) ds ∈ (KC ) ⊗ (EC ) .
− )) if 0 ≤ n ≤ 1; and in Then, the unique solution to (4.1) lies in L((E), Γ(KC − L((E)β , Γ(KC )) with β = 1 − 1/n if n ≥ 1. − ⊗l ⊗m ∗ ) ⊗ (EC ) for which there For l, m ≥ 0 let Kl,m be the space of κ ∈ (KC + exist a complete orthonormal basis {ζn } of KC , a non-negative sequence {an } ∈ `1 , and constant numbers C ≥ 0, p ≥ 0 such that
|hκ ⊗m ξ ⊗m , ζi1 ⊗ · · · ⊗ ζil i| ≤ Cai1 · · · ail |ξ|m p ,
ξ ∈ EC .
(4.4)
For example, a finite linear combination of elements of the form f1 ⊗ · · · ⊗ fl ⊗ F , − ⊗m ∗ , F ∈ (EC ) , belongs to Kl,m . where f1 , . . . , fl ∈ KC Theorem 4.2. If Lt is of the form: Lt =
n k X X
Z Ξl,m (λl,m (t)) ,
κl,m (t) ≡
t
λl,m (s) ds ∈ Kl,m , 0
l=0 m=0
− )), where 0 ≤ β < 1 is then the unique solution to (4.1) lies in L((E)β , Γβ˜−1 (KC chosen in such a way that max{k + 1, k + n, 2n} = 2/(1 − β).
The proofs of Theorems 4.1 and 4.2 are deferred in Secs. 5.1 and 5.2, respectively. For the case where the coefficient {Lt } is an infinite series of integral kernel operators, we have ∗ ) be a continuous map and assume Theorem 4.3. Let t 7→ Lt ∈ L(WBell(k) , WBell(k) that Z t X − ⊗l ⊗m ∗ Ξl,m (λl,m (t)) , κl,m (t) ≡ λl,m (s) ds ∈ (KC ) ⊗ (EC ) . Lt = l,m
0
+ If there exist C ≥ 0, p ≥ 0 and a bounded, non-negative sesquilinear for Q on KC with Tr Q < RBell(k+1) such that
ˆ t (ξ, η)e−hξ,ηi |2 ≤ CGBell(k) (|ξ|2 )GBell(k) (Q(η, η)) , |L p
ξ, η ∈ EC ,
− )). then the unique solution to (4.1) lies in L(WBell(k+1) , ΓBell(k+1)−1 (KC
The proof is given in Sec. 5.3. There are many cases where similar results are obtained with our method, for some relevant results see the next section.
March 19, 2002 11:43 WSPC/148-RMP
258
00117
D. M. Chung, U. C. Ji & N. Obata
4.2. Examples Throughout this section formula (4.3) is taken into account. We start with a quantum stochastic differential equation of Hudson–Parthasarathy type (3.5). Proposition 4.4. The initial value problem (4.1) with Lt = L1 a∗t at + L2 at + L3 a∗t + L4 has a unique solution in L((E), Γ(HC )). Proof. For any ξ, η ∈ EC , we have Z t Z t Z t Z t −hξ,ηi ˆ Ls (ξ, η) ds = L1 ξ(s)η(s) ds + L2 ξ(s) ds + L3 η(s) ds + L4 t e 0
0
0
0
= L1 hM[0,t] ξ, ηi + L2 h1[0,t] , ξi + L3 h1[0,t] , ηi + L4 t ≡ hκ1,1 (t), η ⊗ ξi + hκ0,1 (t), ξi + hκ1,0 (t), ηi + κ0,0 (t) , where M[0,t] is the multiplication operator by the indicator function 1[0,t] of [0, t]. ∗ by the kernel theorem, with κ1,1 (t) ∈ Since M[0,t] ∈ L(EC , HC ) ∼ = HC ⊗ EC ∗ ∗ HC ⊗ EC . Obviously, κ0,1 (t) ∈ EC and κ1,0 (t) ∈ HC . Therefore it follows from Theorem 4.1 that the unique solution lies in L((E), Γ(HC )). Proposition 4.5. Let n ≥ 1. The initial value problem (4.1) with Lt =
n X
L0m am t +
m=0
n X
L1m a∗t am t
m=0
has a unique solution in L((E)β , Γ(HC )), where β = 1 − 1/n. Proof. For any ξ, η ∈ EC , we have Z t Z t Z t n n X X −hξ,ηi m ˆ Ls (ξ, η) ds = L0m ξ(s) ds + L1m η(s)ξ(s)m ds e 0
m=0
=
n X
0
m=0
L0m h1[0,t] , ξ m i +
m=0
n X
0
L1m hM[0,t] ξ m , ηi .
m=0
Then the assertion follows by a similar argument as in Proposition 4.4. Proposition 4.6. Let n ≥ 1 and assume that {Lt } involves higher order derivatives of quantum white noises in such a way that Lt =
n X m=0 (m)
m where ∇m t at = (−1) Ξ0,1 (δt
L0m ∇m t at +
n X
L1m a∗t ∇m t at ,
m=0
). Then the unique solution lies in L((E), Γ(HC )).
March 19, 2002 11:43 WSPC/148-RMP
00117
Quantum Stochastic Analysis via White Noise Operators
259
Proof. For any ξ, η ∈ EC , we have Z t Z t Z t n n X X ˆ s (ξ, η) ds = L L0m ξ (m) (s) ds + L1m η(s)ξ (m) (s) ds e−hξ,ηi 0
0
m=0
=
n X
0
m=0
L0m h1[0,t] , ξ (m) i +
m=0
n X
L1m hM[0,t] ξ (m) , ηi .
m=0
In view of dm ∈ L(EC , EC ) , dsm
M[0,t] ◦
dm ∈ L(EC , HC ) , dsm
we may derive the assertion by a similar argument as in Proposition 4.4. We now consider the case where {Lt } involves higher powers of quantum white noises in such a way that Lt =
n k X X
m Ll,m a∗l t at .
(4.5)
l=0 m=0
Then, by a similar argument as above we see that Theorem 4.2 with K − = E−p for a suitably chosen p > 0 can be applied. Below we prove a sharper result by a direct application of Theorem 2.2. P For a finite sum of integral kernel operators: Ξ = l,m Ξl,m (κl,m ) we put deg Ξ = max{l + m; Ξl,m (κl,m ) 6= 0} < ∞ . Such a white noise operator belongs to L(Wα , Wα∗ ) for all α satisfying (A1)–(A3). Proposition 4.7. Let {Ξt } be the unique solution to the initial value problem (4.1) with coefficient {Lt } given as in (4.5). Choose 0 ≤ β < 1 such that deg Lt ≤ 2/(1 − β). Then for any T > 0 there exists p > 1/2 such that {Ξt }0≤t≤T ⊂ L((E)β , Γβ˜−1 (E−p )). Proof. For any ξ, η ∈ EC , we have Z t n k X X ˆ s (ξ, η) ds = L Ll,m h1[0,t] ξ m , η l i . e−hξ,ηi 0
l=0 m=0
Since the pointwise multiplication of E = S(R) is continuous, for each l ≥ 1 there exist Cl ≥ 0 and pl ≥ 0 such that |ξ l |0 ≤ Cl |ξ|lpl . Therefore, for any l, m ≥ 1 and 0 > 0 we obtain l+m 0 l+m , |Ll,m h1[0,t] ξ m , η l i| ≤ |Ll,m |Cl Cm |η|lpl |ξ|m pm ≤ |ξ|rm + |η|ql
where we choose rm ≥ pm and ql ≥ pl in such a way that |Ll,m |Cl Cm ρ(rm −pm )m ≤ 1 and ρ(ql −pl )(l+m) ≤ 0 . Similarly, for any l ≥ 1 choose ql0 ≥ pl such that √ |Ll,0 h1[0,t] , η l i| ≤ |Ll,0 | T Cl |η|lpl ≤ 0 |η|lq0 , l
March 19, 2002 11:43 WSPC/148-RMP
260
00117
D. M. Chung, U. C. Ji & N. Obata
0 and for any m ≥ 1 choose rm ≥ pm such that √ m |L0,m h1[0,t] , ξ m i| ≤ |L0,m | T |Cm ξ|m 0 . pm ≤ |ξ|rm
Hence for any l, m ≥ 0 and 0 > 0 there exist C ≥ 0 and r, q ≥ 0 such that |Ll,m h1[0,t] ξ m , η l i| ≤ C + |ξ|dr + 0 |η|dq , where d = deg Lt and we used the fact that xα ≤ 1 + xβ for x ≥ 0 and 0 < α ≤ β. Now we define a sesquilinear form Q on Es+q,C , s > 1/2, by Q(ξ, η) = (0 (k + 1)(n + 1))2/d hAq ξ, Aq ηi . Then Q is a non-negative sesquilinear form with Tr Q = (0 (k + 1)(n + 1))2/d kA−s k2HS . For a given > 0, by taking 0 > 0 satisfying (0 (k + 1)(n + 1))2/d kA−s k2HS < we obtain Tr Q < and Z n k X X −hξ,ηi t ˆ s (ξ, η) ds ≤ e L |Ll,m h1[0,t] ξ m , η l i| 0
l=0 m=0
≤ C 0 + |ξ|r+r0 + Q(η, η)d/2 , d/2 0
where C 0 ≥ 0 and r0 ≥ 0 with (k + 1)(n + 1)ρr d/2 ≤ 1. The assertion is now immediate from Theorem 2.2. With a similar argument as in Theorem 4.2 we come to the following Proposition 4.8. The unique solution of dΞ = (a2t + a∗2 t ) Ξ, dt
Ξ(0) = I ,
lies in L((E), Γ(E−p )) with p ≥ 2. The last example is the case of deg Lt = ∞. Proposition 4.9. Let {Ξt } be the unique solution to the initial value problem (4.1) with Lt = wexp(a2t + a∗2 t ). Then there exists p > 2 such that {Ξt } ⊂ L(WBell(2) , ΓBell(2)−1 (E−p )). The proof is straightforward with the help of Theorem 2.2. Moreover, it is not difficult to determine p after tedious computation.
March 19, 2002 11:43 WSPC/148-RMP
00117
Quantum Stochastic Analysis via White Noise Operators
261
4.3. A classical case In [13] classical stochastic differential equations are extensively studied by means of Wick product and unique existence results are established in a distribution sense. To clarify classical–quantum relationship we introduce a multiplication operator by a white noise distribution. For each Φ ∈ W ∗ we define a multiplication operator MΦ by hhMΦ φ, ψii = hhΦ, φψii ,
φ, ψ ∈ W .
Since pointwise multiplication of test functions is a continuous bilinear map, we see easily that MΦ ∈ L(W, W ∗ ) and the map Φ 7→ MΦ is a continuous injection. Recall that the Wick product of white noise distributions Φ, Ψ ∈ W ∗ is defined by S(Φ Ψ)(ξ) = SΦ(ξ) · SΨ(ξ) ,
ξ ∈ EC .
It then follows by definition that MΦΨ = MΦ MΨ ,
Φ, Ψ ∈ W ∗ .
We now consider a normal-ordered white noise equation for white noise distributions: dXt = Φt X t , X 0 = Ψ , (4.6) dt where {Φt } is a classical stochastic process in a white noise sense, i.e. t 7→ Φt ∈ W ∗ is continuous, and Ψ ∈ W ∗ . An equation of the form (4.6) is studied in [13]. Associated with (4.6) we consider a NOWEDE given by dΞt = MΦt Ξt , dt
Ξ0 = I .
(4.7)
Since hh(MΦt Ξt )φ0 , φξ ii = hhMΦt φ0 , φξ iihhΞt φ0 , φξ ii = hhΦt , φξ iihhΞt φ0 , φξ ii = hhΦt (Ξt φ0 ), φξ ii , we have (MΦt Ξt )φ0 = Φt (Ξt φ0 ) . Therefore if a quantum stochastic process Ξt satisfies (4.7), then the classical process Xt ≡ (Ξt φ0 ) Ψ
(4.8)
satisfies (4.6). Thus our regularity results can be applied to the classical case (4.6) too. It is known [24] that the Fock expansion of a multiplication operator MΦ is given by MΦ =
∞ X (l + m)! Ξl,m (Fl+m ) , l!m!
l,m=0
(4.9)
March 19, 2002 11:43 WSPC/148-RMP
262
00117
D. M. Chung, U. C. Ji & N. Obata
where Φ = (Fn ) ∈ W ∗ . On the other hand, for each ξ, η ∈ EC ˆ Φ (ξ, η) = hhΦ, φξ φη ii = hhΦ, φξ+η iiehξ,ηi = SΦ(ξ + η)ehξ,ηi . M
(4.10)
With the help of (4.9) and (4.10) our conditions introduced in the previous sections are paraphrased in terms of Φt = (Fn (t)) which is the coefficient of the classical equation (4.6). ∗ is continuous and for For instance, assume that the map t 7→ Φt ∈ WBell(k) each n ≥ 0 Z t − ⊗n Fn (s) ds ∈ (KC ) , Gn (t) ≡ 0
and that there exist C ≥ 0, p ≥ 0 and a bounded, non-negative sesquilinear for Q + with Tr Q < RBell(k+1) such that on KC |SΦt (ξ + η)|2 ≤ CGBell(k) (|ξ|2p )GBell(k) (Q(η, η)) ,
ξ, η ∈ EC .
− ). Then regApplying Theorem 4.3, wee see that Ξt φ0 belongs to ΓBell(k+1)−1 (KC ularity of the unique solution to (4.6) given by (4.8) can be specified according to the initial value Ψ. In this way, a classical stochastic differential equation involving singular noises such as quadratic powers of white noise can be discussed within our framework beyond the traditional Itˆ o theory. The solution lies not necessarily in the original Fock space but can be found in a weighted Fock space which is “singular” to the original Fock space.
5. Proofs of Main Theorems 5.1. Proof of Theorem 4.1 ⊗m ∗ ) there exist constant numbers C ≥ 0 Lemma 5.1. Let m ≥ 1. For any κ ∈ (EC and p ≥ 0 such that
|hκ, ξ ⊗m i| ≤ C + |ξ|2m p ,
ξ, η ∈ EC .
Proof. Taking p ≥ 0 with |κ|−p < ∞ we see that |hκ, ξ ⊗m i| ≤ |κ|−p |ξpm | ≤ |κ|2−p + |ξ|2m p , as desired. − ⊗m ∗ Lemma 5.2. Let m ≥ 0. For any κ ∈ (KC ) ⊗ (EC ) and > 0, there exist constant numbers C ≥ 0, p ≥ 0, and a bounded, non-negative sesquilinear form Q + with Tr Q < such that on KC
|hκ, η ⊗ ξ ⊗m i| ≤ C + |ξ|2m p + Q(η, η) ,
ξ, η ∈ EC .
March 19, 2002 11:43 WSPC/148-RMP
00117
Quantum Stochastic Analysis via White Noise Operators
263
− Proof. We first consider the case of κ ∈ KC . Since η 7→ hκ, ηi is a continuous + + linear form on KC , there exists κ0 ∈ KC such that
hκ, ηi = hη, κ0 i+ ,
+ η ∈ KC ,
+ . Note that |κ0 |+ = where the right hand side is the (hermitian) inner product of KC + |κ|0 . With this κ0 ∈ KC we define
Q0 η = hκ, ηiκ0 = hη, κ0 i+ κ0 ,
+ η ∈ KC .
+ with Tr Q0 = |κ0 |2+ = |κ|2− . Define a Obviously, Q0 is a trace class operator on KC + sesquilinear form Q on KC by
Q(ξ, η) = 1 hξ, Q0 ηi+ ,
+ ξ, η ∈ KC ,
with 1 < /|κ|2− . Then Tr Q = 1 Tr Q0 < and |hκ, ηi| ≤
1 1 + 1 |hκ, ηi|2 = + |Q(η, η) , 1 1
as desired. − ⊗m ∗ ⊗ (EC ) , where m ≥ 1. Let L ∈ We next consider the case of κ ∈ KC ⊗m − L(EC , KC ) be defined by hκ, η ⊗ ξ ⊗m i = hLξ ⊗m , ηi , ξ, η ∈ EC . P Then, inserting ξ = hξ, ei iei , we obtain X hξ, ei1 i · · · hξ, eim ihL(ei1 ⊗ · · · ⊗ eim ), ηi . hκ, η ⊗ ξ ⊗m i = i1 ,...,im
Hence, for any p ≥ 0 and 1 > 0 it follows from the Schwartz inequality that |hκ, η ⊗ ξ ⊗m i| 1/2 X 2 2p 2 2p |hξ, ei1 i (2i1 + 2) · · · |hξ, eim i| (2im + 2) ≤ i1 ,...,im
×
X
1/2 |hL(ei1 ⊗ · · · ⊗ eim ), ηi|2 (2i1 + 2)−2p · · · (2im + 2)−2p
i1 ,...,im
≤
X 1 2m |ξ|p + 1 |hL(A−p ei1 ⊗ · · · ⊗ A−p eim ), ηi|2 . 1 i ,...,i 1
(5.1)
m
+ by Define a non-negative sesquilinear form Q on KC X hL(A−p ei1 ⊗ · · · ⊗ A−p eim ), ξihL(A−p ei1 ⊗ · · · ⊗ A−p eim ), ηi . Q(ξ, η) = 1 i1 ,...,im
March 19, 2002 11:43 WSPC/148-RMP
264
00117
D. M. Chung, U. C. Ji & N. Obata
+ Let {ζj } be a complete orthonormal basis of KC . Then, X X X Q(ζj , ζj ) = 1 |hL(A−p ei1 ⊗ · · · ⊗ A−p eim ), ζj i|2 Tr Q = j
= 1
j
X X j
i1 ,...,im
|hκ, ζj ⊗ A−p ei1 ⊗ · · · ⊗ A−p eim i|2 = 1 |κ|21,m;−,−p ,
i1 ,...,im
where note that {A−p ei } is a complete orthonormal basis of Ep . By assumption one may choose p ≥ 0 such that |κ|1,m;−,−p < ∞. Moreover, we choose 1 > 0 and q ≥ 0 in such a way that 1 < /|κ|21,m;−,−p and ρ2qm /1 ≤ 1. Then, Tr Q < and (5.1) becomes |hκ, η ⊗ ξ ⊗m i| ≤ |ξ|2m p+q + Q(η, η) , which completes the proof. Lemma 5.3. Let p > r0 + 1/2, where r0 ≥ 0 is chosen in such a way that the inclusion Er0 ,→ K + is a contraction, and define Q(ξ, η) = hA−p ξ, A−p η i . + with Then Q is a sesquilinear form on KC
0 ≤ Tr Q ≤ kA−(p−r0 ) k2HS . Proof. We first note that such a choice of r0 ≥ 0 is always possible under the assumption posed in Sec. 1.4, see also the proof of Lemma 1.4. Then, in view of the chain of Hilbert spaces Ep ,→ Er0 ,→ K + ,→ H ,→ K − ,→ E−r0 ,→ E−p , we see that the natural inclusion map T : K + → E−r0 is a contraction. Moreover, S : Ep → Er0 is of Hilbert–Schmidt type with kSkHS = kA−(p−r0 ) kHS , hence so is + . Then S ∗ : E−r0 → E−p . Let {ζj } be a complete orthonormal basis of KC X X X Q(ζj , ζj ) = |A−p ζj |20 = |ζj |2−p . (5.2) Tr Q = j
j
With the help of |ζj |2−p = |S ∗ T ζj |2−p =
j
X
|hS ∗ T ζj , ei i|2 (2i + 2)−2p
i
=
X
|hζj , T ∗ Sei i|2 (2i + 2)−2p ,
i
we see that (5.2) becomes X XX |hζj , T ∗ Sei i|2 (2i + 2)−2p = |T ∗ Sei |2− (2i + 2)−2p . Tr Q = i
j
i
(5.3)
March 19, 2002 11:43 WSPC/148-RMP
00117
Quantum Stochastic Analysis via White Noise Operators
265
Since T ∗ is also a contraction, (5.3) is bounded by X X |Sei |2r0 (2i + 2)−2p = (2i + 2)2r0 (2i + 2)−2p = kA−(p−r0 ) k2HS , ≤ i
i
from which the assertion follows. Proposition 5.4. Let Ξ be a white noise operator of the form: Ξ=
n 1 X X
Ξl,m (κl,m ) ,
− ⊗l ⊗m ∗ κl,m ∈ (KC ) ⊗ (EC ) .
l=0 m=0 − )) for 0 ≤ n ≤ 1; and Then the Wick exponential wexp Ξ belongs to L((E), Γ(KC − )), where β = 1 − 1/n for n ≥ 2. belongs to L((E)β , Γ(KC
Proof. Assume that n ≥ 2. The proof for 0 ≤ n ≤ 1 is similar. The symbol of Ξ is given by ˆ η)} \Ξ(ξ, η) = ehξ,ηi exp{e−hξ,ηi Ξ(ξ, wexp ) ( n 1 X X ⊗l ⊗m hκl,m , η ⊗ ξ i , = exp hξ, ηi +
ξ, η ∈ EC .
l=0 m=0
Taking Theorem 2.2 and Remark 2.3 into account, we need only to derive an estimate of the form: \Ξ(ξ, η)|2 ≤ M exp{|ξ|2/(1−β) + Q(η, η)} , |wexp p
ξ, η ∈ EC ,
(5.4)
where M ≥ 0 and p ≥ 0 are constant numbers, and Q is a non-negative sesquilinear + with an arbitrarily small Tr Q. form on KC Given > 0 arbitrarily small, we take 0 < 1 < /(2n + 4). By Lemma 5.1, for each 0 ≤ m ≤ n there exist C0m ≥ 0 and pm ≥ 0 such that |hκ0,m , ξ ⊗m i| ≤ C0m + |ξ|2m pm ,
ξ, η ∈ EC .
Similarly, by Lemma 5.2, there exist C1m ≥ 0, qm ≥ 0 and a bounded, non-negative + with Tr Qm < 1 such that sesquilinear form Qm on KC |hκ1,m , η ⊗ ξ ⊗m i| ≤ C1m + |ξ|2m qm + Qm (η, η) ,
ξ, η ∈ EC .
Choose r0 ≥ 0 such that the inclusion Er0 ,→ K + is a contraction and fix q > r0 + 1/2 with kA−(q−r0 ) k2HS < 1 . Define p = max{q, pm , qm ; 0 ≤ m ≤ n}. Note an obvious inequality 2n ≤ |ξ|2m |ξ|2m r p ≤ 1 + |ξ|p ,
0 ≤ r ≤ p,
0 ≤ m ≤ n.
Then, replacing the constant numbers C0m , C1m with a larger C, we obtain |hκ0,m , ξ ⊗m i| ≤ C + |ξ|2n p ,
0 ≤ m ≤ n,
(5.5)
March 19, 2002 11:43 WSPC/148-RMP
00117
D. M. Chung, U. C. Ji & N. Obata
266
and |hκ1,m , η ⊗ ξ ⊗m i| ≤ C + |ξ|2n p + Qm (η, η) ,
0 ≤ m ≤ n.
(5.6)
Moreover, note that 2 |hξ, ηi| ≤ |ξ|p |η|−p ≤ |ξ|2p + |η|2−p ≤ 1 + |ξ|2n p + |η|−p .
(5.7)
Thus, combining (5.5)–(5.7), we obtain n 1 X X hκl,m , η ⊗l ⊗ ξ ⊗m i hξ, ηi + l=0 m=0
n X
2 ≤ 1 + 2(n + 1)C + (2n + 3)|ξ|2n p + |η|−p +
Qm (η, η) .
(5.8)
m=0 + by Now we define a bounded, non-negative sesquilinear form Q on KC
Q(ξ, η) = 2
n X
Qm (ξ, η) + 2hA−p ξ, A−p η i ,
+ ξ, η ∈ KC .
m=0
Then with some constant numbers M1 ≥ 0 and r ≥ 0, (5.8) becomes n 1 X X ⊗l ⊗m hκl,m , η ⊗ ξ i ≤ M1 + |ξ|2n 2 hξ, ηi + p+r + Q(η, η) ,
(5.9)
l=0 m=0
where Tr Q ≤ 2
n X
Tr Qm + 2kA−(q−r0 ) k2HS < 2(n + 1)1 + 21 < ,
m=0
which follows from Lemma 5.3. Then (5.4) follows immediately from (5.9). Theorem 4.1 is a direct consequence of Proposition 5.4. 5.2. Proof of Theorem 4.2 − ⊗l ) and > 0, there exist C ≥ 0 Lemma 5.5. Let l ≥ 1. For any κ ∈ Kl,0 ⊂ (KC + with Tr Q < such that and a bounded, non-negative sesquilinear form Q on KC
|hκ, η ⊗l i| ≤ C + Q(η, η)(l+1)/2 ,
η ∈ EC .
+ , a nonProof. By definition we take a complete orthonormal basis {ζn } of KC 1 negative sequence {an } ∈ ` , and constant numbers C ≥ 0 such that
|hκ, ζi1 ⊗ · · · ⊗ ζil i| ≤ Cai1 · · · ail .
March 19, 2002 11:43 WSPC/148-RMP
00117
Quantum Stochastic Analysis via White Noise Operators
267
P Hence by using η = n hη, ζn i+ ζn , X hκ, ζi1 ⊗ · · · ⊗ ζil ihη, ζi1 i+ · · · hη, ζil i+ |hκ, η ⊗l i| = i1 ,...,il ≤C
X
ai1 · · · ail |hη, ζi1 i+ | · · · |hη, ζil i+ | = C
X
i1 ,...,il
≤C
X
!l ai |hη, ζi i+ |
i
!l/2 ai
i
X
!l/2 ai |hη, ζi i+ |
2
.
(5.10)
i
P Put M = i ai < ∞ and suppose we are given an arbitrary 1 > 0 with 1 < /M . With the help of an obvious inequality bl cm ≤ bl+m + cl+m for b, c ≥ 0, (5.10) becomes !l/2 X −l 2 l 1/2 2 ai |hη, ζi i+ | 1 (1 C M ) i
2 l (l+1)/2 ≤ (−l + 1 C M )
1
X
!(l+1)/2 ai |hη, ζi i+ |2
.
(5.11)
i
Define a sesquilinear form by X ai hξ, ζi i+ hζi , ηi+ , Q(ξ, η) = 1
+ ξ, η ∈ KC .
(5.12)
i + with Tr Q = 1 M < . Obviously, Q is a non-negative sesquilinear form on KC Thus (5.10) and (5.11) become 2 l (l+1)/2 + Q(η, η)(l+1)/2 , |hκ, η ⊗l i| ≤ (−l 1 C M )
which completes the proof. − ⊗l ⊗m ∗ Lemma 5.6. Let l ≥ 1 and m ≥ 1. For any κ ∈ Kl,m ⊂ (KC ) ⊗ (EC ) and + > 0, there exist p ≥ 0 and a bounded, non-negative sesquilinear form Q on KC with Tr Q < such that
+ Q(η, η)(l+m)/2 , |hκ, η ⊗l ⊗ ξ ⊗m i| ≤ |ξ|l+m p
ξ, η ∈ E|C .
(5.13)
Proof. With the property stated as in (4.4), the proof is a simple modification of the proof of Lemma 5.5. Proposition 5.7. Assume that k + n ≥ 2 and consider a white noise operator Ξ which is a finite sum of integral kernel operators of the form: Ξ=
n k X X l=0 m=0
Ξl,m (κl,m ) ,
− ⊗l ⊗m ∗ κl,m ∈ Kl,m ⊂ (KC ) ⊗ (EC ) .
March 19, 2002 11:43 WSPC/148-RMP
268
00117
D. M. Chung, U. C. Ji & N. Obata
− Then the Wick exponential wexp Ξ belongs to L((E)β , Γβ˜−1 (KC )), where 0 ≤ β < 1 is chosen in such a way that max{k + 1, k + n, 2n} = 2/(1 − β).
Proof. The symbol of wexp Ξ is given by \Ξ(ξ, η) = exp hξ, ηi + wexp
n k X X
! hκl,m , η
⊗l
⊗ξ
⊗m
i ,
(5.14)
l=0 m=0
to which we shall apply Theorem 2.2. Suppose > 0 and 1 > 0 are given arbitrarily. By Lemma 5.1, for each m ≥ 0 there exist C0,m ≥ 0 and p0,m ≥ 0 such that |hκ0,m , ξ ⊗m i| ≤ C0,m + |ξ|2m p0,m .
(5.15)
By Lemma 5.5, for each l ≥ 1 there exist Cl ≥ 0 and a bounded, non-negative + with Tr Ql,0 < 1 such that sesquilinear form Ql,0 on KC |hκl,0 , η ⊗l i| ≤ Cl + Ql,0 (η, η)(l+1)/2 .
(5.16)
By Lemma 5.6, for each pair l ≥ 1 and m ≥ 1 there exist pl,m ≥ 0 and a bounded, + with Tr Ql,m < 1 such that non-negative sesquilinear form Ql,m on KC (l+m)/2 , |hκl,m , η ⊗l ⊗ ξ ⊗m i| ≤ |ξ|l+m pl,m + Ql,m (η, η)
ξ, η ∈ EC .
(5.17)
Fix q > r0 + 1/2 with kA−(q−r0 ) k2HS < 1 , where r0 ≥ 0 is chosen in such a way that Er0 ,→ K + is a contraction. We put p = max{q, pl,m ; 0 ≤ l ≤ k, 0 ≤ m ≤ n} and define Q(ξ, η) =
n k X X
Ql,m (ξ, η) + hA−p ξ, A−p ηi ,
ξ, η ∈ EC .
(5.18)
l=1 m=0 + with Then Q becomes a sesquilinear form on KC
Tr Q ≤
n k X X
Tr Ql,m + kA−(p−r0 ) k2HS ≤ k(n + 1)1 + 1 .
l=1 m=0
Thus, given > 0 we may choose 0 < 1 < /(kn + k + 1) to have Tr Q < . Now we estimate (5.14). Put d1 = max{2n, k + n}. By using an obvious inequality xα ≤ 1 + xβ for x ≥ 0 and 0 < α ≤ β, we sum up (5.15)–(5.17) to obtain |hξ, ηi| +
n k X X
|hκl,m , η ⊗l ⊗ ξ ⊗m i|
l=0 m=0
≤ M1 + (1 + n +
kn)|ξ|dp1
+
k X
Ql,0 (η, η)(l+1)/2
l=1
+
k X
n X
l=1 m=1
Ql,m (η, η)(l+m)/2 + |η|2−p ,
March 19, 2002 11:43 WSPC/148-RMP
00117
Quantum Stochastic Analysis via White Noise Operators
269
where M1 is a constant. Note an inequality: for x1 , . . . , xn ≥ 0 and α1 , . . . , αn > 0 with α = max{α1 , . . . , αn } ≥ 1 it holds that αn α α α 1 xα 1 + · · · + xn ≤ (1 + x1 ) + · · · + (1 + xn ) ≤ n + (x1 + · · · + xn ) .
Then with d2 = max{k + 1, k + n} and (5.18) we come to k X
Ql,0 (η, η)(l+1)/2 +
n k X X
Ql,m (η, η)(l+m)/2 + |η|2−p
l=1 m=1
l=1
≤ (k + kn + 1) +
n k X X
!d2 /2 Ql,m (η, η) + |η|2−q
l=1 m=0
= (k + kn + 1) + Q(η, η)d2 /2 , Consequently, with some constant numbers M ≥ 0 and r ≥ 0 we have n k X X ⊗l ⊗m 1 hκl,m , η ⊗ ξ i ≤ M + |ξ|dp+r + Q(η, η)d2 /2 . 2 hξ, ηi + l=0 m=0
Finally, taking d = max{d1 , d2 } = max{k + 1, k + n, 2n}, we have \Ξ(ξ, η)|2 ≤ C exp{|ξ|dp+r + Q(η, η)d/2 } , |wexp and the assertion follows by Theorem 2.2 and Remark 2.3. Theorem 4.2 is direct from Proposition 5.7. 5.3. Proof of Theorem 4.3 It is sufficient to prove the following ∗ ) and assume that there exist C ≥ Proposition 5.8. Let Ξ ∈ L(WBell(k) , WBell(k) + with Tr Q < 0, p ≥ 0 and a bounded, non-negative sesquilinear form Q on KC RBell(k+1) such that
ˆ η)e−hξ,ηi |2 ≤ CGBell(k) (|ξ|2 )GBell(k) (Q(η, η)) , |Ξ(ξ, p Then the Wick exponential wexp Ξ belongs to
ξ, η ∈ EC .
(5.19)
− )). L(WBell(k+1) , ΓBell(k+1)−1 (KC
Proof. For simplicity we write Gk = GBell(k) . The symbol of wexp Ξ is given by ˆ η)e−hξ,ηi } . \Ξ(ξ, η) = ehξ,ηi exp{Ξ(ξ, wexp By the Schwartz inequality and (5.19) we see that for any q ≥ 0, √ \Ξ(ξ, η)|2 ≤ e2|hξ,ηi| exp{2 C G1/2 (|ξ|2p )G1/2 (Q(η, η))} |wexp k k ≤ e|ξ|q e|η|−q exp{CGk (|ξ|2p )} exp{Gk (Q(η, η))} . 2
2
(5.20)
March 19, 2002 11:43 WSPC/148-RMP
270
00117
D. M. Chung, U. C. Ji & N. Obata
We first consider CGk (|ξ|2p ) + |ξ|2q . Without loss of generality we may assume that C ≥ γk , where γk is defined in (1.7). In view of Proposition 1.2 we obtain CGk (|ξ|2p ) + |ξ|2q = C{Gk (|ξ|2p ) − 1} + C + |ξ|2q C 2 ≤ γk Gk |ξ| − 1 + C + |ξ|2q γk p ≤ γk Gk (|ξ|2r ) − γk + C + |ξ|2r ,
(5.21)
where r ≥ max{p, q} is suitably chosen. By means of an obvious inequality s + Gk (t) ≤ Gk (s + t) for s, t ≥ 0, (5.21) becomes 1 2 2 2 CGk (|ξ|p ) + |ξ|q ≤ γk Gk 1+ |ξ|r − γk + C γk ≤ γk {Gk (|ξ|2s ) − 1} + C , where s ≥ r is chosen in such a way that (1 + 1/γk )ρ2(s−r) ≤ 1. Then, exp{CGk (|ξ|2p ) + |ξ|2q } ≤ eC Gk+1 (|ξ|2s ) . On the other hand, we obtain easily Gk (Q(η, η)) + |η|2−q ≤ γk {Gk (Q(η, η) + |η|2−q ) − 1} + γk
(5.22)
and hence, exp{Gk (Q(η, η)) + |η|2−q } ≤ eγk Gk+1 (Q(η, η) + |η|2−q ) .
(5.23)
By combining (5.22) and (5.23), (5.20) is brought into \Ξ(ξ, η)|2 ≤ C4 Gk+1 (|ξ|2s )Gk+1 (Q(η, η) + |η|2−q ) , |wexp
(5.24)
where C4 > 0 is a constant. With a bounded non-negative sesquilinear form Q1 on + defined by KC Q1 (ξ, η) = Q(ξ, η) + hA−q ξ, A−q η i ,
+ ξ, η ∈ KC ,
(5.24) becomes \Ξ(ξ, η)|2 ≤ C3 Gk+1 (|ξ|2s )Gk+1 (Q1 (η, η)) . |wexp Choosing a sufficiently large q ≥ 0 in such a way that Tr Q1 ≤ Tr Q + kA−(q−r0 ) k2HS < RBell(k+1) , where r0 ≥ 0 is chosen as Er0 ,→ K + is a contraction. (This is guaranteed by Lemma 5.3.) The assertion then follows by Theorem 2.2. Acknowledgments This joint work was accomplished during the second named author’s stay at Nagoya University in 1999. He is most grateful to Graduate School of Mathematics, in particular to Professor Nobuaki Obata for their warm hospitality. The authors thank the referee for useful comments that improved this paper.
March 19, 2002 11:43 WSPC/148-RMP
00117
Quantum Stochastic Analysis via White Noise Operators
271
References [1] L. Accardi, “Noise and dissipation in quantum theory”, Rev. Math. Phys. 2 (1990) 127–176. [2] L. Accardi, Y.-G. Lu and I. Volovich, “Non-linear extensions of classical and quantum stochastic calculus and essentially infinite dimensional analysis”, pp. 1–33 in Probability Towards 2000, eds. L. Accardi and C. C. Heyde, Lect. Notes in Stat. 128, Springer-Verlag, 1998. [3] N. Asai, I. Kubo and H.-H. Kuo, “General characterization theorems and intrinsic topologies in white noise analysis”, Hiroshima Math. J. 31 (2001) 299–330. [4] A. Bohm and M. Gadella, Dirac Kets, Gamow Vectors and Gel’fand Triples, Lect. Notes in Phys. 348, Springer-Verlag, 1989. [5] D. M. Chung, U. C. Ji and N. Obata, “Higher powers of quantum white noises in terms of integral kernel operators”, Infinite Dimen. Anal. Quantum Prob. 1 (1998) 533–559. [6] D. M. Chung, U. C. Ji and N. Obata, “Normal-ordered white noise differential equations I: Existence of solutions as Fock space operators”, pp. 115–135 in Trends in Contemporary Infinite Dimensional Analysis and Quantum Probability, eds. L. Accardi et al., Istituto Italiano di Cultura, Kyoto, 2000. [7] D. M. Chung, U. C. Ji and N. Obata, “Normal-ordered white noise differential equations II: Regularity properties of solutions”, pp. 157–174 in Prob. Theory and Math. Stat. eds. B. Grigelionis et al., VSP/TEV Ltd., 1999. [8] W. G. Cochran, H.-H. Kuo and A. Sengupta, “A new class of white noise generalized functions”, Infinite Dimen. Anal. Quantum Prob. 1 (1998) 43–67. [9] C. W. Gardiner, Quantum Noise, Springer-Verlag, 1991. [10] R. Gannoun, R. Hachaichi, H. Ouerdiane and A. Rezgui, “Un th´eoreme de dualit´e entre espaces de fonctions holomorphes ` a croissance exponentielle”, J. Funct. Anal. 171 (2000) 1–14. [11] T. Hida, Analysis of Brownian Functionals, Carleton Math. Lect. Notes 13, Carleton University, Ottawa, 1975. [12] T. Hida, H.-H. Kuo, J. Potthoff and L. Streit, White Noise: An Infinite Dimensional Calculus, Kluwer Academic Publishers, 1993. [13] H. Holden, B. Øksendal, J. Ubøe and T. Zhang, Stochastic Partial Differential Equations, Birkh¨ auser, 1996. [14] Z.-Y. Huang, “Quantum white noises — White noise approach to quantum stochastic calculus”, Nagoya Math. J. 129 (1993) 23–42. [15] Z.-Y. Huang and S.-L. Luo, “Wick calculus of generalized operators and its applications to quantum stochastic calculus”, Infinite Dimen. Anal. Quantum Prob. 1 (1998) 455–466. [16] R. L. Hudson and K. R. Parthasarathy, “Quantum Itˆ o’s formula and stochastic evolutions”, Commun. Math. Phys. 93 (1984) 301–323. [17] Yu. G. Kondratiev, P. Leukert and L. Streit, “Wick calculus in Gaussian analysis”, Acta Appl. Math. 44 (1996) 269–294. [18] Yu. G. Kondratiev and L. Streit, “Spaces of white noise distributions: Constructions, descriptions, applications I”, Rep. Math. Phys. 33 (1993) 341–366. [19] I. Kubo, H.-H. Kuo and A. Sengupta, “White noise analysis on a new space of Hida distributions”, Infinite Dimen. Anal. Quantum Prob. 2 (1999) 315–335. [20] I. Kubo and S. Takenaka, “Calculus on Gaussian white noise I”, Proc. Japan Acad. 56A (1980) 376–380. [21] H.-H. Kuo, White Noise Distribution Theory, CRC Press, 1996.
March 19, 2002 11:43 WSPC/148-RMP
272
00117
D. M. Chung, U. C. Ji & N. Obata
[22] P.-A. Meyer, Quantum Probability for Probabilists, Lect. Notes in Math. 1538, Springer-Verlag, 1993. [23] N. Obata, “An analytic characterization of symbols of operators on white noise functionals”, J. Math. Soc. Japan 45 (1993) 421–445. [24] N. Obata, White Noise Calculus and Fock Space, Lect. Notes in Math. 1577, Springer-Verlag, 1994. [25] N. Obata, “Generalized quantum stochastic processes on Fock space”, Publ. RIMS 31 (1995) 667–702. [26] N. Obata, “Integral kernel operators on Fock space — Generalizations and applications to quantum dynamics”, Acta Appl. Math. 47 (1997) 49–77. [27] N. Obata, “Quantum stochastic differential equations in terms of quantum white noise”, Nonlinear Anal. 30 (1997) 279–290. [28] N. Obata, “Wick product of white noise operators and quantum stochastic differential equations”, J. Math. Soc. Japan. 51 (1999) 613–641. [29] K. R. Parthasarathy, “Quantum Ito’s formula”, Rev. Math. Phys. 1 (1989) 89–112. [30] K. R. Parthasarathy, An Introduction to Quantum Stochastic Calculus, Birkh¨ auser, 1992. [31] J. Potthoff and L. Streit, “A characterization of Hida distributions”, J. Funct. Anal. 101 (1991) 212–229.
March 19, 2002 12:4 WSPC/148-RMP
00119
Reviews in Mathematical Physics, Vol. 14, No. 3 (2002) 273–302 c World Scientific Publishing Company
ANDERSON LOCALIZATION FOR A MULTIDIMENSIONAL MODEL INCLUDING LONG RANGE POTENTIALS AND DISPLACEMENTS
HERIBERT ZENK Institut f¨ ur Theoretische Physik der Universit¨ at Wien Boltzmanngasse 5, A-1090; Austria
Received 12 March 2001 Revised 19 November 2001 We give a short summary on how to combine and extend results of Combes and Hislop [2] (short range Anderson model with additional displacements), Kirsch, Stollmann and Stolz [13] and [14] (long range Anderson model without displacements) to get localization in an energy interval above the infimum of the almost sure spectrum for a continuous multidimensional Anderson model including long range potentials and displacements. Keywords: Random Schr¨ odinger Operators; Anderson Localization; Long Range Potentials, Random Displacements. Mathematics Subject Classification 2000: 81Q10, 35J10, 47A10
1. Introduction We treat the multidimensional Anderson model including displacements as in [2], using a long range single site potential ℘ as in [14]. To be more precise, we consider a continuous function ℘ : Rd → R+ and two independent families (λj : (Ω0 , P 0 ) → R)j∈Zd and (ξj : (Ω00 , P 00 ) → Rd )j∈Zd of independent identically distributed random variables satisfying the following conditions: (1) The joint probability distribution F = FP 0 ◦λ−1 : R → [0, 1] j
0
t→P ◦
λ−1 j (]
Z − ∞, t[) =
t
g(x) dx −∞
satisfies F (t) > 0 for t > 0, F has a density g ∈ L∞ (R) with respect to Lebesgue measure µ, and there is a finite interval [0, g0 ] containing the support of g. (2) There is an R ∈]0, 12 [, such that |ξj (ω 00 )| < R, for every j ∈ Zd and ω 00 ∈ Ω00 . Here | · | denotes the maximum norm in Rd . (3) We further assume that there is a c0 > 0 such that c0 101+2R ≤ ℘ holds, where 1xl denotes the characteristic function of the box Λ(x, l) with sidelength l > 0 273
March 19, 2002 12:4 WSPC/148-RMP
274
00119
H. Zenk
centered at a point x ∈ Rd . We consider two conditions for ℘(x), as |x| tends to ∞: (a) the “long range case”: There is a c1 > 0 and ς > 2d such that ℘(x) ≤ c1 hxi−ς := c1 (1 + |x|2 )−ς/2 holds, for every x ∈ Rd . As a special case we consider (b) the “short range case”: ℘ has compact support. Using these properties of ℘, (λj )j∈Zd and (ξj )j∈Zd , we define a bounded potential X λj (ω 0 )℘(x − j − ξj (ω 00 )) V (ω 0 , ω 00 , x) := j∈Zd
and arrive at the selfadjoint random Schr¨ odinger operator H(ω 0 , ω 00 ) := −4 + 0 00 2 d 0 V (ω , ω ) in L (R ) on the domain D(H(ω , ω 00 )) = H 2 (Rd ). Thus the operator H is measurable (a proof can be found in [4, Proposition V.3.1]) in the sense, that ω 7→ hϕ, (H(ω) − z)−1 Ψi is measurable for all ϕ, Ψ ∈ L2 (Rd ) and all z ∈ C\R. If d we denote the path of (λj )j∈Zd as λ• : Ω0 → RZ , we get an equivalent stochastic process ω 0 7→ λ• (ω 0 ) = (λj (ω 0 ))j∈Zd ! d d (1) prj : (RZ , B(RZ ), P 0 ◦ λ−1 • ) → R , (xk )k∈Zd 7→ xj j∈Zd for the family (λj )j∈Zd and another process (prj : ((Rd )Z , P00 ◦ ξ•−1 ) → Rd )j∈Zd equivalent to (ξj )j∈Zd . Using independence and identical distribution one easily d
(2)
⊗Z ⊗Z := (P 0 ◦ λ−1 and P00 ◦ ξ•−1 = P⊗Z := (P 00 ◦ ξ0−1 )⊗Z verifies P0 ◦ λ−1 • = P0 1 0 ) Zd d Zd for all cylinder sets in B(R ) respectively B((R ) ), hence for all elements of d d B(RZ ) respectively B((Rd )Z ). As (λj )j∈Zd and (ξj )j∈Zd are independent families, d
d
d
d
⊗ we construct a new probability space (Ω, B(Ω), P) = (RZ × (Rd )Z , B(Ω), P⊗Z 0 ⊗Zd P1 ), where B(Ω) is the completion of the product-σ-algebra on Ω with respect to the measure P containing all P-zerosets, and an equivalent stochastic process (the canonical representant) ! prj : Ω → R × Rd d
ω = (ωk )k∈Zd 7→ ωj
d
d
j∈Zd
for ((λj , ξj ) : Ω0 × Ω00 → R × Rd )j∈Zd , see e.g. [12] for more details about equivalent stochastic processes. In this way we identify ω = (ωj )j∈Zd = (ωj0 , ωj00 )j∈Zd ≡ (λj (ω 0 ), ξj (ω 00 )) ∈ Ω with (ω 0 , ω 00 ) ∈ Ω0 × Ω00 and thus the random potential reads as X ωj0 ℘(x − j − ωj00 ) . V (ω, x) = j∈Zd
The product structure of Ω also immediately shows limn→∞ P(θj−n A ∩ B) = P(A)P(B) for all j ∈ Zd \{0} and all cylinder sets A, B in B(Ω), where θj : Ω → Ω is the measure preserving transformation defined by θj ((ωk )k∈Zd ) = (ωj+k )k∈Zd .
March 19, 2002 12:4 WSPC/148-RMP
00119
Anderson Localization for a Multidimensional Model
275
This equality extends to all A, B ∈ B(Ω), cf. [23, Theorem 1.17], hence shows the mixing property and especially the ergodicity of the system (Ω, P, (θj )j∈Zd ). As V (θx ω, y) = V (ω, x+y), the existence of subsets Σ, Σc , Σs , Σac , Σsc , Σess , Σpp ⊆ R such that Σ = σ(H(ω)), Σc = σc (H(ω)), etc. holds for P-almost all ω ∈ Ω, is a standard result in the theory of random operators; see e.g. [4, Proposition V.2.4]. Adapting an idea of [13] for the proof of Eq. (1.1) therein, as a warm up in handling displacements, we prove: Lemma 1.1. Under the assumptions specified above, inf Σ = 0. Proof. V (ω) ≥ 0 implies H(ω) ≥ 0 for P-almost all ω ∈ Ω, hence inf Σ ≥ 0 and it is enough to show 0 ∈ Σ. For 0 ∈ σ(−4) choose a singular sequence (φn )n∈N in C0∞ (Rd ), i.e. kφn k = 1 and −4φn → 0, and a sequence (ln )n∈N of odd numbers, n→∞ such that supp φn ⊆ Λ(0, l2n ) and ln −→ 0. The events 1 for each k ∈ Λ(0, ln ) ∩ Zd Ωn := ω ∈ Ω : ωk0 ∈ 0, n S d satisfy P(Ωn ) = P0 ([0, n1 ])ln > 0, so the invariant set An := x∈Zd θx−1 Ωn has full T P measure due to the ergodicity of (θx )x∈Zd with respect to P. Now A := n∈N An satisfies P(A) = 1 and for each ω ∈ A and n ∈ N, there is xn (ω) ∈ Zd with 1 −1 0 d Ω = ω ∈ Ω : ω ∈ 0, (ω), l ) ∩ Z for k ∈ Λ(x ω ∈ θ−x . n n n k n (ω) n As a consequence
X
1 00 ℘(· − j − ω ) kH(ω)φn (· − xn (ω))k ≤ k − 4φn k + j
n
j∈Λ(xn (ω),ln )∩Zd
X
xn (ω) 00
+ g0 1 ln ℘(· − j − ωj )
2 j∈Zd \Λ(xn (ω),ln )
∞
.
(1)
∞
For the estimation of the last two terms, we see that for each x ∈ Rd , j0 , j ∈ Zd , |x − j0 | ≤ 12 and P-almost every ω ∈ Ω, we obtain |j − j0 | ≤ |j0 − x| + |x − ωj00 − j| + |ωj00 | < |x − ωj00 − j| + 1 . As there are (2k + 1)d − (2k − 1)d elements j ∈ Zd satisfying |j0 − j| = k > 0, we may estimate ∞ X X X 00 00 −ς ≤ c ℘(x − ω − j) hx − ω − ji 1 j j j∈Zd k=0 j∈Zd |j−j0 |=k
≤ c 1 + c1
∞ X
[(2k + 1)d − (2k − 1)d ](1 + (k − 1)2 )− 2 < ∞ ς
k=1
(2)
March 19, 2002 12:4 WSPC/148-RMP
276
00119
H. Zenk
as an upper bound for the second norm in (1). If we choose some x ∈ Λ(xn (ω), l2n ) and j ∈ Zd \Λ(xn (ω), ln ) for an estimate of the third term in (1), then the sum in (2) is restricted to k ≥ l4n − 1, hence we get a constant c < ∞ with
X
xn (ω) n→∞ 00
1 l ℘(· − j − ω ) < clnd−ς −→ 0 . j
2n
j∈Zd \Λ(xn (ω),ln ) ∞
So (Φn (· − xn (ω)))n∈N is a singular sequence for 0 ∈ σ(H(ω)) for each ω ∈ A, which yields 0 ∈ Σ. Now we state the main theorem of this paper: Theorem 1.2. Let H(ω) be the random Schr¨ odinger operator specified above. There exists an EX > inf Σ, such that σ(H(ω)) ∩ [inf Σ, EX ] is pure point spectrum for P-almost all ω ∈ Ω. In the short range case the corresponding eigenfunctions are exponentially decaying; in the long range case, the eigenfunctions decay faster than every inverse polynomial. The general scheme for a proof of such a theorem includes a Wegner estimate, multiscale analysis, spectral averaging and a criterion to detect pure point spectrum. The first three techniques originate from papers in the eighties, dealing with ohlich and Spencer proved a discrete model in l2 (Zd ). In the fundamental work [9], Fr¨ exponential decay of the Green’s function for large disorder and fixed energy. Then Fr¨ ohlich, Martinelli, Scoppola and Spencer [10], Delyon, Levy and Souillard [6] and Simon and Wolff [20] gave three independent proofs for the existence of dense pure point spectrum. Later on these methods were extended and simplified, see e.g., in [5] or [4], but these proofs seemed to be restricted to the discrete model. Combes and Hislop extended these ideas to continuous random Schr¨ odinger operators in [2]. They treat the continuous Anderson model with displacements for compact supported single site potential ℘ and in the case ℘(x) ≤ c1 hxi−ς , ς > 92 d. But the proof of this long range case is considered to be “incomplete” by the authors in [3]. Kirsch, Stollmann and Stolz studied a model without random displacements but with an additional periodic potential in [13], discussing effects of the random perturbation on the band structure. In [14] they extended their model to single site potentials ℘ satisfying ς > 2d. In [15] the case ς > 4d is considered; under this stronger requirement on ℘ they proved, that the eigenfunctions are exponentially decaying. A textbook account treating both, discrete and continuous random operators, is [21]. As Theorem 1.2 also covers the long range case, it is an improvement of the results in [2]. In Sec. 2 we start with a proof of the Wegner estimate on the resolvent (HΛ (ω)−E−iε)−1 of operators HΛ (ω), where HΛ (ω) is obtained as the restriction of H(ω) to a bounded open box Λ with Dirichlet boundary conditions. Using the geometric resolvent equation (6), we may view the resolvent of HΛ (ω) as the restriction
March 19, 2002 12:4 WSPC/148-RMP
00119
Anderson Localization for a Multidimensional Model
277
of the resolvent of HΛ0 (ω), with Λ0 ⊇ Λ, to functions with support in Λ modulo a boundary term. So provided we control these boundary terms, the geometric resolvent equation enables us to pass from HΛ (ω) to HΛ0 (ω), i.e. to enlarge the boxes under consideration. The multiscale analysis in Sec. 4 consists of constructing sen→∞ ries (ln )n∈N −→ ∞ of length scales and events (An )n∈N with (1 − P(An ))n∈N ∈ l1 and in finding a lower bound EX > inf Σ, such that we get enough control on all the boundary terms arising from the geometrical resolvent equation, when we pass inductively from the operators (HΛ(z,ln ) (ω) − E − iε)−1 , ω ∈ An , z ∈ Rd , E ∈ [inf Σ, EX ] to those restricted to boxes of the next length scale ln+1 > ln . This result of multiscale analysis allows us to use the Borel–Cantelli lemma, yielding a key step of the proof in Sec. 5: We obtain estimates on the decay of the norm of some localized versions 1y (H(ω) − E − iε)−1 1x and 1x (H(ω) − E − iε)−1 ℘(· − z) of the resolvent for P-almost all ω ∈ Ω and for µ-almost all E ∈ [inf Σ, EX ] as ε & 0 and |x| → ∞. Using these estimates we establish the decay properties of Theo1 d rem 1.2 for distributional solutions Ψ ∈ H− ς (R ) of the “perturbed” eigenvalue 4 equation H(ω)Ψ + α℘(· − z)Ψ = EΨ for α 6= 0, P-almost all ω ∈ Ω and µ-almost all E ∈ [inf Σ, EX ]. Due to the fact, that with respect to a spectral measure να,z,ω of H(ω) + α℘(· − z) for almost every E ∈ R such a distributional solution Ψ exists, we get pure point spectrum and the decay properties of the eigenfunctions for those “perturbed” operators H(ω) + α℘(· − z). The proof of Theorem 1.2 follows in Sec. 6.4, where we write “unperturbed” operators H(˜ ω ) for P-almost all ω ˜ ∈ Ω as appropriate “perturbed” operators H(ω)+α℘(·−z), using several relations between sets of P, µ, respectively, να,z,ω measure 0. 2. Spectral Averaging and Wegner Estimate In this section we make use of the product structure of P to estimate expectation values via Fubini’s theorem. We fix j ∈ Zd and perform the integration with respect to ωj0 first by some inequalities from [2] and then integrate the remaining variables. We begin with the following lemma from [2], cf. Lemma 4.1 and Corollary 4.2 there. Lemma 2.1. Let H be a separable Hilbert space and H0 a selfadjoint operator in H. Let B, V ≥ 0 be bounded operators on H, such that there is a constant c2 > 0 satisfying 0 ≤ c2 B 2 ≤ V. For α ∈ R, let πα be the spectral decomposition of the operator H(α) := H0 +αV and fix a bounded function h : R → R of compact support. Then for every measurable set M ⊆ R of finite Lebesgue measure, µ(M ) < ∞, one has: (1) k (2) k
R
1 hu, R 1+tα2
R
R
µ(M)kuk2 , for t > 0 and u c2 µ(M)kuk2 khk∞ , for u ∈ H. c2
Bπα (M )Buidαk ≤
h(α)hu, Bπα (M )Buidαk ≤
∈ H,
Theorem 2.2 (Wegner Estimate). Let Λ ⊆ Rd be an open finite box and let HΛ (ω) := −4+V (ω)|Λ be the restriction of H(ω) to L2 (Λ) with Dirichlet boundary
March 19, 2002 12:4 WSPC/148-RMP
278
00119
H. Zenk
conditions. Then there is a function CW : R → R+ satisfying • for each E0 ∈ R the function CW is bounded on ] − ∞, E0 ], • P{ω ∈ Ω : dist(σ(HΛ (ω)), E0 ) < η} ≤ CW (E0 )ηld holds for every box Λ of sidelength l ≥ 1. Proof. For a box Λ ⊆ Rd we fix J := {j ∈ Zd : Λj := Λ(j, 1) ∩ Λ 6= ∅}. Let χj be L Λ the characteristic function of Λj and define HN,Λ := − j∈J 4Nj ≤ HΛ (ω) as the 2 direct sum of the Laplacians on L (Λj ) with Neumann boundary conditions. Then we use the proof of [4, Proposition 4.5] to get P{dist(σ(HΛ (ω)), E0 ) < η} ≤ eE0 +η E(tr(πΛ (]E0 − η, E0 + η[)e−HN,Λ )) XX kE(χj πΛ (]E0 − η, E0 + η[)χj )ke−En,j ≤ eE0 +η
(3)
n∈N j∈J
where πΛ is the spectral decomposition of HΛ and (En,j )n∈N are the eigenvalues of Λ −4Nj . These eigenvalues are explicitly known (see e.g. [18]), so one gets XX e−En,j ≤ Cld . (4) n∈N j∈J
For a given j ∈ J, we now estimate kE(χj πΛ (]E0 − η, E0 + η[)χj )k. In order to do so, we choose ϕ ∈ L2 (Λ) satisfying kϕk = 1 and decompose (Ω, P) as a product d ˆ = (RZd \{j} × (Rd )Zd , P⊗(Z \{j}) ⊗ P⊗Zd ) and (R, P0 ). Writing ω = (ˆ ˆ j , P) ω , α) of (Ω 0
1
Fubini’s theorem yields hϕ, E(χj πΛ (]E0 − η, E0 + η[)χj )ϕi Z Z ˆ ω) hϕ, χj πΛ (]E0 − η, E0 + η[, (ˆ ω , α))χj ϕi dP0 (α) dP(ˆ = ˆj Ω
Z
R
Z
= ˆj Ω
R
ˆ ω) . g(α)hϕ, χj πΛ (]E0 − η, E0 + η[, (ˆ ω , α))χj ϕi dα dP(ˆ
P ω , x) := i∈Zd ωj0 ℘(x − i − ωi00 ) one gets Defining Vˆj (ˆ i6=j
ω , x) + α℘(x − j − ωj00 ) V (ω, x) = V ((ˆ ω , α), x) = Vˆj (ˆ and ω ))|Λ + α℘(· − j − ωj00 )|Λ . HΛ (ω) = (−4Λ + Vˆj (ˆ As the assumptions on the model imply that g is a bounded function of compact support, ℘(·−j −ωj00 )|Λ ≥ 0, and c0 (χj )2 ≤ ℘(·−j −ωj00)|Λ , we may apply Lemma 2.1
March 19, 2002 12:4 WSPC/148-RMP
00119
Anderson Localization for a Multidimensional Model
279
with h = g, c2 = c0 , H = L2 (Λ), H0 = (−4Λ + Vˆj (ω))|Λ , V = ℘(· − j − ωj00 )|Λ and B = χj . This application yields Z Z ˆ ω) g(α)hϕ, χj πΛ (]E0 − η, E0 + η[, (ˆ ω , α))χj ϕi dα dP(ˆ ˆj Ω
R
≤
Z ˆj Ω
2ηkgk∞ µ(]E0 − η, E0 + η[)kgk∞ ˆ dP(ˆ ω) = . c0 c0
(5)
Combining (3), (4) and (5), we arrive at the desired result. The next result is close to [13, Proposition 4.2]. Proposition 2.3. For any ζ > 0 and b ∈ ]0, 2[ there is a l1∗ = l1∗ (ζ, b), such that for all boxes Λ of sidelength l ≥ l1∗ the following inequality is valid : P{ω ∈ Ω : dist(σ(HΛ (ω)), inf Σ) ≤ lb−2 } ≤ l−ζ . Proof. The proof consists of several steps; in the first step, we give a lower bound for the lowest eigenvalue µ1 (HΛN (ω)) of the operator H(ω) restricted to Λ with Neumann boundary conditions. Fix z ∈ Zd , take an odd number l, so Card(Λ(z, l) ∩ Zd ) = ld and define E := ( ω ˜ j :=
π2 , 8c0 (1 + 4c0 )l2
min{8E, ωj0 }
if j ∈ Λ(z, l) ∩ Zd ,
0
if j ∈ Zd \Λ(z, l) .
Due to our general assumptions c0 101+2R ≤ ℘ and |ωj00 | < R for P-almost all ω ∈ Ω, P ˜ j 1j satisfies V˜ (˜ ω )|Λ(z,l) ≤ V (ω)|Λ(z,l) . Hence the so the potential V˜ (˜ ω ) := c0 j∈Zd ω N ˜ ˜ (˜ (˜ ω )) of −4 + V ω )) restricted to Λ(z, l) with Neumann lowest eigenvalue µ1 (H Λ(z,l) N ˜ N (˜ (ω)). Note, that µ1 (H ω )) boundary conditions is a lower bound for µ1 (H Λ(z,l)
Λ(z,l)
does not depend on the random displacements. In the second step, we have a closer ˜ N (˜ look to the event {µ1 (H Λ(z,l) ω )) < E}. The definition of E ensures, that we may apply Temple’s inequality, see [19, Lemma 4.4], which yields 3 −d ˜ N (˜ l µ1 (H Λ(z,l) ω )) ≥ 4
X
ω ˜j ,
j∈Λ(z,l)∩Zd
d ˜ N (˜ ˜ j < 4E} > hence µ1 (H Λ(z,l) ω )) < E implies that Card{j ∈ Λ(z, l) ∩ Z : ω to N ˜ N (˜ µ1 (H Λ(z,l) ω )) ≤ µ1 (HΛ(z,l) (ω)) ,
ld 2.
Due
March 19, 2002 12:4 WSPC/148-RMP
280
00119
H. Zenk
we obtain the inclusion N (ω)) < 4E} {ω ∈ Ω : µ1 (HΛ(z,l) ld . ⊆ ω ∈ Ω : Card{j ∈ Λ(z, l) ∩ Zd : ωj0 < 4E} > 2
As step three, we apply Cramer’s theorem, see [22, Theorem 1.3.13], which yields d1 ld 4 d 0 ≤ e−l 2 log( 3 ) , P ω ∈ Ω : Card{j ∈ Λ(z, l) ∩ Z : ωj < 4E} > 2 provided 34 < P0 ([4E, ∞[) < 1. Hence, if l is big enough, so that E satisfies this condition, we obtain d1 π2 4 N −2 l ≤ e−l 2 log( 3 ) . P ω ∈ Ω : µ1 (HΛ(z,l) (ω)) < 2c0 (1 + 4c0 ) N (ω), Λ(z, l) ⊆ Λ, is done The last step, the estimation of HΛN (ω) by a sum of HΛ(z,l) as in [13, Proposition 4.2]. One obtains
P{ω ∈ Ω : µ1 (HΛN (ω)) ≤ lb−2 } ≤ l−ζ for all boxes Λ of sidelength l, which has to be large enough. As HΛN (ω) ≤ HΛ (ω) this also proves the proposition. 3. Geometric Resolvent Equation The results of Sec. 2 enable us to control the resolvent for the operator HΛ (ω) restricted to a finite box Λ. We state the geometric resolvent equations and some consequences, which were implicitly stated in [2, 13] and [14]. These observations, which allow us to increase the size of the box Λ, are close to the preparations of multiscale analysis in [7, 8] and [11]. Theorem 3.1. Let ∅ 6= Λ ⊆ Rd be a finite, open box and χ ∈ C0∞ (Λ, R) a smooth function of compact support in Λ, and choose as Λ0 another finite, open box Λ0 ⊇ Λ odinger operators or Λ0 = Rd . For a potential V ∈ L∞ (Rd , R) consider the Schr¨ 0
HΛ0 := −4Λ + V |Λ0
and
HΛ := −4Λ + V |Λ
with Dirichlet boundary conditions for Λ and for Λ0 whenever Λ0 is finite. Suppose 0 0 that WχΛ is the operator in L2 (Λ0 ) with domain D(WχΛ ) = H 1 (Λ0 ) satisfying 0
WχΛ f := −(4χ)f − 2∇χ · ∇f ,
∀ f ∈ H 1 (Λ0 ) .
Then for all z ∈ C\R, the following geometric resolvent equations hold true for all u ∈ L2 (Λ0 ) and v ∈ L2 (Λ): 0
χ(HΛ0 − z)−1 u = (HΛ − z)−1 χu + (HΛ − z)−1 WχΛ (HΛ0 − z)−1 u
(6)
(HΛ0 − z)−1 χv = χ(HΛ − z)−1 v − (HΛ0 − z)−1 WχΛ (HΛ − z)−1 v .
(7)
March 19, 2002 12:4 WSPC/148-RMP
00119
Anderson Localization for a Multidimensional Model 0
0
0
281
0
Proof. We notice that WχΛ f = −4Λ (χf ) + χ4Λ f for all f ∈ D(4Λ ) and 0 0 −4Λ (χf ) = −4Λ (χf ) for all f ∈ D(−4Λ ) or f ∈ D(−4Λ ). Hence 0
(HΛ − z)(χf ) = χ(HΛ0 − z)f + WχΛ f 0
for all f ∈ D(4Λ ). Taking (HΛ −z)−1 on both sides of this equation and considering f = (HΛ0 − z)−1 u gives (6) and similarly from χ(HΛ − z)f = (HΛ0 − z)(χf ) − WχΛf for f = (HΛ − z)−1 v one gets (7). 0
0
0
Given a sequence (fn )n∈N in D(−4Λ ) then fn → f , −4Λ fn → h and WχΛ fn → g 0 altogether imply WχΛ f = g. So, we can state some properties of the operator WχΛ resulting from the closed graph theorem. Lemma 3.2. The operator WχΛ has the following properties: • WχΛ (HΛ − z)−1 : L2 (Λ) → L2 (Λ) ⊆ L2 (Λ0 ) is bounded for every z ∈ C\R, 0 • (HΛ − z)−1 WχΛ : H 1 (Λ0 ) → L2 (Λ) has a continuous extension on L2 (Λ0 ) with 0 norm k(HΛ − z)−1 WχΛ k = kWχΛ (HΛ − z¯)−1 k. In the multiscale analysis procedure we fix δ > 0 and consider boxes Λ(z, l) centered at z of sidelength l > 3δ. We also use an Urysohn function ϕ for Λ(z, l), which is a function ϕ ∈ C0∞ (Λ(z, l − 23 δ), [0, 1]) with ϕ = 1 on Λ(z, l − 43 δ). In this situation, for every τ ∈ C ∞ (Rd ) satisfying τ (x) = 1, for x ∈ Λ(z, l− 23 δ)\Λ(z, l − 43 δ) 0 0 one has WϕΛ = WϕΛ ◦ τ for all Λ0 ⊇ Λ(z, l). To perform the multiscale analysis we note: Lemma 3.3. Let ϕ be an Urysohn function for Λ = Λ(z, L) and let Λ0 ⊇ Λ be a finite box. Given η > 0 and z ∈ C\R with dist(
˜ = {x ∈ Λ(z, L) : where Λ function c3 is given as:
L 2
3
− δ < |z − x| <
L 2}
is the enlarged δ-edge of Λ and the
r
2 [(1 + 2
c3 (η,
r ·
k4τ1 k∞ + 2
2 [(1 + 2
!
using a function τ1 ∈ C0∞ (Λ(z, L)) with τ1 (x) = 1 for all x ∈ Λ(z, L − 23 δ). Proof. Taking τ1 specified above, we have τ1 1zL = 1zL . Thus applying the geometric 3
3
resolvent equation (6) to τ1 (HΛ0 − z)−1 1zL v and then applying WϕΛ we obtain 3
March 19, 2002 12:4 WSPC/148-RMP
282
00119
H. Zenk
WϕΛ τ1 (HΛ0 − z)−1 1zL v 3
0
= WϕΛ (HΛ − z)−1 1zL v + WϕΛ (HΛ − z)−1 WτΛ1 (HΛ0 − z)−1 1zL v . 3
3
Taking also (4ϕ)τ1 = 4ϕ and ∇ϕ · ∇τ1 = 0 into account, this equation leads to: WϕΛ (HΛ − z)−1 1zL v 3
0
= WϕΛ τ1 (HΛ0 − z)−1 1zL v − WϕΛ (HΛ − z)−1 WτΛ1 (HΛ0 − z)−1 1zL v 3
3
−1 z
= −(4ϕ)(HΛ0 − z) − WϕΛ (HΛ
1 L v − 2(∇ϕ) · ∇((HΛ0 − z)
1 L v)
3
−1
− z)
−1 z
0 WτΛ1 (HΛ0
3
−1 z
− z)
1L v .
(8)
3
0
For χ ∈ C 1 (Λ0 , Rd ), u ∈ D(−4Λ ) and z ∈ C\R, we remark that 1 kχ · grad uk2 + 4ku div χk2 ≥ −2
2 [kχ(HΛ0 − z)uk2 + (1 + 2
k∇(HΛ0 − z)−1 vk2 0
= h(HΛ0 − z)−1 v, (−4Λ )(HΛ0 − z)−1 vi =
1 1 kvk2 + k
Expressing WϕΛ and WτΛ1 in terms of 4ϕ and ∇ϕ · ∇ respectively 4τ1 and ∇τ1 · ∇ in (8) and inserting these estimates, the lemma is proven. 4. Multiscale Analysis In this section we give a slight modification of the multiscale analysis, which has been used in [2, 13] and [14]. We state the results in a different way to the papers mentioned above, stressing the deterministic or probabilistic nature of the statements. In this way, we can take into account the displacements and the long range case for the single site potential more easily. For the multiscale analysis procedure
March 19, 2002 12:4 WSPC/148-RMP
00119
Anderson Localization for a Multidimensional Model
283
we now fix a δ > 0 and further on, we only consider boxes of sidelength l > 3δ. An Urysohn function of such a box is then always chosen with respect to this fixed δ in accordance with the conventions made before Lemma 3.3. Definition 4.1. Given an energy E, a configuration ω ∈ Ω and γ > 0, we call a box Λ = Λ(z, l) a γ-good box (for E and ω), if sup kWϕΛ (HΛ (ω) − E − iε)−1 1zL k ≤ e−γl
(9)
3
ε6=0
holds for an Urysohn function ϕ of Λ. Otherwise Λ is called γ-bad (for E and ω). We remark that with the help of the first resolvent equation we can change supε6=0 in (9) into supε∈Q,ε6=0 . Using measurability of HΛ and the geometric resolvent equation to express WϕΛ (HΛ (ω) − E − iε)−1 in terms of ϕ and HΛ (ω), we get {ω ∈ Ω : Λ is γ good for ω} ∈ B(Ω). The aim of the multiscale analysis is to find sequences (ln )n∈N in R+ and (An )n∈N in B(Ω), such that ln → ∞, Λ(z, ln ) is γ-good for E and ω ∈ An and P(An ) ≥ 1 − ln−ζ for a ζ > 0. In order to perform this “induction process” there are these steps to do: 4.1. Finding γ-good boxes for an initial length scale l0 Lemma 4.2. Let V be a potential on Rd which is locally uniformly in Lp , p = 2 for d ≤ 3 and p > d2 for d ≥ 4. Let Λ be a finite box and HΛ = −4Λ + odinger operator with Dirichlet boundary conditions. Let ]r, s[ be V |Λ be the Schr¨ ˜ ∈ L∞ (Λ) satisfying kχk∞ , kχk ˜ ∞ ≤ 1 and a spectral gap of HΛ and take χ, χ dist(supp χ, ∂Λ), dist(supp χ, ˜ ∂Λ) ≥ δ0 > 0. Then there are positive constants C0 , C1 and C2 only depending on δ0 and supx∈Rd kV kLp (Λ(x,1)) , such that for each E ∈]r, s[, each j ∈ {1, . . . , d}, and η := dist(E, σ(HΛ )) the following estimates hold : 1 η dist(supp χ,supp χ) C2 (C0 + s) − CC+s ˜ e 0 , kχ(H ˜ Λ − E − iε)−1 χk ≤ η kχ∂ ˜ j ((HΛ − E − iε)−1 χ)k ≤
1 C2 (C0 + s) − CC+s ˜ e 0 η dist(supp χ,supp χ) . η
A proof of this lemma can be found in [13, Appendix A]; using this lemma, we prove the “initial length scale estimate”: Theorem 4.3. For ξ ∈]0, 1[ there is a minimal length scale l2∗ = l2∗ (ξ), such that for all l0 > l2∗ there is a γ0 = γ0 (l0 , ξ) > 0 such that for all z ∈ Rd , for all Urysohn l−ξ
functions ϕzl0 of Λ(z, l0 ) with δ ≥ 1 and for all E ∈ I := [inf Σ, inf Σ + 02 ] and all configurations ω ∈ A := {ω ∈ Ω : dist(inf Σ, σ(HΛ(z,l0 ) (ω))) ≥ l0−ξ } the box Λ(z, l0 ) is γ0 -good (for E ∈ I and ω ∈ A). and χ = 1zl0 to get dist(supp 1Λ(z,l , supp 1zl0 ) = Proof. We take χ ˜ = 1Λ(z,l ˜ ˜ 0) 0) l0 3
3
3
− δ. If we consider that I ⊆ ρ(HΛ(z,l0 ) (ω)) for each ω ∈ A and η =
March 19, 2002 12:4 WSPC/148-RMP
284
00119
H. Zenk l−ξ
dist(E, σ(HΛ(z,l0 ) (ω))) ≥ 02 for all E ∈ I and ω ∈ A using this spectral gap I, we get from Lemma 4.2: Λ(z,l0 )
kWϕz
l0
(HΛ(z,l0 ) (ω) − E − iε)−1 1zl0 k 3
≤
(HΛ(z,l0 ) (ω) k4ϕzl0 k∞ k1Λ(z,l ˜ 0)
− E − iε)−1 1zl0 k 3
∇((HΛ(z,l0 ) (ω) − E − iε)−1 1zl0 )k + 2k∇ϕzl0 k∞ k1Λ(z,l ˜ 0) 3
≤ (k4ϕzl0 k∞ + 2k∇ϕzl0 k∞ )C2 [2l0ξ (C0 + inf Σ) + 1]e
−l−ξ 0
l C1 ( 0 −δ) 3 −ξ l 2(C0 +inf Σ+ 0 ) 2
As the norms k4ϕzl0 k∞ and k∇ϕzl0 k∞ are bounded with respect to l0 , and C0 + inf Σ if l0 is large enough, there are constants C3 , C4 and C5 such that Λ(z,l0 )
kWϕz
l0
≤
.
l−ξ 0 2
≤
(HΛ(z,l0 ) (ω) − E − iε)−1 1zl0 k 3
−ξ C3 l0ξ e−l0 (C4 l0 −C5 )
=
C4 −ξ C4 −ξ −ξ [C3 l0ξ eC5 l0 e−( 2 l0 )l0 ]e−( 2 l0 )l0 −ξ
C4 −ξ
l →∞
holds, provided l0 is big enough. As C3 l0ξ eC5 l0 e−( 2 l0 )l0 0−→ 0 there is a l2∗ = C4 −ξ −ξ l2∗ (ξ) with C3 l0ξ eC5 l0 e−( 2 l0 )l0 ≤ 1 for l0 ≥ l2∗ . If we take γ0 = γ0 (l0 , ξ) := C24 l0−ξ and choose any l0 ≥ l2∗ (ξ), we get Λ(z,l0 )
kWϕz
l0
(HΛ(z,l0 ) (ω) − E − iε)−1 1zl0 k ≤ e−γ0 l0 . 3
4.2. Box counting or how to get a bigger good box In this section we are looking for fixed E how to get for some typical realizations ω from a γ-good box Λ(z, l) to a γ 0 -good bigger box Λ(z, l0 ). For fixed z ∈ Rd and l < l0 we consider the lattice Γz, l := {z + 3l y : y ∈ Zd } and the cluster 3 Q(ω, z, γ, l, l0 ) := {x ∈ Γz, l : Λ(x, l) ⊆ Λ(z, l0 ) and Λ(x, l) is γ-bad for ω} of γ-bad 3 boxes centered at a point of Γz, l contained in Λ(z, l0 ). 3
Lemma 4.4. Fix an energy E, let l be big enough, l0 > 13l, β > 0 and choose a configuration ω ∈ Ω satisfying δQ(ω, z, γ, l, 3l0) ≤ 11 3 l for the diameter of the cluster 0 −β of bad boxes, dist(E, σ(HΛ(z,l0 ) (ω))) ≥ (l ) and dist(E, σ(HΛ(z,3l0 ) (ω))) ≥ (3l0 )−β . If we choose γ 0 := (1 − box for ω and E.
−
13l l0 )(γ
log(3d −1) ) l
−
(2d+2β) log l0 , l0
then Λ(z, l0 ) is a γ 0 -good
Proof. Taking w, z ∈ Λ(z, 3l0 ) with |w − z| ≥ 23 l and using the notation ϕw l for an z w w w 1 = 0 and 1 ϕ = 1 , so Urysohn function of Λ(w, l) we get ϕw l l l l l 3
w
−1 z
3
3
1 l (HΛ(z,3l0 ) (ω) − E − iε)
3
1l Λ(z,3l0 )
= 1wl (HΛ(z,l) (ω) − E − iε)−1 Wϕw 3
3
l
(HΛ(z,3l0 ) (ω) − E − iε)−1 1zl
3
(10)
March 19, 2002 12:4 WSPC/148-RMP
00119
Anderson Localization for a Multidimensional Model
285
is a special case of the geometric resolvent equation (6). Choosing w ∈ Γz, l ∩Λ(z, 3l0 ) 3 we take the 3d −1 boxes of sidelength 3l centered at γw := (Γz, l ∩Λ(w, l))\{w} to get 3 P 0 ˜ 1wl as an equality in L2 (Λ(w, l)) for every function τ ∈ C0∞ (Λ(w, l)) τ =τ· 0 w ∈γw
3
Λ(z,3l0 )
Λ(z,3l0 )
with τ |Λ(w,l− 23 δ)\Λ(w,l− 43 δ) = 1. As such a function τ satisfies Wϕw l and due to Lemma 3.2 we have Λ(z,3l0 )
k1wl (HΛ(z,l) (ω) − E − iε)−1 Wϕw l
3
Λ(w,l)
k = kWϕw l
= Wϕw l
◦τ
(HΛ(w,l) (ω) − E + iε)−1 1wl k , 3
for w1 ∈ γw with 0
k1wl (HΛ(z,3l0 ) (ω) − E − iε)−1 1zl k , k1wl 1 (HΛ(z,3l0 ) (ω) − E − iε)−1 1zl k = max 0 w ∈γw
3
3
3
3
we obtain from (10) k1wl (HΛ(z,3l0 ) (ω) − E − iε)−1 1zl k 3
3
= k1wl (HΛ(w,l) (ω) − E − 3
Λ(z,3l0 ) iε)−1 Wϕw τ l
X
w 0 ∈γ
0
1wl 3
w
· (HΛ(z,3l0 ) (ω) − E − iε)−1 1zl k 3
Λ(w,l)
≤ (3d − 1)kWϕw l
(HΛ(w,l) (ω) − E + iε)−1 1wl k 3
0
k1wl (HΛ(z,3l0 ) (ω) − E − iε)−1 1zl k · max 0 w ∈γw
3
3
Λ(w,l)
= (3d − 1)kWϕw l
(HΛ(w,l) (ω) − E + iε)−1 1wl k 3
−1 z
· k1 l (HΛ(z,3l0 ) (ω) − E − iε) w1 3
1lk. 3
we can iterate this procedure to get a sequence w = w0 , w1 , . . . , wn If |w1 − z| ≥ 0 in Γz, l ∩ Λ(z, 3l ) with |wj − z| ≥ 23 l for j = 0, . . . , n − 1 and 2 3 l,
3
k1wl (HΛ(z,l0 ) (ω) − E − iε)−1 1zl k 3
3
≤ (3d − 1)n
n−1 Y j=0
kW
Λ(wj ,l) wj
ϕl
(HΛ(wj ,l) (ω) − E + iε)−1 1 l j k w 3
· k1wl n (HΛ(z,3l0 ) (ω) − E − iε)−1 1zl k .
(11)
3
3
/ Q(ω, z, γ, l, 3l0), we Restricting our considerations to the case w0 , w1 , . . . , wn−1 ∈ get from (11) k1wl (HΛ(z,3l0 ) (ω) − E − iε)−1 1zl k 3
3
≤ (3 − 1) (e d
n
−γl n
) k1 l (HΛ(z,3l0 ) (ω) − E − iε)−1 1zl k . wn 3
3
(12)
March 19, 2002 12:4 WSPC/148-RMP
286
00119
H. Zenk
This procedure which allows us to get further exponential factors e−γl in (12) stops, if either |wn − z| = 3l or Λ(wn+1 , l) * Λ(z, 3l0 ) or wn ∈ Q(ω, z, γ, l, 3l0) is satisfied. Provided we get a pair wi = wj with 0 ≤ i < j the procedure can be iterated as many times we like, by setting wj+1 := wi+1 , “running through this 0 path wi , . . . , wj ” once more. Take x ∈ S1 := {x ∈ Γz, l : Λ(x, 3l ) ∩ Λ(z, l3 ) 6= ∅} and 3 ˜ l0 ) 6= ∅}, then we have |x − y| ≥ l0 −l − δ > 11 l y ∈ S2 := {y ∈ Γ l : Λ(y, l ) ∩ Λ(z, z, 3
3
3
3
and consequently Λ(x, l) or Λ(y, l) is a γ-good box. So we can start the construction above with w = x and z = y whenever Λ(x, l) is γ-good and with w = y, z = x otherwise. In the case where Λ(x, l) and Λ(y, l) are γ-good and we end up with wn ∈ Q(ω, z, γ, l, 3l0), we increase the number of exponential factors e−γl by starting the above construction once again with w = y and z = wn . Now we have to estimate the minimum number of e−γl -factors in terms of |x−y| for each possible combination of these stopping conditions to get k1xl (HΛ(z,l0 ) (ω) − E − iε)−1 1yl k 3
3
≤e
d 0 −(γ− log(3l −1) )3(|x−y|−max{ 4l 3 ,δQ(ω,z,γ,l,3l )})
k(HΛ(z,3l0 ) (ω) − E − iε)−1 k (13)
for all lattice sites x ∈ S1 and y ∈ S2 , where δQ(ω, z, γ, l, 3l0) denotes the diameter 0 of the cluster Q(ω, z, γ, l, 3l0). As we have 3|x − y| − max{ 4l 3 , δQ(ω, z, γ, l, 3l )} ≥ 0 l − 13l > 0 and −1 z 1 l0 k k1Λ(z,l 0 ) (HΛ(z,3l0 ) (ω) − E − iε) ˜ 3
≤ Card S1 Card S2 max k1 l (HΛ(z,3l0 ) (ω) − E − iε)−1 1xl k y
x∈S1 y∈S2
3
3
the last task is to count the elements of S1 and S2 and then to combine this with the estimate of Lemma 3.3, using c3 ((l0 )−β , E) ≤ c4 (l0 )β to get a c ≥ 0 with Λ(z,l0 )
kWϕz
l0
(HΛ(z,l0 ) (ω) − E − iε)−1 1zl0 k
≤ cl−2d e(l
3 0
log(3d −1) −13l)( −γ)+(2d+2β) log l0 l
yielding the proposed γ 0 , provided l is big enough. 4.3. An important probabilistic trick in multiscale analysis for long range single site potentials Taking a finite set ∅ = 6 Υ ⊂ Zd and A ⊆ Ω, we find a set B ⊇ prΥ (A) in the d+1 Υ σ-algebra B(R ) with (P0 ⊗ P1 )⊗Υ (B) = [(P0 ⊗ P1 )⊗Υ ]∗ (prΥ (A)). Consequently d d ∗ B × (Rd+1 )Z \Υ ∈ B(Ω) satisfies B × (Rd+1 )Z \Υ ⊇ pr−1 Υ (prΥ (A)) and P (B × d (Rd+1 )Z \Υ \A) = 0, so due to the completeness of B(Ω) we obtain pr−1 Υ (pr(A)) ∈ B(Ω). Fixing A, B ∈ B(Ω) and finite sets ∅ = 6 Υ1 , Υ2 ⊆ Zd , Υ1 ∩ Υ2 = ∅ then the
March 19, 2002 12:4 WSPC/148-RMP
00119
Anderson Localization for a Multidimensional Model
287
−1 two events pr−1 Υ1 (prΥ1 (A)) and prΥ2 (prΥ2 (B)) are independent. Due to this fact in [14] one finds the following definition.
Definition 4.5. A box Λ(z, l) is called γ-super good for ω and E, if Λ(z, l) is γω ) with Υ(z, 4l) := good for E and for all ω ¯ ∈ Ω satisfying prΥ(z,4l) (ω) = prΥ(z,4l) (¯ Zd ∩ Λ(z, 4l). Using γ-super good boxes instead of γ-good boxes eliminates all the problems, we would have, if we want to estimate for noncompact ℘ the probability P{ω ∈ Ω : Λ(x, l) and Λ(y, l) are not γ-good for E and ω} in terms of P{ω ∈ Ω : Λ(x, l) is not γ-good for E and ω}, even if we assume large |x − y|. In order to proceed, we need some pr−1 Υ prΥ -versions of earlier statements. Lemma 4.6. There is a c5 , such that for L large enough we have the inclusions: pr−1 Υ(z,4L) (prΥ(z,4L) {ω ∈ Ω : dist(E, σ(HΛ(z,L) (ω))) < η}) ⊆ {ω ∈ Ω : dist(E, σ(HΛ(z,L) (ω))) < η + c5 L−(ς−d) } , pr−1 Υ(z,4L) (prΥ(z,4L) {ω ∈ Ω : dist(E, σ(HΛ(z,L) (ω))) > η}) ⊆ {ω ∈ Ω : dist(E, σ(HΛ(z,L) (ω))) > η − c5 L−(ς−d) } . Proof. We take A = {ω ∈ Ω : dist(E, σ(HΛ(z,L) (ω))) < η} for the first inclusion and similarly A = {ω ∈ Ω : dist(E, σ(HΛ(z,L) (ω))) > η} for the second. Then we take ω ¯ ∈ pr−1 Υ(z,4L) (prΥ(z,4L) (A)) and choose ω ∈ A satisfying ω ). As the ωj and ω ¯ j coincide for all sites j ∈ Υ(z, 4L) and prΥ(z,4L) (ω) = prΥ(z,4L) (¯ −ς ω , x)| ≤ c5 L−(ς−d) . ℘ decreases at least polynomial as c1 hxi , one gets |V (ω, x)−V (¯ So the Minimax theorem shows, that the absolute value of the differences of the ω ) is bounded above by c5 L−(ς−d) and nth eigenvalues of HΛ(z,L) (ω) and HΛ(z,4L) (¯ this lemma follows. Lemma 4.7. If l0 > 13l and P{ω ∈ Ω : Λ(x, l) is not γ-super good for ω} ≤ η for each x ∈ S(z, l, l0 ) := {x ∈ Γz, l : Λ(x, l) ⊆ Λ(z, 3l0 )} are satisfied, one gets: 3
0 2d l 11 0 2d l ≤ 9 ) > η2 . pr ω ∈ Ω : δQ(ω, z, γ, l, 3l P pr−1 Υ(z,4l0 ) Υ(z,4l0 ) 3 l Proof. We remark that for finite sets ∅ = 6 Υ1 ⊆ Υ2 ⊂ Zd and every A ⊆ Ω we get −1 the “reversed” pr−1 pr-relation prΥ1 (prΥ1 (A)) ⊇ pr−1 Υ2 (prΥ2 (A)). So 11 0 l ) > pr−1 pr ω ∈ Ω : δQ(ω, z, γ, l, 3l 0 0 Υ(z,4l ) Υ(z,4l ) 3 [ pr−1 = Υ(z,4l0 ) (prΥ(z,4l0 ) {ω ∈ Ω : Λ(x, l) and Λ(y, l) are γ-bad for ω}) x,y∈S(z,l,l0 ) |x−y|> 11 l 3
March 19, 2002 12:4 WSPC/148-RMP
288
00119
H. Zenk
⊆
[
[pr−1 Υ(x,4l) (prΥ(x,4l) {ω ∈ Ω : Λ(x, l) is γ-bad for ω})
x,y∈S(z,l,l0 ) |x−y|> 11 l 3
∩ pr−1 Υ(y,4l) (prΥ(y,4l) {ω ∈ Ω : Λ(y, l) is γ-bad for ω})] and as Υ(x, 4l)∩Υ(y, 4l) = ∅ for all x, y ∈ S(z, l, l0 ) with |x−y| > 11 3 l, we can use the −1 independence of the two events prΥ(x,4l) (prΥ(x,4l) {ω ∈ Ω : Λ(x, l) is γ-bad for ω}) and pr−1 Υ(y,4l) (prΥ(y,4l) {ω ∈ Ω : Λ(y, l) is γ-bad for ω}) to estimate 11 0 l ) > P pr−1 pr ω ∈ Ω : δQ(ω, z, γ, l, 3l 0 Υ(z,4l ) Υ(z,4l0 ) 3 X P({ω ∈ Ω : Λ(x, l) not γ-super good for ω}) ≤ x,y∈S(z,l,l0 ) |x−y|> 11 l 3
· P({ω ∈ Ω : Λ(y, l) not γ-super good for ω}) 0 2d l η2 . ≤ (Card S(z, l, l0 ))2 η 2 ≤ 92d l Theorem 4.8. Let l be big enough and l0 > 13l and suppose that for each x ∈ S(z, l, l0 ) the estimate P{ω ∈ Ω : Λ(x, l) is γ-super good for ω} ≥ 1 − η holds, then there is a c6 , such that P{ω ∈ Ω : Λ(z, l0 ) is γ 0 -super good for ω} ≥ 1 − η 0 , where we can choose γ 0 and η 0 as follows: 13l (2d + 2β) log l0 log(3d − 1) 0 , − γ = 1− 0 γ− l l l0 " # 0 2d l η 2 + (l0 )d−β + (l0 )−(ς−2d) . η 0 = c6 l Proof. We only look for configurations ω ∈ A1 with 11 0 l ) > pr ω ∈ Ω : δQ(ω, z, γ, l, 3l A1 := Ω\ pr−1 0 Υ(z,4l ) Υ(z,4l0 ) 3 0 −β 0 })] ∩ [Ω\ pr−1 Υ(z,4l0 ) (prΥ(z,4l0 ) {ω ∈ Ω : dist(E, σ(HΛ(z,3l ) (ω))) < (3l ) 0 −β 0 })] . ∩ [Ω\ pr−1 Υ(z,4l0 ) (prΥ(z,4l0 ) {ω ∈ Ω : dist(E, σ(HΛ(z,l ) (ω))) < (l )
¯ ∈ Ω with prΥ(z,4l0 ) (ω) = prΥ(z,4l60) (¯ ω ), one easily verifies that Fixing ω ∈ A1 and ω this ω ¯ satisfies the conditions of Lemma 4.4, so one obtains the γ 0 and that Λ(z, l0 ) ¯ . A combination of Lemma 4.6, Theorems 2.2 and 4.7 gives is a γ 0 -good box for ω the estimate P(A1 ) ≥ 1 − η 0 .
March 19, 2002 12:4 WSPC/148-RMP
00119
Anderson Localization for a Multidimensional Model
289
4.4. The induction process Theorem 4.9. There are l∗ , ζ > 0 and EX > inf Σ, such that if we choose l0 > l∗ αn and α ∈]1, 2d+2ζ 2d+ζ [ and define ln := l0 there is a γ > 0, such that P{ω ∈ Ω : Λ(z, ln ) is not γ-super good for ω and E} ≤ ln−ζ
(14)
for each z ∈ Rd and E ∈ [inf Σ, EX ]. −(ς−d)
Proof. Choosing b > 1, we get ξ ∈]2 − b, 1[ such that l0b−2 − c5 l0 is large enough. Now we conclude from Lemma 4.6 that
≥ l0−ξ if l0
b−2 pr−1 Υ(z,4l0 ) (prΥ(z,4l0 ) {ω ∈ Ω : dist(inf Σ, σ(HΛ(z,l0 ) (ω))) > l0 }) −(ς−d)
⊆ {ω ∈ Ω : dist(inf Σ, σ(HΛ(z,l0 ) (ω))) > l0b−2 − c5 l0
},
so the initial length scale estimate shows that for all l0 > l2∗ (ξ) the box Λ(z, l0 ) is γ0 = γ0 (l0 , ξ) =
C4 −ξ 2 l0
good for all E ∈ [inf Σ −
l−ξ 0 2 , inf
Σ+
l−ξ 0 2 ]
and all
b−2 ω ∈ pr−1 Υ(z,4l0 ) (prΥ(z,4l0 ) {ω ∈ Ω : dist(inf Σ, σ(HΛ(z,l0 ) (ω))) > l0 }) .
This implies that for ω ∈ {ω ∈ Ω : dist(inf Σ, σ(HΛ(z,l0 ) (ω))) > l0b−2 } the box Λ(z, l0 ) is γ0 -super good, and so using Proposition 2.3 we get P{ω ∈ Ω : Λ(z, l0 ) is γ0 -super good for ω and E} ≥ 1 − l0−ζ if we choose a suitable lower bound for l0 . Now the induction works; we define 13lk γk+1 (l0 , γ0 , α) = γk+1 := γk 1 − − ck , lk+1 log(3d − 1) 13lk log lk+1 , 1− + (2d + 2β) ck (l0 ) = ck := lk lk+1 lk+1 and use the proof of Lemma A3 in [2] to express γk+1 in terms of lj and cj , Q −(α−1)αj ) satisfying 0 < j = 1, . . . , k to find κ = κ(l0 , α) = j∈N (1 − 13l0 γ0 (l0 ,ξ) κ(l0 , α) 2
≤ γk for all k ∈ N if we choose l0 big enough. In order to check the P-estimate (14), we use the expression for η 0 in Theorem 4.8 and optimize the parameter α, β and ζ as in the proof of [14, Proposition 5.4], which says: We can choose ζ ∈]0, ς − 2d[, then α ∈]1, 2d+2ζ 2d+ζ [ and β > d + ζ such that if we only consider large values of l0 , then for all k ∈ N, we will get the inequalities: 2d 1 −2d 1 −2ζ lk+1 1 −ζ 2d−ς 2d α α ≤ lk+1 , c6 ηk2 ≤ c6 lk+1 lk+1 lk+1 c6 lk+1 3 lk ζ 1 2d(1− α )−2 α
= c6 lk+1
≤
1 −ζ l , 3 k+1
d−β c6 lk+1 ≤
With this choice of l0 and for EX := inf Σ +
l−ζ 0 2
1 −ζ l . 3 k+1
> inf Σ Theorem 4.9 follows.
March 19, 2002 12:4 WSPC/148-RMP
290
00119
H. Zenk
5. From the Resolvent on Finite Boxes to the Localized Resolvent on the Whole Space Theorem 5.1. Let EX be as in Theorem 4.9, then for every E ∈ [inf Σ, EX ] there is a measurable set B1 = B1 (E) with P(B1 ) = 1, such that for every bounded set K ⊆ Rd and each ω ∈ B1 there is r1 (ω, E, K) ∈]0, ∞[ such that sup k(H(ω) − E − iε)−1 1K k = sup k1K (H(ω) − E − iε)−1 k ≤ r1 (ω, E, K) . ε6=0
ε6=0
Proof. Setting Λk := Λ(0, lk ), εk := e−γlk and Wk := WϕΛkk for an Urysohn funck tion ϕk of Λk and lk := l0α as in Theorem 4.9, we define ( ) Mk = Mk (E) :=
ω ∈ Ω : sup kWk (HΛk (ω) − E − iε)−1 1 lk k ≤ εk
,
3
ε6=0
Nk = Nk (E) := {ω ∈ Ω : dist(σ(HΛk (ω), E) > εk−1 } . Combining Theorems 4.9 and 2.2 we get (1 − P(Mk ))k∈N ∈ l1
and (1 − P(Nk ))k∈N ∈ l1 .
Hence the Borel–Cantelli lemma implies that B1 := B1 (E) = lim inf k→∞ Mk (E) ∩ Nk (E) has measure 1. For each ω ∈ B1 (E) we have a lower bound k0 (ω), such that ω ∈ Mk (E) ∩ Nk (E) for all k ≥ k0 (ω). For the given bounded set K we choose m with 1K ⊆ Λ(0, lm 3 ) and then we treat indices j ≥ max{k0 (ω), m} =: n. If we take the Urysohn functions ϕk of Λk and use ϕk+1 Wk = Wk , we get for v = 1K v applying the geometric resolvent equation (7) several times (H(ω) − E − iεj )−1 1K v = (H(ω) − E − iεj )−1 ϕn v = ϕn (HΛn (ω) − E − iεj )−1 v − (H(ω) − E − iεj )−1 Wn (HΛn (ω) − E − iεj )−1 v = ϕn (HΛn (ω) − E − iεj )−1 v − ϕn+1 (HΛn+1 (ω) − E − iεj )−1 Wn · (HΛn (ω) − E − iεj )−1 1 ln v + (H(ω) − E − iεj )−1 Wn+1 3
· (HΛn+1 (ω) − E − iεj )−1 Wn (HΛn (ω) − E − iεj )−1 1 ln v 3
= ϕn (HΛn (ω) − E − iεj )−1 v +
j−n−1 X
(−1)s+1 ϕn+s+1
s=0 −1
· (HΛn+s+1 (ω) − E − iεj )
( st=0 Wn+t (HΛn+t (ω)
− E − iεj )−1 1 ln+t )v 3
+ [(−1)n−j+1 (H(ω) − E − iεj )−1 j−n t=0 Wn+t · (HΛn+t (ω) − E − iεj )−1 1 ln+t ]v . 3
(15)
March 19, 2002 12:4 WSPC/148-RMP
00119
Anderson Localization for a Multidimensional Model
As ω ∈ Mk (E) ∩ Nk (E) we estimate k(HΛk (ω) − E − iεj )−1 k ≤ 1 εk−1
−1
and kWk (HΛk (ω) − E − iεj ) −1
k(H(ω) − E − iεj )
1K vk ≤ kvk
1 lk k ≤ εk to get
1 dist(E,σ(HΛk (ω)))
3
1 εn−1
≤ kvk e
+
γln−1
j−n−1 X
1
s=0
εn+s
+
X
s Y
! e
−γlk
291
j−n 1 Y εn+t + εn+t εj t=0 t=0
≤
!
.
k∈N
Using the first resolvent equation we pass from (εj )j∈N to ε 6= 0, if we take P r1 (ω, E, K) := 2(eγln−1 + k∈N e−γlk ). As ϕn |K = 1 the proof is complete. Lemma 5.2. Let EX and γ be given as in Theorem 4.9. (1) Taking a bounded function Q of compact support and an E ∈ [inf Σ, EX ], there is a B2 = B2 (E, Q) ∈ B(Ω) of full measure P(B2 ) = 1 and for each ω ∈ B2 there are constants s2 = s2 (ω, E, Q), s3 = s3 (ω, E, Q) and s4 (ω, E, Q), such that for all x ∈ Rd , |x| ≥ s3 (ω, E, Q) and for all ω ∈ B2 (E, Q) one gets: sup k(H(ω) − E − iε)−1 Qk ≤ s2 (ω, E, Q) < ∞ ,
(16)
ε6=0 γ
sup k1x (H(ω) − E − iε)−1 Qk ≤ s4 (ω, E, Q)e− 5 |x| .
(17)
ε6=0
(2) Taking d < D < r and an E ∈ [inf Σ, EX ], there is a set B2 = B2 (E, D, r) ∈ B(Ω) with P(B2 (E, D, r)) = 1 and for each ω ∈ B2 (E, D, r) there are t2 = t2 (ω, E, r), t3 = t3 (ω, E, D, r) and t4 = t4 (ω, E, D, r) ∈ R+ , such that for all x ∈ Rd satisfying |x| ≥ t3 (ω, E, D, r) and for all ω ∈ B2 (E, D, r) the following inequalities are valid : sup k(H(ω) − E − iε)−1 h·i−r k ≤ t2 (ω, E, r) < ∞ ,
(18)
ε6=0
sup k1x (H(ω) − E − iε)−1 h·i−r k ≤ t4 (ω, E, D, r)hxiD−r .
(19)
ε6=0
Proof. Let (lk )k∈N be the sequence of length scales from Theorem 4.9 and let ζ(D, r) be the positive solution of X + d − r(X+2d) 2X+2d = D − r. Choose ζ > 0 as ˜ = Q has compact support and choose some ζ ∈]0, ζ(D, r)[, in Theorem 4.9 if Q ˜ = h·i−r . As which fulfills the requirements 0 < ζ < ς − 2d of Theorem 4.9 if Q x+2d x 7→ x + d − r 2x+2d is increasing, we get ζ +d−
ζ + 2d r < ζ +d−r α 2ζ + 2d
≤ ζ(D, r) + d − r
ζ(D, r) + 2d =D−r <0 2ζ(D, r) + 2d
(20)
March 19, 2002 12:4 WSPC/148-RMP
292
00119
H. Zenk
for the choice α ∈ ]1, 2ζ+2d ζ+2d [ in Theorem 4.9. For k ∈ N, E ∈ [0, EX ] define 11 l , 27l ) > pr ω ∈ Ω : δQ(ω, 0, γ, l Mk (E) : = Ω\ pr−1 k−1 k k−1 Υ(0,4lk ) Υ(0,4lk ) 3 and ∩ ω ∈ B1 (E) : sup k(HΛ(0,27lk ) (ω) − E − iε)−1 k ≤ζ+d k ε6=0
sup k109lk (HΛ(0,27lk ) (ω) ε6=0
−1
− E − iε)
d WϕR27l k k
≤e
−γlk
,
then the multiscale analysis and the Wegner estimate imply (1−P(Mk (E)))k∈N ∈ l1 . So due to the Borel–Cantelli lemma and Theorem 5.1 the event B2 (E) := B1 (E) ∩ lim inf k→∞ Mk (E) satisfies P(B2 (E)) = 1. For each ω ∈ B2 (E) there is k0 = k0 (ω), such that ω ∈ Mk (E) for k ≥ k0 (ω). As ˜ ≤ sup k(1 − 10 )(H(ω) − E − iε)−1 Qk ˜ sup k(H(ω) − E − iε)−1 Qk 9lk ε6=0
0
ε6=0
˜ , + sup k109lk (H(ω) − E − iε)−1 Qk 0
ε6=0
Theorem 5.1 implies, that it is enough to estimate supε6=0 k(1 − 109lk )(H(ω) − E − 0 ˜ for the proof of (16) and (18). Define Θk := 10 − 10 , then an Urysohn iε)−1 Qk 9lk
9lk−1
function ϕ27lk for Λ(0, 27lk ) satisfies Θk = Θk · ϕ27lk . Choosing τ27lk as described d d after Lemma 3.2, such that WϕR27l ◦ τ27lk = WϕR27l , an application of the geometric k k resolvent equation yields: ˜ 2 k(1 − 109lk0 )(H(ω) − E − iε)−1 Quk =
X
˜ 2 kΘk (H(ω) − E − iε)−1 Quk
k>k0
≤2
X
˜ 2 (kΘk (HΛ(0,27lk ) (ω) − E − iε)−1 ϕ27lk Quk
k>k0
˜ 2) . + kΘk (HΛ(0,27lk ) (ω) − E − iε)−1 WϕR27l k2 kτ27lk (H(ω) − E − iε)−1 Quk d
k
(21) The estimate supε6=0 k109lk (HΛ(0,27lk ) (ω) − E − iε)−1 WϕR27l k ≤ e−γlk is satisfied for d
k
k
ω ∈ Mk (E) and from the construction of lk = l0α we may assume τ27lk = Θk+1 τ27lk . As a consequence of (21) we obtain ˜ 2 (1 − 2e−2γlk0 )k(1 − 109lk )(H(ω) − E − iε)−1 Quk ≤2
X k>k0
0
˜ 2 kΘk (HΛ(0,27lk ) (ω) − E − iε)−1 ϕ27lk Quk
March 19, 2002 12:4 WSPC/148-RMP
00119
Anderson Localization for a Multidimensional Model
293
for ω ∈ B2 (E). Inserting 1 = 10lk−1 + (1 − 10lk−1 ), we are lead to an estimate of 3
X
3
−1
kΘk (HΛ(0,27lk ) (ω) − E − iε)
˜ 2 ϕ27lk 10lk−1 Quk 3
k>k0
+
X
˜ 2. kΘk (HΛ(0,27lk ) (ω) − E − iε)−1 ϕ27lk (1 − 10lk−1 )Quk
(22)
3
k>k0
Using δQ(ω, 0, γ, lk−1 , 27lk ) ≤ ω ∈ B2 (E), k ≥ k0 then
11 3 lk−1
and k(HΛ(0,27lk ) (ω) − E − iε)−1 k ≤ lkζ+d for
k1ylk−1 (HΛ(0,27lk ) (ω) − E − iε)−1 10lk−1 k ≤ e
log bd
−(γ− l
k−1
)(3|y|−11lk−1 ) ζ+d lk
(23)
3
3
follows analog to (13) for lk−1 d lk−1 Z : Λ y, ∩ (Λ(0, 9lk )\Λ(0, 9lk−1 )) 6= ∅ . y ∈ Sk := y ∈ 3 3 Counting the elements y ∈ Sk and using 92 lk−1 ≤ |y| ≤ 92 lk + kΘk (HΛ(0,27lk ) (ω) − E − iε)−1 10lk−1 k ≤
lk−1 6 ,
we obtain
9 d(1− 1 ) lk α e− 4 γlk−1
,
3
hence the first sum in (22) is finite. Estimating the second sum of (22), we may ˜ = h·i−r , hence as a result of the Wegner estimate and the decay properties assume Q −r of h·i , X ˜ 2 kΘk (HΛ(0,27lk ) (ω) − E − iε)−1 ϕ27lk (1 − 10lk−1 )Quk 3
k>k0
≤ kuk
2
X
−1 2
k(HΛ(0,27lk ) (ω) − E − iε)
k
1+
k>k0
≤ 62r kuk2
X
r 2(ζ+d− α )
lk
lk−1 6
2 !−r
,
k≥k0
which is finite due to (20). l For the proof of (17) and (19), we may assume Λ(x, 1) ⊆ Λ(y, k−1 3 ) ∩ Λ(0, 9lk ) for some k ≥ k0 (ω), ω ∈ B2 (E) and y ∈ Sk . Hence we estimate ˜ Observing 109lk = 109lk ϕ27lk , an application supε6=0 k109lk 1ylk−1 (H(ω)− E − iε)−1 Quk. 3
of the geometric resolvent equation yields: ˜ k109lk 1ylk−1 (H(ω) − E − iε)−1 Quk 3 d ˜ ≤ k109lk (HΛ(0,27lk ) (ω) − E − iε)−1 WϕR27l k k(H(ω) − E − iε)−1 Quk k
˜ + k1ylk−1 (HΛ(0,27lk ) (ω) − E − iε)−1 ϕ27lk 10lk−1 Quk 3
3
˜ . + k1 lk−1 (HΛ(0,27lk ) (ω) − E − iε)−1 ϕ27lk (1 − 10lk−1 )Quk y
3
3
(24)
March 19, 2002 12:4 WSPC/148-RMP
294
00119
H. Zenk
Due to the definition of B2 (E), we have sup k109lk (HΛ(0,27lk ) (ω) − E − iε)−1 WϕR27l k ≤ e−γlk , d
k
ε6=0
˜ is finite. and the proof of (16) and (18) shows, that supε6=0 k(H(ω) − E − iε)−1 Quk γ From (23) we obtain k1ylk−1 (HΛ(0.27lk ) (ω) − E − iε)−1 10lk−1 k ≤ e− 2 |y| if lk−1 is big 3
3
˜ has compact support, the third term in (24) vanishes for big k, hence enough. If Q ˜ = h·i−r , then due to (20), the definition of B2 (E) and the we obtain (17). If Q decay properties of h·i−r , we obtain ˜ k(HΛ(0,27lk ) (ω) − E − iε)−1 k · k(1 − 10lk−1 )Qk 3
≤ lkζ+d
1+
lk−1 6
2 !− r2 ≤ 6r lkD−r ,
which implies (19). Theorem 5.3. Choosing EX and γ as in Theorem 4.9 there is a Z0 ∈ B(Ω) with P(Z0 ) = 1, such that for each ω ∈ Z0 there is an I(ω) ∈ B([inf Σ, EX ]) with µ(I(ω)) = µ([inf Σ, EX ]), such that the following statements hold : (1) For all y ∈ Qd , ω ∈ Z0 and E ∈ I(ω), there are constants r3 (ω, E, y), r4 (ω, E, y) ∈ R+ , such that for all x ∈ Rd satisfying |x| ≥ r3 (ω, E, y) one gets: γ
sup k1y (H(ω) − E − iε)−1 1x k ≤ r4 (ω, E, y)e− 5 |x| . ε6=0
(2) Is ℘ of compact support, then for all ω ∈ Z0 and E ∈ I(ω) there are constants r3 (ω, E, ℘), r4 (ω, E, ℘) such that for all x, z ∈ Rd with |x| ≥ r3 (ω, E, ℘) and |z| < R we have: γ
sup k1x (H(ω) − E − iε)−1 ℘(· − z)k ≤ r4 (ω, E, ℘)e− 5 |x| . ε6=0
(3) For n ∈ N, ω ∈ Z0 and E ∈ I(ω) there are r3 (ω, E, ς, n), r4 (ω, E, ς, n) ∈ R+ , such that for all x, z ∈ Rd with |x| ≥ r3 (ω, E, ς, n) and |z| < R the following inequality is valid : sup k1x (H(ω) − E − iε)−1 h· − zi− 2 −n 4 k ≤ r4 (ω, E, ς, n)hxi−n 4 . ς
ς
ς
ε6=0
Proof. Using Q = 1y in Lemma 5.2 we obtain B2 (E, y) ∈ B(Ω) with P(B2 (E, y)) = 1 and r3 (ω, E, y), r4 (ω, E, y) ∈ R+ for each ω ∈ B2 (E, y), such that for |x| ≥ r3 (ω, E, y) we get γ
sup k1y (H(ω) − E − iε)−1 1x k ≤ r4 (ω, E, y)e− 5 |y| . ε6=0
March 19, 2002 12:4 WSPC/148-RMP
00119
Anderson Localization for a Multidimensional Model
In the case that ℘ has compact support, we take M := Q := k℘k∞ 1M . Using Lemma 5.2 again we get
S |z|
295
supp(℘(· − z)) and
sup k1x (H(ω) − E − iε)−1 ℘(· − z)k ε6=0
= sup k℘(· − z)(H(ω) − E − iε)−1 1x k ≤ sup kQ(H(ω) − E − iε)−1 1x k ε6=0
ε6=0 γ
= sup k1x (H(ω) − E − iε)−1 Qk ≤ r4 (ω, E, ℘)e− 5 |x| ε6=0
for ω ∈ B2 (E, Q), |x| ≥ r3 (ω, E, ℘) with P(B2 (E, Q)) = 1 and r3 (ω, E, ℘), r4 (ω, E, ℘) ∈ R+ . As 1 + |x − z|2 ≥ 12 (1 + |x|2 ) for |x| ≥ 4R, we estimate h· − zi−r ≤ ch·i−r for all |z| < R. Taking D = 2ς and r = 2ς + n 4ς we get sets B2 (E, 2ς , 2ς + n 4ς ) ∈ B(Ω) with P(B2 (E, 2ς , 2ς + n 4ς )) = 1 and r3 (ω, E, ς, n), r4 (ω, E, ς, n) ∈ R+ such that for |z| < R and |x| ≥ r3 (ω, E, ς, n): sup k1x(H(ω) − E − iε)−1 h· − zi− 2 −n 4 k ς
ς
ε6=0
= sup kh· − zi− 2 −n 4 (H(ω) − E − iε)−1 1x k ≤ r4 (ω, E, ς, n)hxi−n 4 . ς
ς
ς
ε6=0
Hence for all E ∈ [inf Σ, EX ] the set B3 (E) :=
\
B2
n∈N
nς ς ς E, , + 2 2 4
! ∩
\
B2 (E, y)
y∈Qd
or \
B3 (E) :=
B2 (E, y) ∩ B2 (E, Q)
y∈Qd
has full measure P(B3 (E)) = 1. So [ B3 (E) × {E} ⊆ Ω × [inf Σ, EX ] Y := E∈[inf Σ,EX ]
has full outer measure (P ⊗ µ)∗ (Y ) = EX − inf Σ, because for all X ∈ B(Ω) ⊗ B([inf Σ, EX ]) with X ⊇ Y every section X(E) := {ω ∈ Ω : (ω, E) ∈ X} contains B3 (E), so Z EX Z EX P(X(E)) dE ≥ P(B3 (E)) dE EX − inf Σ ≥ (P ⊗ µ)(X) = inf Σ
inf Σ
= EX − inf Σ . This calculation provides us with examples of sets X ∈ B(Ω) ⊗ B([inf Σ, EX ]) with X ⊇ Y and (P ⊗ µ)(X) = (P ⊗ µ)∗ (Y ). Fixing such a set X and due to (P ⊗ µ)∗ (X\Y ) = 0, we can choose an N ∈ B(Ω) ⊗ B([inf Σ, EX ]) with N ⊇ X\Y and
March 19, 2002 12:4 WSPC/148-RMP
296
00119
H. Zenk
(P ⊗ µ)(N ) = 0. Defining I := X\N ⊆ Y the sections I(ω) := {E ∈ [inf Σ, EX ] : (ω, E) ∈ I} and the function Ω → R+ ω 7→ µ(I(ω)) are measurable. Due to Fubini’s Theorem we get Z µ(I(ω)) dP(ω) = (P ⊗ µ)(I) = EX − inf Σ , Ω
so P{ω ∈ Ω : µ(I(ω)) < EX − inf Σ} = 0 and Z0 := Ω\{ω ∈ Ω : µ(I(ω)) < EX − inf Σ} ∈ B(Ω) has full P-measure, which completes the proof. 6. Proof of Localization Following the ideas of [13, 14] and [2], we note three results used for the proof of localization, before turning to the proof of our main theorem. 6.1. “ P-almost sure absolute continuity” Theorem 6.1. Let V ∈ L∞ (Rd , R) and Q ∈ L∞ (Rd , R+ ) and suppose that there is a nonempty open set K ⊆ Rd with Q · 1K ≥ 1K . Define H0 := −4 + V and for each α ∈ R take H(α) := H0 + αQ and let πα be the spectral decomposition of H(α). Then for each M ∈ B(R) with µ(M ) = 0 there is an N ∈ B(R) with µ(N ) = 0, such that πα (M ) = 0 for all α ∈ R\N. √ Proof. Choosing V = Q and B = Q in Part 1 of Lemma 2.1 we get Z √ √ 1 hΦ, Q πα (M ) Q Φi dα ≤ µ(M )kΦk2 2 R 1+α for all Φ ∈ L2 (Rd ). As µ(M ) = 0 we get N (M, Φ) ∈ B(R) with µ(N (M, Φ)) = 0 √ √ √ and hΦ, Q πα (M ) Q Φi = kπα (M ) Q Φk2 = 0 for all α ∈ R\N √ (M, Φ). Tak2 d (R ), we have πα(M ) Q Φ = 0 and ing a countable dense subset F ⊆ L √ √ 0 = g(H(α))πα (M ) Q Φ = πα (M )g(H(α)) Q Φ for all Φ ∈ L2 (Rd ),√ α ∈ S R\ φ∈F N (M, φ) and g ∈ L∞ (πα ). As Q·1K ≥ 1K the subspace lin{g(H(α)) QΦ : g ∈ L∞ (πα ), Φ ∈ L2 (Rd )} is dense in L2 (Rd ) due to [2, Proposition A2.2] and Theorem 6.1 is therefore proven. 6.2. Distributional eigenfunctions odinger operator Theorem 6.2. Let s > d2 and V ∈ L∞ (Rd , R). For the Schr¨ H = −4 + V we take a spectral control measure ν. Then for ν-almost every E ∈ R there is a distributional solution Ψ of the eigenvalue equation HΨ = EΨ with 1 (Rd )\{0} = {f : Rd → C : f 6= 0 measurable, h·i−s f ∈ H 1 (Rd )}. Ψ ∈ H−s
March 19, 2002 12:4 WSPC/148-RMP
00119
Anderson Localization for a Multidimensional Model
297
√ Proof. With S = −4 + 1 , T = h·is and γ(α) := (α − z)−m , α ∈ R, for a given z ∈ C\R, m ∈ N, m > 12 + d4 , [16, Theorem 3.6] shows, that the assumptions of [17, Theorems 1 and 2] are satisfied, so this theorem follows. 6.3. Perturbation theory Theorem 6.3. Let Z0 be the set constructed in Theorem 5.3 and for ω ∈ Z0 take E ∈ J(ω) := {E ∈ I(ω) : E is not an eigenvalue of H(ω)}, |z| < R, α ∈ R\{0} and 1 d a distributional solution Ψ ∈ H− ς (R ) of the equation 4
H(ω)Ψ + α℘(· − z)Ψ = EΨ .
(25)
Then Ψ decays as follows: • If ℘ is of compact support and γ as in Theorem 4.9, then there is a C = C(Ψ, α) > 0, such that for all y ∈ Rd one gets: γ
|Ψ(y)| ≤ Ce− 5 |y| . • In the long range case for all t > 0 there is a Ct = Ct (Ψ, α) > 0 such that |Ψ(y)| ≤ Ct hyi−t for all y ∈ Rd . Proof. All potentials are bounded, hence the subsolution estimate (see [1, The1 d orem 2.4]) implies for distributional solutions Ψ ∈ H− ς (R ) of (25), that ψ is a 4 continuous function and there is a constant c independent of y, such that Z |Ψ(x)| dx . |Ψ(y)| ≤ c |y−x|< 12
Application of the H¨ older inequality shows |Ψ(y)| ≤ ck1y ΨkL2 . To estimate y k1 ΨkL2 we note that the spectral calculus implies lim (H(ω) − E − iε)−1 (H(ω) − E)Φ = Φ − π(ω, {E})Φ
ε→0
for all E ∈ R and Φ ∈ D(H(ω)), where π(ω, ·) is the spectral decomposition of H(ω). Next we choose an Urysohn function ϕn of the box Λ(0, 2n + 1) with ϕn |Λ(0,2n−1) = 1 d −ς 2 d 2 d 1. Due to Ψ ∈ H− ς (R ) we get h·i 4 ∂j Ψ ∈ L (R ), hence ∇ϕn · ∇Ψ ∈ L (R ). 4 Inserting the distributional eigenvalue equation −4Ψ = EΨ − V (ω)Ψ − α℘(· − z)Ψ in the Leibniz product rule 4(ϕn Ψ) =
d X
[Ψ(∂k2 ϕn ) + ϕn ∂k2 Ψ + 2∂k ϕn ∂k Ψ]
k=1
= (4ϕn )Ψ + ϕn 4Ψ + 2∇ϕn · ∇Ψ implies ϕn Ψ ∈ D(−4). As E ∈ J(ω) is not an eigenvalue of H(ω), we conclude lim (H(ω) − E − iε)−1 (H(ω) − E)ϕn Ψ = ϕn Ψ .
ε→0
March 19, 2002 12:4 WSPC/148-RMP
298
00119
H. Zenk
Given y ∈ Qd we choose n ∈ N with Λ(y, 1) ⊆ Λ(0, 2n − 1), then inserting (H(ω) − E)ϕn Ψ:
k1y Ψk = k1y ϕn Ψk = 1y lim (H(ω) − E − iε)−1 (H(ω) − E)(ϕn Ψ) ε→0
= 1y lim (H(ω) − E − iε)−1 (−(4ϕn )Ψ − 2∇ϕn · ∇Ψ − α℘(· − z)ϕn Ψ) ε→0
≤ sup k1y (H(ω) − E − iε)−1 (4ϕn )Ψk ε6=0
+ 2 sup k1y (H(ω) − E − iε)−1 ∇ϕn · ∇Ψk ε6=0
+ |α| sup k1y (H(ω) − E − iε)−1 ℘(· − z)ϕn Ψk .
(26)
ε6=0
As supp ∂ β ϕn ⊆ Λ(0, 2n + 1)\Λ(0, 2n − 1) =: M (n) for all β ∈ Nd0 with |β| ≥ 1 we get k1y (H(ω) − E − iε)−1 4ϕn Ψk X k1y (H(ω) − E − iε)−1 1x k k1M(n) Ψk , ≤ k4ϕn k∞ x∈Zd |x|=n
and k1y (H(ω) − E − iε)−1 ∇ϕn · ∇Ψk X k1y (H(ω) − E − iε)−1 1x k k1M(n) ∂ k Ψk . ≤ dk∇ϕn k∞ x∈Zd |x|=n ς
As k1M(n) ∂ k Ψk, k1M(n) Ψk ≤ (1 + (n + 12 )2 ) 8 kΨkH 1ς , we obtain a polynom p(n), 4
such that after inserting the estimates of Theorem 5.3 for |x| = n ≥ r3 (ω, E, y) γ
k1y Ψk ≤ p(n)kΨkH 1 ς e− 5 n + |α| sup k1y (H(ω) − E − iε)−1 ϕn ℘(· − z)Ψk . −
4
(27)
ε6=0
Estimating the last term of (27) we consider first the short range case for |y| ≥ r3 (ω, E, ℘) and for n big enough, such that ϕn ℘(· − z) = ℘(· − z) k1y (H(ω) − E − iε)−1 ϕn ℘(· − z)Ψk ≤ k1y (H(ω) − E − iε)−1 ℘(· − z)k k1supp ℘(· − z) Ψk γ
≤ r4 (ω, E, ℘)e− 5 |y| k1supp ℘(· − z) Ψk . Combining the results for n → ∞ we get γ
|Ψ(y)| ≤ ck1y Ψk ≤ c|α|r4 (ω, E, ℘)k1supp ℘(· − z) Ψke− 5 |y|
(28)
and this result extends to all y ∈ Rd by continuity of Ψ. In the long range case we ς have to do a sort of induction process. First note that h·−zi− 4 Ψ, ℘(·−z)Ψ ∈ L2 (Rd )
March 19, 2002 12:4 WSPC/148-RMP
00119
Anderson Localization for a Multidimensional Model
299
hence k1y (H(ω) − E − iε)−1 ϕn ℘(· − z)Ψk ≤ k1y (H(ω) − E − iε)−1 (℘(· − z)h· − zi 4 )k kh· − zi− 4 ϕn Ψk ς
ς
≤ c1 kh· − zi− 4 ς (H(ω) − E + iε)−1 1y k kΨkH 1 ς . 3
−
4
Using Theorem 5.3 we estimate sup k1y (H(ω) − E − iε)−1 h· − zi− 4 ς k ≤ r4 (ω, E, ς, 1)hyi− 4 ς
3
ε6=0
for |y| ≥ r3 (ω, E, ς, 1). Thus for n → ∞ Eq. (27) implies |Ψ(y)| ≤ ck1y Ψk ≤ cc1 |α|r4 (ω, E, ς, 1)kΨkH 1 ς hyi− 4 . ς
−
(29)
4
This estimate yields Ψ ∈ L2 (Rd ), so we are able to improve the estimate k1y (H(ω) − E − iε)−1 ϕn ℘(· − z)Ψk ≤ k1y (H(ω) − E − iε)−1 ℘(· − z)k kΨk ≤ c1 kh· − zi−ς (H(ω) − E + iε)−1 1y k kΨk = c1 k1y (H(ω) − E − iε)−1 h· − zi−ς k kΨk in order to get |Ψ(y)| ≤ ck1y Ψk ≤ c|α|c1 r4 (ω, E, ς, 2)kΨkhyi−2 4 ς
for |y| ≥ r3 (ω, E, ς, 2). Continuing we end at |Ψ(y)| ≤ c|α|c1 kh·i(n−2) 4 Ψkr4 (ω, E, ς, n)hyi−n 4 ς
ς
for |y| ≥ r3 (ω, E, ς, n) and for n 4ς ≥ t. Choosing ς
Ct (Ψ, α) ≥ c|α|c1 r4 (ω, E, ς, n)kh·i(n−2) 4 Ψk , we get |Ψ(y)| ≤ Ct (Ψ, α)hyi−t for all y ∈ Rd . 6.4. Proof of the main theorem We first gather some facts from the last sections, which imply pure point spectrum for operators “perturbed” at site 0: As it was seen in the construction of Z0 , for each ω ∈ Z0 we have µ([inf Σ, EX ]\J(ω)) = 0. So due to Theorem 6.1 we get S(ω, z) ⊆ R with µ(R\S(ω, z)) = 0 and να,z,ω ([inf Σ, EX ]\J(ω)) = 0 for every ω ∈ Z0 , |z| < R and α ∈ S(ω, z), where να,z,ω is a spectral control measure of the operator H(ω) + α℘(· − z). On the other hand for να,z,ω -almost all E ∈ R there is 1 d a distributional solution Ψ ∈ H− ς (R ) of the eigenvalue equation (H(ω) + α℘(· − 4 z))Ψ = EΨ. Applying Theorem 6.3 for |z| < R and E ∈ J(ω) these distributional eigensolutions are L2 -eigenfunctions, decaying as stated in Theorem 6.3. Hence the interval [inf Σ, EX ] consists να,z,ω -almost sure of eigenvalues of H(ω)+α℘(· −z) and
March 19, 2002 12:4 WSPC/148-RMP
300
00119
H. Zenk
σ(H(ω) + α℘(· − z)) ∩ [inf Σ, EX ] is therefore pure point. In the second step of the proof, we construct a set Z1 of full P-measure, such that for each ω ¯ ∈ Z1 the operator H(¯ ω ) equals such an operator perturbed at site 0 of pure point spectrum as we have just constructed. To do this we consider Ω as a product of the probability spaces ˆ P) ˆ = ((R × Rd )⊗(Zd \{0}) , (P0 ⊗ P1 )⊗(Zd \{0}) ) R = (R, P0 ), Rd = (Rd , P1 ) and (Ω, ˆ For each α ∈ R let ˆ ) ∈ Ω with ω00 ∈ R, ω000 ∈ Rd and ω ˆ ∈ Ω. writing ω = (ω00 , ω000 , ω d ˆ × R : (α, z, ω ˆ ) ∈ Z0 } denote the section through Z0 at α. As M (α) := {(ˆ ω, z) ∈ Ω Z ˆ ⊗ P1 )(M (α)) dP0 (α) , 1 = P(Z0 ) = (P R
ˆ ⊗ P1 )(M (α0 )) = 1. For (ˆ ˆ × Rd respectively ω, z) ∈ Ω we choose α0 ∈ R with (P d ¯ (z) = (¯ ω (z)j )j∈Zd and δ(α, z) = (δ(α, z)j )j∈Zd ∈ Ω taking (ω, z) ∈ Ω × R define ω ( ( (α0 , z) : j = 0 , (α, z) : j = 0 , and δ(α, z)j := (30) ω ¯ (z)j := 0 : j 6= 0 . ωj : j 6= 0 , Then we define ω(z) + δ(α, z)) : ω ¯ (z) ∈ Z0 , |z| < R, α ∈ S(¯ ω (z), z)} ⊆ Ω , Z1 := {(¯ Z2 := {ω ∈ Ω : H(ω) is pure point in [inf Σ, EX ] and the eigenfunctions decay as in Theorem 6.3} ⊆ Ω ˆ |z| < R the section through Z1 at (ˆ ω , z) and for ω ˆ ∈ Ω, ω , z) := {α ∈ R : ω ¯ (z) + δ(α, z) ∈ Z1 } . S1 (ˆ Using the definition of Z1 S1 (ˆ ω , z) = {α ∈ R : (¯ ω (z) + δ(α, z)) ∈ Z1 } ω (z), z)} = {α ∈ R : ω ¯ (z) ∈ Z0 , α ∈ S(¯ ( ∅:ω ¯ (z) ∈ / Z0 = S(¯ ω (z), z) : ω ¯ (z) ∈ Z0 ( ∅ : (ˆ ω, z) ∈ / M (α0 ) , = S(¯ ω (z), z) : (ˆ ω, z) ∈ M (α0 ) , ˆ ω (z)), z)) = 1 for all (ˆ ω , z) ∈ M (α0 ) we compute and as P(M (α0 )) = 1 and P0 (S(¯ Z Z ˆ ω) 1Z1 (α0 + α, z, ω ˆ ) dP0 (α) dP1 (z) dP(ˆ P(Z1 ) = Z
d ˆ Ω×R
R
Z
= Z
M(α0 )
R
Z
= M(α0 )
ˆ ω) 1S1 (ˆω,z) (α) dP0 (α) dP1 (z) dP(ˆ
R
ˆ ω) 1S(¯ω(z),z) (α) dP0 (α) dP1 (z) dP(ˆ
March 19, 2002 12:4 WSPC/148-RMP
00119
Anderson Localization for a Multidimensional Model
301
Z ˆ ω) P0 (S(¯ ω (z), z)) dP1 (z) dP(ˆ
= M(α0 )
Z
ˆ ω ) dP1 (z) = 1 . dP(ˆ
= M(α0 )
In view of H(¯ ω (z)) + α℘(· − z) X ωj0 ℘(· − j − ωj00 ) + (α0 + α)℘(· − z) = H(¯ ω (z) + δ(α, z)) , = −4 + j∈Zd j6=0
we get Z1 ⊆ Z2 and hence the conclusion. Acknowledgments This paper is a summary of my diploma thesis [25] in mathematics and physics at Universit¨at Regensburg. I thank W. Hackenbroch, who initiated the theme; K. Barbey, who spent a lot of time in discussions of the forthcoming mathematical problems which were a great help making some arguments rigorous; U. Krey for having a look at the physical sight of the problem and V. Bach for various suggestions in the preparation of this article. References [1] H. Cycon, R. Froese, W. Kirsch and B. Simon, Schr¨ odinger Operators — with Application to Quantum Mechanics and Global Geometry, Texts and Monographs in Physics, Springer, 1987. [2] J. M. Combes and P. D. Hislop, “Localization for some continuous random Hamiltonians in d-dimensions”, J. Funct. Anal. 124 (1994) 149–180. [3] J. M. Combes and P. D. Hislop, “Landau Hamiltonians with random potentials: localization and the density of states”, Comm. Math. Phys. 177 (1996) 603–629. [4] R. Carmona and J. Lacroix, Spectral Theory of Random Schr¨ odinger Operators, Probability and Its Applications, Birkh¨ auser, 1990. [5] H. von Dreifus and A. Klein, “A new proof of localization in the Anderson tight binding model”, Comm. Math. Phys. 124 (1989) 285–299. [6] F. Delyon, Y. L´evy and B. Souillard, “Anderson localization for multidimensional systems at large disorder or low energy”, Comm. Math. Phys. 100 (1985) 463–470. [7] A. Figotin and A. Klein, “Localization of classical waves I: acoustic waves”, Comm. Math. Phys. 180 (1996) 439–482. [8] W. Fischer, H. Leschke and P. M¨ uller, “Spectral localization by Gaussian random potentials in multi-dimensional continuous space”, J. Statistical Phys. 101(5/6) (2000) 935–985. [9] J. Fr¨ ohlich and T. Spencer, “Absence of diffusion in the Anderson tight binding model for large disorder or low energy”, Comm. Math. Phys. 88 (1983) 151–184. [10] J. Fr¨ ohlich, F. Martinelli, E. Scoppola and T. Spencer, “A constructive proof of localization in Anderson tight binding model”, Comm. Math. Phys. 101 (1985) 21–46.
March 19, 2002 12:4 WSPC/148-RMP
302
00119
H. Zenk
[11] F. Germinet and A. Klein, “Bootstrap multiscale analysis and localization in random Media”, Comm. Math. Phys. 222 (2001) 415–448. [12] W. Hackenbroch and A. Thalmeier, Stochastische Analysis, Teubner 1994 [13] W. Kirsch, P. Stollmann and G. Stolz, “Localization for random perturbations of periodic Schr¨ odinger operators”, to appear in Random Operators and Stochastic Equations. [14] W. Kirsch, P. Stollmann and G. Stolz, “Anderson localization for random Schr¨ odinger operators with long range interactions”, preprint, 1997. [15] W. Kirsch, P. Stollmann and G. Stolz, “Anderson localization for random Schr¨ odinger operators with long range interactions”, Comm. Math. Phys. 195 (1998) 495–507. [16] T. Poerschke and G. Stolz, “On eigenfunction expansions and scattering theory”, Math. Z. 212 (1993) 337–357. [17] T. Poerschke, G. Stolz and J. Weidmann, “Expansions in generalized eigenfunctions of selfadjoint operators”, Math. Z. 202 (1989) 397–408. [18] M. Reed and B. Simon, Methods of Modern Mathematical Physics, IV. Analysis of Operators, Academic Press, New York, 1979. [19] B. Simon, “Lifschitz tails for the Anderson model”, J. Stat. Phys. 38(1/2) (1985) 65–76. [20] B. Simon and T. Wolff, “Singular continuous spectrum under rank one perturbation and localization for random Hamiltonians”, Comm. Pure Appl. Math. 39 (1986) 75–90. [21] P. Stollmann, Caught by Disorder, Bound States in Random Media, Birkh¨ auser Boston, 2001. [22] D. W. Stroock, Probability Theory, An Analytic View, Cambridge University Press 1993. [23] P. Walters, An Introduction to Ergodic Theory, Graduate Texts in Mathematics 79, Springer, 1982. [24] J. Weidmann, Linear Operators in Hilbert Spaces, Graduate Texts in Mathematics Vol. 68, Springer Verlag, Berlin, 1980. [25] H. Zenk, Anderson–Lokalisierung bei Random Schr¨ odinger Operatoren in mehreren Dimensionen, Diplomarbeit Regensburg 1997 and Spektraleigenschaften von Schr¨ odingeroperatoren in ungeordneten Systemen untersucht mit Methoden der mathematischen Physik, Diplomarbeit, Regensburg, 1998.
March 19, 2002 12:11 WSPC/148-RMP
00118
Reviews in Mathematical Physics, Vol. 14, No. 3 (2002) 303–316 c World Scientific Publishing Company
FORMAL AND ANALYTIC DEFORMATIONS FROM WITT TO VIRASORO
L. GUERRINI Universit` a Degli Studi di Bologna Dipartimento di Matematica per le Scienze Economiche e Sociali Viale Filopanti 5, I-40126, Bologna, Italy
[email protected]
Received 2 March 2001 We introduce a new family WF of deformations of the Witt algebra W, F varying in the space of all polynomials with vanishing constant terms, and show the existence of an isomorphism of its formal and analytic completions with those of the Witt algebra. Central extensions of this algebra are considered and the existence of an isomorphism between their formal and analytic completions with those of the Virasoro algebra is proved. Keywords: Deformations; Witt algebra; Virasoro algebra.
1. Introduction Let (G, [· , ·]) be a complex Lie algebra. A formal deformation Gt of G is a formal power series X Fn (g, h)tn [g, h]t = [g, h] + n≥1
with Fn : G × G → G bilinear antisymmetric maps, such that # "∞ ∞ ∞ ∞ X X X X n m gn t , hm t = tj Fi (gn , hm ) n=0
m=0
t
j=0
i+n+m=j
is a Lie algebra structure on G[[t]], the space of formal power series on G in the complex parameter t. In particular the map F1 is a 2-cocycle on G with coefficients in the adjoint representation of G. Two deformations [· , ·]t and [· , ·]0t of G are said to be formally equivalent if there exists a transformation Φt : G → Gt ,
Φt = 1 +
∞ X i=1
303
ti ϕi
March 19, 2002 12:11 WSPC/148-RMP
304
00118
L. Guerrini
with ϕi : G → G linear, such that for each g, h ∈ Gt Φt ([g, h]0t ) = [Φt (g), Φt (h)]t . A formal deformation equivalent to the null deformation is called trivial. One has the obvious notion of a deformation of order k and equivalence simply means that one replaces the formal power series by formal power series truncated at the order k. For k = 1, one speaks of infinitesimal deformations. Since two deformations beginning with [· , ·] + tF1 and [· , ·] + tG1 are equivalent up to order one if and only if F1 − G1 is a coboundary, one has that the second cohomology group H 2 (G, G) classifies equivalence classes of infinitesimal deformations. A Lie algebra is called formally rigid if each of its formal deformations is trivial, that is if there exists a Lie algebra isomorphism Gt ' G[[t]] over C[[t]] . It is a standard result [4] that H 2 (G, G) = 0 implies G is formally rigid. One has an actual deformation of the Lie algebra G if by replacing the parameter t by a number in a formal or order k deformation, one gets a Lie algebra structure on G. A Lie algebra G is called analytically rigid if, for small |t|, one has a Lie algebra isomorphism Gt ' G
over C .
The question whether formal rigidity implies analytic rigidity was investigated in [5–7] for certain special deformations Wf of the Witt algebra W, where the parameter f runs in the space of even polynomials with vanishing constant terms. Since Wf is a deformation of W and H 2 (W, W) = 0 is rigid [2], there should exist, for each f , a formal isomorphism Sf : Wf → W. The construction of this expected isomorphism however showed that one should work with suitable completions (adic and analytic) of the Lie algebras Wf and W. Moreover, it was proved [7] that this map is analytic in f , in the sense that there exists a neighborhood N of 0 in C ∞ (S 1 ) such that if M is any finite dimensional subspace of C ∞ (S 1 ), then, for f ∈ N ∩ M, Sf is holomorphic in f . Motivated by the fact that the Witt algebra W has a one-dimensional central extension, the Virasoro algebra Vir, and [2] H 2 (Vir, Vir) = 0, one could define central extensions of the family Wf and see if it is possible to extend the above results to the case of the centrally extended algebras, that is to construct explicitly the corresponding isomorphisms. In Sec. 2, a new family WF of deformations of the Witt algebra is introduced, the parameter F varying in the space of all polynomials with vanishing constant terms. Adic and analytic completions of this family are studied and it is showed that they are isomorphic to those of the family Wf . In Sec. 3, an isomorphism between the completions of the Lie algebras Wf and WF is shown. The study of central extensions of the family Wf is therefore reduced to those of the family WF .
March 19, 2002 12:11 WSPC/148-RMP
00118
Formal and Analytic Deformations
305
In Sec. 4, a one-dimensional central extension VirF of WF is defined and a formal isomorphism between the adic completions of Vir and VirF is explicitly constructed. In Sec. 5, the results of the previous section are extended to the analytic completions of the Lie algebras Vir and VirF . 2. The Algebra WF and Its Formal and Analytic Completions Let X=
M
CLn
n∈Z
be an infinite dimensional vector space over C with basis {Ln }n∈Z . One can make X into a Lie algebra by defining multiplication through [Lm , Ln ] = (n − m)Lm+n for all m, n ∈ Z. This Lie algebra is called the Witt algebra and is denoted by W. This algebra arises naturally when one considers the space of complex vector fields on the unit circle S 1 = {e2πiθ , θ ∈ R}, that is the space of linear operators gd/dθ on the space of complex valued C ∞ -functions on S 1 . W is in fact the subspace of vector fields gd/dθ for which g has a finite Fourier series expansion, the Lie algebra structure given by [gd/dθ, hd/dθ] = (gdh/dθ − hdg/dθ) d/dθ . Let z = e2πiθ , then, through the assignment g(z) → g(z)d/dz, one can identify the Lie algebra W with the space of Laurent polynomials X cm z m (cm ∈ C) g= |m|≤M
with bracket [g, h] = gh0 − g 0 h . A basis for this space is clearly given by ˆln = z n+1 (n ∈ Z). Let P be the space of polynomials with vanishing constant terms and let F ∈ P. Let WF be the Lie algebra whose vector space structure is the same as the algebra W and bracket [g, h]F = (1 + F )[g, h] . ˆ F }F ∈P is a deformation of W with the infinite Since W0 = W, the family {W dimensional parameter space P. ˆ be the adic completion of W, that is one replaces the underlying vector Let W space of Laurent polynomials by the vector space of formal Laurent series X cm z m m≥−M
March 19, 2002 12:11 WSPC/148-RMP
306
00118
L. Guerrini
and bracket structure is the same. Let Pˆ be the space {ˆ u1 z + uˆ2 z 2 + · · · (ˆ uj ∈ C)}. ˆ ˆ The definition of [· , ·]F makes sense also for g, h ∈ W, F ∈ P and [· , ·]F is continuous in the adic topology. ˆ →W ˆ (k ≥ 1) be the continuous linear map Let Tk : W T k g = ak F k g with {ak }k≥1 scalars uniquely determined from the power series identity in z (1 + z)
∞ Y
(1 + ak z k ) = 1 .
k=1
Since [7] ak =
X k d k (−1)k + (−1) d add , k k 1≤d
for all k ≥ 1 ,
d|k
one has
−1 , ak = 1, 0,
if k = 1 , if k = 2r , r > 1 ,
(2.1)
if 1 < k 6= 2 . r
ˆ and W ˆF . The next result shows the existence an isomorphism between W Theorem 2.1. Let SF =:
∞ Y
(1 + Tk ) :
k=1
denote the limit (in the adic topology) lim (1 + T1 )(1 + T2 ) · · · (1 + TN ) .
N →∞
ˆ and W ˆF. Then SF is an isomorphism between W Proof. Same proof as [7, Theorem 14]. ˆ ∞ (respectively W ˆ ∞ ) be the analytic analogue of the Lie algebras W ˆ Let W F ˆ ˆ (respectively WF ), that is the underlying vector space W is replaced by the space ˆ ∞ = C ∞ (S 1 ) . W Let the parameter F ∈ C ∞ (S 1 ). C ∞ (S 1 ) is a Frechet space with topology determined by the norms k · k(m) (m ≥ 1) r X d sup r g(e2πiθ ) kgk(m) = dθ 0≤r≤m
March 19, 2002 12:11 WSPC/148-RMP
00118
Formal and Analytic Deformations
307
with k · k(1) ≤ k · k(2) ≤ · · · ≤ k · k(m) ≤ · · · For r = 0, 1, 2, . . . , gn → g if and only if (dr /dθr )gn → (dr /dθr )g uniformly. One speaks of norms k · k(m) instead of seminorms since this is sufficient for the application one has in mind. Let C ∞ (S 1 )(m) be the completion of C ∞ (S 1 ) by k · k(m) . Then C ∞ (S 1 ) ⊂ · · · ⊂ C ∞ (S 1 )(m) ⊂ C ∞ (S 1 )(m−1) ⊂ · · · ⊂ C ∞ (S 1 )(1) and C ∞ (S 1 ) =
∞ \
C ∞ (S 1 )(m) .
m=1
Sobolev’s classical result [1] that says the above can be achieved by another set of norms X X (m) = |n|r |ˆ g (n)| , kgk
0≤r≤m n
g k1 , kgk(0) = kˆ
P with gˆ(n) the Fourier coefficients of g, g(e2πiθ ) = n∈Z gˆ(n)e2πinθ . The l1 -norm will be used since one needs the fact l1 is a Banach algebra. The next result says that under some restrictions on F the formal isomorphism SF of Theorem 2.1 is analytic. Theorem 2.2. If sup |F | < 1 and kF k(0) < 1, then the map SF =:
∞ Y
(1 + Tk )
k=1
ˆ ∞. ˆ ∞ and W is an isomorphism between W F Proof. Same proof as [7, Theorem 25]. Corollary 2.1. Let M = {F ∈ C ∞ (S 1 ) : sup |F | < 1 and ||F ||(0) < 1} and F be any finite dimensional subspace of C ∞ (S 1 ). Then the map F → SF is analytic from M ∩ F into the space of continuous linear maps C ∞ (S 1 ) → C ∞ (S 1 ).
March 19, 2002 12:11 WSPC/148-RMP
308
00118
L. Guerrini
3. Relations Between the Lie Algebras Wf and WF In [7], a family of deformations Wf of the Witt algebra W parametrized by the space E of even polynomials with vanishing constant terms was defined. The vector space structure is the same as W and Lie bracket (g, h odd) [g, h]f = (1 + f )[g, h] , (g, h even) [g, h]f = [g, h] , [g, h]f = (1 + f )[g, h] + 1 f 0 gh , (g even, h odd) 2 where 0 denotes differentiation with respect to z, f ∈ E and g, h ∈ W. Since ˆ 0 = W, the family {W ˆ f }f ∈E is a deformation of W with the infinite dimensional W parameter space E. The relation between the completions of the algebras WF and Wf is as follows. Theorem 3.1. For any f ∈ E and F ∈ P, one has ˆF . ˆf ' W W ˆ ([7]) and W ˆ 'W ˆ F (Theorem 2.1). ˆf ' W Proof. This follows from W Theorem 3.2. If f and F are close to 0 in the C ∞ (S 1 )-topology, then ˆ F∞ . ˆ f∞ ' W W ˆ ([7]) and W ˆ 'W ˆ ∞ (Theorem 2.2). ˆ∞ ' W Proof. This follows from W f F These two theorems in particular say that the study of central extensions of the ˆ ∞ can be done in term of those of W ˆ F and W ˆ ∞. ˆ f and W algebras W f F ˆ F of VirF and the Isomorphism 4. Adic Completions Vir ˆ F ' Vir ˆ Vir Let VirF = WF ⊕ Cλ be the one-dimensional central extension of WF defined for all g, h ∈ W by ˜ h; F )λ , [g, h]F,Vir = (1 + F )[g, h] ⊕ χ(g, [g, λ]F,Vir = 0 , [λ, h]F,Vir = 0 ,
March 19, 2002 12:11 WSPC/148-RMP
00118
Formal and Analytic Deformations
with
χ(g, ˜ h; F ) = χ((1 + F )g, (1 + F )h) + Res0
3 2 5 F − F 4 2
309
z −2 [g, h] ,
Res0 the residue in zero and χ the 2-cocycle of the Witt algebra I 13 g 0 h00 dz . χ(g, h) = − 12πi On the basis elements this means 13 0 . χ(ˆlm , ˆln ) = − (m3 − m)δm+n 6 If F = 0, the family VirF reduces to the Virasoro algebra, that is it is a deformation of Vir. Introducing a parameter t, that is replacing F with tF , the central extension can be rewritten as ˜ h; tF ) [g, h]t = (1 + tF )[g, h] ⊕ χ(g, = [g, h]Vir + tC11 (g, h; F ) + t2 C22 (g, h; F ) , that is it can be viewed as a formal deformation of Vir in the parameter t. ˆ Vir ˆ F of the These definitions immediately extend to the adic completions Vir, algebras Vir, VirF . ˆ and Vir ˆ F In order to construct the expected formal isomorphism between Vir one needs first to define some linear maps. ˆ → Vir ˆ (k ≥ 1) be the continuous linear map Let T˜k : Vir ( T˜k (λ) = 0 , T˜k (g) = Tk (g) ⊕ Res0 (Ak F k z −2 g) , where {Ak }k≥1 are the scalars uniquely determined from the following power series identity in z ∞ X 1 3 2 5 Ak z k z z = − . (4.1) k 4 2 (1 + z)2 Y k=1 (1 + ar z r )(1 + z) r=1
Lemma 4.1. Let k ≥ 2. Then s−2 X X 3 13 X i) + k− , (−A (−AN ) + 2 4 4 r=1 2r
if k even ,
if k odd ,
2r+1 |(k−N )
where s denotes the smallest positive integer such that k < 2j for all j > 2.
March 19, 2002 12:11 WSPC/148-RMP
310
00118
L. Guerrini
Proof. By (2.1) one has ∞ X N =1
AN z N N Y
(1 + ar z r )(1 + z)
r=1
=
A1 z + 1 − z2
X N ≥2 N =2i ,i≥1
AN z N + 1 − z 2N
X
AN z N
N ≥3 N 6=2i ,i≥1
N Y
(1 + ar z r )(1 + z)
r=1
=
X AN z N X A1 z + + 2 2N 1−z 1−z i
X
AN z N
r≥1 2r
N =2 i≥1
N Y
(1 + ar z r )(1 + z)
r=1
=
X AN z N X A1 z + + 1 − z2 1 − z 2N i
r≥1 2r
N =2 i≥1
Since
X
3 2 5 z − z 4 2
AN z N . 1 − z 2r+1
∞
X 1 = Ck z k , 2 (1 + z) k=1
where
5 − , 2 3 13 Ck = k− , 4 4 13 − k + 3 , 4 4 then it follows from (4.1) that if k is odd, X X A1 + r≥1
if k = 1 , if k even , k > 2 , if k odd , k > 3 , k ≥ 3, AN = −
2r
3 13 k+ . 4 4
Therefore Ak = −
s−2 X
X
r=1
2r
AN −
13 (k − 1) , 4
with s denoting the smallest positive integer such that k < 2j . If k is even, k ≥ 2, then s−2 X X X 3 13 k− A2i + AN + Ak = 4 4 r+1 i≥1 r=1 r 2i+1 |(k−2i )
2
March 19, 2002 12:11 WSPC/148-RMP
00118
Formal and Analytic Deformations
311
and so Ak = −
X i≥1 2i+1 |(k−2i )
A2i −
s−2 X
X
r=1
2r
AN +
3 13 k− . 4 4
with s denoting the smallest positive integer such that k < 2j . Corollary 4.1. (1) If k is odd, k ≥ 3, then Ak = 2M−2 A3 , where M is the smallest positive integer such that k < 2i (i ≥ 1) and A3 = −
13 . 2
In particular, Ai = Aj for all i, j odd such that 2s < i, j < 2s+1 . (2) If k is even, k = 2s , s ≥ 1, then Ak =
13 (k − 1) . 4
(3) If k is even, 2s < k < 2s+1 , then Ak = −Ai for all i, 2s < i < 2s+1 . If k is odd, k ≥ 1, then Ak < 0 and if k is even, k ≥ 2, then Ak > 0. Proof. By the previous lemma using induction and the fact A3 = −(13/2). In particular, for all k ≥ 1 one has |Ak | ≤
13 2 k . 4
Theorem 4.1. Let ψk = 1 + tk T˜k ˆ kF be the Lie algebra structure of Vir ˆ defined inductively in k by and let Vir [g, h]kt = ψk−1 [ψk (g), ψk (h)]k−1 t = [g, h]Vir + tk+1 C1k+1 (g, h; F ) + tk+2 C2k+2 (g, h; F ) + · · · , with C1k+1 (g, h; F ) = ωk+1 (g, h; F ) ⊕ V1k+1 (g, h; F ) ,
(4.2)
March 19, 2002 12:11 WSPC/148-RMP
312
00118
L. Guerrini
and ωk+1 defined by δTk+1 (g, h) = ωk+1 (g, h; F ) , δ the coboundary operator. Then the map T˜k+1 satisfies δ T˜k+1 (g, h) = C1k+1 (g, h; F )
ˆ . (g, h ∈ Vir)
Proof. It is by induction on k. The case k = 1 is straightforward. Suppose the statement is true for r = 2, 3, . . . , k, we prove it is true for k + 1. By the inductive hypothesis and the fact [g, h]kt = ψk−1 [ψk (g), ψk (h)]k−1 t −1 = ψk−1 ψk−1 · · · ψ1−1 [ψ1 · · · ψk−1 ψk (g), ψ1 · · · ψk−1 ψk (h)]t # " k k k Y Y Y r ˜ −1 r˜ r˜ = (1 + t Tr ) (1 + t Tr )g, (1 + t Tr )h r=1
r=1
k Y
k Y
=
(1 + tr T˜r )−1
r=1
r=1
t
(1 + tr T˜r )2 ((1 + tF )[g, h])
r=1 k Y
⊕χ ˜
k Y
(1 + t T˜r )g, r
r=1
!! (1 + t T˜r )h r
r=1
it follows [g, h]kt =
k Y
(1 + ar F r tr )(1 + tF )[g, h]
r=1
⊕χ
k Y
r r
(1 + ar F t )(1 + tF )g,
r=1
k Y
! r r
(1 + ar F t )(1 + tF )h
r=1
k X + Res0 r r=1 Y
−Ar tr F r s
z
s
(1 + as t F )(1 + tF )
−2
[g, h]
s=1
+ Res0
3 2 2 5 t F − tF 4 2
Y k
! (1 + ar F r tr )2 z −2 [g, h] .
r=1
Therefore [g, h]kt = [g, h]Vir + tk+1 C1k+1 (g, h; F ) + tk+2 C2k+2 (g, h; F ) + · · · ,
March 19, 2002 12:11 WSPC/148-RMP
00118
Formal and Analytic Deformations
313
where C1k+1 (g, h; F ) = ωk+1 (g, h; F ) ⊕ Res0 (Ak+1 F k+1 z −2 [g, h]) − ak+1 (χ(F k+1 g, h) + χ(g, F k+1 h)) having used Lemma 4.1. It is now immediate that δ T˜k+1 (g, h) = C1k+1 (g, h; F ). The previous theorem essentially says ψ −1
ψ −1
ψ −1
k 1 2 ˆ 1F ' ˆ 2F ' · · · ' ˆ kF . ˆ F ' Vir Vir Vir Vir
ˆ This ˆ F and Vir. As k → ∞ one should expect to get an isomorphism between Vir is in fact what the next result says. Theorem 4.2. Let S˜F =:
∞ Y
(1 + T˜k ) :
k=1
be lim :
N →∞
N Y
(1 + T˜k ) :
k=1
where : : means that the product is taken as (1 + T˜1 )(1 + T˜2 ) · · · (1 + T˜N ) and the limit is in the adic topology. Then S˜F is an isomorphism between the Lie algebras ˆ and Vir ˆ F. Vir Proof. It follows from the definitions and the above as k → ∞ and t = 1. ˆ ∞ and the Isomorphism Vir ˆ ∞ ' Vir ˆ ∞ 5. Analytic Completions Vir F F ∞
ˆ Let Vir be the smooth analogue of the Virasoro algebra, that is the Lie algebra whose vector space structure is C ∞ (S 1 ) ⊕ Cλ, where ( ) X ∞ 1 n −M gn z , |gn | = O(|n| ) for all M > 0 C (S ) = g = n∈Z ∞
ˆ F be the vector space C ∞ (S 1 ) ⊕ Cλ with and bracket [· , ·]Vir . Similarly let Vir ∞ 1 bracket [· , ·]F,Vir . Let F ∈ C (S ). The aim is now to make sense of the map S˜F , defined in Theorem 4.2, as a continuous invertible transformation. Let B [m] be the Banach space C ∞ (S 1 )(m) ⊕ Cλ with norm k · k[m] = k · k(m) + k · kC . Proposition 5.1. Let k ≥ 1, m ≥ 1 and F ∈ C ∞ (S 1 ) be such that sup |F | < 1. Then
March 19, 2002 12:11 WSPC/148-RMP
314
00118
L. Guerrini
(1) the operators T˜k : B [m] → B [m] are continuous; (2) the operators 1 + T˜k : B [m] → B [m] are continuous and invertible. Proof. (1) Let g ⊕ αλ ∈ B [m] . By definition, (2.1) and (4.2) one has kT˜k gk(m) ≤ kF k gk(m) X = kDr (F k g)k(0) 0≤r≤m
(0)
X X r
s k r−s
= (F )D (g) D
s
0≤r≤m 0≤s≤r ≤
X
2r kF k k(r) kgk(r)
0≤r≤m
≤ 2m+1 (m + 1)kF k k(m) kgk(m) , and | Res0 (Ak F k z −2 g)| ≤ 4k 2 (sup |F |)k kgk(m) . Therefore kTek (g ⊕ αλ)k[m] = kTk gk(m) + | Res0 (Ak F k z −2 g)| ≤ (2m+1 (m + 1)kF k k(m) + 4k 2 (sup |F |)k )kg ⊕ αλ)k[m] . (2) The map 1 + T˜k is clearly continuous by (1) and its inverse is −Ak F k −2 1 −1 ˜ g ⊕ Res0 z g , (1 + Tk ) (g ⊕ αλ) = 1 + ak F k 1 + ak F k the element 1 + ak F k being never 0 on S 1 (k ≥ 1) since sup |F | < 1. Proposition 5.2. Let m ≥ 1 and F ∈ C ∞ (S 1 ), sup |F | < 1 and kF k(0) < 1. Then ∞ X
kT˜k k[m] < ∞ .
k=1
Proof. By Proposition 5.1 ∞ X
kT˜k k[m] ≤ 2m+1 (m + 1)
k=1
∞ X
kF k k(m) + 4
k=1
Since sup |F | < 1, by the Ratio test ∞ X k=1
k 2 (sup |F |)k < ∞ .
∞ X k=1
k 2 (sup |F |)k .
March 19, 2002 12:11 WSPC/148-RMP
00118
Formal and Analytic Deformations
315
Since the behavior of a series does not change if one deletes a finite number of its terms, let consider the series ∞ X kF k k(m) . k=m+1
Now one has ∞ X
kF k k(m) =
k=m+1
∞ X
k=m+1
X
kDr (F k )k(0)
0≤r≤m
(0)
X X r!
s1 sk (D F ) · · · (D F ) =
s +···+s =r s1 ! · · · sk ! k=m+1 0≤r≤m 1 k
s ≥0 ∞ X
j
∞ X X ≤ k=m+1
≤
0≤r≤m
∞ X
k=m+1
X
X s1 +···+sk =r sj ≥0
r! (kF k(m) )m (kF k(0) )k−m s1 ! · · · sk !
k m (kF k(m) )m (kF k(0) )k−m
0≤r≤m
≤ (m + 1)(kF k(m) )m
∞ X
k m (kF k(0) )k−m .
k=m+1
So again by the Ratio test, kF k < 1, one has that the series is convergent. The fact k > m is used to say that in each (s1 , . . . , sk ) at least k − m of the sj have to be 0. (0)
Theorem 5.1. Let m ≥ 1 and F ∈ C ∞ (S 1 ), sup |F | < 1 and kF k(0) < 1. Let S˜N = (1 + T˜1 )(1 + T˜2 ) · · · (1 + T˜N ) . Then, as N → ∞, the map S˜N : B [m] → B [m] is convergent in the norm topology to a continuous invertible linear operator S˜F = limN →∞ (1 + T˜1 )(1 + T˜2 ) · · · (1 + T˜N ). Its inverse is given by S˜F−1 = lim (1 + T˜N−1 )(1 + T˜N−1−1 ) · · · (1 + T˜1−1 ) . N →∞
Proof. Same proof as [7, Lemma 20]. Theorem 5.2. Let F ∈ C ∞ (S 1 ) be such that sup |F | < 1 and kF k(0) < 1. Then the operator ∞ Y ˆ ∞ → Vir ˆ ∞ (1 + T˜k ) : Vir S˜F =: F k=1
is an isomorphism.
March 19, 2002 12:11 WSPC/148-RMP
316
00118
L. Guerrini
Proof. Same proof as [7, Lemma 22]. P If F is a Laurent polynomial, that is |n|≤M cn z n , then it is enough to have X |cn | < 1 . kF k(0) = |n|≤M
Corollary 5.1. Let M be a finite subspace of C ∞ (S 1 ) and N = {F ∈ C ∞ (S 1 ) : sup |F | < 1 and kF k(0) < 1} . Let B be the space of all continuous linear operators C ∞ (S 1 )⊕Cλ → C ∞ (S 1 )⊕Cλ. Then the map Ψ : M ∩ N → B, Ψ(F ) = S˜F is an analytic map. Proof. This follows from the previous results and Morera’s theorem. Acknowledgments I would like to thank all those people who helped me in many different ways during the most testing period of my life (1998–2000). References [1] R. A. Adams, Sobolev Spaces, Academic Press, 1975. [2] A. Fialowski, “Deformations of some infinite-dimensional Lie algebras”, J. Math. Phys. 31(6) (1990) 1340–1343. [3] M. Flato and D. Sternheimer, “Deformations of Poisson brackets, separate and joint analyticity in group representations, nonlinear group representations and physical applications”, Harmonic Analysis and Representations of Semisimple Lie Groups, eds. J. A. Wolf, M. Cahen and M. De Wilde, Dordrecht, 1980, Mathematical Physics and Applied Mathematics, Vol. 5, pp. 385–448. [4] M. Gerstenhaber, “On the deformation of rings and algebras”, Ann. Math. 79 (1964) 59–103. [5] L. Guerrini, “Construction and deformation of infinite dimensional Lie algebras, doctoral thesis, University of California, Los Angeles, 1998. [6] L. Guerrini, “Formal and analytic deformations of the Witt algebra”, Lett. Math. Phys. 46 (1998) 121–129. [7] L. Guerrini, “Formal and analytic rigidity of the Witt algebra”, Rev. Math. Phys. 11(3) (1999) 303–320.
March 19, 2002 12:23 WSPC/148-RMP
00115
Reviews in Mathematical Physics, Vol. 14, No. 3 (2002) 317–342 c World Scientific Publishing Company
REPRESENTATIONS OF REFLECTION ALGEBRAS
A. I. MOLEV School of Mathematics and Statistics, University of Sydney, NSW 2006, Australia
[email protected] E. RAGOUCY LAPTH, Chemin de Bellevue, BP 110, F-74941 Annecy-le-Vieux cedex, France
[email protected]
Received 30 July 2001 Revised 19 October 2001 We study a class of algebras B(n, l) associated with integrable models with boundaries. These algebras can be identified with coideal subalgebras in the Yangian for gl(n). We construct an analog of the quantum determinant and show that its coefficients generate the center of B(n, l). We develop an analog of Drinfeld’s highest weight theory for these algebras and give a complete description of their finite-dimensional irreducible representations.
1. Introduction A central role in the theory of integrable models in statistical mechanics is played by the Yang–Baxter equation R12 (u − v)R13 (u − w)R23 (v − w) = R23 (v − w)R13 (u − w)R12 (u − v) ;
(1.1)
see Baxter [1]. Here R(u) is a linear operator R(u) : V ⊗ V → V ⊗ V on the tensor square of a vector space depending on the spectral parameter u. Both sides of the Yang–Baxter equation are linear operators on the triple tensor product V ⊗ V ⊗ V and the indices of R(u) indicate the copies of V where R(u) acts; e.g., R12 (u) = R(u) ⊗ 1. A simplest nontrivial solution of the equation is provided by the Yang R-matrix R(u) = 1 − P u−1 ,
(1.2)
where P is the permutation operator P : ξ ⊗ η 7→ η ⊗ ξ in the space C ⊗ Cn . The Yang R-matrix emerges in the XXX or six vertex model [1]. It gives rise to an algebra with the defining relations given by the RTT relation n
R(u − v)T1 (u)T2 (v) = T2 (v)T1 (u)R(u − v)
(1.3)
(we discuss the precise meaning of this relation below in Sec. 2). The algebraic structures associated with the Yang–Baxter equation were studied in the works of 317
March 19, 2002 12:23 WSPC/148-RMP
318
00115
A. I. Molev & E. Ragoucy
Faddeev’s school in the 70-s and 80-s in relation with the quantum inverse scattering method; see e.g. Takhtajan–Faddeev [30], Kulish–Sklyanin [12]. In particular, a central element called the quantum determinant in the algebra defined by (1.3) was introduced by Izergin and Korepin [10] in the case of two dimensions. The basic ideas and formulas associated with the quantum determinant for an arbitrary n are given in the paper Kulish–Sklyanin [12]. Tarasov [31, 32] described irreducible representations (monodromy matrices) in the case n = 2. In was independently realized by Drinfeld and Jimbo around 1985 that the algebraic structures associated with the quantum inverse scattering method are naturally described by the language of Hopf algebras. This marked the beginning of the theory of quantum groups [8] (a historic background of this theory is given in the book by Chari and Pressley [4]). In his paper [7] Drinfeld introduced a remarkable class of quantum groups called the Yangians. For any simple Lie algebra a the Yangian Y(a) is a canonical deformation of the universal enveloping algebra U(a[x]) for the polynomial current Lie algebra a[x]. The Hopf algebra defined by the RT T relation (1.3) with the Yang R-matrix (1.2) is called the Yangian for the general linear Lie algebra gl(n) and denoted Y(gl(n)) or Y(n). The significance of the Yangians was explained by Drinfeld [7] who showed that the rational solutions of the Yang–Baxter equation (1.1) are described by the Yangian representations. The irreducible finite-dimensional representations were classified in his subsequent paper [9]. The results turned out to be parallel to those for the semisimple Lie algebras. Every irreducible finite-dimensional representation of Y(a) is a quotient of the corresponding universal highest weight module where the components of the highest weight satisfy some dominance conditions. Explicit constructions of all such representations for the Yangians Y (sl(2)) and Y (2) are given in Tarasov [31, 32] and Chari–Pressely [3]. However, apart from this case, the explicit structure of the Yangian representations remains unknown even in the case of gl(n) (a description of a class of generic and tame representations is given in [19] and [22] via Gelfand–Tsetlin bases). Sklyanin [29] introduced a class of algebras associated with integrable models with boundaries (we call them the reflection algebras in this paper). His approach was inspired by Cherednik’s scattering theory [6] for factorized particles on the half-line. Instead of the RT T relation the algebras are defined by the reflection equation R(u − v)B1 (u)R(u + v)B2 (v) = B2 (v)R(u + v)B1 (u)R(u − v) .
(1.4)
In [29] commutative subalgebras in the reflection algebras (in the case of two dimensions) were constructed and the algebraic Bethe ansatz was described. Moreover, an analog of the quantum determinant for these algebras was introduced and some properties of the highest weight representations were discussed. Different versions of (1.4) were employed in the works Reshetikhin–SemenovTian-Shansky [28], Olshanski [24] and Noumi [23]. Similar classes of algebras were studied in Kulish–Sklyanin [13], Kuznetsov–Jørgensen–Christiansen [14],
March 19, 2002 12:23 WSPC/148-RMP
00115
Representations of Reflection Algebras
319
Koornwinder and Kuznetsov [11, 15]. Recently, algebras of this kind were discussed in the physics literature in connection with the NLS model, they describe the integrals of motion of the model; see Liguori–Mintchev–Zhao [16], Mintchev–Ragoucy– Sorba [17]. In this paper we consider a family of reflection algebras B(n, l). They are defined as associative algebras whose generators satisfy two types of relations: the reflection equation and the unitary condition; see (2.5) and (2.6) below. The unitary condition allows us to identify B(n, l) with a subalgebra in the gl(n)-Yangian Y (n); see Theorem 3.1. This condition was not explicitly used in [29], but it appears e.g. in ˜ l) such that B(n, l) is a quotient of [11]. If we omit it we get a larger algebra B(n, ˜ l) by an ideal generated by some central elements. The consequence of this fact B(n, is that the finite-dimensional irreducible representations of both algebras B(n, l) and ˜ l) are essentially the same. B(n, On the other hand, the subalgebra B(n, l) turns out to be a (left) coideal in the Hopf algebra Y (n); see Proposition 3.3. This allows us to regard the tensor product L ⊗ V of an Y(n)-module L and a B(n, l)-module V as a B(n, l)-module; cf. [29, Proposition 2]. We show that the center of B(n, l) is generated by the coefficients of an analog of the quantum determinant which we call, following [21], the Sklyanin determinant. We derive a formula which expresses the Sklyanin determinant in terms of the quantum determinant for the Yangian Y(n). The aforementioned properties of the algebras B(n, l) exhibit much analogy with the twisted Yangians introduced by Olshanski [24]; see also [21] for a detailed exposition. Moreover, in two dimensions the algebras B(2, 0) and B(2, 1) turn out to be respectively isomorphic to the symplectic and orthogonal twisted Yangians; see Sec. 4.2. This analogy also extends to the representation theory; cf. [20]. We prove here that, as for the twisted Yangians, the Drinfeld highest weight theory [9] is applicable to the algebras B(n, l). Every finite-dimensional irreducible representation of B(n, l) is highest weight, and given an irreducible highest weight module, we produce necessary and sufficient conditions for it to be finite-dimensional. These conditions are expressed in terms of the Drinfeld polynomials in a way similar to [9] and [20] for the Yangians and twisted Yangians. Note, however, an essential difference: here all Drinfeld polynomials must satisfy a symmetry condition; see Theorem 4.6. In particular, all of them have even degree. In conclusion, we would like to emphasize the common feature of the three classes of algebras, the Yangian, twisted Yangians and reflection algebras: the defining relations in all the cases can be presented in a special matrix from. This allows special algebraic techniques (the so-called R-matrix formalism) to be used to describe the algebraic structure and study representations of these algebras. On the other hand, the close relationship of these “quantum” algebras with the matrix Lie algebras leads to applications in the classical representation theory; see e.g. [18] and references therein. The Yangian symmetries have been found in various areas of physics including the theory of integrable models in statistical mechanics, conformal
March 19, 2002 12:23 WSPC/148-RMP
320
00115
A. I. Molev & E. Ragoucy
field theory, quantum gravity. We note a surprising connection of the Yangian and twisted Yangians with the finite W-algebras recently discovered in [26, 27, 25]; see also [2]. 2. Definitions and Preliminaries Recall first the definition of the gl(n)-Yangian Y(n); see e.g. [12, 9]. We follow the notation from [21] where a detailed account of the properties of Y(n) is given. The (1) (2) Yangian Y(n) is the complex associative algebra with the generators tij , tij , . . . , where 1 ≤ i, j ≤ n, and the defining relations [tij (u), trs (v)] =
1 (trj (u)tis (v) − trj (v)tis (u)) , u−v
(2.1)
where tij (u) = δij + tij u−1 + tij u−2 + · · · ∈ Y (n)[[u−1 ]] (1)
(2)
and u is a formal (commutative) variable. Introduce the matrix T (u) =
n X
tij (u) ⊗ Eij ∈ Y(n)[[u−1 ]] ⊗ End Cn ,
(2.2)
i,j=1
where the Eij are the standard matrix units. Then the relations (2.1) are equivalent to the single RT T relation (1.3). Here T1 (u) and T2 (u) are regarded as elements of Y (n)[[u−1 ]] ⊗ End Cn ⊗ End Cn , the subindex of T (u) indicates to which copy of End Cn this matrix corresponds, and R(u) = 1 − P u−1 ,
P =
n X
Eij ⊗ Eji ∈ (End Cn )⊗2 .
i,j=1
Now we introduce the reflection algebras. Fix a decomposition of the parameter n into the sum of two nonnegative integers, n = k + l. Denote by G the diagonal n × n-matrix G = diag(ε1 , . . . , εn )
(2.3)
where εi = 1 for 1 ≤ i ≤ k, and εi = −1 for k + 1 ≤ i ≤ n. The reflection algebra (r) B(n, l) is a unital associative algebra with the generators bij where r runs over positive integers and i and j satisfy 1 ≤ i, j ≤ n. To write down the defining relations introduce formal series ∞ X (r) (0) bij u−r , bij = δij εi (2.4) bij (u) = r=0
and combine them into the matrix n X bij (u) ⊗ Eij ∈ B(n, l)[[u−1 ]] ⊗ End Cn . B(u) = i,j=1
March 19, 2002 12:23 WSPC/148-RMP
00115
Representations of Reflection Algebras
321
The defining relations are given by the reflection equation [29] R(u − v)B1 (u)R(u + v)B2 (v) = B2 (v)R(u + v)B1 (u)R(u − v)
(2.5)
together with the unitary condition B(u)B(−u) = 1 ,
(2.6)
where we have used the notation of (1.3). Rewriting (2.5) and (2.6) in terms of the matrix elements we obtain, respectively, [bij (u), brs (v)] =
1 (brj (u)bis (v) − brj (v)bis (u)) u−v 1 + u+v
δrj
n X
bia (u)bas (v) − δis
a=1
1 δij − 2 u − v2
n X
n X
! bra (v)baj (u)
a=1
bra (u)bas (v) −
a=1
n X
!
bra (v)bas (u)
(2.7)
a=1
and n X
bia (u)baj (−u) = δij .
(2.8)
a=1
˜ l) which is defined in the same way as B(n, l) We shall also be using the algebra B(n, but with the unitary condition (2.6) dropped. (This corresponds to Sklyanin’s orig(r) ˜ l). inal definition [29]). We use the same notation bij for the generators of B(n, As we shall see in the following proposition, the algebra B(n, l) is isomorphic to a ˜ l) by an ideal whose generators are central elements; see also [17]. quotient of B(n, ˜ l) the product B(u)B(−u) is a scalar matrix Proposition 2.1. In the algebra B(n, B(u)B(−u) = f (u)1 ,
(2.9)
˜ l). where f (u) is an even series in u−1 whose all coefficients are central in B(n, Proof. Multiply both sides of (2.7) by u2 − v 2 and put v = −u. We obtain ! n n X X bia (u)bas (−u) − δis bra (−u)baj (u) 2u δrj a=1
= δij
a=1 n X a=1
bra (u)bas (−u) −
n X
! bra (−u)bas (u) .
(2.10)
a=1
Choosing appropriate indices i, j, r, s, it is easy to see that B(u)B(−u) = B(−u)B(u) and that this matrix is scalar. Thus, (2.9) holds for an even series
March 19, 2002 12:23 WSPC/148-RMP
322
00115
A. I. Molev & E. Ragoucy
f (u). Now multiply both sides of (2.5) by B2 (−v) from the right: R(u − v)B1 (u)R(u + v)f (v) = B2 (v)R(u + v)B1 (u)R(u − v)B2 (−v) .
(2.11)
Applying (2.5) to the right hand side we write it as B2 (v)R(u + v)B1 (u)R(u − v)B2 (−v) = B2 (v)B2 (−v)R(u − v)B1 (u)R(u + v) = f (v)R(u − v)B1 (u)R(u + v) .
(2.12)
This shows that f (u) is central. We would like to comment on the relevance of the choice of the initial matrix G in the expansion B(u) = G +
∞ X
B (r) u−r ;
(2.13)
r=1
see also [17]. We could take G to be an arbitrary nondegenerate matrix. However, as the proof of Proposition 2.1 shows, the reflection equation implies that G2 is a scalar matrix. Since G is nondegenerate, the scalar is nonzero. On the other hand, as can be easily seen, for any constant c and any nondegenerate matrix A the transformations B(u) 7→ cB(u) and
B(u) 7→ AB(u)A−1
(2.14)
preserve the reflection equation. Accordingly, the matrix G is then transformed as G 7→ cG and G 7→ AGA−1 . Therefore, we may assume that G2 = 1 and using an appropriate matrix A we can bring G to the form (2.3). Note also that both the reflection equation and unitary condition are preserved by the change of sign B(u) 7→ −B(u). This implies that the algebras B(n, l) and B(n, n − l) are isomorphic. In what follows we assume that the parameter l satisfies 0 ≤ l ≤ n/2. ˜ l) that given It is immediate from the definitions of the algebras B(n, l) and B(n, −1 −1 any formal series g(u) ∈ 1 + u C[[u ]] the mapping B(u) 7→ g(u)B(u)
(2.15)
˜ l). If g(u) satisfies g(u)g(−u) = 1 then is an automorphism of the algebra B(n, (2.15) is an automorphism of B(n, l). 3. Algebraic Structure of B(n, l) Here we show that each B(n, l) can be identified with a coideal subalgebra in the Yangian Y(n) and prove an analog of the Poincar´e–Birkhoff–Witt theorem for the algebra B(n, l). Then we describe its center using an analog of the quantum determinant.
March 19, 2002 12:23 WSPC/148-RMP
00115
Representations of Reflection Algebras
323
3.1. Embedding B(n, l) ,→ Y (n) Denote by T −1 (u) the inverse matrix for T (u); see (2.2). It can be easily seen that the matrix T −1 (−u) satisfies the RT T relation (1.3). This implies that the mapping T (u) → T −1 (−u) defines an algebra automorphism of the Yangian Y (n); cf. [21]. The following connection between B(n, l) and Y(n) was indicated in [29]. Theorem 3.1. The mapping ϕ : B(u) 7→ T (u)GT −1 (−u)
(3.1)
defines an embedding of the algebra B(n, l) into the Yangian Y(n). Proof. First we verify that ϕ is an algebra homomorphism. Denote the matrix ˜ ˜ B(−u) ˜ We obviously have B(u) = 1 and so (2.6) is T (u)GT −1(−u) by B(u). satisfied. Further, we have ˜2 (v) ˜1 (u)R(u + v)B R(u − v)B = R(u − v)T1 (u)G1 T1−1 (−u)R(u + v)T2 (v)G2 T2−1 (−v) .
(3.2)
We find from the RT T relation (1.3) that T1−1 (−u)R(u + v)T2 (v) = T2 (v)R(u + v)T1−1 (−u) .
(3.3)
Therefore, since Gi commutes with Tj (u) for i 6= j we bring the expression (3.2) to the form R(u − v)T1 (u)T2 (v)G1 R(u + v)G2 T1−1 (−u)T2−1(−v) .
(3.4)
One easily verifies that R(u − v)G1 R(u + v)G2 = G2 R(u + v)G1 R(u − v) .
(3.5)
Applying (1.3) and (3.5) we write (3.4) as T2 (v)T1 (u)G2 R(u + v)G1 T2−1 (−v)T1−1 (−u)R(u − v) ,
(3.6)
˜ 1 (u)R(u − v) due to (3.3). Thus, B(u) ˜ ˜2 (v)R(u + v)B satisfies which coincides with B the reflection Eq. (2.5). We now show that ϕ has trivial kernel. The Yangian Y(n) admits two natural (r) filtrations; see [21]. Here we use the one defined by setting deg1 tij = r. Similarly, (r)
we define the filtration on the algebra B(n, l) by deg1 bij = r. Let us first verify that the homomorphism ϕ is filtration-preserving. By definition, the matrix elements of ˜ B(u) are expressed as ˜bij (u) =
n X a=1
εa tia (u)t0aj (−u) ,
(3.7)
March 19, 2002 12:23 WSPC/148-RMP
324
00115
A. I. Molev & E. Ragoucy
where the t0ij (u) denote the matrix elements of T −1 (u). Inverting the matrix T (u) we come to the following expression for t0ij (u): t0ij (u) = δij +
∞ X
n X
(−1)k
k=1
t◦ia1 (u)t◦a1 a2 (u) · · · t◦ak−1 j (u) ,
(3.8)
a1 ,...,ak−1 =1
where t◦ij (u) = tij (u) − δij . Taking the coefficient at u−r for r ≥ 1 we get 0(r)
tij =
r X
(−1)k
n X
X
(r )
(r )
k 2) tia11 t(r a1 a2 · · · tak−1 j ,
(3.9)
a1 ,...,ak−1 =1 r1 +···+rk =r
k=1
with the last sum taken over positive integers ri . Therefore, we find from (3.7) that (r) the degree of ˜bij does not exceed r. Hence, ϕ is filtration-preserving and it defines a homomorphism of the corresponding graded algebras gr1 B(n, l) → gr1 Y(n) .
(3.10)
(r) (r) Let t¯ij denote the image of tij in the rth component of gr1 Y (n). The algebra gr1 Y (n) is obviously commutative, and it was proved in [21, Theorem 1.22] that (r) the elements t¯ij are its algebraically independent generators. On the other hand, by the defining relations (2.7), the algebra gr1 B(n, l) is also (r) (r) commutative. Denote by ¯bij the image of bij in the rth component of gr1 B(n, l). We find from the unitary condition (2.8) that the elements
¯b(2p−1) , ij
1 ≤ i, j ≤ k
or k + 1 ≤ i, j ≤ n ,
¯b(2p) , ij
1≤i≤k<j≤n
or 1 ≤ j ≤ k < i ≤ n ,
(3.11)
with p running over positive integers, generate the algebra gr1 B(n, l). Now, by (3.7), (r) the image of ¯bij under the homomorphism (3.10) has the form ¯b(r) 7→ (εi (−1)r−1 + εj )t¯(r) + (· · ·) , ij ij
(3.12)
(s) where (· · ·) indicates a linear combination of monomials in the elements t¯ij with s < r. This implies that the elements (3.11) are algebraically independent which completes the proof.
The proposition implies the following analog of the Poincar´e–Birkhoff–Witt theorem for the algebra B(n, l). Corollary 3.2. Given any total ordering on the elements (2p−1)
bij
(2p)
bij
,
,
1 ≤ i, j ≤ k 1≤i≤k<j≤n
or k + 1 ≤ i, j ≤ n , or 1 ≤ j ≤ k < i ≤ n ,
with p = 1, 2, . . . , the ordered monomials in these elements constitute a basis of the algebra B(n, l).
March 19, 2002 12:23 WSPC/148-RMP
00115
Representations of Reflection Algebras
325
Due to Proposition 3.1, we may regard B(n, l) as a subalgebra of Y(n) so that the generators bij (u) are identified with the elements ˜bij (u) given by (3.7). Recall that the Yangian Y (n) is a Hopf algebra with the coproduct ∆ : Y(n) → Y (n) ⊗ Y(n) defined by ∆(tij (u)) =
n X
tia (u) ⊗ taj (u) .
(3.13)
a=1
Proposition 3.3. The subalgebra B(n, l) is a left coideal in Y(n): ∆(B(n, l)) ⊆ Y(n) ⊗ B(n, l) .
(3.14)
Proof. It suffices to show that the images of the generators of B(n, l) under the coproduct are contained in Y(n) ⊗ B(n, l). Since ∆ is an algebra homomorphism, the images of the matrix elements t0ij (u) of T −1 (u) are given by ∆(t0ij (u)) =
n X
t0aj (u) ⊗ t0ia (u) .
(3.15)
a=1
Therefore, using (3.7) we calculate that ∆(bij (u)) =
n X
tia (u)t0cj (−u) ⊗ bac (u) ,
(3.16)
a,c=1
completing the proof. 3.2. Sklyanin determinant The quantum determinant qdet T (u) of the matrix T (u) is a formal series in u−1 , qdet T (u) = 1 + d1 u−1 + d2 u−2 + · · · , defined by qdet T (u) =
X
di ∈ Y(n) ,
sgn p · tp(1)1 (u) · · · tp(n)n (u − n + 1) ;
(3.17)
(3.18)
p∈Sn
see Izergin–Korepin [10], Kulish–Sklyanin [12]. The elements d1 , d2 , . . . are algebraically independent generators of the center of the algebra Y(n); see e.g. [21] for the proof. The quantum determinant can be equivalently defined in a R-matrix form [12, 5]; see also [21]. Consider the tensor product Y(n)[[u
−1
]] ⊗ End Cn ⊗ · · · ⊗ End Cn
(3.19)
with n copies of End Cn . For complex parameters u1 , . . . , un set R(u1 , . . . , un ) = (Rn−1,n )(Rn−2,n Rn−2,n−1 ) · · · (R1n · · · R12 ) ,
(3.20)
where we abbreviate Rij = Rij (ui − uj ), and the subindices enumerate the copies of End Cn in (3.19). If we specialize to ui = u − i + 1 ,
i = 1, . . . , n ,
(3.21)
March 19, 2002 12:23 WSPC/148-RMP
326
00115
A. I. Molev & E. Ragoucy
then R(u1 , . . . , un ) becomes the one-dimensional anti-symmetrization operator An in the space (End Cn )⊗n . The quantum determinant qdet T (u) is defined by the relation An T1 (u) · · · Tn (u − n + 1) = An qdet T (u) .
(3.22)
Now, using (2.5) and the relation Rij Rir Rjr = Rjr Rir Rij ,
(3.23)
(the Yang–Baxter equation) we derive the following relation for the B-matrices: ˜ 12 · · · R ˜ 1n B2 (u2 )R ˜ 23 · · · R ˜ 2n B3 (u3 ) · · · R ˜ n−1,n Bn (un ) R(u1 , . . . , un )B1 (u1 )R ˜ n−1,n · · · B3 (u3 )R ˜2n · · · R ˜ 23 B2 (u2 )R ˜ 1n · · · R ˜ 12 B1 (u1 )R(u1 , . . . , un ) , = Bn (un )R (3.24) ˜ ij = Rij (ui + uj ). When we specialize the parameters ui as in (3.21), the where R element (3.24) will be equal to the product of the anti-symmetrizer An and a series in u−1 with coefficients in B(n, l). We call this series the Sklyanin determinant and denote it by sdet B(u). That is, sdet B(u) is defined by ˜ 12 · · · R ˜ 1n B2 (u − 1) · · · R ˜ n−1,n Bn (u − n + 1) = An sdet B(u) . An B1 (u)R
(3.25)
As follows from the definition, the constant term of sdet B(u) is det G = (−1)l , so sdet B(u) = (−1)l + c1 u−1 + c2 u−2 + · · · ,
ci ∈ B(n, l) .
(3.26)
In the next theorem we regard B(n, l) as a subalgebra in Y (n); see Theorem 3.1. Theorem 3.4. We have the identity sdet B(u) = θ(u) qdet T (u)(qdet T (−u + n − 1))−1 ,
(3.27)
where θ(u) = (−1)l
k l n Y Y Y (2u − 2n + 2i) (2u − 2n + 2i) i=1
i=1
1 . 2u − 2n + i + 1 i=1
(3.28)
In particular, all the coefficients of sdet B(u) are central in B(n, l). Moreover, the odd coefficients c1 , c3 , . . . are algebraically independent and generate the center of the algebra B(n, l). Proof. Substitute B(u) = T (u)GT −1 (−u) into (3.25). Applying relation (3.3) repeatedly, we bring the left hand side of (3.25) to the form ˜ 12 · · · R ˜ 1n G2 An T1 (u) · · · Tn (u − n + 1)G1 R ˜ n−1,n Gn T −1 (−u) · · · T −1 (−u + n − 1) . ···R n 1
(3.29)
March 19, 2002 12:23 WSPC/148-RMP
00115
Representations of Reflection Algebras
327
Now use (3.22), and then note that ˜ 12 · · · R ˜ 1n G2 · · · R ˜ n−1,n Gn = An θ(u) , An G1 R
(3.30)
for some scalar function θ(u). Indeed, this follows e.g. from (3.24) where we specialize ui as in (3.21) and consider the trivial representation of B(n, l) such that B(u) 7→ G. Furthermore, we have An T1−1 (−u) · · · Tn−1 (−u + n − 1) = An (qdet T (−u + n − 1))−1 .
(3.31)
This follows from (3.22), if we first multiply both sides by Tn−1 (u − n + 1) · · · T1−1 (u) from the right, replace u with −u + n − 1 and then conjugate the left hand side by the permutation of the indices 1, . . . , n which sends i to n − i + 1. To complete the proof of (3.27) we need to calculate θ(u). It suffices to find a diagonal matrix element of the operator on the left hand side of (3.30) corresponding to the vector e1 ⊗ · · · ⊗ en , where the ei denote the canonical basis of Cn . We have (n − 1)!An = An A0n−1 , where A0n−1 is the anti-symmetrizer in the tensor product of the copies of End Cn corresponding to the indices 2, . . . , n. Note that A0n−1 commutes with G1 . Furthermore, we have the identity ˜ 12 · · · R ˜ 1n = R ˜ 1n · · · R ˜ 12 A0 A0n−1 R n−1 ,
(3.32)
which easily follows from (3.23). This allows us to use induction on n to find the matrix element and to perform the calculation of θ(u) which is now straightforward. Using (3.27) we can conclude that all the coefficients of sdet B(u) belong to the center of the Yangian Y (n), and hence to the center of its subalgebra B(n, l). Furthermore, if we put −1 n−1 n−1 , (3.33) θ u+ c(u) = sdet B u + 2 2 then by (3.27) we have the identity c(u) = d(u)d(−u)−1 ,
d(u) = qdet T
u+
n−1 2
.
(3.34)
The coefficients of d(u) are algebraically independent generators of the center of Y(n). Since c(u)c(−u) = 1, repeating the argument of the second part of the proof of Theorem 3.1 for the case of B(1, 0) we find that all the even coefficients of c(u) can be expressed in terms of the odd ones, and the latter are algebraically independent. Since n−1 , (3.35) sdet B(u) = θ(u)c u − 2 the same holds for the coefficients of sdet B(u). Finally, let us show that the center of B(n, l) is generated by the coefficients of (r) sdet B(u). We use another filtration on B(n, l) defined by setting deg2 bij = r − 1. We first verify that the corresponding graded algebra gr2 B(n, l) is isomorphic to the universal enveloping algebra for a twisted polynomial current Lie algebra. Consider
March 19, 2002 12:23 WSPC/148-RMP
328
00115
A. I. Molev & E. Ragoucy
the involution σ of the Lie algebra gl(n) given by σ : Eij 7→ εi εj Eij . Denote by a0 and a1 the eigenspaces of σ corresponding to the eigenvalues 1 and −1, respectively. In particular, a0 is a Lie subalgebra of gl(n) isomorphic to gl(k) ⊕ gl(l). Denote by gl(n)[x]σ the Lie algebra of polynomials in a variable x of the form a0 + a1 x + a2 x2 + · · · + am xm ,
a2i ∈ a0 ,
a2i−1 ∈ a1 .
(3.36)
We claim that the following is an algebra isomorphism: gr2 B(n, l) ' U(gl(n)[x]σ ) .
(3.37)
(r) (r) Indeed, denote by ¯bij the image of bij in the (r − 1)th component of gr2 B(n, l). Then by the unitary condition we have for r ≥ 1 (r) (εi + (−1)r εj )¯bij = 0 ,
(3.38)
while the reflection equation gives (r) (s) (r+s−1) (r+s−1) − δil (εi + εj (−1)r−1 )¯bkj . [¯bij , ¯bkl ] = δkj (εi (−1)r−1 + εj )¯bil
(3.39)
This shows that the mapping ¯b(r) 7→ (εi + (−1)r−1 εj )Eij xr−1 ij
(3.40)
defines an algebra homomorphism gr2 B(n, l) → U(gl(n)[x]σ ). Corollary 3.2 ensures that its kernel is trivial. Further, it is easily deduced from the definition (3.25) of the Sklyanin determinant that the images of c¯2m+1 under the isomorphism (3.37) are given by c¯2m+1 7→ (−1)l 2(E11 + · · · + Enn )x2m ,
m ≥ 0.
(3.41)
The theorem will be proved if we show that the center of U(gl(n)[x]σ ) is generated by the elements (E11 +· · ·+Enn )x2m with m ≥ 0. This is equivalent to the claim that the center of U(sl(n)[x]σ ) is trivial. Here sl(n)[x]σ is the Lie algebra of polynomials of the form (3.36) where now a0 = sl(n) ∩ (gl(k) ⊕ gl(l)). However, this follows from a slight modification of a more general result: see [21, Proposition 4.10]. Namely, if a is any Lie algebra and σ is its involution, we define a[x]σ as the Lie algebra of polynomials of type (3.36). Then an argument similar to [21] proves that if the center of a is trivial, and the a0 -module a1 has no nontrivial invariant elements then the center of U(a[x]σ ) is trivial. Next we give some remarks. (1) The algebra B(n, l) can also be regarded as a deformation of the universal enveloping algebra U(gl(n)[x]σ ). To see this, introduce the deformation parameter h and rewrite the defining relations for B(n, l) in terms of the re-scaled generators 0(r) (r) bij = bij hr−1 . These define a family of algebras B(n, l)h. If h 6= 0 the algebra B(n, l)h is isomorphic to B(n, l) while for h = 0 one obtains the universal enveloping algebra U(gl(n)[x]σ ). (2) It would be interesting to find an explicit formula for sdet B(u) in terms of the generators bij (u).
March 19, 2002 12:23 WSPC/148-RMP
00115
Representations of Reflection Algebras
329
Recall that the Yangian Y(sl(n)) for the special linear Lie algebra sl(n) can be defined as the subalgebra of Y(n) which consists of the elements stable under all automorphisms of the form T (u) 7→ h(u)T (u), where h(u) is a series in u−1 with the constant term 1; see [21]. Then one has the tensor product decomposition Y(n) = Z(n) ⊗ Y(sl(n)) ,
(3.42)
where Z(n) is the center of Y(n). Define the special reflection algebra SB(n, l) by SB(n, l) = B(n, l) ∩ Y(sl(n)) .
(3.43)
In other words, SB(n, l) consists of the elements of B(n, l) which are stable under all automorphisms (2.15). It is implied by (3.42) (cf. [21, Proposition 4.14]), that the following decomposition holds B(n, l) = Z(n, l) ⊗ SB(n, l) ,
(3.44)
where Z(n, l) is the center of B(n, l). 4. Representations of B(n, l) Here we show that the Drinfeld highest weight theory [9] applies to the representations of the algebras B(n, l); see also [20] for the case of twisted Yangians. We then give a complete description of the finite-dimensional irreducible representations of B(n, l). 4.1. Highest weight representations Recall first Drinfeld’s classification results for representations of the Yangian Y(n) [9]; see also [20]. A representation L of the Yangian Y(n) is called highest weight if there exists a nonzero vector ξ ∈ L such that L is generated by ξ, tij (u)ξ = 0
for 1 ≤ i < j ≤ n ,
tii (u)ξ = λi (u)ξ
for 1 ≤ i ≤ n ,
(4.1)
for some formal series λi (u) ∈ 1 + u−1 C[[u−1 ]]. The vector ξ is called the highest vector of L and the set λ(u) = (λ1 (u), . . . , λn (u)) is the highest weight of L. Let λ(u) = (λ1 (u), . . . , λn (u)) be an n-tuple of formal series. Then there exists a unique, up to an isomorphism, irreducible highest weight module L(λ(u)) with the highest weight λ(u). Any finite-dimensional irreducible representation of Y(n) is isomorphic to L(λ(u)) for some λ(u). The representation L(λ(u)) is finitedimensional if and only if there exist monic polynomials Q1 (u), . . . , Qn−1 (u) in u such that Qi (u + 1) λi (u) = , i = 1, . . . , n − 1 . (4.2) λi+1 (u) Qi (u) These relations are analogs of the dominance conditions in the representation theory of semisimple Lie algebras. Similar relations involving monic polynomials are
March 19, 2002 12:23 WSPC/148-RMP
330
00115
A. I. Molev & E. Ragoucy
given by Drinfeld [9] to describe finite-dimensional irreducible representations of the Yangian Y(a) for any simple Lie algebra a. The polynomials Qi (u) are called the Drinfeld polynomials (usually denoted by Pi (u)). Let us now turn to the reflection algebras. A representation V of the algebra B(n, l) is called highest weight if there exists a nonzero vector ξ ∈ V such that V is generated by ξ, bij (u)ξ = 0
for 1 ≤ i < j ≤ n ,
bii (u)ξ = µi (u)ξ
for 1 ≤ i ≤ n ,
(4.3)
for some formal series µi (u) ∈ εi + u−1 C[[u−1 ]]. The vector ξ is called the highest vector of V and the set µ(u) = (µ1 (u), . . . , µn (u)) is the highest weight of V . Theorem 4.1. Every finite-dimensional irreducible representation V of the algebra B(n, l) is a highest weight representation. Moreover, V contains a unique (up to a constant factor) highest vector. Proof. Our approach is quite standard; cf. [4, Proposition 12.2.3], [20]. Introduce the subspace of V V 0 = {η ∈ V | bij (u)η = 0 , 1 ≤ i < j ≤ n} .
(4.4)
We show first that V 0 is nonzero. The defining relations (2.7) give (1)
[bij , brs (u)] = (εi + εj )(δrj bis (u) − δis brj (u)) .
(4.5)
This implies that V 0 can be equivalently defined as V 0 = {η ∈ V | bi,i+1 (u)η = 0 , i = 1, . . . , n − 1} .
(4.6)
(1)
The operators εi bii are pairwise commuting and so they have a common eigenvector ζ 6= 0 in V . If we suppose that V 0 = 0 then there exists an infinite sequence of nonzero vectors in V , ζ,
(r )
bi1 ,i1 1 +1 ζ ,
(r )
(r )
bi2 ,i2 2 +1 bi1 ,i1 1 +1 ζ, . . . .
(4.7)
(1)
By (4.5) they all are eigenvectors for the operators εi bii , i = 1, . . . , n of different weights. Thus, they are linearly independent which contradicts to the assumption dim V < ∞. So, V 0 is nontrivial. Next, we show that all the operators brr (u) preserve the subspace V 0 . We use a reverse induction on r. For r = n we see from (2.7) that if i < j < n then bij (u)bnn (v) ≡ 0, where the equivalence is modulo the left ideal in B(n, l) generated by the coefficients of bij (u) with i < j. Similarly, if i < n then the assertion follows from the equivalence bin (u)bnn (v) ≡
1 bin (u)bnn (v) u+v
(4.8)
March 19, 2002 12:23 WSPC/148-RMP
00115
Representations of Reflection Algebras
331
which is immediate from (2.7). Now let r < n. Note the following obvious consequence of (2.7): if i < m and i 6= j then bij (u)bjm (v) ≡
n 1 X bia (u)bam (v) . u + v a=m
(4.9)
In particular, the right hand side is independent of j and so, if i < m then for any indices j, j 0 6= i we have bij (u)bjm (v) ≡ bij 0 (u)bj 0 m (v) .
(4.10)
If i + 1 < r then it is immediate from (2.7) that bi,i+1 (u)brr (v) ≡ 0. Further, using (4.9) we derive from (2.7) that br−1,r (u)brr (v) ≡
n−r+1 br−1,r (u)brr (v) , u+v
(4.11)
which gives br−1,r (u)brr (v) ≡ 0. Next, by (2.7), br,r+1 (u)brr (v) ≡
1 (br,r+1 (u)brr (v) − br,r+1 (v)brr (u)) u−v −
n X 1 bra (v)ba,r+1 (u) . u + v a=r+1
(4.12)
n−r br,r+1 (v)br+1,r+1 (u), which is equivalent to By (4.10), the second sum here is − u+v zero by the induction hypothesis. So we get
1 u−v−1 br,r+1 (u)brr (v) + br,r+1 (v)brr (u) ≡ 0 . u−v u−v
(4.13)
Swapping u and v we obtain −
u−v+1 1 br,r+1 (u)brr (v) + br,r+1 (v)brr (u) ≡ 0 . u−v u−v
(4.14)
The system of linear Eqs. (4.13) and (4.14) has only zero solution which proves the assertion in this case. Finally, for i > r we get from (2.7) bi,i+1 (u)brr (v) ≡
1 (br,i+1 (u)bir (v) − br,i+1 (v)bir (u)) , u−v
(4.15)
and br,i+1 (u)bir (v) ≡
1 (bi,i+1 (u)brr (v) − bi,i+1 (v)brr (u)) u−v −
n X 1 bia (v)ba,i+1 (u) . u + v a=i+1
(4.16)
n−i bi,i+1 (v)bi+1,i+1 (u), which By (4.10), the second sum here is equivalent to − u+v is equivalent to zero by the induction hypothesis. Therefore, (4.16) implies that br,i+1 (u)bir (v) is symmetric in u and v which shows that the left hand side of (4.15) is equivalent to 0.
March 19, 2002 12:23 WSPC/148-RMP
332
00115
A. I. Molev & E. Ragoucy
Our next step is to show that all the operators brr (u) on V 0 commute. For any 1 ≤ r ≤ n, the defining relations (2.7) give 1 1 αr (u, v) , (4.17) [brr (u), brr (v)] ≡ 1− u+v u+v where αr (u, v) =
n X
(bra (u)bar (v) − bra (v)bar (u)) .
(4.18)
a=r+1
Using (2.7) again we obtain for a ≥ r + 1 bra (u)bar (v) − bra (v)bar (u) 1 ([brr (u), brr (v)] + [baa (u), baa (v)] + αr (u, v) + αr (u, v)) . u+v Taking the sum over a gives ≡
(4.19)
(u + v − n + r)αr (u, v) ≡ (n − r)[brr (u), brr (v)] +
n X
([baa (u), baa (v)] + αr (u, v)) .
(4.20)
a=r+1
Using (4.17) we easily prove by a reverse induction on r that [brr (u), brr (v)] ≡ 0 and αr (u, v) ≡ 0. Finally, if i < r then by (2.7) 1 ([brr (u), brr (v)] + αr (u, v)) , (4.21) u2 − v 2 which is equivalent to 0 as was shown above. We can now conclude that the subspace V 0 contains a common eigenvector ξ 6= 0 for the operators brr (u), that is, (4.3) holds for some formal series µi (u). Since V is irreducible, the submodule B(n, l)ξ must coincide with V . The uniqueness of ξ now follows from Corollary 3.2. [bii (u), brr (v)] ≡ −
Given any n-tuple µ(u) = (µ1 (u), . . . , µn (u)), where µi (u) ∈ εi + u−1 C[[u−1 ]], we define the Verma module M (µ(u)) as the quotient of B(n, l) by the left ideal generated by all the coefficients of the series bij (u) with 1 ≤ i < j ≤ n, and bii (u) − µi (u) for i = 1, . . . , n. However, contrary to the case of the Yangian Y (n) the module M (µ(u)) can be trivial for some µ(u). If M (µ(u)) is nontrivial we denote by V (µ(u)) its unique irreducible quotient. Any irreducible highest weight module with the highest weight µ(u) is clearly isomorphic to V (µ(u)). Theorem 4.2. The Verma module M (µ(u)) is nontrivial (i.e. V (µ(u)) exists) if and only if µn (u)µn (−u) = 1 ,
(4.22)
and for each i = 1, . . . , n − 1 the following conditions hold µi (−u + n − i) = µ ˜i+1 (u)˜ µi+1 (−u + n − i) , µ ˜i (u)˜
(4.23)
March 19, 2002 12:23 WSPC/148-RMP
00115
Representations of Reflection Algebras
333
where µ ˜i (u) = (2u − n + i)µi (u) + µi+1 (u) + · · · + µn (u) .
(4.24)
Proof. Suppose first that V (µ(u)) exists. For each i = 1, . . . , n set βi (u, v) =
n X
bia (u)bai (v) .
(4.25)
a=i
The highest vector of V (µ(u)) is an eigenvector of βn (u, v) with the eigenvalue µn (u)µn (v). Due to the unitary condition (2.8) this eigenvalue must be equal to 1 if v = −u which proves (4.22). Using the notation of the proof of Theorem 4.1, we derive from (2.7) that n X 1 u+v−n+i βi (u, v) ≡ bii (u)bii (v) − βa (v, u) u+v u + v a=i+1 n X 1 (baa (u)bii (v) − baa (v)bii (u)) . + u − v a=i+1
(4.26)
Therefore, n X 1 u+v−n+i (βi (u, v) − βi (v, u)) ≡ (βa (u, v) − βa (v, u)) . u+v u + v a=i+1
(4.27)
Since clearly βn (u, v) ≡ βn (v, u), an easy induction implies that βi (u, v) ≡ βi (v, u) for all i. Now consider (4.26) with i replaced by i + 1 and subtract this from (4.26). This gives u+v−n+i (βi (u, v) − βi+1 (u, v)) u+v ≡ bii (u)bii (v) +
n X 1 (baa (u)bii (v) − baa (v)bii (u)) − bi+1,i+1 (u)bi+1,i+1 (v) u − v a=i+1
n X 1 (baa (u)bi+1,i+1 (v) − baa (v)bi+1,i+1 (u)) . − u − v a=i+2
(4.28)
Apply both sides to the highest vector of V (µ(u)) and put u + v = n − i. We then get the following condition for the components of µ(u), µi (u)µi (v) +
n X 1 (µa (u)µi (v) − µa (v)µi (u)) u − v a=i+1
= µi+1 (u)µi+1 (v) +
n X 1 (µa (u)µi+1 (v) − µa (v)µi+1 (u)) , u − v a=i+2
(4.29)
where v = −u + n − i. It is now a straightforward calculation to verify that this condition is equivalent to (4.23).
March 19, 2002 12:23 WSPC/148-RMP
334
00115
A. I. Molev & E. Ragoucy
Conversely, suppose that the conditions (4.22) and (4.23) hold. We shall demonstrate that there exists a highest weight module L(λ(u)) over the Yangian Y(n) such that B(n, l)-cyclic span of the highest vector is a B(n, l)-module with the highest weight µ(u). This will prove the existence of V (µ(u)). Suppose first that L(λ(u)) is an arbitrary irreducible highest weight module. The quantum comatrix Tˆ (u) = (tˆij (u)) is defined by qdet T (u) = Tˆ(u)T (u − n + 1) .
(4.30)
It can be deduced from (3.22) that the matrix element tˆij (u) equals (−1)i+j times the quantum determinant of the submatrix of T (u) obtained by removing the ith column and jth row. Therefore, the matrix element t0ij (u) of the inverse matrix T −1 (u) can be expressed as t0ij (u) = (qdet T (u + n − 1))−1 tˆij (u + n − 1) .
(4.31)
This implies that the highest vector ξ of the Y (n)-module L(λ(u)) is annihilated by the elements t0ij (u) with i < j, and that ξ is an eigenvector for the tp0ii (u). The corresponding eigenvalues are found from the formula (3.18) and given by t0ii (u)ξ =
λi+1 (u + n − i) · · · λn (u + 1) ξ. λi (u + n − i) · · · λn (u)
The following relations are easily derived from (1.3); see also [21, Sec. 7] ! n n X X 1 0 0 0 δrj tia (u)tas (v) − δis tra (v)taj (u) . [tij (u), trs (v)] = u−v a=1 a=1
(4.32)
(4.33)
Hence modulo the left ideal in Y (n) generated by the elements tij (u) with i < j we can write for a > i tia (u)t0ai (−u)
n n 1 X 1 X 0 0 ≡ tic (u)tci (−u) − t (−u)tca (u) . 2u c=i 2u c=a ac
(4.34)
Due to (3.7) this implies that
! n n n X X X 1 εa tic (u)t0ci (−u) − t0ac (−u)tca (u) . bii (u) ≡ εi tii (u)t0ii (−u) + 2u a=i+1 c=a c=i (4.35)
Now suppose that i ≥ k + 1 and set fii (u) = −
n X
t0ic (−u)tci (u) .
(4.36)
c=i
Then (4.35) can be written as n 1 X 2u − n + i bii (u) ≡ −tii (u)t0ii (−u) − faa (u) . 2u 2u a=i+1
(4.37)
March 19, 2002 12:23 WSPC/148-RMP
00115
Representations of Reflection Algebras
335
A similar transformation for fii (u) gives n 1 X 2u − n + i 0 fii (u) ≡ −tii (−u)tii (u) − baa (u) . 2u 2u a=i+1
(4.38)
Since bnn (u) ≡ fnn (u), an easy induction proves that bii (u) ≡ fii (u) for all i = k + 1, . . . , n. By (4.37), we have for those i, n 1 X 2u − n + i bii (u) + baa (u) ≡ −tii (u)t0ii (−u) . 2u 2u a=i+1
(4.39)
In the same way for i = 1, . . . , k we get n X 1 2u − n + i bii (u) + baa (u) ≡ tii (u)t0ii (−u) . 2u − 2l 2u − 2l a=i+1
(4.40)
Now, applying both sides of (4.39) and (4.40) to ξ and using (4.32) we obtain ( λn (u)λn (−u)−1 , if l = 0 , (4.41) µn (u) = −λn (u)λn (−u)−1 , if l > 0 . Moreover, λi (u)λi+1 (−u + n − i) µ ˜i (u) = , µ ˜i+1 (u) λi+1 (u)λi (−u + n − i)
(4.42)
l − u λk (u)λk+1 (−u + l) µ ˜ k (u) = · . µ ˜k+1 (u) u λk+1 (u)λk (−u + l)
(4.43)
if i 6= k; while
The proof of the theorem is now completed as follows. By (4.22) there exists a series λn (u) ∈ 1 + u−1C[[u−1 ]] such that (4.41) holds. Similarly, (4.23) ensures that there exist series λ1 (u), . . . , λn−1 (u) satisfying (4.42) and (4.43). Thus, V (µ(u)) is isomorphic to the irreducible quotient of B(n, l)ξ. 4.2. Representations of B(2, 0) and B(2, 1) The algebras B(2, 0) and B(2, 1) turn out to be isomorphic to the symplectic and orthogonal twisted Yangians Y − (2) and Y+ (2), respectively (see the remark at the end of this section). Thus the representations of B(2, 0) and B(2, 1) can be described by using the results of [20] for the twisted Yangians. In our argument we use the corresponding isomorphisms of the extended algebras (Proposition 4.3). Note also that this similarity between the reflection algebras and the twisted Yangians does not extend to higher dimensions. Recall (see e.g. [21]) that the twisted Yangian Y± (2) is an associative algebra (1) (2) with generators sij , sij , . . . , where i, j ∈ {1, 2}. Introduce the generating series sij (u) = δij + sij u−1 + sij u−2 + · · · (1)
(2)
(4.44)
March 19, 2002 12:23 WSPC/148-RMP
336
00115
A. I. Molev & E. Ragoucy
and combine them into the matrix S(u) = (sij (u)). Then the defining relations have the form of a reflection equation analogous to (2.5), R(u − v)S1 (u)Rt (−u − v)S2 (v) = S2 (v)Rt (−u − v)S1 (u)R(u − v) ,
(4.45)
as well as the symmetry relation S t (−u) = S(u) ±
S(u) − S(−u) . 2u
(4.46)
Here we have used the notation of Sec. 2, and Rt (u) = 1 − Qu−1 ,
Q=
2 X
t Eij ⊗ Eji ∈ (End C2 )⊗2 ,
i,j=1
where the matrix transposition is defined by a22 a12 a22 t t and A = A = a21 a11 −a21
−a12 a11
,
(4.47)
for Y + (2) and Y− (2), respectively. ˜ − (2) is defined in the same way ˜ + (2) and Y Another pair of associative algebras Y as Y± (2), but with the symmetry relation (4.46) dropped. Then the algebra Y± (2) ˜ ± (2) by the ideal generated by all coefficients of is isomorphic to the quotient of Y the series δ(u) defined by the relation 1 (4.48) δ(u)Q = QS1 (u)R(2u)S2−1 (−u) . 1∓ 2u ˜ ± (2); see [21, Theorems 6.3 and Moreover, these coefficients belong to the center of Y 6.4] for the proofs of these assertions. Note also that δ(u) satisfies δ(u)δ(−u) = 1 which is easy to deduce from (4.48). ˜ l) is the algebra with the defining relations (2.5; see Sec. 2. Recall that B(n, Here we let G = diag(1, −1). Proposition 4.3. The mappings 1 and S(u) 7→ B u + 2
1 S(u) 7→ B u + G 2
(4.49)
˜ 0) and Y ˜ 1), respectively. ˜ + (2) → B(2, ˜ − (2) → B(2, define algebra isomorphisms Y Proof. We have to show that the matrix B(u + 12 ) or B(u + 12 )G, respectively, ˜ − (2) the operator P + Q is satisfies the relation (4.45). Note that in the case of Y the identity on (End C2 )⊗2 . This implies that Rt (−u − v) =
u+v+1 R(u + v + 1) u+v
(4.50)
March 19, 2002 12:23 WSPC/148-RMP
00115
Representations of Reflection Algebras
337
˜ + (2) both G1 QG1 + P and G2 QG2 + P are proving the assertion. In the case of Y the identity operators which implies that G1 Rt (−u − v)G1 = G2 Rt (−u − v)G2 =
u+v+1 R(u + v + 1) . u+v
(4.51)
It remains to note that G1 G2 commutes with R(u − v). Suppose now that V is an irreducible finite-dimensional representation of the ˜ ± (2) and, by Propotwisted Yangian Y± (2). Then V is naturally extended to Y ˜ 0) or B(2, ˜ 1), respectively. The central element sition 4.3, to the algebra B(2, f (u) defined in (2.9) acts in V as a scalar series. Then there exists a series g(u) ∈ 1 + u−1 C[[u−1 ]] such that g(u)g(−u)f (u) = 1 as an operator in V . This means that the composition of V with the corresponding automorphism (2.15) of ˜ 0) or B(2, ˜ 1) can be regarded as an irreducible representation of the algebra B(2, B(2, 0) or B(2, 1), respectively. Conversely, any irreducible finite-dimensional representation V of B(2, 0) or ˜ 0) or B(2, ˜ 1) and B(2, 1) is naturally extended to the corresponding algebra B(2, ± ˜ hence to Y (2). The composition of V with an appropriate automorphism S(u) 7→ h(u)S(u) can be regarded as an irreducible representation of the twisted Yangian Y± (2). Indeed, the central element δ(u) defined by (4.48) acts as a scalar series on V , and since δ(u)δ(−u) = 1, the series h(u) is found from the relation h(u)h(−u)−1 δ(u) = 1. This argument allows us to carry over the description of representations of Y± (2) to the case of the algebras B(2, 0) or B(2, 1). The following proposition is easily derived from [21, Theorems 5.4 and 6.4]. Suppose that two formal series µ1 (u) and µ2 (u) satisfy the conditions of Theorem 4.2 so that the irreducible highest weight module V (µ(u)) exists. Proposition 4.4. (i) The B(2, 0)-module V (µ(u)) is finite-dimensional if and only if there exists a monic polynomial P (u) in u such that P (−u + 2) = P (u) and P (u + 1) (2u − 1)µ1 (u) + µ2 (u) = . 2uµ2 (u) P (u)
(4.52)
In this case P (u) is unique. (ii) The B(2, 1)-module V (µ(u)) is finite-dimensional if and only if there exist γ ∈ C and a monic polynomial P (u) in u such that P (−u + 2) = P (u), P (γ) 6= 0 and P (u + 1) γ−u (2u − 1)µ1 (u) + µ2 (u) = · . 2uµ2 (u) P (u) γ+u−1
(4.53)
In this case the pair (P (u), γ) is unique. Given α, β ∈ C denote by L(α, β) the irreducible gl(2)-module with the highest weight (α, β). It is well-known that any finite-dimensional irreducible representation
March 19, 2002 12:23 WSPC/148-RMP
338
00115
A. I. Molev & E. Ragoucy
of Y(2) is isomorphic to a tensor product of the form L = L(α1 , β1 ) ⊗ · · · ⊗ L(αk , βk ) ,
(4.54)
up to the twisting by an automorphism T (u) 7→ h(u)T (u) of Y(2), where h(u) is a formal series [32, 3]; see also [20]. The corresponding analogs of this result for the algebras B(2, 0) and B(2, 1) are also immediate from [20] due to the above argument. For any γ ∈ C denote by V (γ) the one-dimensional representation of B(2, 1) such that the generators act by b11 (u) 7→
u+γ , u−γ
b22 (u) 7→ −1 ,
b12 (u) 7→ 0 ,
b21 (u) 7→ 0 .
(4.55)
Representations of this kind were constructed in the pioneering paper [6]. Given any Y (n)-module L and any B(n, l)-module V we shall regard L⊗V as a B(n, l)-module by using Proposition 3.3. Proposition 4.5. Any finite-dimensional irreducible B(2, 0)-module is isomorphic to the restriction of a Y (2)-module of the form (4.54). Any finite-dimensional irreducible B(2, 1)-module is isomorphic to the tensor product L ⊗ V (γ), where L is a Y (2)-module of the form (4.54). We give some remarks here. (1) It is proved in [21, Proposition 4.14] that the twisted Yangian admits the decomposition ± ± ± Y (2) = ZZ ⊗ SY (2) ,
(4.56)
where ZZ± is the center of Y ± (2), and SY± (2) is a subalgebra called the special twisted Yangian. Using the decomposition (3.44) we conclude from Proposition 4.3 that the algebras SY+ (2) and SY− (2) are respectively isomorphic to SB(2, 1) and SB(2, 0). Indeed, each special subalgebra is the quotient of the corresponding alge˜ l) by the ideal of relations sending the generators of the center ˜ ± (2) or B(2, bra Y to zero. On the other hand, both centers ZZ± and ZZ(2, l) are polynomial algebras in countably many variables; see Theorem 3.4 above and [21, Theorem 4.11]. This implies the isomorphisms Y + (2) ∼ = B(2, 1) and Y − (2) ∼ = B(2, 0). (2) Criteria of irreducibility of the modules L and L ⊗ V (γ) (with L given by (4.54)) over the algebras B(2, 0) and B(2, 1), respectively, can also be obtained from the corresponding results in [20]. 4.3. Classification theorem for B(n, l)-modules Theorem 4.1 together with the following result will complete the description of finite-dimensional irreducible representations of B(n, l). Let V (µ(u)) be an irreducible highest weight module over the algebra B(n, l), so that the conditions of Theorem 4.2 for the components µi (u) hold. We shall keep using the notation (4.24).
March 19, 2002 12:23 WSPC/148-RMP
00115
Representations of Reflection Algebras
339
Theorem 4.6. (i) The B(n, 0)-module V (µ(u)) is finite-dimensional if and only if there exist monic polynomials P1 (u), . . . , Pn−1 (u) in u such that Pi (−u + n − i + 1) = Pi (u) and Pi (u + 1) µ ˜ i (u) = , µ ˜i+1 (u) Pi (u)
i = 1, . . . , n − 1 .
(4.57)
(ii) The B(n, l)-module V (µ(u)) (with l > 0) is finite-dimensional if and only if there exist an element γ ∈ C and monic polynomials P1 (u), . . . , Pn−1 (u) in u such that Pi (−u + n − i + 1) = Pi (u), Pk (γ) 6= 0 and Pi (u + 1) µ ˜i (u) = , µ ˜i+1 (u) Pi (u)
i = 1, . . . , n − 1 ,
i 6= k ,
(4.58)
while Pk (u + 1) γ−u µ ˜k (u) = · . µ ˜k+1 (u) Pk (u) γ+u−l
(4.59)
Proof. Set V = V (µ(u)) and suppose first that dim V < ∞. We shall use induction on n. In the case n = 2 the result holds by Proposition 4.4. Suppose now that n ≥ 3 and consider the subspace V+ = {η ∈ V | b1i (u)η = 0 for i = 2, . . . , n} .
(4.60)
The calculations similar to those used in the proof of Theorem 4.1 show that V+ is stable under the operators bij (u) with 2 ≤ i, j ≤ n. Moreover, the operators b◦ij (u) = bi+1,j+1 (u) ,
i, j = 1, . . . , n − 1
(4.61)
form a representation of the algebra B(n−1, l) in V+ (recall that by our assumptions, l ≤ n/2). The cyclic span B(n−1, l)ξ of the highest vector ξ ∈ V is a module with the highest weight µ+ (u) = (µ2 (u), . . . , µn (u)). Since this module is finite-dimensional the conditions of the theorem must be satisfied for the components of µ+ (u). Similarly, consider the subspace V− = {η ∈ V | bin (u)η = 0 for i = 1, . . . , n − 1} .
(4.62)
Now the operators bnn (u) and bij (u) with 1 ≤ i, j ≤ n − 1 preserve V− . Moreover, the operators δij 1 1 ◦ · bnn u + + , i, j = 1, . . . , n − 1 (4.63) bij (u) = bij u + 2 2u 2 form a representation of the algebra B(n − 1, 0) or B(n − 1, l − 1) in V− , respectively for l = 0 and l > 0. Again, the cyclic span of ξ is a finite-dimensional module
March 19, 2002 12:23 WSPC/148-RMP
340
00115
A. I. Molev & E. Ragoucy
over B(n − 1, 0) or B(n − 1, l − 1), respectively, with the highest weight µ− (u) = (µ◦1 (u), . . . , µ◦n−1 (u)), where 1 1 1 ◦ · µn u + + . (4.64) µi (u) = µi u + 2 2u 2 By the induction hypothesis, the components of µ− (u) must satisfy the conditions of the theorem. Rewriting them in terms of the components of µ(u) we complete the proof of the “only if” part. Conversely, suppose that the conditions of the theorem hold for the components of the highest weight µ(u). Since Pi (u) = Pi (−u + n − i + 1) for each i, there exist monic polynomials Qi (u) in u such that Pi (u) = (−1)deg Qi Qi (u)Qi (−u + n − i + 1) .
(4.65)
Then there exists a representation L(λ(u)) of the Yangian Y (n) such that the components λi (u) of λ(u) satisfy the conditions (4.2) for the polynomials Qi (u). The argument of the proof of Theorem 4.2 shows that the B(n, 0)-span of the highest vector ξ of L(λ(u)) is a module with the highest weight µ(u) such that the relations (4.42) hold. Since V (µ(u)) is isomorphic to the irreducible quotient of this span we conclude that V (µ(u)) is finite-dimensional. Suppose now that l > 0. It is easy to verify that for any γ ∈ C the assignment bij (u) 7→ δij
u+γ εi u − γ
(4.66)
defines a one-dimensional representation of B(n, l) which we denote by V (γ). Now consider the B(n, l)-module L(λ(u)) ⊗ V (l − γ); see Proposition 3.3. Let η be a basis vector of V (l − γ). Repeating again the calculation of the proof of Theorem 4.2 we find that the B(n, l)-span of the vector ξ ⊗ η is a module with the highest weight µ(u) such that the relations (4.42) hold for i 6= k, while γ−u λk (u)λk+1 (−u + l) µ ˜ k (u) = · . µ ˜k+1 (u) γ + u − l λk+1 (u)λk (−u + l)
(4.67)
This implies that V (µ(u)) is finite-dimensional. Using the decomposition (3.44) we can deduce the following parametrization of the representations of the special reflection algebra SB(n, l). Corollary 4.7. (i) The finite-dimensional irreducible representations of the special reflection algebra SB(n, 0) are in a one-to-one correspondence with the families of monic polynomials (P1 (u), . . . , Pn−1 (u)) such that Pi (−u + n − i + 1) = Pi (u). (ii) The finite-dimensional irreducible representations of the special reflection algebra SB(n, l) with l > 0 are in a one-to-one correspondence with the families (P1 (u), . . . , Pn−1 (u), γ), where γ ∈ C and the Pi (u) are monic polynomials such that Pi (−u + n − i + 1) = Pi (u) and Pk (γ) 6= 0.
March 19, 2002 12:23 WSPC/148-RMP
00115
Representations of Reflection Algebras
341
Acknowledgment This work was done during the first author’s visit to the Laboratoire d’Annecyle-Vieux de Physique Th´eorique, Annecy, France. He would like to thank the Laboratoire for the warm hospitality. References [1] R. J. Baxter, Exactly Solved Models in Statistical Mechanics, Academic Press, New York, 1982. [2] C. Briot and E. Ragoucy, “RTT presentation of finite W-algebras”, J. Phys. A34 (2001) 7287–7310. [3] V. Chari and A. Pressley, “Yangians and R-matrices”, L’Enseign. Math. 36 (1990) 267–302. [4] V. Chari and A. Pressley, A Guide to Quantum Groups, Cambridge University Press, 1994. [5] I. V. Cherednik, “Quantum groups as hidden symmetries of classic representation theory”, pp. 47–54 in Differential Geometric Methods in Physics, ed. A. I. Solomon, World Scientific, Singapore, 1989. [6] I. V. Cherednik, “Factorizing particles on a half line and root systems”, Theoret. Math. Phys. 61 (1984) 977–983. [7] V. G. Drinfeld, “Hopf algebras and the quantum Yang–Baxter equation”, Soviet Math. Dokl. 32 (1985) 254–258. [8] V. G. Drinfeld, “Quantum Groups”, pp. 798–820 in Proc. Int. Congress Math., Berkeley, 1986, AMS, Providence RI, 1987. [9] V. G. Drinfeld, “A new realization of Yangians and quantized affine algebras”, Soviet Math. Dokl. 36 (1988) 212–216. [10] A. G. Izergin and V. E. Korepin, “A lattice model related to the nonlinear Schr¨ odinger equation”, Sov. Phys. Dokl. 26 (1981) 653–654. [11] T. H. Koornwinder and V. B. Kuznetsov, “Gauss hypergeometric function and quadratic R-matrix algebras”, St. Petersburg Math. J. 6 (1994) 161–184. [12] P. P. Kulish and E. K. Sklyanin, “Quantum spectral transform method: recent developments”, pp. 61–119 in Integrable Quantum Field Theories, Lecture Notes in Phys. 151, Springer, Berlin-Heidelberg, 1982. [13] P. P. Kulish and E. K. Sklyanin, “Algebraic structures related to reflection equations”, J. Phys. A25 (1992) 5963–5975 . [14] V. B. Kuznetsov, M. F. Jørgensen and P. L. Christiansen, “New boundary conditions for integrable lattices”, J. Phys. A28 (1995) 4639–4654. [15] V. B. Kuznetsov, “3 F 2 (1) hypergeometric function and quadratic R-matrix algebra”, pp. 185–197 in Symmetries and integrability of difference equations, CRM Proc. Lecture Notes, 9, Amer. Math. Soc., Providence, RI, 1996. [16] A. Liguori, M. Mintchev and L. Zhao, “Boundary exchange algebras and scattering on the half line”, Comm. Math. Phys. 194 (1998) 569–589. [17] M. Mintchev, E. Ragoucy and P. Sorba, “Spontaneous symmetry breaking in the gl(N )-NLS hierarchy on the half line”, to appear in J. Phys. A. [18] A. I. Molev, “Yangians and their applications”, to appear in Handbook of Algebra, Vol. 3, ed. M. Hazewinkel, Elsevier. [19] A. I. Molev, “Gelfand–Tsetlin basis for representations of Yangians”, Lett. Math. Phys. 30 (1994) 53–60.
March 19, 2002 12:23 WSPC/148-RMP
342
00115
A. I. Molev & E. Ragoucy
[20] A. I. Molev, “Finite-dimensional irreducible representations of twisted Yangians”, J. Math. Phys. 39 (1998) 5559–5600. [21] A. Molev, M. Nazarov and G. Olshanski, “Yangians and classical Lie algebras”, Russian Math. Surveys 51(2) (1996) 205–282. [22] M. Nazarov and V. Tarasov, “Representations of Yangians with Gelfand–Zetlin bases”, J. Reine Angew. Math. 496 (1998) 181–212. [23] M. Noumi, “Macdonald’s symmetric polynomials as zonal spherical functions on quantum homogeneous spaces”, Adv. Math. 123 (1996) 16–77. [24] G. Olshanski, “Twisted Yangians and infinite-dimensional classical Lie algebras”, pp. 103–120 in Quantum Groups, ed. P. P. Kulish, Lecture Notes in Math. 1510, Springer, Berlin-Heidelberg, 1992. [25] E. Ragoucy, “Twisted Yangians and folded W-algebras”, Internat. J. Modern Phys. A16 (2001) 2411–2433. [26] E. Ragoucy and P. Sorba, “Yangians and finite W-algebras”, Quantum groups and integrable systems (Prague, 1998). Czechoslovak J. Phys. 48 (1998) 1483–1487. [27] E. Ragoucy and P. Sorba, “Yangian realisations from finite W-algebras”, Comm. Math. Phys. 203 (1999) 551–572. [28] N. Yu. Reshetikhin and M. A. Semenov-Tian-Shansky, “Central extensions of quantum current groups”, Lett. Math. Phys. 19 (1990) 133–142. [29] E. K. Sklyanin, “Boundary conditions for integrable quantum systems”, J. Phys. A21 (1988) 2375–2389. [30] L. A. Takhtajan and L. D. Faddeev, “Quantum inverse scattering method and the Heisenberg XYZ-model”, Russian Math. Surv. 34(5) (1979) 11–68. [31] V. O. Tarasov, “Structure of quantum L-operators for the R-matrix of the XXZmodel”, Theor. Math. Phys. 61 (1984) 1065–1071. [32] V. O. Tarasov, “Irreducible monodromy matrices for the R-matrix of the XXZ-model and lattice local quantum Hamiltonians”, Theor. Math. Phys. 63 (1985) 440–454.
April 25, 2002 17:1 WSPC/148-RMP
00120
Reviews in Mathematical Physics, Vol. 14, No. 4 (2002) 343–374 c World Scientific Publishing Company
REFLEXIVE POLYHEDRA, WEIGHTS AND TORIC CALABI YAU FIBRATIONS
MAXIMILIAN KREUZER Institut f¨ ur Theoretische Physik, Technische Universit¨ at Wien Wiedner Hauptstraße 8–10, A-1040 Wien, Austria
[email protected] HARALD SKARKE Mathematical Institute, University of Oxford 24-29 St. Giles’, Oxford OX1 3LB, England
[email protected]
Received 24 April 2001 Revised 19 November 2001 During the last years we have generated a large number of data related to Calabi– Yau hypersurfaces in toric varieties which can be described by reflexive polyhedra. We classified all reflexive polyhedra in three dimensions leading to K3 hypersurfaces and have also completed the four-dimensional case relevant to Calabi–Yau threefolds. In addition, we have analysed for many of the resulting spaces whether they allow fibration structures of the types that are relevant in the context of superstring dualities. In this survey we want to give background information both on how we obtained these data, which can be found at our web site, and on how they may be used. We give a complete exposition of our classification algorithm at a mathematical (rather than algorithmic) level. We also describe how fibration structures manifest themselves in terms of toric diagrams and how we managed to find the respective data. Both for our classification scheme and for simple descriptions of fibration structures the concept of weight systems plays an important role.
1. Introduction A few years after Calabi–Yau manifolds had found their way into physics it was conjectured that they should actually come in pairs with opposite Euler number, since an exchange of complex structure and K¨ ahler moduli in physics corresponds to a change of sign in the definition of the charge, or, equivalently, an exchange of particles and anti-particles [1, 2]. This phenomenon is called mirror symmetry. Although the situation is complicated by the fact that there are rigid Calabi–Yau manifolds whose “mirror string compactifications” do not have a straightforward geometrical interpretation [3, 4], the search for the mirror manifolds proved to be an extremely fruitful enterprise from both the physicists’ and the mathematicians’ perspective [5, 7]. 343
April 25, 2002 17:1 WSPC/148-RMP
344
00120
M. Kreuzer & H. Skarke
The first systematic constructions of large classes of Calabi–Yau threefolds as complete intersections in products of projective spaces [8] did not seem to support the mirror hypothesis because the resulting manifolds all had negative Euler numbers. But when the attention was extended to weighted projective spaces, it turned out that the blow up parameters of the quotient singularities can provide large positive contributions. The first substantial list of pairs of Hodge numbers resulting from constructions of this type [9] was almost mirror symmetric in the sense that only for a few percent of the Hodge data the respective mirror pair was not in the list. A complete classification [10, 11], however, made the picture worse, and abelian quotients [12], which make a subclass of these spaces perfectly symmetric [13, 14], did not help with this problem either. Batyrev’s construction of toric Calabi–Yau hypersurfaces [15], which is manifestly mirror symmetric while generalising the above results, provided a solution to this puzzle. In this framework the geometrical data is encoded by a reflexive polyhedron,a i.e. a lattice polyhedron whose facets are all at distance 1 from the unique interior point (see below). Toric geometry turned out to provide a very efficient tool for the analysis of many physical aspects of Calabi–Yau compactifications, including the physics of perturbative [16] and non-perturbative [17]–[20] topology changing transitions, as well as fibration structures that are important in string dualities [21, 22]. This made a constructive classification of reflexive polyhedra a useful and interesting enterprise. Our approach to this problem [23] was partly inspired by our experience with the classification of weighted projective spaces that admit transversal quasi-homogeneous polynomials [24]. Indeed, as it turned out, the Newton polyhedra that correspond to polynomials defining CY hypersurfaces in weighted P4 are all reflexive [25, 26] and provide a canonical resolution of the ambient space singularities (this is no longer true in higher dimensions). Actually, regardless of the P transversality condition, a diophantine equation of the form ni ai = d with posP ni , and with the set of solutions restricted to ai ≥ 0 itive coefficients ni , d = gives a simple way to produce lattice polyhedra with at most one interior point (this is a necessary condition for reflexivity): We may regard this as an embedding of the lattice into a higher-dimensional space with the polyhedron being contained in the finite intersection of an affine subspace with the non-negative half-spaces. All lattice points, except for the candidate interior point, whose coordinates are all equal to 1, are located on some coordinate hyperplane ai = 0. We may then ask ourselves if all reflexive polyhedra are contained in polyhedra that can be embedded in this way. In the next section we will show that the answer is assertive provided that we allow for an embedding with higher codimension k −n, i.e. we also consider solutions to more than one equation of the above form, k X i=1 a In
(j)
ai n i
= d(j) ,
d(j) =
k X
(j)
ni ,
j = 1, . . . , k − n ,
i=1
the present context polyhedra are always convex subsets of a real vector space.
(1)
April 25, 2002 17:1 WSPC/148-RMP
00120
Reflexive Polyhedra, Weights and Toric Calabi–Yau Fibrations
345
(j)
but with some of the coefficients ni equal to zero according to a certain pattern. What then makes our construction work is the fact that there is only a finite set of coefficients that lead to lattice polytopes with an interior point. The collections (j) ni of non-negative numbers are called weight systems in the case of a single equation and combined weight systems if k − n > 1. If we shift our coordinates to xi = ai − 1 the resulting polyhedron lies in a linear subspace of the embedding P (j) space determined by i ni xi = 0, is bounded by xi ≥ −1 and has the origin of the embedding space as its interior point. (These linear coordinates are more useful for many general considerations whereas the affine coordinates ai are better suited for quickly finding the lattice points in a given example.) In practice, because of the huge number of solutions, an enumeration of all reflexive polyhedra seems to be possible only in up to 4 dimensions. This leads to a further simplification of the procedure because in up to 4 dimensions all polytopes ∆ that correspond to a minimal combined weight systems are reflexive [26]. It is easy to see that the dual ∆∗ (defined in Eq. (2) below) of a reflexive polytope ∆ is ˆ again a reflexive lattice polytope. Moreover, ∆ is contained in a larger polytope ∆ ˆ ∗ is contained in ∆∗ . Therefore only minimal polytopes for if and only if its dual ∆ which ∆∗ does not contain any reflexive subpolytope are necessary ingredients for our classification scheme. We will show that in 4 dimension there are 308 reflexive polytopes that contain all others as subpolytopes, provided that we also consider sublattices. Finding all relevant lattices is a subtle point and our strategy to solve this problem will be described in the next section. 25 additional maximal reflexive polytopes can be obtained directly from these 308 objects on sublattices. The remaining 7 maximal polytopes could only be found by carrying out the complete classification algorithm. While one of the main insights of the “first superstring revolution” was the fact that Calabi–Yau spaces are crucial for string compactifications, it was found during the “second string revolution” that fibration structures of Calabi–Yau manifolds are essential for understanding various non-perturbative string dualities. In particular, K3 fibrations are required for the duality between heterotic and IIA theories [27, 28] and elliptic fibrations are needed for F -theory compactifications [22, 29, 30]. Again toric geometry provides beautiful tools for studying the respective structures. As we will see, the polytope ∆∗f corresponding to the fiber manifests itself as a subpolytope of ∆∗ with the same interior point, whereas the base space is a toric variety whose fan can be determined by projecting the original fan along the linear subspace spanned by ∆∗f . While we never attempted to give a complete classification of structures of this type, we did create large lists of fibration structures [31, 32]. The geometry of such toric fibrations was analysed in more detail in [33]. Our data are accessible at our web site [34], and we plan to make the source code of our programs available in the near future. Since one of the motivations for writing this contribution was to give useful background material for anyone interested in applying our data, we would like to briefly mention some older results on our web page that will not be discussed in the remainder of this paper. These are mostly related to weighted projective spaces and, in the physical context, to
April 25, 2002 17:1 WSPC/148-RMP
346
00120
M. Kreuzer & H. Skarke
Landau–Ginzburg models [2, 35]. We classified all 10839 weight systems allowing transversal quasihomogeneous polynomials [24, 10] with singularity index 3, leading to Landau–Ginzburg models with a central charge of c = 9 and computed the corresponding numbers of (anti)chiral states in the superconformal field theories (this includes the 7555 transversal weights for weighted P4 ). Vafa’s formulas for these numbers [35] inspired the definition of what Batyrev et al. call string theoretic Hodge numbers [36]. We also extended these results to arbitrary abelian quotients that leave a transversal polynomial invariant [12] (and included the modifications by discrete torsions [37], which correspond to topologically non-trivial background 2-form fields in the physical context [38]). Since the Newton polyhedra are reflexive also for abelian quotients, the resulting Hodge numbers (without discrete torsion) are all recovered in the toric context. Nevertheless our results might be useful when working in weighted projective spaces, since transversal polynomials in general have larger symmetries than the complete Newton polyhedra. We will not discuss Calabi–Yau data obtained by other groups here. An important class of spaces that we did not consider consists of complete intersection Calabi–Yau varieties. The classification of these objects in products of projective spaces was given in [8], and Klemm has produced a sizeable list of codimension two complete intersections in weighted projective spaces. A number of toric complete intersections [39, 40] was constructed in [41, 42]. All transversal weights for Calabi– Yau 4-folds in weighted P5 were constructed in [43]. Further web pages with relevant information are [44, 45]. In the next section we give a self-contained exposition of our classification algorithm and of the results in 3 and 4 dimensions. In Sec. 3 we discuss the implications of these results for the geometry of toric K3 and Calabi–Yau hypersurfaces. In Sec. 4 we explain the toric realisation of fibrations where both the fibered space and the fiber have vanishing first Chern classes. We discuss how weight systems can be used to encode such fibrations and how this is related to fibrations in weighted projective spaces. We also provide an appendix with several tables that summarise some of our results. 2. Classification of Reflexive Polyhedra In this section we give a self-contained exposition of our methods and results on the classification of reflexive polyhedra, without reference to toric geometry. Nevertheless, as we will see in the next section, some of the concepts used here, in particular the concept of weights, have interpretations in terms of geometry. A polytope in Rn is the convex hull of a finite set of points in Rn , and for our present purposes a polyhedron is the same thing as a polytope (in particular, it is always bounded, which need not be true if a polyhedron is defined as the intersection of a finite number of half spaces). We will be interested in the case where we have a pair of lattices M ' Zn and N = Hom(M, Z) ' Zn and their real extensions MR ' Rn and NR ' Rn . A
April 25, 2002 17:1 WSPC/148-RMP
00120
Reflexive Polyhedra, Weights and Toric Calabi–Yau Fibrations
347
polyhedron ∆ ⊂ MR is called a lattice (or integer) polyhedron if the vertices of ∆ lie in M . Definition 2.1. A polytope ∆ ⊂ Rn has the “interior point property” or “IP property”, if 0 (the origin of Rn ) is in the interior. A simplex with this property is an IP simplex. Definition 2.2. For any set ∆ ⊂ MR the dual (or polar) set ∆∗ ⊂ NR = MR∗ is given by ∆∗ = {y ∈ NR : hy, xi ≥ −1 ∀ x ∈ ∆} ,
(2)
where hy, xi is the duality pairing between y ∈ NR and x ∈ MR . If ∆ is a polytope with the IP property, then ∆∗ is also a polytope with the IP property and (∆∗ )∗ = ∆. Definition 2.3. A lattice polyhedron ∆ ⊂ MR is called reflexive if its dual ∆∗ ⊂ NR is a lattice polyhedron w.r.t. the lattice N dual to M . The main idea of our classification scheme is to construct a set of polyhedra such that every reflexive polyhedron is a subpolyhedron of one of the polyhedra in this set. By duality, every reflexive polyhedron must contain one of the duals of these polyhedra, so we are looking for polyhedra that are minimal in some sense. In the following subsection we will give a definition of minimality that depends only on the way in which a polytope is spanned by its vertices, without reference to a lattice or details of the linear structure. We will see that this allows for a very rough classification with only a few objects in low dimensions. The corresponding characterisation of polyhedra can be refined by specifying explicitly the linear relations between the vertices with the help of weight systems. We will see that these weight systems can be used in a simple way to find the polyhedra dual to the minimal ones and to check whether they can possibly contain reflexive polyhedra; the main criterion here is the existence of a dual pair of lattices such that a minimal polytope is a lattice polyhedron and the convex hull of the lattice points of the dual has the IP property. The classification of the relevant weight systems leads to a finite number of polytopes that contain all reflexive polytopes, with the subtlety that only the linear structure but not the lattice on which some polytope may be reflexive is specified. In the final subsection we solve this problem by showing how to identify all lattices on which a polyhedron given in terms of its linear structure can be reflexive, and present the results of our classification scheme. Let us illustrate with the example in Fig. 1 how we can obtain a weight system for a given reflexive polyhedron ∆ with vertices in some n-dimensional lattice M . Reflexivity implies that the dual (or polar) polytope ∆∗ defined in Eq. (2) has its vertices on the dual lattice N = Hom(M, Z). In our case ∆∗ is already minimal in the sense that we lose the interior point (IP) if we drop any of its vertices and take the convex hull of the remaining vertices. The set of vertices of ∇ = ∆∗
April 25, 2002 17:1 WSPC/148-RMP
348
00120
M. Kreuzer & H. Skarke
Fig. 1.
A minimal polyhedron ∇ that corresponds to a combined weight system.
can be decomposed into the two triangles (V1 , V2 , V5 ) and (V3 , V4 , V5 ) that both contain the IP in their lower-dimensional interior. As we will show later, similar decompositions are always possible for minimal polyhedra. For both triangles the P barycentric coordinates of the IP are given by q = (1/4, 1/4, 1/2), i.e. qi Vi = 0 P and qi = 1, where the sum is over the indices of the vertices for any of the (j) (j) two triangles. Rescaling the coefficients to integers ni = d(j) qi we arrive at the (1) (2) weight system ni = (1, 1, 0, 0, 2), ni = (0, 0, 1, 1, 2). We will show that weights obtained in this way can always be used to describe the dual polytope ∆ as in Eq. 1. In the present case, this construction leads to x1 + x2
+2x5 = 0 , x3 + x4 + 2x5 = 0 .
(3) (4)
Eliminating, for example, x2 and x4 it is easily checked that we indeed reconstructed ∆ (note that ai = xi + 1 is the lattice distance of a point from the facet dual to Vi ). If we keep all points with xi ≥ −1 then, in our example, ∆∗ is equal to ∇. In general ∆∗ will not be minimal and we first have to drop some vertices of ∆∗ to arrive at a minimal polytope ∇ whose simplex decomposition leads to a weight system. If we drop points from ∆ in such a way that ∆0 ⊂ ∆ is reflexive, then ∆0∗ becomes larger. The vertices of ∇ remain vertices of ∆0∗ ⊃ ∇ as long as the bounding hyperplanes xi = −1, which in our case support all facets of ∆, are affinely spanned by facets of ∆0 . A different way to generate a “smaller” ∆ is to keep the vertices but to go to a coarser M lattice: We may, for example, demand x1 − x3 ∈ 2Z or x1 + x5 ∈ 2Z. Correspondingly, the N lattice becomes finer and is no longer generated by the vertices of ∇. In general there will occur additional lattice points in ∇. The coarsest lattice that keeps all vertices of ∆ and the IP is obtained by imposing x1 − x3 ∈ 4Z and x1 + x5 ∈ 2Z. Actually, in our example, this exchanges ∇ and ∆.
April 25, 2002 17:1 WSPC/148-RMP
00120
Reflexive Polyhedra, Weights and Toric Calabi–Yau Fibrations
349
2.1. Minimal polyhedra and their structures We will later give various definitions of minimality, each of which has advantages and disadvantages. Here we define the weakest form of minimality, but the one that is most useful, where we forget for the time being about the lattice structure and concentrate on the vertex structure only. Definition 2.4. A minimal polyhedron ∇ ⊂ Rn is defined by the following properties: (i) ∇ has the IP property. (ii) If we remove one of the vertices of ∇, the convex hull of the remaining vertices of ∇ does not have the IP property. Obviously every polytope ∇ ⊂ Rn with the IP property contains at least one minimal polytope spanned by a subset of the vertices of ∇. Before asking ourselves which minimal polytopes can be subpolytopes of reflexive polyhedra, we will now analyse the possible general structures of minimal polytopes. Lemma 2.1. A minimal polytope ∇ ⊂ Rn with vertices V1 , . . . , Vk is either a simplex or contains an n0 -dimensional minimal polytope ∇0 := ConvexHull{V1 , . . . , Vk0 } and an IP simplex S := ConvexHull(R ∪ {Vk0 +1 , . . . , Vk }) with R ⊂ {V1 , . . . , Vk0 } such that k − k 0 = n − n0 + 1 ≥ 2 and dim S ≤ n0 . Proof. If ∇ is a simplex, there is nothing left to prove. Otherwise, we first note that every vertex V of ∇ must belong to at least one IP simplex: It is always possible to find a triangulation of ∇ such that every n-simplex in this triangulation has V as a vertex (just triangulate the cone whose apex is V and whose one-dimensional rays are V V˜ , where the V˜ are the other vertices of ∇). As 0 must belong to at least one of these simplices, it must lie on some simplicial face which then is an IP simplex. Now consider the set of all IP simplices consisting of vertices of ∇. Any subset of this set will define a lower-dimensional minimal polytope: The fact that 0 is interior to each simplex means that it is a positive linear combination of the vertices of any such simplex, and therefore 0 can also be written as a positive linear combination of all vertices involved. If the corresponding polytope were not minimal, our original ∇ could not be minimal, either. Among all lower-dimensional minimal polytopes, take one (call it ∇0 ) with the maximal dimension n0 smaller than n. Rn factorises 0 0 0 into Rn and Rn /Rn ∼ = Rn−n (equivalence classes in Rn ). The remaining vertices 0 define a polytope ∇n−n0 in Rn /Rn . If ∇n−n0 were not a simplex, it would contain a simplex of dimension smaller than n − n0 which would define, together with the vertices of ∇0 , a minimal polytope of dimension s with n0 < s < n, in contradiction with our assumption. Therefore ∇n−n0 is a simplex. Because of minimality of ∇, each of the n − n0 + 1 vertices of ∇n−n0 can have only one representative in Rn ,
April 25, 2002 17:1 WSPC/148-RMP
350
00120
M. Kreuzer & H. Skarke
implying k − k 0 = n − n0 + 1. The equivalence class of 0 can be described uniquely as a positive linear combination of these vertices. This linear combination defines 0 a vector in Rn , which can be written as a negative linear combination of ≤ n0 linearly independent vertices of ∇0 . These vertices, together with those of ∇n−n0 , form the simplex S. By the maximality assumption about ∇0 , dim S cannot exceed dim ∇0 . Definition 2.5. For an n-dimensional minimal polytope ∇ with k vertices, an IP simplex structure is a collection of subsets Si , 1 ≤ i ≤ k − n of the set of vertices of ∇, such that: The convex hull of the vertices in each Si is an IP simplex, Sj ∇j = ConvexHull i=1 Si is a lower-dimensional minimal polytope for every j ∈ {1, . . . , k − n}, ∇k−n = ∇ and Sj−1 Sj \ i=1 Si contains at least two vertices. Corollary 2.1. Every minimal polytope allows an IP simplex structure. Proof. If ∇ is a simplex, this is obvious. Otherwise one can choose Sk−n = S and ∇k−n−1 = ∇0 with S and ∇0 as in Lemma 2.1 and proceed inductively. Lemma 2.2. Denote by {Si } an IP simplex structure. Then Si − contains exactly one point.
S j6=i
Sj never
Proof. An IP simplex contains line segments V V 0 with V 0 = −εV , where ε is a positive number. If a simplex S = ConvexHull {V1 , . . . , Vs+1 } has all of its vertices except one (Vs+1 ) in common with other simplices, then all points in the linear span of S are nonnegative linear combinations of the Vj and the −εj Vj with j ≤ s, thus showing that Vs+1 violates the minimality of ∇. The following example shows that an IP simplex structure need not be unique: Example 2.1. n = 5, ∇ = ConvexHull {V1 , . . . , V8 } with V1 = (1, 1, 0, 0, 0), V2 = (1, −1, 0, 0, 0), V3 = (−1, 0, 1, 0, 0), V4 = (−1, 0, −1, 0, 0) , V5 = (−1, 0, 0, 1, 0), V6 = (−1, 0, 0, −1, 0), V7 = (1, 0, 0, 0, 1), V8 = (1, 0, 0, 0, −1) . (5) ∇ contains the IP simplices S1234 = V1 V2 V3 V4 (in the x1 x2 x3 -plane), S1256 (in the x1 x2 x4 -plane), S3478 (in the x1 x3 x5 -plane), S5678 (in the x1 x4 x5 -plane) and the 4dimensional minimal polytopes ∇123456 , ∇123478 , ∇125678 , ∇345678 . Any set of three of the four IP simplices defines an IP simplex structure.
April 25, 2002 17:1 WSPC/148-RMP
00120
Reflexive Polyhedra, Weights and Toric Calabi–Yau Fibrations
351
Lemma 2.3. For dimensions n = 1, 2, 3, 4 of Rn precisely the following IP simplex structures of minimal polyhedra are possible: n = 1 : {S1 = V1 V2 } ; n = 2 : {S1 = V1 V2 V3 } , {S1 = V1 V2 , S2 = V10 V20 , } ; n = 3 : {S1 = V1 V2 V3 V4 } , {S1 = V1 V2 V3 , S2 = V10 V20 } , {S1 = V1 V2 V3 , S2 = V1 V20 V30 } , {S1 = V1 V2 , S2 = V10 V20 , S3 = V100 V200 } ; n = 4 : As in the first column of Table 1 in the appendix . Proof. Recursive application of Lemma 2.1 and use of Lemma 2.2 shows that these are the only possible structures. Explicit realisations of these structures will be presented later. 2.2. Weight systems Any IP polytope, and therefore any reflexive polyhedron, must obviously contain one of the minimal polyhedra encountered in the last subsection. The structures found there are rather coarse, so now we have to face the task of suitably refining them in such a way that they become useful for our goal of classifying reflexive polyhedra. In particular, we will find that the linear relations between the vertices of minimal polyhedra can be encoded by sets of real numbers called weight systems, and we will address the question of which weight systems can occur if a minimal polyhedron is a subpolyhedron of some reflexive polytope. The fact that a simplex spanned by vertices Vi contains the origin in its interior is equivalent to the condition that there exist positive real numbers (weights) qi P such that qi Vi = 0. As these numbers are unique up to a common factor, it is P convenient to choose some normalisation such as qi = 1. Definition 2.6. A weight system is a collection of positive real numbers (weights) P qi = 1. A weight system corresponding to an IP simplex with vertices qi with P qi Vi = 0. A combined weight Vi is the normalised set of numbers qi such that system (CWS) corresponding to a minimal polyhedron endowed with an IP simplex (j) structure is the collection of weight systems qi corresponding to the IP simplices (j) Sj occurring there, with qi = 0 if Vi 6∈ Sj . We call a (combined) weight system rational if all of the qi are rational numbers. If a minimal polyhedron ∇ is a lattice polyhedron, a corresponding CWS will always be rational. In this case it is possible to normalise the weights as positive P integers ni with no common divisor; then qi = ni /d with d = ni . We will use both conventions for describing weight systems. By the definition of a lattice polyhedron,
April 25, 2002 17:1 WSPC/148-RMP
352
00120
M. Kreuzer & H. Skarke
any lattice on which a minimal polyhedron ∇ is integer must contain the lattice Ncoarsest generated by the vertices of ∇. Definition 2.7. Given a minimal polyhedron ∇ ⊂ NR , we define the lattice Ncoarsest as the lattice in NR generated linearly over Z by the vertices of ∇ and the lattice Mfinest ⊂ MR as the lattice dual to Ncoarsest . Lemma 2.4. If ∇ is a minimal polyhedron with vertices Vi and q a CWS corresponding to an IP simplex structure of ∇, then: (a) The map MR → Rk , X → x = (x1 , . . . , xk ) with xi = hVi , Xi defines an P (j) embedding such that the image of MR is the subspace defined by i qi xi = 0 ∀ j. (b) ∇∗ is isomorphic to the polyhedron defined in this subspace by xi ≥ −1 for i = 1, . . . , k. (c) If q is rational, then Mfinest is isomorphic to the sublattice of Zk = P (j) {(x1 , . . . , xk ) integer} ⊂ Rk determined by the equations i qi xi = 0. Proof. P (j) P (j) (a) qi Vi = 0 implies i qi xi = 0. Conversely, the xi determine X because a point in MR is uniquely determined by its duality pairings with a set of generators (here, the Vi ) of the dual space. (b) Follows from the form of the embedding map and the definition of the dual polytope (2). (c) If X belongs to any lattice M such that ∇ is integer on the dual lattice N , the corresponding xi must be integer. If the xi are integer, then X has integer pairings with the generators Vi of Ncoarsest , so X belongs to Mfinest. Corollary 2.2. An IP simplex structure together with the specification of a CWS uniquely determines a minimal polyhedron up to isomorphism. Proof. By Lemma 2.4, ∇∗ and hence ∇ is uniquely determined by the CWS. As our example after Lemma 2.2 shows, an IP simplex structure need not be unique, so it is possible that two different CWS may correspond to the same minimal polytope. In such a situation, the weight systems of one CWS must be linear combinations of those of the other CWS with coefficients that are not all nonnegative. Since all weights must be positive, this can only happen if there is an IP simplex such that all of its vertices also belong to other IP simplices in the same IP simplex structure. This can happen only for n ≥ 5, as one can see by explicitly checking all cases for n ≤ 4. Thus, for n ≥ 5 it might be preferable to work with equivalence classes of CWS leading to the same minimal polytopes instead of using CWS only.
April 25, 2002 17:1 WSPC/148-RMP
00120
Reflexive Polyhedra, Weights and Toric Calabi–Yau Fibrations
353
Definition 2.8. If q is a rational CWS corresponding to a minimal polyhedron ∇, we define ∆(q) as the convex hull of ∇∗ ∩ Mfinest . We say that q has the IP property if ∆(q) has the IP property. Corollary 2.3. If a CWS has the IP property, then every single weight system occurring in it also has the IP property. Proof. Without loss of generality we can assume that the single weight system is (1) (1) q(1) with qi > 0 for i ≤ l and qi = 0 for i > l. There is a natural projection π from Zk as in Lemma 2.4 to Zl by restriction to the first l coordinates. Our construction implies that the projection of the lattice polytope in Zk is a subpolytope of the P (1) lattice polytope in Zl determined by i qi xi = 0 and xi ≥ −1. If 0l = π(0k ) were not in the interior of the polytope in Zl , then 0k could not be in the interior of the polytope in Zk . Lemma 2.5. Let l denote the number of weights of a weight system. Then the following statements hold: l = 2: There is a single IP weight system, namely (1, 1). l = 3: There are three IP weight systems, namely (1, 1, 1), (1, 1, 2) and (1, 2, 3). l = 4: There are the 95 IP weight systems shown in Table 3. l = 5: There are 184, 026 IP weight systems which can be found at our web site [34] Proof. The classification of IP weight systems is based on the study of which integer points are allowed by Lemma 2.4. Assume that a weight system q1 , . . . , ql allows a collection of points with coordinates xi ≥ −1 as in Lemma 2.4, including the interior point with xi = 0 ∀ i. If these points fulfill an equation of the type Pl 0 with a 6= q, then the weight system must also allow at least one i=1 ai xi = Pl Pl point with i=1 ai xi > 0 and at least one point with i=1 ai xi < 0 to ensure that 0 is really in the interior. The latter inequality is the one that we actually use for the algorithm: Starting with the point 0, we see that unless our weight system is Pl q = (1/l, . . . , 1/l), there must be at least one point with i=1 xi < 0. For l ≤ 5 there are only a few possibilities, and after choosing some point x1 , we can look for some simple equation fulfilled by 0 and x1 and proceed in the same way. If l = 2, any weight system except (1/2, 1/2) would have to allow an integer point with x1 + x2 < 0, x1 ≥ −1 and x2 ≥ −1. Such a point has no positive coordinate and therefore cannot be allowed by a (positive) weight system. For l = 3 the classification is still easily carried out by hand: Unless q = (1/3, 1/3, 1/3), we need at least one point with x1 + x2 + x3 < 0. As points where no coordinate is greater than 0 would be in conflict with the positivity of the weight system, we need the point (1, −1, −1) (up to a permutation of indices). Now we note that 0 and (1, −1, −1) both fulfill 2x1 + x2 + x3 = 0, so q = (1/2, 1/4, 1/4) or we need a point with 2x1 +x2 +x3 < 0. The only point allowed by this inequality which leads to a sensible weight system is (−1, 2, −1), leading to q = (1/2, 1/3, 1/6).
April 25, 2002 17:1 WSPC/148-RMP
354
00120
M. Kreuzer & H. Skarke
For l = 4 and l = 5 we have implemented this strategy in a computer program that produced 99 and 200653 candidates for IP weight systems, respectively. Finally, explicit constructions of ∆(q) show that four of the 99 weight systems with l = 4 and 16627 of the 200653 weight systems with l = 5 do not have the IP property, leading to the results given. Remark 2.1. The 95 IP weight systems for l = 4 are precisely the well known 95 weight systems for weighted P4 ’s that have K3 hypersurfaces [46, 47], whereas for l = 5 the 7555 weight systems corresponding to weighted P4 ’s that allow transverse polynomials [11, 10] are just a small subset of the 184026 different IP weight systems. Lemma 2.6. In dimensions n = 1, 2, 3, 4, the CWS with the IP property are the weight systems with l = n + 1 given in the previous lemma and, in addition, the following CWS: n = 2: {(1, 1, 0, 0), (0, 0, 1, 1)} n = 3: The 21 CWS given in Table 2 n = 4: 17320 CWS (cf. the second column of Table 1) Proof. By explicitly combining the structures of Lemma 2.3 with the IP weight systems of Lemma 2.5 and checking for the IP property of ∆(q). 2.3. The classification As we saw in the previous subsections, every reflexive polyhedron must contain at least one minimal polytope corresponding to one of the CWS found there. Thus, by duality, every reflexive polyhedron must be a subpolyhedron of one of the ∆(q) on some suitable sublattice of the finest possible lattice Mfinest . We start this section with analysing the question of which dual pairs of lattices can be chosen such that a dual pair of polyhedra is reflexive on them. Then we give various refinements of our original definition of minimality, and finally we present our results on the classification of reflexive polyhedra. Given a dual pair of polytopes such that ∆ has nV vertices and nF facets (a facet being a codimension 1 face), the dual polytope has nV facets and nF vertices. Definition 2.9. The vertex pairing matrix (VPM) X is the nF × nV matrix whose entries are Xij = hV¯i , Vj i, where V¯i and Vj are the vertices of ∆∗ and ∆, respectively. Xij will be −1 whenever Vj lies on the i’th facet. Note that X is independent of the choice of a dual pair of bases in NR and MR but depends on the orderings of the vertices. If ∆ is reflexive, then its VPM is obviously integer. In this case there are distinguished lattices Mcoarsest and Ncoarsest , generated by the vertices of ∆ and ∆∗ , respectively, and their duals Nfinest and Mfinest . Clearly any lattice M on which ∆ is reflexive must fulfill Mcoarsest ⊆ M ⊆ Mfinest.
April 25, 2002 17:1 WSPC/148-RMP
00120
Reflexive Polyhedra, Weights and Toric Calabi–Yau Fibrations
355
Lemma 2.7. If ∆ ⊂ MR ' Rn is a polytope with the IP property such that its VPM X is integer, the following statements hold: ˜ ·D ˜ ·U ˜ = W · D · U, where W ˜ is a GL(nF , Z) X can be decomposed as X = W ˜ ˜ matrix, U is a GL(nV , Z) matrix and D is an nF × nV matrix such that the first n diagonal elements are positive integers whereas all other elements are zero; W, D and U are the obvious nF × n, n × n and n × nV submatrices. The lattices M ⊂ MR on which ∆ is reflexive are in one to one correspondence with decompositions D = T · S, where T and S are upper triangular integer matrices with positive diagonal elements and with 0 ≤ Tji < Tii . Then ∆ as a lattice polyhedron on M is isomorphic to the polytope in Zn whose vertices are given by the columns of S · U and ∆∗ is isomorphic to the polytope in Zn whose vertices are given by the lines of W · T . In particular, ∆ on Mfinest corresponds to D · U, ∆ on Mcoarsest corresponds to U, ∆∗ on Nfinest corresponds to W · D and ∆∗ on Ncoarsest corresponds to W. Proof. By recombining the lines and columns of X in the style of Gauss’s algorithm for solving systems of linear equations, we can turn X into an nF ×nV matrix ˜ with non-vanishing elements only along the diagonal. But recombining lines just D corresponds to left multiplication with some GL(nF , Z) matrix, whereas recombining columns corresponds to right multiplication with some GL(nV , Z) matrix. Keeping track of the inverses of these matrices, we successively create decomposi˜ (n) · U ˜ (n) (with W ˜ (0) = 1, D ˜ (0) = X and U ˜ (0) = 1). We denote ˜ (n) · D tions X = W ˜ ˜ ˜ ˜ ˜ being regular the matrices resulting from the last step by W , D and U . W and U ˜ has only n non-vanishing matrices and the rank of X being n, it is clear that D elements which can be taken to be the first n diagonal elements. In the same way as we defined an embedding of MR in Rk Lemma 2.4, we now define an embedding in RnF such that Mfinest is isomorphic to the sublattice of ZnF determined by the linear relations among the V¯i . In this context the Xij ˜ effects a are just the embedding coordinates of the Vj . The nF × nF matrix W nF so that ∆ now lies in the lattice spanned by the change of coordinates in Z first d coordinates. Thus we can interpret the columns of D · U as the vertices of ∆ on Mfinest. Similarly, the lines of W · D are coordinates of the vertices of ∆∗ on Nfinest , whereas U and W are the corresponding coordinates on the coarsest possible lattices. ~ i and the generators of Mfinest by ~ei , we Denoting the generators of Mcoarsest by E ~ i = ~ej Dji . An intermediate lattice will have generators E~i = ~ej Tji such that have E ~ i = E~j Sji = ~ek Tkj Sji ~ i can be expressed in terms of the E~j , amounting to E the E with some integer matrix S. This results in the condition Dki = Tkj Sji . In order to get rid of the redundancy coming from the fact that the intermediate lattices can be described by different sets of generators, one may proceed in the following way: E~1 may be chosen as a multiple of ~e1 (i.e., E~1 = ~e1 T11 ). Then we choose E~2 as a vector in the ~e1 -~e2 -plane (i.e., E~1 = ~e1 T12 + ~e2 T22 ) subject to the condition that the lattice generated by E~1 and E~2 should be a sublattice of the one generated
April 25, 2002 17:1 WSPC/148-RMP
356
00120
M. Kreuzer & H. Skarke
~ 1 and E ~ 2 , which is equivalent to the possibility of solving Tkj Sji = Dki for by E integer matrix elements of S. We may avoid the ambiguity arising by the possibility of adding a multiple of E1 to E2 by demanding 0 ≤ T12 < T11 . We can choose the elements of T column by column (in rising order). For each particular column i we first pick Tii such that it divides Dii ; then Sii = Dii /Tii . Then we pick the Tji with j decreasing from i − 1 to 1. At each step the j’th line of T · S = D, X Tjk Ski + Tjj Sji = 0 , (6) Tji Sii + j
must be solved for the unknown Tji and Sji with the extra condition 0 ≤ Tji < Tii ensuring that we get only one representative of each equivalence class of bases. At this point we have, in principle, all the ingredients that we need for a complete classification of reflexive polyhedra. We simply have to construct all subpolyhedra with integer VPM of all ∆(q) with q being one of our IP CWS, and apply Lemma 2.7. Both for theoretical and for practical reasons, however, it is interesting to reduce the number of polyhedra used as a starting point in our scheme. To this end we will give various refinements of our original definition of minimality, preceded by a useful lemma on the structure of ∆(q). Lemma 2.8. For n ≤ 4, ∆(q) is reflexive whenever it has the IP property. Proof. This fact was proved in [26] and later explicitly confirmed by our computer programs. Definition 2.10. Let ∇ ⊂ NR be a minimal lattice polyhedron such that ∆, the convex hull of ∇∗ ∩ M , also has the IP property. Then we say that ∇ has the span property if the vertices of ∇ are also vertices of ∆∗ ; ∇ is lp-minimal: if we remove one of the vertices of ∇, the convex hull of the remaining set of lattice points of ∇ does not have the IP property; ∇ is very minimal; if we remove one of the vertices of ∇ from the set of lattice points of ∆∗ , the convex hull of the remaining lattice points of ∆∗ does not have the IP property; a CWS q is said to have one of the above properties if the corresponding ∇ on Ncoarsest has it. A reflexive polytope ∆ ⊂ MR is called r-maximal (and its dual ∆∗ ⊂ NR r-minimal) if it is not contained in any other reflexive polytope; a CWS q is called r-minimal if ∆(q) is r-maximal. The name “span property” refers to the fact that our definition is equivalent to the statement that the hyperplanes in MR dual to the vertices of ∇ are spanned by points of ∆. The following lemma clarifies the relations between the various definitions of minimality and the ways in which these definitions can be used to refine our classification scheme. It also answers the question of how many CWS of the various minimality types exist.
April 25, 2002 17:1 WSPC/148-RMP
00120
Reflexive Polyhedra, Weights and Toric Calabi–Yau Fibrations
357
Lemma 2.9. (a) For every reflexive polytope ∆ ⊂ MR , there exists at least one CWS q with the span property such that ∆ is a subpolyhedron of the convex hull of ∇∗ ∩ M and M is a sublattice of Mfinest. (b) For every reflexive polytope ∆ ⊂ MR , there exists at least one lp-minimal CWS q such that ∆ is a subpolyhedron of the convex hull of ∇∗ ∩ M and M is a sublattice of Mfinest . (c) If q is very minimal, ∆(q) is not a subpolyhedron of ∆(q0 ) for any q0 corresponding to a minimal polytope different from the one defined by q. (d) A very minimal polytope is lp-minimal and has the span property. (e) For every reflexive polytope ∆ ⊂ MR ' Rn with n ≤ 4, there exists at least one r-minimal CWS q such that ∆ is a subpolyhedron of the convex hull of ∇∗ ∩ M and M is a sublattice of Mfinest . (f) For n ≤ 4, a CWS q is r-minimal if and only if it is very minimal. (g) For n ≤ 3 (but not for n = 4), every lp-minimal CWS has the span property. (h) The very minimal CWS for n = 2 are {(1, 1, 1)}, {(1, 1, 2)} and {(1, 1, 0, 0), (0, 0, 1, 1)}. The remaining IP weight system {(1, 2, 3)} has the span property but is not lp-minimal. (i) For n = 3 the minimality type is indicated in Tables 2 and 3. (j) For n = 4 the numbers of CWS of the different minimality types are given in Table 1. Proof. (a) By dropping vertices from ∆∗ one can always arrive at a minimal polytope ∇ ⊆ ∆∗ and the corresponding CWS. (b) By dropping lattice points from ∆∗ one can always arrive at an lp-minimal (and therefore also minimal) polytope ∇ ⊆ ∆∗ and the corresponding CWS. (c) If ∆(q) were a subpolyhedron of ∆(q0 ) for some q0 other than q, then (∆(q))∗ would contain (but not be equal to) (∆(q0 ))∗ , which is impossible by the definition of q being very minimal. (d) By definition, ∆ ⊆ ∇∗ , implying ∇ ⊆ ∆∗ . Very minimal implies span: If a vertex of ∇ were not a vertex of ∆∗ , it would be in the convex hull of the remaining lattice points of ∆∗ which then would be equal to ∆∗ and hence have the IP property, thus violating the assumption that ∇ is very minimal. The fact that very minimal implies lp-minimal is obvious from comparing the different definitions. (e) With (a), we can find a CWS q(1) such that ∆ is a subpolyhedron of ∆(q(1) ), possibly on a sublattice. By Lemma 2.8, ∆(q(1) ) is reflexive. If q(1) is not rminimal, ∆(q(1) ) is a proper subpolyhedron of some other reflexive polyhedron ∆(1) for which we can find a CWS q(2) as before. As the number of lattice
April 25, 2002 17:1 WSPC/148-RMP
358
00120
M. Kreuzer & H. Skarke
points of ∆(q(i) ) increases in every step, this process has to terminate; thus q(i) must be r-minimal for some i. (f) Because of (c), every very minimal CWS is r-minimal. The fact that every r-minimal CWS is very minimal was checked explicitly by our computer programs. (g)–(j) By explicit checks, for n ≥ 3 with the help of our computer programs. To end this section, we now give the results of the application of our classification scheme for various dimensions. Proposition 2.1. For n = 2 there are 16 reflexive polyhedra up to linear isomorphisms. All of them are subpolyhedra of ∆(q) where q is one of the three very minimal CWS. Proof. The classification of 2-dimensional reflexive polyhedra has been established for a while (see, e.g., [48, 49]) and is easily reproduced within our scheme. The second fact can be checked explicitly. Proposition 2.2. For n = 3 there are 4319 reflexive polyhedra up to linear isomorphisms. 4318 of them are subpolyhedra of ∆(q) where q is one of the very minimal CWS of Tables 2 and 3. The remaining one is the convex hull of ∇∗ ∩ M, where ∇ is determined by the weight system (1, 1, 1, 1) and M is a Z2 sublattice of Mfinest . Proof. We explicitly constructed all subpolyhedra with integer VPMs of the ∆(q) coming from very minimal CWS with the help of a computer program and checked that polyhedra coming from CWS that are not very minimal are contained in the list of reflexive subpolyhedra of the ∆(q) for very minimal CWS. Application of Lemma 2.7 produced the last polyhedron (see [50]). Proposition 2.3. For n = 4 there are 473, 800, 776 reflexive polyhedra up to linear isomorphisms. In addition to ∆(q) with q one of the 308 r-minimal CWS, there are 32 further r-maximal polyhedra. Proof. Our computer programs produced 473,800,652 different reflexive subpolyhedra of the 308 polytopes we started with, and another 124 that are subpolytopes on sublattices. Among these there are 25 r-maximal polyhedra that were obtained by applying Lemma 2.7 to the original 308 r-maximal polytopes and checking for r-minimality of the duals on the various lattices allowed by Lemma 2.7. Another 7 r-maximal polyhedra showed up on sublattices that were generated either by lattice reductions that were required by reflexivity during the search for subpolytopes or in the final search for all allowed lattices for all VPMs (for details see [51]).
April 25, 2002 17:1 WSPC/148-RMP
00120
Reflexive Polyhedra, Weights and Toric Calabi–Yau Fibrations
359
3. Geometric Interpretation of Our Classification Results We now want to discuss what our results on the classification of reflexive polyhedra imply for Calabi–Yau manifolds that are hypersurfaces in toric varieties. The lattice points of a reflexive polyhedron ∆ encode the monomials occurring in the description of the hypersurface in a variety VΣ whose fan Σ is determined by a triangulation of the dual polyhedron ∆∗ . For details of what a fan is and how it determines a toric variety, it is best to look up a standard textbook [52, 53]. There is one particular approach to the description of toric varieties, however, which cannot be found there. This is the description in terms of homogeneous coordinates [54], which is the one most useful for applications in physics, and which also exhibits in the clearest way the significance of the weight systems that we used in the context of our classification scheme. We will briefly present this approach and show how Calabi–Yau manifolds are constructed in this setup and then we will proceed to explain some of the consequences of our results in terms of geometry. Given a fan Σ in NR , it is possible to assign a global homogeneous coordinate system to VΣ in a way similar to the usual construction of Pn . To this end one assigns a coordinate zk , k = 1, . . . , K to each one-dimensional cone in Σ. If the primitive generators v1 , . . . , vK of these one-dimensional cones span NR , then there P must be K − n independent linear relations of the type k wjk vk = 0. These linear relations are used to define equivalence relations of the type 1
K
(z1 , . . . , zK ) ∼ (λwj z1 , . . . , λwj zK ) ,
j = 1, . . . , K − n
(7)
on the space CK \ ZΣ . The set ZΣ is determined by the fan Σ in the following way: It is the union of spaces {(z1 , . . . , zK ) : zi = 0 ∀ i ∈ I}, where the index sets I are those sets for which {vi : i ∈ I} does not belong to a cone in Σ. Thus (C∗ )K ⊂ CK \ ZΣ ⊂ CK \ {0}. Then VΣ = (CK \ ZΣ )/((C∗ )(K−n) × G), where the K − n copies of C∗ act by the equivalence relations given above and the finite abelian group G is the quotient of the N lattice by the lattice generated by the vk . We will usually consider the case where G is trivial. In this approach the toric divisors Dk are determined by the equations zk = 0. The construction of a Calabi–Yau hypersurface from a reflexive polyhedron proceeds in the following way: We take ∆ to be a reflexive polyhedron in MR , ∆∗ ⊂ NR its dual, and Σ a fan defined by a maximal triangulation of ∆∗ . This means that the integer generators v1 , . . . , vK of the one-dimensional cones are just the integer points (except the origin) of ∆∗ . The polynomial whose vanishing determines the Calabi–Yau hypersurface takes the form X x∈∆∩M
ax
K Y
hvk ,xi+1
zk
.
(8)
k=1
It is easily checked that it is quasihomogeneous with respect to all K − n relations PK of (7) with degrees dj = k=1 wjk , j = 1, . . . , K − n. Note how the reflexivity of the polyhedron ensures that the exponents are nonnegative.
April 25, 2002 17:1 WSPC/148-RMP
360
00120
M. Kreuzer & H. Skarke
By [15], the Hodge numbers h11 and h1,n−2 are known, and in [36] the remaining Hodge numbers of the type h1i were calculated. For a hypersurface of dimension n − 1 ≥ 2 these formulas can be summarised as ! X l∗ (θ∗ ) h1i = δ1i l(∆∗ ) − n − 1 − codimθ ∗ =1
+ δn−2,i l(∆) − n − 1 −
X codimθ=1
! ∗
l (θ)
+
X
l∗ (θ∗ )l∗ (θ)
(9)
codimθ ∗ =i+1
for 1 ≤ i ≤ n − 2, where l denotes the number of integer points of a polyhedron and l∗ denotes the number of interior integer points of a face. These formulas are invariant under the simultaneous exchange of ∆ with ∆∗ and h1i with h1,n−i so that Batyrev’s construction is manifestly mirror symmetric (at least at the level of Hodge numbers). For n ≤ 4, the generic (n − 1)-dimensional Calabi–Yau hypersurface in the family defined by ∆ will be smooth [15] and the meaning of these numbers is unambiguous. For n ≥ 5, the Calabi–Yau variety may have singularities that do not allow a crepant blow-up. In this case we refer the reader to [36] for a discussion of the precise meaning of the Hodge numbers resulting from Eq. (9). In the case of a K3 surface there is only one such number, namely h11 , which is well known always to be equal to 20. Contrary to the case of higher dimensional Calabi–Yau manifolds, this number is not the same as the Picard number, which is given by [15] X X l∗ (θ∗ ) + l∗ (θ∗ )l∗ (θ) . (10) Pic = l(∆∗ ) − 4 − facetsθ ∗ of ∆∗ edgesθ ∗ of ∆∗ Mirror symmetry for K3 surfaces is usually interpreted in terms of families of lattice polarised K3 surfaces (see, e.g., [55] or [56]). In this context the Picard number of a generic element of a family and the Picard number of a generic element of the mirror family add up to 20. The fact that the Picard numbers for toric mirror P families add up to 20 + l∗ (θ∗ )l∗ (θ) indicates that our toric models occupy rather special loci in the total moduli spaces. If a polyhedron ∆1 contains a polyhedron ∆2 , then the definition of duality implies ∆∗1 ⊂ ∆∗2 . Therefore the variety determined by the fan over ∆∗1 may be obtained from the variety determined by the fan over ∆∗2 by blowing down one or several divisors. If we perform this blow-down while keeping the same monomials (those determined by ∆2 ), we obtain a generically singular hypersurface. This hypersurface can be desingularised by varying the complex structure in such a way that we now allow monomials determined by ∆1 . Thus the classes of Calabi–Yau hypersurfaces determined by polyhedra ∆1 and ∆2 , respectively, can be said to be connected whenever ∆1 contains ∆2 or vice versa. More generally, if there is a chain of polyhedra ∆i such that ∆i and ∆i+1 are connected in the sense defined above, we call the hypersurfaces corresponding to any two elements of the chain connected.
April 25, 2002 17:1 WSPC/148-RMP
00120
Reflexive Polyhedra, Weights and Toric Calabi–Yau Fibrations
361
We can easily check for connectedness as a by-product of our classification scheme: For each new CWS q we check explicitly that at least one of the subpolyhedra of ∆(q) has been found before. Connectedness of the corresponding list of 4318 polytopes in three dimensions follows from the fact that this is always the case. Connectedness of all 3-d reflexive polyhedra follows from the fact that the last polytope that we only found on a sublattice contains 679 reflexive proper subpolytopes that were found before. In the same way all of the four-dimensional polyhedra that we have found so far form a connected web. As we saw in the previous section, every three-dimensional reflexive polytope ∆∗ ⊂ NR contains one of 16 r-minimal polyhedra as a subpolytope on the same lattice. Therefore, the fan of any toric ambient variety determined by a maximal triangulation of a reflexive polyhedron is a refinement of one of the corresponding 16 fans. In other words, any such toric ambient variety is given by the blow-up of one of the following 16 spaces (cf. Tables 2 and 3 and Proposition 2.2): – – – – – – –
P3 , P3 /Z2 , 8 different weighted projective spaces P2(q1 ,q2 ,q3 ) , P2 × P1 , P2(1,1,2) × P1 , 3 further double weighted spaces, and P1 × P1 × P1 .
Each of the three spaces with ‘overlapping weights’ allows two distinct bundle structures: The first one can be interpreted as a P2 bundle in two distinct ways, the second one as a P2 bundle or a P2(1,1,2) bundle, and the third one can be interpreted as a P2(1,1,2) bundle in two distinct ways. In each case the base space is P1 . Let us end this section with briefly discussing a few of the most interesting objects in our lists. There are precisely two mirror pairs with Picard numbers 1 and 19, respectively. One of them is the quartic hypersurface in P3 with Picard number 1, together with its mirror of Picard number 19, which is also the model whose Newton polytope is the only reflexive polytope with only 5 lattice points. This model corresponds to a blow-up of a Z4 × Z4 orbifold of P3 . The blow-up of six fixed lines zi = zj by three divisors each yields 18 exceptional divisors leading to the total Picard number of 19. The other mirror pair with Picard numbers 1 and 19 consists of the hypersurface in P3(1,1,1,3) of degree 6 and an orbifold of the same model, with Newton polyhedra with 39 and 6 points, respectively. This polyhedron is also one of the two ‘largest’ polyhedra in the sense that there is no reflexive polytope in three dimensions with more than 39 lattice points. The other polyhedron with the maximal number of 39 points is the Newton polytope of the hypersurface of degree 12 in P3(1,1,4,6) . This model leads to the description of elliptically fibered K3 surfaces that is commonly used in F -theory applications [22, 29, 30], with the elliptic fiber embedded in a P2(1,2,3) by a Weierstrass equation. The mirror family of this class of
April 25, 2002 17:1 WSPC/148-RMP
362
00120
M. Kreuzer & H. Skarke
models can be obtained by forcing two E8 singularities into the Weierstrass model and blowing them up. The resulting hypersurface allows also a different fibration structure which can develop an SO(32) singularity; thereby this model is able to describe the F -theory duals of both the E8 × E8 and the SO(32) heterotic strings with unbroken gauge groups in 8 dimensions [57]. In four dimensions there is a unique ‘largest’ object, determined by the weight system (1, 1, 12, 28, 42)/84. It has the maximum number, namely 680, of lattice points and the corresponding Calabi–Yau threefold has the Hodge numbers h11 = 11 and h12 = 491. The latter is the largest single Hodge number in our list, and the value of |χ| = |2(h11 − h12 )| = 960 is also maximal, the only other object with the same values being the mirror. F -theory compactifications of the latter lead to the theories with the largest known gauge groups in six dimensions [57]. Another interesting object that we encountered is the 24-cell, a self dual polytope with 24 vertices, which leads to a self mirror Calabi–Yau manifold with Hodge numbers (20, 20). It has the maximal symmetry order 1152 = 24 ∗ 48 among all 4 dimensional reflexive polytopes and arises as a subpolytope of the hypercube. It is a Platonic solid that contains the Archimedian cuboctahedron (with symmetry order 48) as a reflexive section through the origin parallel to one of its 24 bounding octahedra. Note that in our context symmetries are realised as lattice isomorphisms, i.e. as subgroups of GL(n, Z), and not as rotations. The polytope with the largest order, namely 128, of Mfinest /Mcoarsest is determined by the weight system (1, 1, 1, 1, 4)/8. For the Newton polytope of the quintic hypersurface in P4 , this order is 125. There is a well known Z5 orbifold of the quintic with Hodge numbers (1, 21) which is quite peculiar from the lattice point of view: Although the N lattice is not the lattice Nfinest generated by the vertices of ∆∗ , the only lattice points of ∆∗ are its vertices and the IP. Thus it provides an example where the N lattice is not even generated by the lattice points of ∆∗ . This can only happen in more than 3 dimensions: As a lattice triangle with 3 lattice points is always regular (i.e. it has the minimal volume 1 in lattice units) and there are no lattice hyperplanes between a facet and the IP because of reflexivity, the vertices of any triangle of a maximal triangulation of a 2-dimensional facet of a 3-dimensional polytope provide a lattice basis.
4. Fibrations In this section we want to discuss fibrations of hypersurfaces of holonomy SU (n−1) in n-dimensional toric varieties where the generic fiber is an (nf − 1)-dimensional variety of holonomy SU (nf − 1). In other words, it will apply to elliptic fibrations of K3 surfaces, CY threefolds, CY fourfolds, etc., to K3 fibrations of CY k-folds with k ≥ 3, to threefold fibrations of fourfolds, and so on. The main message is that the structures occurring in the fibration are reflected in structures in the N lattice: The fiber, being an algebraic subvariety of the whole space, is encoded by a polyhedron ∆∗f which is a subpolyhedron of ∆∗ , whereas the base, which is a
April 25, 2002 17:1 WSPC/148-RMP
00120
Reflexive Polyhedra, Weights and Toric Calabi–Yau Fibrations
363
projection of the fibration along the fiber, can be seen by projecting the N lattice along the linear space spanned by ∆∗f . We will first give a general discussion and then explain how descriptions in terms of CWS may be useful for identifying and/or encoding fibration structures. 4.1. Fibrations and reflexive polyhedra Assume that ∆∗ contains a lower-dimensional reflexive subpolyhedron ∆∗f = (Nf )R ∩ ∆∗ with the same interior point. This allows us to define a dual pair of exact sequences 0 → Nf → N → Nb → 0
(11)
0 → Mb → M → Mf → 0 ,
(12)
and
and corresponding sequences for the underlying real vector spaces. We can convince ourselves that the image of ∆ under MR → (Mf )R is dual to ∆∗f in the following way: We choose a basis ej , j = 1, . . . , n of N such that Nf is generated by the ej with 1 ≤ j ≤ nf and define ei to be the dual basis. Then ∆f = {(x1 , . . . , xnf ) : ∃xnf +1 , . . . , xn with (x1 , . . . , xn ) ∈ ∆)} , (∆∗ )f = {(y1 , . . . , ynf ) : (y1 , . . . , ynf , 0) ∈ ∆∗ } ,
(13) (14)
and the duality of these two polytopes is easily checked. Let us also assume that the image Σb of Σ under π : N → Nb defines a fan in Nb . This is certainly not true for arbitrary triangulations of ∆∗ . Constructing fibrations, one should rather build a fan Σb from the images of the one-dimensional cones in Σ and try to construct a triangulation of Σ and thereby of ∆∗ that is compatible with the projection. It would be interesting to know whether this is always possible whenever the intersection of a reflexive polyhedron with a linear subspace of NR is again reflexive. The set of one-dimensional cones in Σb is the set of images of one-dimensional cones in Σ that do not lie in Nf . The image of a primitive generator vi of a cone in Σ is the origin or a positive integer multiple of a primitive generator v˜j of a onedimensional cone in Σb . Thus we can define a matrix rji , most of whose elements are 0, through πvi = rij v˜j with rij ∈ N if πvi lies in the one-dimensional cone defined by v˜j and rij = 0 otherwise. Our base space is the multiply weighted space determined by 1
˜ K
˜ −n j = 1, . . . , K ˜, (15) P i ˜ji are any integers such that i w ˜j v˜i = 0. The projection where n ˜ = n−nf and the w map from VΣ (and, as we will see, from the Calabi–Yau hypersurface) to the base (˜ z1 , . . . , z˜K˜ ) ∼ (λw˜j z˜1 , . . . , λw˜j z˜K˜ ) ,
April 25, 2002 17:1 WSPC/148-RMP
364
00120
M. Kreuzer & H. Skarke
is given by z˜i =
Y
ri
zj j .
(16)
j j
j
i
This is well defined: zj → λwk zj leads to z˜i → λwk rj z˜i which is among the good P j i P j wk rj v˜i = 0. equivalence relations because applying π to wk vj = 0 gives A generic point in the base space will have z˜i 6= 0 for all i, implying zi 6= 0 for all vi 6∈ ∆∗f . The choice of a specific point in VΣb and the use of all equivalence relations except for those involving only vi ∈ ∆∗f allows to fix all zi except for those corresponding to vi ∈ ∆∗f . Thus the preimage of a generic point in VΣb is indeed a variety in the moduli space determined by ∆∗f . What we have seen so far is just that VΣ is a fibration over VΣb with generic fiber VΣf (this is actually the statement of an exercise in [52, p. 41] and how this fibration structure manifests itself in terms of homogeneous coordinates. Now we also want to see how this can be extended to hypersurfaces. To this end note that if vk ∈ ∆∗f then hvk , xi only depends on the equivalence class [x] ∈ Mf of x under x∼y
if
Thus we may rewrite Eq. (8) as Y hv ,[x]i+1 X a0[x] zk k p= [x]∈∆f ∩Mf
x − y ∈ Mb .
with
vk ∈∆∗ f
a0[x] =
(17) X x∈[x]
ax
Y
hvk ,xi+1
zk
.
(18)
vk 6∈∆∗ f
In each coordinate patch for VΣb this is just an equation for the fiber with coefficients that are polynomial functions of coordinates of the base space. Whenever a one-dimensional cone (with primitive generator v˜i ) in Σb is the image of more than one one-dimensional cone in Σ, the fiber becomes reducible over the divisor z˜i = 0 determined by vi . Different components of the fiber correspond to different equations zj = 0 with πvj = rji v˜i . The intersection patterns of the different components of the reducible fibers are crucial for understanding enhanced gauge symmetries in type IIA string theory [58, 56] and F -theory [22, 29, 30]. Blowing down the corresponding subvarieties (and hence making the Calabi–Yau space itself singular) leads to the appearance of non-perturbative enhanced gauge groups whose Lie algebras are determined by the intersection patterns of the components of the fibers. In terms of the N lattice, the occurrence of enhanced gauge groups can be easily inferred by studying the preimage of a one-dimensional cone in Σb . In particular, as noted by Candelas and Font [59], under favourable circumstances the Dynkin diagram of the corresponding Lie algebra can be seen directly in the toric diagram in the N lattice. 4.2. Fibrations and weight systems As for polyhedra, weight systems provide a very useful and economic tool for constructing and describing fibrations. We only consider toric CY fibrations which
April 25, 2002 17:1 WSPC/148-RMP
00120
Reflexive Polyhedra, Weights and Toric Calabi–Yau Fibrations
365
require a reflexive section through the origin of ∆∗ ⊂ NR whose dimension is equal to nf for (nf − 1)-dimensional CY-fibers. In the M lattice this corresponds to a projection onto the dual reflexive polyhedron along an (n − nf )-dimensional subspace. Hence, on either side, we need to specify a linear subspace (Nf )R ⊂ NR or (Mb )R ⊂ MR . This can be done, for example, by singling out vectors that span the subspace or by representing the subspace as an intersection of hyperplanes. If the polyhedron is given in terms of some CWS it is natural to try to specify this linear subspace by using some part of the weight information. Note that a (j) lower index i in a CWS ni corresponds to a vertex of the minimal polytope ∇, or, by duality, to a bounding hyperplane in the M lattice (which, if ∆ is embedded into Rk , is the intersection of the coordinate hyperplane ai = 0, or xi = −1, with the affine or linear subspace that supports ∆). An upper index j corresponds to a simplex in the CWS and, hence, to the linear subspace spanned by that simplex. Actually this can be regarded as a special case of the former correspondence, since the subspace that is spanned by a simplex S j is generated by those vectors of ∇ (j) for which ni 6= 0. The simplest case is therefore the situation where a subset of the IP simplex structure of the minimal polytope ∇ provides a weight system for ∆f . In turn, we can engineer fibrations where the fiber corresponds to a certain weight system if we start by generating combined weight systems q with a given subsystem qf . As usual, one has to check that ∆(q) is reflexive, which in up to 4 dimensions is equivalent to the IP property (in the case of Calabi–Yau 4-fold CWS with ∆(q) IP but not reflexive we can proceed with reflexive subpolytopes ∆0 ⊂ ∆(q)). Because of the additional equations that come from the extended CWS q, the set of solutions to Eq. (1) is smaller than those for qf when we disregard the additional coordinates. Hence the intersection of ∆∗ with the subspace spanned by the vertices of ∇f contains ∆(qf )∗ , but may be larger. We therefore obtain a fibration if the resulting nf -dimensional polytope is reflexive.b In the case of elliptic fibrations this is always true: We want to show that ∗ ∆f = ∆∗ ∩ (Nf )R is reflexive. As we saw in Sec. 4.1, ∆f can be identified with the image of ∆ under MR → (Mf )R . As the image of 0M under M → Mf is an IP of ∆f , ∆f has at least one IP. Because of the well known fact that a two-dimensional lattice polytope is reflexive if it has precisely one IP, all that is left to show is that ∆f cannot have more than one IP. As ∆∗f ⊇ ∇f implies ∆f ⊆ ∇∗f and ∇f as a lattice polytope with a single IP is reflexive, ∇∗f and hence ∆f has precisely one IP, i.e. ∆f is indeed reflexive. To obtain elliptic fibrations in Weierstrass form, as they are mostly used in F -theory compactifications [22, 29, 30], we thus only need to take the weight system qf = (1, 2, 3)/6 as the fiber part of a combined weight system and check for reflexivity of ∆(q). b If
this is not the case we could proceed by dropping vertices of ∆ and trying to find a larger reflexive section, but this soon becomes ugly and in view of the abundance of weight systems it is hardly worth the effort.
April 25, 2002 17:1 WSPC/148-RMP
366
00120
M. Kreuzer & H. Skarke
It is no surprise that the situation we described does not cover the general case: Given an IP simplex structure of a minimal polytope ∇f ⊆ ∆∗f it is not always possible to extend it to a simplex structure for a minimal polytope ∇ ⊆ ∆∗ . As a simple example in 4 dimensions we consider the points V1 , . . . , V6 ∈ N with coordinates given by the columns of the matrix
−1
0 (V1 , . . . , V6 ) = 0 0
1
1
−1
−1
1
−1
0
0
0
0
1
−1
0
0
0
1
−1
0 . −1 −1
(19)
The first three points provide a weight system (2, 1, 1) for an elliptic fiber such that ∆∗f is supported by the 1–2 plane. V1 is contained in the convex hull of V2 , . . . , V6 , which is a minimal polytope ∇ as defined in Sec. 2.1. The weight system for ∇ is q = (2, 2, 2, 1, 1)/8 and we cannot use (2, 1, 1)/4 as part of a CWS corresponding to a minimal polyhedron ∇ with an IP simplex structure in the sense of Sec. 2. In this situation it makes sense to use generalised IP simplex structures where the vertices of the IP simplices are lattice points (but not necessarily vertices) of ∆∗ and we do not insist on the non-redundancy implied by our original definition of an IP simplex structure. Having made this point we may use the linear relations among the vertices of the simplices (V1 , V2 , V3 ) and (V2 , V3 , V4 , V5 , V6 ) to arrive at the CWS q(1) = (2, 1, 1, 0, 0, 0)/4 and q(2) = (0, 2, 2, 2, 1, 1)/8. A CWS of this type was not considered in our classification scheme because V1 = (V2 + V3 )/2 is redundant when combined with the vertices that correspond to q(2) . It does, however, lead to a perfectly sensible system of equations (1), the convex hull of whose solutions is ∇∗ (in our example all polytopes are simplices). Actually, for q = (2, 2, 2, 1, 1)/8 we find that (∆(q))∗ has the seven lattice points V2 , . . . , V6 , 0 and (−1, 0, −1, 0)T , but no reflexive subpolytope. The CWS {q(1) , q(2) }, on the other hand, leads to a polytope (∆(q1 , q2 ))∗ with nine lattice points and the reflexive subpolytope that we started with: The addition of V1 refines the lattice generated by the vertices of ∇ in such a way that the convex hull of V2 , . . . , V6 on the finer N lattice now contains the additional lattice points V1 and −V1 . As an aside we thus observe that a CWS corresponding to a generalised IP simplex structure may also be used to encode certain sublattices of Mfinest . Probably most polytopes can be directly specified by using a generalised IP simplex structure and the corresponding CWS. A counterexample is given by the Z5 quotient of the quintic at the end of Sec. 3, where the N lattice is not generated by ∆∗ ∩ N . But in practice such a representation is only useful if the number of equations is small. In any case combined weight systems provide a simple construction for toric fibrations and can always be used to specify reflexive sections.
April 25, 2002 17:1 WSPC/148-RMP
00120
Reflexive Polyhedra, Weights and Toric Calabi–Yau Fibrations
367
4.3. Toric fibrations and weighted projective spaces There is a more subtle way in which a fibration structure can be encoded in a weight system. It only works for codimension 1 fibers, but it is quite interesting for historical and practical reasons. When string dualities led to interest in K3 fibrations, the first examples were constructed in the context of weighted projective spaces [60, 61]. It turned out that these examples are indeed special cases of toric fibrations in the sense that they correspond to reflexive projections of Newton polyhedra of transversal hypersurfaces in weighted P3 or P4 . Actually, reflexive objects of codimension 1 were first observed on these Newton polyhedra, either as reflexive facets or as reflexive sections through the IP in the M lattice [59]. Since what we really need for a toric fibration is a reflexive section in NR , the question arises whether there is a reflexive projection of ∆ onto one of its facets. A simple necessary condition for this is provided by the following observations. We work with the embedding space of Lemma 2.4 and do not distinguish between objects in MR and their images under the embedding map. Lemma 4.1. (a) For a polytope ∆ defined by a weight system ni only facets that are supported by a lattice hyperplane xl = −1 can have interior points. (b) If y = (y1 , . . . , yk ) ∈ M has yl = −1, yi ≥ 0 for i 6= l, the map πy : MR → MR , πy x = x + (xl + 1)y has the following properties: It is a projection to the affine subspace xl = −1, i.e. πy2 = πy and πy MR = MR ∩ {xl = −1}. It respects the lattice structure, i.e. if x ∈ M then πy x ∈ M ∩ {xl = −1}, The image of ∆ is the corresponding facet of ∆, i.e. πy ∆ = ∆ ∩ {xl = −1}. (c) There is a one-to-one correspondence between maps π with the same properties as in (b) such that 0 gets mapped to an IP of the facet with xl = −1 and P partitions of the weight nl by the remaining weights, i.e. nl = i6=l yi ni where the yi are nonnegative integers. Proof. (a) If an interior point of a facet is not on some hyperplane xl = −1 all xi must be nonnegative, but this is only possible for the interior point of ∆. (b) The first two statements follow directly from the definition of πy . πy ∆ ⊇ ∆ ∩ {xl = −1} follows from the fact that πy is a projection. For πy ∆ ⊆ ∆ ∩ {xl = −1} we note that ∆ ∩ {xl = −1} is the convex hull of the lattice points in {xl = −1} with xi ≥ −1 and that every vertex of ∆ gets mapped to such a lattice point. (c) If π is such a map, then we choose y to be the IP of the facet to which 0 is mapped (this implies yi ≥ 0 for i 6= l), and the partition of nl follows from the P fact that yi ni = 0. Conversely, if y is defined by such a partition we have to
April 25, 2002 17:1 WSPC/148-RMP
368
00120
M. Kreuzer & H. Skarke
show that it is interior to the facet. This follows from the facts that 0 is interior to ∆ and y = πy 0. A necessary condition for the existence of a reflexive projection of ∆ onto one of its facets is therefore that one of the weights ni has a unique partition in terms of the other weights. Using this criterion we found all such projections for single weight systems with k ≤ 5 by first searching for weights with unique partitions and then checking reflexivity of the corresponding facets. The results are given in Table 3 for the case of elliptic K3 surfaces and they are available on our web page [34] for K3-fibered Calabi–Yau manifolds (cf. Table 4). We can find a set of generators for (Nf )R by solving the equation hV, yi = 0 for P a general linear combination V = cj Vj of the vertices Vj of ∇. Since hVi , yi = yi we obtain the solutions Vi0 = Vi + yi Vl for i 6= l. The linear relations among the Vi0 are given by the corresponding subset of the original weights. In general they do not provide a weight system for the fiber because the points Vi0 need not belong to ∆∗ . This is easy to see for the class of weights (1, 1, 2n3 , 2n4 , 2n5 ) that was considered by Klemm, Lerche and Mayr [60]. Here V20 = V1 + V2 and hV20 , xi = x1 + x2 for P xi ni = 0 implies that x1 + x2 is even, so x ∈ M with coordinates xi . But V20 is not a primitive lattice vector in N and can be divided by 2, which leads to the weight system (1, n3 , n4 , n5 ) for the K3 fiber. The slightly more complicated example (8, 4, 3, 27, 42)/84 was given by Hosono, Lian and Yau [61]. The first weight has a unique partition with y2 = 2 and y3 = y4 = y5 = 0, so that (Nf )R is spanned by V20 = V2 + 2V1 and Vi with i > 2. This time 8x1 + 4x2 + 3x3 + 27x4 + 42x5 = 0 implies that x2 + 2x1 is a multiple of 3 and the primitive lattice vector V20 /3 ∈ ∆∗ leads to the weight system (4, 1, 9, 14) for the fiber, which agrees with the normalised weights for the fiber given in [61]. If more of the coefficients yi are nonvanishing, it is, of course, still possible to compute a weight system for the fiber, but this gets more tedious and we would also lose the direct connection with the original weights or we would have to introduce many redundant coordinates in a CWS. Another strategy for identifying reflexive projections of ∆(q) that can be used in the codimension 1 case follows from the fact that such a projection either must be along a line parallel to a facet or onto that facet whenever a facet has an interior point. If the number of facets with interior points is large enough this allows us to find all reflexive projections. The result of this analysis is indicated in the next-tolast column of Tables 3 and 4. The K3 surfaces in Table 2 are all elliptic, since their combinded weight systems contain 2 dimensional subsystems. With our strategies to identify reflexive projections (onto facets) we generalised the results of [60, 61] and extended the scope from the transversal case to the complete list of 184026 IP weight systems, where we could identify 124701 fibrations. The efficiency of our approach can be inferred from the fact that we found 5370 fibrations for the 7555 transversal cases, wheras only 628 fibrations yielded to the methods of [61].
April 25, 2002 17:1 WSPC/148-RMP
00120
Reflexive Polyhedra, Weights and Toric Calabi–Yau Fibrations
369
Acknowledgements M.K. is partly supported by the Austrian Research Funds FWF grant Nr. P11582PHY. The research of H.S. is supported by the European Union TMR project ERBFMRX-CT-96-0045. Appendix: Various Tables Table 1. IP simplex structures and numbers of corresponding IP CWS for n = 4 (for n = 3, see Lemma 2.3). IP simplex structure S1 = V1 V2 V3 V4 V5
Total
Span
lp-min.
r-min.
184026
38730
16437
206
16040
6365
143
51
1122
727
40
29
6
6
3
3
36
36
4
4
116
79
19
15
201346
45943
16646
308
S1 = V1 V2 V3 V4 , S2 = V1 V2 V30 V40 S1 = S1 = S1 =
V1 V2 V3 V4 , S2 = V1 V20 V30 V1 V2 V3 , S2 = V10 V20 V30 V1 V2 V3 , S2 = V1 V20 V30 , S3
=
S1 , . . . , Sm−1 as for n = 3, Sm =
V1 V200 V300 (m) (m) V1 V2
Total
Table 2. IP CWS for n = 3. The columns indicate the minimality type (‘s’ for span, ‘l’ for lp-minimality and ‘r’ for r-minimality) and point and vertex numbers for ∆ and ∆∗ . As r-minimality implies lp-minimality and the latter implies the span property for n = 3, we have given only the strongest statement in each case. d n1 n2 n3 n4 n5 3 3
1 1
1 0
1 0
0 1
0 1
3 4 3 4
1 2 1 1
1 0 1 0
1 0 1 0
0 1 0 2
0 1 0 1
3 6
1 3
1 0
1 0
0 2
0 1
3 6 3 6
1 2 1 1
1 0 1 0
1 0 1 0
0 3 0 3
0 1 0 2
4 4 4 4
2 2 2 1
1 0 1 0
1 0 1 0
0 1 0 2
0 1 0 1
P
V
P¯ V¯
d n1 n2 n3 n4 n5 n6
r 30
5
6
4 6
1 2
2 0
1 0
0 3
0 1
1 1 3 3
2 0 2 0
1 0 1 0
0 3 0 2
0 2 0 1
5
r 31
6
7
5
s 23
7
8
6
4 6 6 6
5
6 6
2 2
3 0
1 0
0 3
0 1
6 6 6 6
2 1 1 1
3 0 3 0
1 0 2 0
0 3 0 3
0 2 0 2
3 2 4 2
1 0 2 0
1 0 1 0
1 0 1 0
0 1 0 1
0 1 0 1
s 24
6
9
s 21
5
9
5
s 14
7
11
6
r 35
5
7
5
s 23
6
9
5
P
V
P¯ V¯
s 16
6
14
6
s 12
6
14
6
s 21
5
12
5
s 15
5
15
5
s 10
6
20
6
s
9
5
18
5
r 30
6
6
5
r 27
6
7
5
April 25, 2002 17:1 WSPC/148-RMP
370
00120
M. Kreuzer & H. Skarke Table 2.
d n1
(Continued)
n2
n3
n4
n5
P
V
P¯
V¯
d n1
s 27
5
9
5
6 2
s 19
5
9
5
s 18
6
12
5
2 2 2
4 6
2 3
1 0
1 0
0 2
0 1
4 4 4 6
1 1 1 3
2 0 2 0
1 0 1 0
0 2 0 2
0 1 0 1
n2
n3
n4
n5
3 0
2 0
1 0
0 1
0 1
1 0 0
1 0 0
0 1 0
0 1 0
0 0 1
P
V
P¯
V¯
s 21
6
9
5
r 27
8
7
6
n6
0 0 1
Table 3. The 95 K3 weight systems: r, l, s denote the minimality type as in Table 2, Π is the number of reflexive projections (if known) and F denotes the number of reflexive projections onto facets. The corresponding weights with unique partitions are indicated with bold face. d
n1 n2 n3 n4
4 5 6 6 7 8 8 9 9 10 10 10 11 12 12 12 12 12 12 13 14 14 14 15 15 15 15 15 16 16 16
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1 1 2 2 1 1 1 2 3 1 1 1
1 1 1 1 1 2 1 2 1 2 2 1 2 2 2 3 3 1 2 3 2 3 2 2 3 3 3 3 2 3 4
1 1 1 2 2 2 2 3 3 2 3 3 3 3 4 4 3 4 3 4 4 4 3 5 4 5 5 4 5 4 5
1 2 3 2 3 3 4 3 4 5 4 5 5 6 5 4 4 6 5 5 7 5 7 7 7 6 5 5 8 8 6
r r r r r s r s r s s r l s s s s r s l s s s s s s s s s s s
P
V
P¯ V¯ Π F
d
35 34 39 30 31 24 35 23 33 28 23 36 24 27 24 21 15 39 17 20 27 13 19 26 22 21 14 12 28 24 19
4 6 4 4 7 6 4 6 5 4 7 5 8 4 5 4 4 4 5 7 5 7 5 6 6 5 6 5 5 5 6
5 6 6 6 8 8 7 8 9 8 11 9 13 9 12 9 9 9 11 15 12 16 11 17 17 15 11 12 14 12 17
21 22 22 22 24 24 24 24 24 24 24 24 25 26 26 26 27 27 28 28 28 30 30 30 30 30 30 30 32 32 33
4 5 4 4 6 5 4 5 5 4 6 5 7 4 5 4 4 4 5 7 5 7 5 6 6 5 5 5 5 5 6
0 0 0 1 1 0 1 1 1 0 1 1 1 1 1 1 2 1 1 1 1 3 1 1 1 1 1 1 1 1 1
0 0 0 1 1 0 1 1 1 0 1 1 1 1 1 1 2 1 1 1 1 2 1 1 1 1 0 1 1 1 1
n1 n2 n3 n4 3 1 1 2 1 1 2 2 3 3 3 4 4 1 2 2 2 5 1 3 4 1 1 2 2 3 4 5 2 4 3
5 3 4 4 3 6 3 3 4 4 6 5 5 5 3 5 5 6 4 4 6 4 6 3 6 4 5 6 5 5 5
6 7 6 5 8 8 8 7 5 7 7 6 7 7 8 6 9 7 9 7 7 10 8 10 7 10 6 8 9 7 11
7 11 11 11 12 9 11 12 12 10 8 9 9 13 13 13 11 9 14 14 11 15 15 15 15 13 15 11 16 16 14
– – s – s s – s s s – s – – – – – – – s – s s s – – s – – – –
P V
P¯ V¯ Π F
9 25 22 14 27 18 15 16 12 10 9 8 7 21 16 13 11 6 24 12 7 25 21 18 13 10 10 6 13 9 9
21 20 20 19 15 24 27 20 18 26 21 26 32 24 23 23 32 30 24 18 35 20 24 18 23 35 20 39 29 27 39
5 5 6 5 4 5 4 5 5 5 4 5 5 5 5 5 6 5 4 5 4 5 5 4 4 5 5 4 5 5 4
5 5 6 5 4 5 4 5 5 6 4 6 6 5 5 5 6 6 4 5 4 5 5 4 4 5 5 4 5 5 4
? 1 1 1 1 1 1 2 ? 2 ? ? ? 1 1 2 2 ? 1 2 ? 1 1 1 1 2 ? ? 2 ? 2
1 1 1 1 1 1 0 1 0 1 1 1 1 1 0 1 1 0 1 1 1 1 1 0 1 1 0 1 1 0 1
April 25, 2002 17:1 WSPC/148-RMP
00120
Reflexive Polyhedra, Weights and Toric Calabi–Yau Fibrations Table 3. d
n1 n2 n3 n4
16 17 18 18 18 18 18 18 19 20 20 20 20 20 21 21 21
2 2 1 1 1 2 2 3 3 1 2 2 2 3 1 1 2
3 3 2 3 4 3 3 4 4 4 3 5 4 4 3 5 3
4 5 6 5 6 4 5 5 5 5 5 6 5 5 7 7 7
7 7 9 9 7 9 8 6 7 10 10 7 9 8 10 8 9
s l s s s s s s l s s – – s – – s
(Continued)
P V
P¯ V¯ Π F
d
14 13 30 24 19 16 14 10 9 23 16 11 13 10 24 18 14
18 20 12 15 20 14 20 17 24 13 14 23 23 22 24 24 23
34 34 36 36 36 38 38 40 42 42 42 44 48 50 54 66
6 8 4 5 6 5 6 6 7 4 5 5 4 6 4 5 6
6 8 4 5 6 5 6 6 8 4 5 5 4 6 4 5 6
2 2 1 1 1 2 2 ? ? 1 2 3 1 ? 1 1 2
1 1 1 1 1 1 1 1 1 1 1 2 1 0 1 1 1
371
n1 n2 n3 n4 3 4 1 3 7 3 5 5 1 2 3 4 3 7 4 5
4 6 5 4 8 5 6 7 6 5 4 5 5 8 5 6
10 7 12 11 9 11 8 8 14 14 14 13 16 10 18 22
17 17 18 18 12 19 19 20 21 21 21 22 24 25 27 33
– – – – – – – – s – s – – – – –
P V
P¯ V¯ Π F
11 8 24 12 5 10 7 8 24 15 13 9 12 6 10 9
31 31 24 30 35 35 35 28 24 27 26 39 30 39 35 39
6 5 4 4 4 5 5 4 4 4 5 4 4 4 5 4
6 5 4 4 4 5 5 4 4 4 5 4 4 4 5 4
2 ? 1 2 ? 2 ? ? 1 1 2 2 2 ? 2 2
1 1 1 1 0 1 1 0 1 0 1 1 1 1 1 1
Table 4. Examples from our list of 184026 IP weights [34] with various data including Hodge numbers, point and vertex numbers, and numbers of reflexive projections (onto facets). T indicates transversality and M denotes the minimality type. n3
n4
n5
TM
h11
h12
P
V
P¯
V¯
Π
F
4
5
14
21
−ls
26
39
54
18
35
15
?
0
8 8
10 11
19 26
25 45
−ls −ls
59 63
10 15
16 24
13 15
75 71
21 21
? ?
0 1
d
n1
n2
47
3
69 97
7 7
84
1
1
12
28
42
Tr
11
491
680
5
26
5
1
1
280
7
19
40
87
127
−−
491
11
26
5
680
5
2
1
24 26 33 36
3 3 3 3
4 4 6 6
5 5 6 6
6 7 7 10
6 7 11 11
Ts −ls −− T−
10 22 19 19
34 22 37 49
36 31 34 38
8 13 7 7
12 21 22 22
7 10 6 6
? ? ? ?
0 0 0 0
26 36 39 52
3 5 3 4
4 7 6 6
5 7 9 8
6 8 10 11
8 9 11 23
−s −− Ts T−
14 30 17 29
24 12 41 33
32 19 33 34
14 10 12 9
19 28 22 36
10 9 13 8
? ? ? ?
1 1 1 1
34 44 55 63
3 4 3 7
6 8 10 9
7 9 13 14
8 10 14 15
10 13 15 18
−s −− Ts T−
18 29 28 44
20 17 16 8
27 22 23 15
13 9 12 6
23 31 35 37
12 9 14 6
? ? ? ?
2 2 2 2
5 10
1 1
1 1
1 1
1 3
1 4
Tr −r
1 4
101 126
126 165
5 10
6 9
5 7
0 0
0 0
April 25, 2002 17:1 WSPC/148-RMP
372
00120
M. Kreuzer & H. Skarke Table 4.
(Continued) P¯
V¯
Π
F
7 9
15 19
7 7
0 0
0 0
51 50 45 60
10 8 7 7
14 11 25 25
8 6 6 6
1 1 1 1
0 0 0 0
86 108 23 18
105 141 33 25
5 12 11 7
7 11 30 20
5 8 8 7
1 1 1 1
1 1 1 1
11 14 31 26
33 44 31 38
43 56 33 39
14 9 6 7
14 13 29 29
9 7 6 6
2 2 2 2
1 1 1 1
Tr −ls −− T−
5 11 33 26
51 39 21 28
57 51 33 42
10 14 6 7
10 16 25 25
7 9 6 6
2 2 2 2
2 2 2 2
−s Ts −− T−
18 23 28 35
20 23 16 11
27 26 23 19
10 6 7 6
18 16 25 23
8 6 6 6
3 3 3 3
2 2 2 2
d
n1
n2
n3
n4
n5
TM
h11
h12
P
25 26
1 1
5 5
5 5
6 7
8 8
T− −l
17 19
49 49
65 65
20 20 30 36
2 2 2 2
3 3 5 5
4 5 6 6
4 5 6 6
7 5 11 17
−s Ts −− T−
13 6 27 24
45 48 39 54
8 13 35 40
1 1 2 4
1 1 7 5
2 2 8 9
2 4 9 10
2 5 9 12
Tr −r −− T−
2 6 35 22
19 27 36 40
2 2 4 4
3 3 4 4
4 4 6 6
5 9 9 9
5 9 13 17
−ls Ts −− T−
14 19 30 35
2 2 3 3
2 3 5 5
3 3 5 5
3 4 6 6
4 7 11 16
28 36 40 42
4 4 4 6
5 6 7 7
5 8 7 7
6 9 10 10
8 9 12 12
V
References [1] L. Dixon, in Superstrings, Unified Theories, and Cosmology 1987, eds. G. Furlan et al., World Scientific, 1988, p. 67. [2] W. Lerche, C. Vafa and N. Warner, “Chiral rings in N = 2 superconformal theories”, Nucl. Phys. B324 (1989) 427. [3] P. Candelas, E. Derrick and L. Parkes, “Generalized Calabi–Yau manifolds and the mirror of a rigid manifold”, Nucl. Phys. B407 (1993) 115. [4] P. S. Aspinwall and B. R. Greene, “On the geometric interpretation of N = 2 superconformal theories”, Nucl. Phys. B437 (1995) 205. [5] P. Candelas, X. C. De la Ossa, P. S. Green and L. Parkes, “An exactly soluble superconformal theory from a mirror pair of Calabi–Yau manifolds”, Phys. Lett. B258 (1991) 118. [6] P. Candelas, X. C. De la Ossa, P. S. Green and L. Parkes, “A pair of Calabi–Yau manifolds as an exactly soluble superconformal theory”, Nucl. Phys. B359 (1991) 21. [7] A. Strominger, S.-T. Yau and E. Zaslow, “Mirror symmetry is T duality”, Nucl. Phys. B479 (1996) 243. [8] P. Candelas, A. M. Dale, C. A. Lutken and R. Schimmrigk, “Complete intersection Calabi–Yau manifolds”, Nucl. Phys. B298 (1988) 493. [9] P. Candelas, M. Lynker and R. Schimmrigk, “Calabi–Yau manifolds in weighted P4 ”, Nucl. Phys. B341 (1990) 383.
April 25, 2002 17:1 WSPC/148-RMP
00120
Reflexive Polyhedra, Weights and Toric Calabi–Yau Fibrations
373
[10] M. Kreuzer and H. Skarke, “No mirror symmetry in Landau–Ginzburg spectra!”, Nucl. Phys. B388 (1992) 113. [11] A. Klemm and R. Schimmrigk, “Landau–Ginzburg string vacua”, Nucl. Phys. B411 (1994) 559. [12] M. Kreuzer and H. Skarke, “All abelian symmetries of Landau–Ginzburg potentials”, Nucl. Phys. B405 (1993) 305. [13] P. Berglund and T. H¨ ubsch, “A generalized construction of mirror manifolds”, Nucl. Phys. B393 (1993) 377. [14] M. Kreuzer and H. Skarke, “Orbifolds with discrete torsion and mirror symmetry”, Phys. Lett. B357 (1995) 81. [15] V. V. Batyrev, “Dual polyhedra and mirror symmetry for Calabi–Yau hypersurfaces in toric varieties”, J. Alg. Geom. 3 (1994) 493. [16] P. S. Aspinwall, B. R. Greene and D. R. Morrison, “Space-time topology change and stringy geometry”, J. Math. Phys. 35 (1994) 5321. [17] P. Candelas, P. S. Green and T. H¨ ubsch, “Finite distances between distinct Calabi– Yau vacua: (Other worlds are just around the corner)”, Phys. Rev. Lett. 62 (1989) 1956. [18] P. Candelas, P. S. Green and T. H¨ ubsch, “Rolling among Calabi–Yau vacua”, Nucl. Phys. B330 (1990) 49. [19] A. Strominger, “Massless black holes and conifolds in string theory”, Nucl. Phys. B451 (1995) 96. [20] B. R. Greene, D. R. Morrison and A. Strominger, “Black hole condensation and the unification of string vacua”, Nucl. Phys. B451 (1995) 109. [21] S. Kachru and C. Vafa, “Exact results for N = 2 compactifications of heterotic strings”, Nucl. Phys. B450 (1995) 69. [22] C. Vafa, “Evidence for F -theory”, Nucl. Phys. B469 (1996) 403. [23] M. Kreuzer and H. Skarke, “On the classification of reflexive polyhedra”, Commun. Math. Phys. 185 (1997) 495. [24] M. Kreuzer and H. Skarke, “On the classification of quasihomogeneous functions”, Commun. Math. Phys. 150 (1992) 137. [25] P. Candelas, X. C. de la Ossa and S. Katz, “Mirror symmetry for Calabi–Yau hypersurfaces in weighted P4 and extensions of Landau–Ginzburg theory”, Nucl. Phys. B450 (1995) 267. [26] H. Skarke, “Weight Systems for toric Calabi–Yau varieties and reflexivity of Newton polyhedra”, Mod. Phys. Lett. A11 (1996) 1637. [27] A. Klemm, W. Lerche and P. Mayr, “K3-fibrations and heterotic-type II string duality”, Phys. Lett. B357 (1995) 313. [28] P. S. Aspinwall and J. Louis, “On the ubiquity of K3 fibrations in string duality”, Phys. Lett. B369 (1996) 233. [29] D. R. Morrison and C. Vafa, “Compactifications of F -theory on Calabi–Yau threefolds – I”, Nucl. Phys. B473 (1996) 74. [30] D. R. Morrison and C. Vafa, “Compactifications of F -theory on Calabi–Yau threefolds – II”, Nucl. Phys. B476 (1996) 437. [31] A. Avram, M. Kreuzer, M. Mandelberg and H. Skarke, “Searching for K3 fibrations”, Nucl. Phys. B494 (1997) 567. [32] M. Kreuzer and H. Skarke, “Calabi–Yau fourfolds and toric fibrations”, J. Geom. Phys. 466 (1997) 1. [33] Y. Hu, C. H. Liu and S. T. Yau, “Toric morphisms and fibrations of toric Calabi–Yau hypersurfaces”, math.AG/0010082. [34] M. Kreuzer and H. Skarke, http://hep.itp.tuwien.ac.at/ ∼ kreuzer/CY.html
April 25, 2002 17:1 WSPC/148-RMP
374
00120
M. Kreuzer & H. Skarke
[35] C. Vafa, “String vacua and orbifoldized LG models”, Mod. Phys. Lett. 4A (1989) 1169. [36] V. V. Batyrev and D. I. Dais, “Strong McKay correspondence, string-theoretic Hodge numbers and mirror symmetry”, Topology 35 (1996) 901. [37] M. Kreuzer and H. Skarke, “Landau–Ginzburg orbifolds with discrete torsion”, Mod. Phys. Lett. 10 (1995) 1073. [38] C. Vafa, “Modular invariance and discrete torsion on orbifolds”, Nucl. Phys. B273 (1986) 592. [39] V. V. Batyrev and L. A. Borisov, “On Calabi–Yau complete intersections in toric varieties”, in Proceedings of Trento Conference, 1994. [40] V. V. Batyrev and L. A. Borisov, “Mirror duality and string theoretic Hodge numbers”, Invent. Math. 126 (1996) 183. [41] M. Kreuzer, E. Riegler and D. Sahakyan, “Toric complete intersections and weighted projective space”, math.AG/0103214. [42] M. Kreuzer, “Strings on Calabi–Yau spaces and toric geometry”, Nucl. Phys. Proc. Suppl. 102 (2001) 87. [43] M. Lynker, R. Schimmrigk and A. Wisskirchen, “Landau–Ginzburg vacua of string, M - and F -theory at c = 12”, Nucl. Phys. B550 (1999) 123. [44] R. Schimmrigk, http://thew02.physik.uni-bonn.de/ ∼ netah/cy.html [45] S. Katz, http://www.math.okstate.edu/ ∼ katz/CY/ [46] M. Reid, “Canonical 3-folds”, Proc. Alg. Geom. Anger 1979, Sijthoff and Nordhoff, p. 273. [47] A. R. Fletcher, “Working with complete intersections”, Bonn preprint MPI/89–35, 1989. [48] V. V. Batyrev, “Higher-dimensional toric varieties with ample anticanonical class”, Moscow State Univ., Thesis, 1985. [49] R. J. Koelman, “The number of moduli of families of curves on toric varieties”, Katholieke Universiteit Nijmegen, Thesis, 1990. [50] M. Kreuzer and H. Skarke, “Classification of reflexive polyhedra in three dimensions”, Adv. Theor. Math. Phys. 2 (1998) 847–864. [51] M. Kreuzer and H. Skarke, “Complete classification of reflexive polyhedra in four dimensions”, Adv. Theor. Math. Phys. 4 (2000). [52] W. Fulton, “Introduction to toric varieties”, Princeton Univ. Press, Princeton 1993. [53] T. Oda, Convex Bodies and Algebraic Geometry, Springer, Berlin Heidelberg 1988. [54] D. Cox, “The homogeneous coordinate ring of a toric variety”, J. Alg. Geom. 4 (1995) 17. [55] Igor V. Dolgachev, “Mirror symmetry for lattice polarized K3 surfaces”, J. Math. Sci., New York 81 (1996) 2599. [56] P. S. Aspinwall, “K3 surfaces and string duality”, in Differential Geometry Inspired by String Theory. [57] P. Candelas and H. Skarke, “F -theory, SO(32) and toric geometry”, Phys. Lett. B413 (1997) 63. [58] E. Witten, “String theory dynamics in various dimensions”, Nucl. Phys. B443 (1995) 85. [59] P. Candelas and A. Font, “Duality between the webs of heterotic and type II vacua”, Nucl. Phys. B511 (1998) 295. [60] A. Klemm, W. Lerche and P. Mayr, “K3-Fibrations and heterotic-type II string duality”, Phys. Lett. B357 (1995) 313. [61] S. Hosono, B. H. Lian and S.-T. Yau, “Calabi–Yau varieties and pencils of K3 surfaces”.
April 25, 2002 17:46 WSPC/148-RMP
00121
Reviews in Mathematical Physics, Vol. 14, No. 4 (2002) 375–407 c World Scientific Publishing Company
ABSENCE OF TRANSPORT IN ANDERSON LOCALIZATION
FUMIHIKO NAKANO Mathematical Institute, Tohoku University, Sendai, 980-8578, Japan
[email protected]
Received 11 July 2001 We consider the charge transport in the tight-binding Anderson model. Under a mild condition on the Fermi projection, we show that it is zero almost surely. This result has wider applicability than our previous work [12], while the definition of charge transport is slightly different. It also applies to the computation of non-diagonal component of the conductivity tensor which recovers the famous result of quantization of Hall conductivity in quantum Hall systems.
1. Introduction Since the pioneering work of Anderson [5], where he discussed that a certain disorder may cause materials to have insulating property, there have been much development in the theory of Anderson localization. From mathematical point of view, this phenomenon is represented as the exponential decay of Green’s functions which implies the exponential decay of eigenfunctions, and, together with Kubo formula, vanishing of the electrical conductivity (e.g., [1, 3, 9, 16]). On the other hand, Bellissard et al. [8, 13, 14] developed a theory to compute the electrical conductivity by using C ∗ algebraic approach. They used the random kick-rotor term to introduce dissipation mechanism. In this paper, we study essentially the same quantity, which we call charge transport to compare it with that treated in [12]. We do not use dissipation term, which turns out not to be necessary to consider in the study of Anderson localization, and develop the theory by the use of spectral theory and functional analysis. This approach has the advantage of being mathematically simple, while it would not be straightforward to extend to more general systems, without suitable improvements in the analysis. We will show that, when the Fermi energy lies in the “localized states”, then the charge transport is equal to zero almost surely. This fact has a clear physical interpretation: the current vanishes in the localized regime. However, to prove this fact seems not to be trivial as discussed in [12]. Our model is the standard, tight-binding random Hamiltonian on l2 (Zd ): X ϕ(y) + λVω (x)ϕ(x) , ϕ ∈ l2 (Zd ) . (1.1) (Hω ϕ)(x) := |y−x|=1
375
April 25, 2002 17:46 WSPC/148-RMP
376
00121
F. Nakano
λ > 0 is the coupling constant, and {Vω (x)}x∈Zd are independent, identically distributed random variables on a probability space (Ω, B, P). We assume (1) the probability distribution of Vω (0) has the density r(v)dv. supp r is bounded from below, and r ∈ Lp (R) for some 1 < p ≤ ∞. (2) E|Vω (0)|2 < ∞, where E stands for taking expectations. The assumption that supp r is bounded from below is purely technical to use almost analytic continuation (Sec. 2, Proposition 2.8). However, this assumption is satisfied in most situations, and is natural to study the conducting properties. The condition E|Vω (0)|2 < ∞ is assumed in order that the Liouville operator of HE,ω is densely defined. It is known that, the spectrum of Hω is deterministic almost surely, and σ(Hω ) = [−2d, 2d] + supp r, for a.e. ω [10]. Moreover, there appears some region in the spectrum depending on λ where we have pure point spectrum with exponentially decaying eigenfunctions which, as we already mentioned, is a mathematical presentation of Anderson localization (e.g., [1, 3, 9, 16]). In [12], we considered the charge transport caused by the slowly varying potential. We used adiabatic approximation, and proved that this quantity is equal to zero almost surely when λ is sufficiently large (i.e., at high disorder). In this paper, we consider the charge transport caused by the discrete analogue of the constant electric field E(∈ R) Hω,E := Hω + E · x1 ,
(1.2)
where x1 is the first component of x = (x1 , x2 , . . . , xd ) ∈ Z . We define the charge transport by carrying out the following three operations successively: d
(1) take the thermal average of current operator at zero temperature, (2) divide it by E and let E ↓ 0, (3) take the time average. That is, Z T dt 1 lim T (eitHω,E i[Hω , x1 ]e−itHω,E PF ) , σ1 (ω, F ) := lim T ↑∞ 0 T E↓0 E where T is the trace per volume for an operator A: 1 trace(χΛL A χΛL ) . T (A) := lim L↑∞ |ΛL |
(1.3)
(1.4)
χC is the characteristic funtion of a set C, and ΛL is a finite box in Zd : ΛL := {x ∈ Zd : |xj | ≤ L, j = 1, . . . , d} . e−itHω,E is the unitary propagator of Hω,E , and PF is the Fermi projection operator: χ(−∞,F ] (Hω ), corresponding to the Fermi energy F ∈ R which we fix arbitrary. In the definition of σ1 (ω, F ), it might be more natural if we took limE↓0 E1 after RT taking the time average limT ↑∞ 0 dt T . However, we can only deal with the Abel limit of that, which can be regarded as a sort of approximation. Z 1 ∞ dt δe−tδ T (eitHω,E i[Hω , x1 ]e−itHω,E PF ) , (1.5) σ2 (ω, F ) := lim lim δ↓0 E↓0 E 0
April 25, 2002 17:46 WSPC/148-RMP
00121
Absence of Transport in Anderson Localization
377
Remark 1.1. (1) σ1 (ω, F ) was originally defined in [8]. Instead of studying σ1 (ω, F ), they computed σ2 (ω, F ) by using the relaxation time approximation. (2) We are not able to discuss the deep question that whether the discrete analogue of the constant electric field E · x1 would be appropriate to compute the electrical conductivity [7, 15]. In fact, the spectrum of Hω,E is known to be purely discrete. However, we believe that taking E ↓ 0 limit limE↓0 E1 is not trivial in any case. (3) In non-zero temperature, PF is replaced by fβ,F (Hω ), where fβ,µ (λ) is the Fermi–Dirac distribution function fβ,µ (λ) := (1 + exp[β(λ − µ)])−1 ,
λ ∈ R,
and β > 0 is the inverse temperature. Throughout this paper, we assume the following decaying property of the matrix element of the Fermi projection. Assumption E
X
|x1 |2 |PF (ω ; x, 0)|2 < ∞ ,
(1.6)
x∈Z
PF (ω ; x, y) := hδx , PF δy il2 (Zd ) , where h· , ·il2 (Zd ) is the inner product on l2 (Zd ), and δx (z) = 1, if z = x, and δx (z) = 0, if z 6= x. It is shown that [2], (1.6) follows from the fractional moment bound of the Green’s function E|GF +i (ω ; x, y)|s ≤ Ce−µ|x−y| ,
x, y ∈ Zd ,
(1.7)
for some constants 0 < s < 1, C > 0, µ > 0, uniformly w.r.t. ∈ R. GE (ω ; x, y) := hδx , (Hω − E)−1 δy il2 (Zd ) is the matrix element of Green’s function. (1.7) is one of the key observation of Anderson localization [1, 3], and known to hold if (1) λ > 0 is sufficiently large (high disorder case), or (2) |F | is sufficiently large (extreme energy case), or (3) F is away from the spectrum of free Laplacian for |λ| small (weak disorder case). Under the above Assumption (1.6), we have following results. Theorem 1.1. σ2 (ω, F ) = 0 ,
a.e. ω.
Theorem 1.2. σ1 (ω, F ) = 0 ,
a.e. ω.
Remark 1.2. (1) We can extend Theorem 1.2 to some long-range hopping Hamiltonians. However, it seems not to be a trivial task to extend the resolvent equation (Proposition 3.1.) to those Hamiltonians, which is important to prove Theorem 1.1. (2) It would require further studies to see under which conditions the various limits involved in the definitions of σ1 (ω, F ), σ2 (ω, F ) may be interchanged. HowR∞ ever, in Theorem 1.1, the exchange of two operations, limE↓0 E1 and 0 dt δe−δt is allowed (Remark 4.1) if, in addition, we assume the following dynamical localization estimate E|e−itHω (ω ; x, y)| ≤ Ce−µ|x−y| ,
x, y ∈ Zd ,
(1.8)
April 25, 2002 17:46 WSPC/148-RMP
378
00121
F. Nakano
for some constants C > 0, µ > 0, independent of t ∈ R. (1.8) is known to be true when λ > 0 is sufficiently large (high disorder case [2, 4]). (3) In [12], essentially the same result is proved under more restricted situations (i.e., bounded random potential, bounded electrical potential, and high disorder). However, there are some differences between the definition of σ1 (ω, F ) and the charge transport defined in [12]. For instance, in the definition of σ1 (ω, F ), we take E ↓ 0 limit first and then take the time average, while, in [12], we take these two operations at the same time in a sense. It would not be easy to take these two limits at the same time in the approach of this paper. (4) In [8], they argue that the conductivity is always zero or infinity without taking dissipation mechanism into account. Moreover, they show that the conductivity is always zero if the Hamiltonian is bounded. Our results imply that this is also true in case Anderson localization happens, even when the Hamiltonian is unbounded. On the other hand, in [2], they discuss the derivation of Kubo formula, and obtained the same quantity as what we will have in the proof of above two theorems. Theorems 1.1 and 1.2 are still true when we impose constant magnetic field to Hω . Moreover, the above results have some implications to the theory of integral Hall effect. Let d = 2, and we consider the following Hamiltonian which has constant magnetic flux −B(∈ R) penetrating each plaquettes on Z2 . X hB (x, y)ϕ(y) + λVω (x)ϕ(x) , ϕ ∈ l2 (Z2 ) , (HωB ϕ)(x) := |y−x|=1
where,
( hB (x, y) :=
e
iB 2 hy, x−yiR2
0,
,
if |x − y| = 1 , otherwise ,
x, y ∈ Z2 .
h·, ·iRd is the inner product on Rd . We define the Hall conductivity σ1H (ω, F ), σ2H (ω, F ), by replacing x1 by x2 in the definition of σ1 (ω, F ), σ2 (ω, F ), respectively. We note that dissipation mechanism is not necessary to consider in the definition of the Hall conductivity. Under Assumption (1.6), we have the following result which can be proved by mimicking the proof of Theorems 1.1 and 1.2. Theorem 1.3. (1) σ1 (ω, F ) = σ2 (ω, F ) = 0 , a.e. ω. (2) σ1H (ω, F ) = σ2H (ω, F ) = iT (PF [∂1 PF , ∂2 PF ]PF ) , ∂j PF := [xj , PF ], j = 1, 2.
a.e. ω, where,
The quantity iT (PF [∂1 PF , ∂2 PF ]PF ) is known to have an interpretation as a “topological invariant” ([6, 8] and references therein). Therefore, Theorem 1.3 implies the vanishment of the direct conductivity and the quantization of the Hall conductivity if the Fermi energy lies in the localized regime, which is well-known in the theory of integral quantum Hall effect. The key ingredient of the proof of Theorems 1.1 and 1.2 is the theory developed in [2, 8], namely to consider the Hilbert space (called L2 ) which consists of matrix
April 25, 2002 17:46 WSPC/148-RMP
00121
Absence of Transport in Anderson Localization
379
elements of operators, and consider the Liouville operator of Hω . In Sec. 2, we define L2 , and study some basic properties. We provide proofs even for what is elementary, because it would be better to be careful in the treatment of Liouville operators for unbounded Hamiltonians. In Secs. 3 and 4, we prove Theorems 1.1 and 1.2, respectively. The main part of proofs is to study E ↓ 0 limit: limE↓0 E1 . In order for that, we prove resolvent equation (for Theorem 1.1), and Duhamel formula (for Theorem 1.2), respectively. 2. Preliminaries In this section, we define a Hilbert space L2 , study its basic properties, and consider some operators on it. First of all, we notice that, the operator Hω is covariant w.r.t. the translation in the following sense U (a)Hω U (a)∗ = HT a ω ,
a ∈ Zd , ω ∈ Ω ,
(2.1)
where U (a) is the translation operator on l2 (Zd ): (U (a)ϕ)(x) := ϕ(x − a), U (a)∗ is the adjoint operator of U (a), and T a : Ω → Ω is a measure preserving map defined by ω = {Vω (x)}x∈Zd 7→ T a ω = {Vω (x−a)}x∈Zd . It is well-known that T a is ergodic. Because of (2.1), the matrix element of Hω satisfies the following relation H(ω ; x, y) = H(T −a ω ; x − a, y − a) ,
a, x, y ∈ Zd , ω ∈ Ω .
(2.2)
We say that a function H(ω ; x, y) on Ω × Z × Z satisfies (CR) if (2.2) holds. We define L2 as the space of functions on Ω × Zd × Zd which satisfies (CR) and has a certain decaying property. d
d
L2 := {A(ω ; x, y) : function on Ω × Zd × Zd which is finite a.e., (1) A(ω ; x, y) satisfies (CR), Z X dP |A(ω ; x, 0)|2 < ∞} . (2) kAk2L2 := Ω
x∈Zd
˜ ˜ is Remark 2.1. In the above definition, “a.e.” is in the sense of P-a.e., where P d d d the product measure of P and the counting measure on Z × Z . Since Z × Zd is ˜ countable, “A(ω ; x, y) is finite for P-a.e.” is equivalent to the statement that “there exists a subset Ω0 ⊂ Ω with P(Ω0 ) = 1, such that A(ω ; x, y) is finite for ω ∈ Ω0 , x, y ∈ Zd ”. We shall see there are many functions which belong to L2 . For this purpose, we consider a subspace of L2 . A := {{Aω }ω∈Ω : family of bounded operators on l2 (Zd ) , (1) U (a)Aω U (a)∗ = AT a ω , (2) sup kAω kop < ∞} . ω∈Ω
kAkop is the operator norm on l2 (Zd ).
ω ∈ Ω, a ∈ Zd ,
April 25, 2002 17:46 WSPC/148-RMP
380
00121
F. Nakano
Remark 2.2. If Aω ∈ A, then its matrix element A(ω; x, y) belongs to L2 . In fact, since Aω is bounded on l2 (Zd ), A(ω ; x, y) is finite for any (ω, x, y) ∈ Ω × Zd × Zd . The condition (1) in the definition of A implies that A(ω ; x, y) satisfies (CR). To check the condition (2) in the definition of L2 , we write A(ω ; x, 0) = (Aω δ0 )(x), and since Aω is bounded on l2 (Zd ), and kδ0 kl2 (Zd ) = 1 (k · kl2 (Zd ) is l2 (Zd )-norm), X |A(ω ; x, 0)|2 = k(Aω δ0 )(x)k2l2 (Zd ) ≤ kAω k2op . x∈Zd
The condition (2) in the definition of A now implies kAkL2 < ∞, and thus, A(ω ; x, y) ∈ L2 . Therefore, we will sometimes abuse the notation, and regard A as a subspace of L2 , though the former is a space of operators while the latter is that of functions. To introduce a Hilbert space structure on L2 , we consider L˜2 := L2 (Ω × Zd ). Proposition 2.1. We define a map π : L2 → L˜2 by (πA)(ω, x) := A(ω ; x, 0) ,
A ∈ L2 .
Then, π is a bijection, and (π −1 a)(ω ; x, y) = a(T −y ω, x − y) ,
a ∈ L˜2 .
Proof. It is clear that πA ∈ L˜2 for A ∈ L2 . Conversely, let (π −1 a)(ω ; x, y) := a(T −y ω, x − y), a ∈ L˜2 . By Remark 2.1, (π −1 a)(ω ; x, y) is finite a.e. (π −1 a)(T −b ω ; x − b, y − b) = a(T −b T −(y−b) ω, (x − b) − (y − b)) = a(T −y ω, x − y) = (π −1 a)(ω ; x, y) ,
x, y, b ∈ Zd .
Hence (π −1 a)(ω ; x, y) satisfies (CR). kπ −1 akL2 = kakL˜2 < ∞. Therefore, π −1 a ∈ L2 for a ∈ L˜2 . Then, it suffices to show π −1 ◦ π and π ◦ π −1 are equal to identities on L2 and L˜2 respectively, which follows from direct computations. We introduce an inner product on L2 : hA, BiL2 := hπ(A), π(B)iL˜2 L2 becomes a Hilbert space under the above inner product. 2 Remark 2.3. To see L2 is complete, we let {An }∞ n=1 ⊂ L be a Cauchy sequence. ˜ Then π(An )(ω, x) = An (ω ; x, 0) is a Cauchy sequence in L2 , and thus, there exists an element a(ω, x) ∈ L˜2 , such that π(An ) → a in L˜2 as n → ∞. We define A := π −1 (a) ∈ L2 , then we have An → A in L2 as n → ∞. We regard L2 as a complete metric space in this sense.
April 25, 2002 17:46 WSPC/148-RMP
00121
Absence of Transport in Anderson Localization
381
For A, B ∈ L2 , we define a product AB which is a function on Ω × Zd × Zd : X A(ω ; x, z)B(ω ; z, y) . (2.3) (AB)(ω ; x, y) := z∈Zd
If a = π(A), b = π(B), it is written as π(AB)(ω, x) = (AB)(ω ; x, 0) X a(T −z ω, x − z)b(ω, z) . =
(2.4)
z∈Zd
Lemma 2.1. The sum in (2.3) absolutely converges for a.e. ω. Moreover, for any fixed (x, y) ∈ Zd × Zd , |A(ω ; x, z)||B(ω ; z, y)| ∈ L1 (Ω× Zd ) as a function of (ω, z) ∈ Ω × Zd , and (AB)(ω ; x, y) ∈ L1 (Ω). Proof. |A(ω ; x, z)||B(ω ; z, y)| ≤
1 (|A(ω ; x, z)|2 + |B(ω ; z, y)|2 ) . 2
(2.5)
On the other hand, Z Z X X dP |A(ω ; x, z)|2 = dP |A(T −z ω ; x − z, 0)|2 Ω
Ω
z∈Zd
z∈Zd
Z =
dP Ω
|A(ω ; x − z, 0)|2
z∈Zd
Z =
X
dP Ω
X
|A(ω ; z, 0)|2 = kAk2L2 .
(2.6)
z∈Zd
We used (CR) property of A and Fubini’s theorem. Similarly, we have R P 2 2 dP together imply the second statez∈Zd |B(ω ; z, y)| = kBkL2 < ∞, which Ω P ment of Lemma 2.1. Moreover, (2.6) implies z∈Zd |A(ω ; x, z)|2 < ∞, and similarly, P 2 z∈Zd |B(ω ; z, y)| < ∞, for a.e. ω. For such ω ∈ Ω, the sum of the RHS of (2.5) w.r.t. z ∈ Zd converges, and X 1 X |A(ω ; x, z)||B(ω ; z, y)| ≤ (|A(ω ; x, z)|2 + |B(ω ; z, y)|2 ) . (2.7) 2 d d z∈Z
z∈Z
The RHS of (2.7) ∈ L (Ω), and thus, (AB)(ω ; x, y) ∈ L1 (Ω). 1
By using Lemma 2.1, we can show that the trace per volume T (AB) exists almost surely if A, B ∈ L2 . Proposition 2.2. (1) If A, B ∈ L2 , Z dP(AB)(ω ; 0, 0) , T (AB) = Ω
In particular, T (AB) exists and finite for a.e. ω. (2) T (AB) = T (BA), for a.e. ω.
f or a.e. ω .
April 25, 2002 17:46 WSPC/148-RMP
382
00121
F. Nakano
Proof. (1) By Lemma 2.1, (AB)(ω ; x, y) is finite for a.e. ω. For such ω ∈ Ω, it is clear that trace(χΛL ABχΛL ) makes sense which is simply the usual trace of (2L + 1)d × (2L + 1)d matrix {(AB)(ω ; x, y)}x,y∈ΛL . T (AB) = lim
L↑∞
1 trace(χΛL ABχΛL ) |ΛL |
1 X (AB)(ω ; x, x) . L↑∞ |ΛL |
= lim
(2.8)
x∈ΛL
On the other hand, by using (CR) property of L2 , X A(ω ; x, y)B(ω ; y, x) (AB)(ω ; x, x) = y∈Zd
X
=
A(T −x ω ; 0, y − x)B(T −x ω ; y − x, 0)
y∈Zd
= (AB)(T −x ω ; 0, 0) .
(2.9)
By Lemma 2.1, the RHS of (2.9) ∈ L1 (Ω). Therefore, by Birkhoff’s ergodic theorem, Z dP(AB)(ω ; 0, 0) , for a.e. ω . T (AB) = Ω
(2) For a.e. ω,
Z
T (AB) =
dP(AB)(ω ; 0, 0) Ω
Z
X
dP
= Ω
Z
X
dP
=
A(ω ; 0, y)B(ω ; y, 0)
y∈Zd
Ω
B(T y ω ; 0, y)A(T y ω ; y, 0) .
(2.10)
y∈Zd
By Lemma 2.1, |B(ω ; 0, y)||A(ω ; y, 0)| ∈ L1 (Ω × Zd ), so that we can use Fubini’s theorem. RHS of (2.10) XZ dPB(ω ; 0, y)A(ω ; y, 0) = y∈Zd
Ω
Z dP
= Ω
X
B(ω ; 0, y)A(ω ; y, 0) = T (BA) .
y∈Zd
We have the following fact, which will make it easier to study unbounded operators on L2 . Proposition 2.3. A is dense in L2 .
April 25, 2002 17:46 WSPC/148-RMP
00121
Absence of Transport in Anderson Localization
383
Proof. Let A ∈ L2 , a := π(A) ∈ L˜2 . Define ( a(ω, x) , (if |x| ≤ n and |a(ω, x)| ≤ n) , an (ω, x) := 0, (otherwise), An := π −1 (an ) . Then an → a, and An → A in L˜2 and L2 respectively as n → ∞. We will show An ∈ A which completes the proof. For ϕ, ψ ∈ l2 (Zd ), we compute, by Schwarz inequality and the definition of π −1 , X |ϕ(x)||An (ω ; x, y)||ψ(y)| |hϕ, An ψil2 (Zd ) | ≤ x,y∈Zd
≤
1/2
X
|ϕ(x)|2 |A(ω ; x, y)|
x,y∈Zd
=
×
1/2 |A(ω ; x, y)||ψ(y)|2
x,y∈Zd
X X x∈Zd
X
1/2 |ϕ(x)|2 |an (T −y ω, x − y)|
y∈Zd
X X
y∈Zd
1/2 |an (T −y ω, x − y)||ψ(y)|2
.
(2.11)
x∈Zd
We notice that, |an (T −y ω, x − y)| ≤ n, and an (T −y ω, x − y) 6= 0 only if |x − y| ≤ n. Therefore, RHS of (2.11) ≤ n(2n + 1)d
X
1/2 |ϕ(x)|2
n(2n + 1)d
x∈Zd
X
1/2 |ψ(y)|2
y∈Zd
= Cn kϕkl2 (Zd ) kψkl2 (Zd ) , where the constant Cn > 0 is independent of ω ∈ Ω. Therefore, An is bounded on l2 (Zd ), and supω∈Ω kAn kop < ∞. Because An = π −1 (an ), An satisfies (CR). Hence An ∈ A. It will be convenient to introduce another subspace of L2 . A0 := {A(ω ; x, y) : function on Ω × Zd × Zd , (1) A satisfies (CR) ,
(2) |A(ω ; x, y)| is bounded,
(3) there exists a constant K > 0 such that A(ω ; x, y) = 0 , if |x − y| > K} . Then the proof of Proposition 2.3 also shows A0 ⊂ A, and A0 is dense in L2 .
April 25, 2002 17:46 WSPC/148-RMP
384
00121
F. Nakano
From now on, we consider some operations on L2 . For a function A(ω ; x, y) on Ω × Zd × Zd , we define A∗ (ω ; x, y) by A∗ (ω ; x, y) := A(ω ; y, x) , which corresponds to the matrix element of the adjoint operator if A ∈ A. The corresponding element in L˜2 is a∗ := π(A∗ ) = a(T −x ω, −x) , where a = π(A). Lemma 2.2. If A ∈ L2 , then A∗ ∈ L2 , and kA∗ kL2 = kAkL2 . Proof. It is clear that A∗ (ω ; x, y) is finite a.e. That A∗ (ω ; x, y) satisfies (CR) follows from direct computation. The equality kA∗ kL2 = kAkL2 < ∞ follows from (CR) and Fubini’s theorem, which is the same argument in (2.6) in the proof of Lemma 2.1. Due to Birkhoff’s ergodic theorem, trace per volume is related to the inner product on L2 . Proposition 2.4. Let A, B ∈ L2 . (1) As functions on Ω × Zd × Zd , (BA)∗ (ω ; x, y) = (A∗ B ∗ )(ω ; x, y) ,
for a.e. ω .
∗
(2) T (A B) = hA, BiL2 . Proof. (1) By definition, (BA)∗ (ω ; x, y) = (BA)(ω ; y, x) X B(ω ; y, z) A(ω ; z, x) . = z∈Zd
On the other hand,
X
A∗ B ∗ (ω ; x, y) =
A∗ (ω ; x, z)B ∗ (ω ; z, y)
z∈Zd
X
=
A(ω ; z, x) B(ω ; y, z)
z∈Zd
= (BA)∗ (ω ; x, y) . By Lemma 2.1, the above sums absolutely converge for a.e. ω. (2) By Proposition 2.2, Z X dP A∗ (ω ; 0, x)B(ω ; x, 0) T (A∗ B) = Ω
x∈Zd
Z dP
= Ω
X x∈Zd
A(ω ; x, 0) B(ω ; x, 0) ,
April 25, 2002 17:46 WSPC/148-RMP
00121
Absence of Transport in Anderson Localization
385
for a.e. ω. Therefore, since A(ω ; x, 0) = π(A), T (A∗ B) = hπ(A), π(B)iL˜2 . The following ideal property of L2 will be used frequently in this paper. Proposition 2.5. (1) Let A ∈ L2 , B ∈ A. Since B ∈ A ⊂ L2 (Remark 2.2), we can define products BA, AB as functions on Ω × Zd × Zd . Then BA, AB ∈ L2 , and kBAkL2 = kABkL2 ≤ sup kBkop kAkL2 . ω∈Ω
Hence the multiplication operators on L defined by A 7→ BA, A 7→ AB are linear and bounded. (2) The adjoint operators of A 7→ BA, A 7→ AB are A 7→ B ∗ A, and A 7→ AB ∗ respectively. 2
Proof. (1) By Lemma 2.1, (BA)(ω ; x, y) and (AB)(ω ; x, y) are finite a.e. It ; x, y) and (AB)(ω ; x, y) satisfy (CR). Since R is Peasy to see (BA)(ω P 2 2 2 dP |A(ω ; x, 0)| = kAk d x∈Z x∈Zd |A(ω ; x, 0)| < ∞, for a.e. ω, L2 < ∞, Ω 2 d which implies (Aδ0 )(x) = A(ω ; x, 0) ∈ l (Z ) w.r.t. x for a.e. ω. For such ω ∈ Ω, (BA)(ω ; x, 0) = (BAδ0 )(x) ∈ l2 (Zd ), and X |BA(ω ; x, 0)|2 = k(BAδ0 )(x)k2l2 (Zd ) x∈Zd
≤ kBk2op k(Aδ0 )(x)k2l2 (Zd ) X ≤ sup kBk2op |A(ω ; x, 0)|2 . ω∈Ω
(2.12)
x∈Zd
In (2.12), we recall that B = {Bω }ω∈Ω , where Bω is a bounded operator on l2 (Zd ) for any fixed ω ∈ Ω. Since the RHS of (2.12) ∈ L1 (Ω), we have BA ∈ L2 , and kBAkL2 ≤ supω∈Ω kBkop kAkL2 . By Lemma 2.2, A∗ ∈ L2 , and kA∗ kL2 = kAkL2 . Since B ∈ A, B ∗ is the matrix element of the adjoint operator of B, if we regard B as a bounded operator on l2 (Zd ). Moreover, B ∗ ∈ A. By Proposition 2.4, (AB)∗ (ω ; x, y) = B ∗ A∗ (ω ; x, y), as functions on Ω × Zd × Zd . By what we have just shown, B ∗ A∗ ∈ L2 . Therefore, (AB)∗ , and hence, AB ∈ L2 . Since kBkop = kB ∗ kop , and kAkL2 = kA∗ kL2 , we have kABkL2 = k(AB)∗ kL2 = kB ∗ A∗ kL2 ≤ sup kB ∗ kop kA∗ kL2 = sup kBkop kAkL2 . ω∈Ω
ω∈Ω
To prove Proposition 2.5(2), we prepare the following lemma, which is not a priori trivial. Lemma 2.3. Let A, C ∈ L2 , and B ∈ A. Then, as functions on Ω × Zd × Zd , ((AB)C)(ω ; x, y) = (A(BC))(ω ; x, y) ,
a.e. ω .
April 25, 2002 17:46 WSPC/148-RMP
386
00121
F. Nakano
Proof. By Proposition 2.3, we can take sequence of bounded operators {An }∞ n=1 , 2 {Cn }∞ ⊂ A which satisfy A → A, C → C in L as n → ∞. Then, by n n n=1 Proposition 2.5(1), An B → AB, BCn → BC, in L2 as n → ∞. Moreover, by Schwarz inequality and (2.6), Z dP|((AB)C)(ω ; x, y)| Ω
Z ≤
dP Ω
≤
X
|(AB)(ω ; x, z)||C(ω ; z, y)|
z∈Zd
Z dP Ω
X
1/2 1/2 Z X |(AB)(ω ; x, z)|2 dP |C(ω ; z, y)|2 Ω
z∈Zd
z∈Zd
= kABkL2 kCkL2 ≤ sup kBkop kAkL2 kCkL2 . ω∈Ω
Therefore, for any fixed (x, y) ∈ Zd × Zd , ((An B)Cn )(ω ; x, y) → ((AB)C)(ω ; x, y) , as n → ∞ in L1 (Ω). Since Zd × Zd is countable, by diagonal trick, we can take a subsequence n0 = n0 (k) such that ((An B)Cn )(ω ; x, y) → ((AB)C)(ω ; x, y) , pointwise for a.e. ω, as k → ∞, for any x, y ∈ Zd . Similarly, by taking a subsequence further: n00 = n00 (n0 (k)), (An (BCn ))(ω ; x, y) → (A(BC))(ω ; x, y) , pointwise for a.e. ω, as k → ∞. Therefore, it is sufficient to show ((An B)Cn )(ω ; x, y) = (An (BCn ))(ω ; x, y) , pointwise, which is clear. In fact, since An , B, Cn are bounded on l2 (Zd ), ((An B)Cn )(ω ; x, y) = hδx , An BCn δy il2 (Zd ) = (An (BCn ))(ω ; x, y) . Proof of Proposition 2.5(2). Let A, C ∈ L2 , and B ∈ A. By Proposition 2.4, hBA, CiL2 = T ((BA)∗ C) = T ((A∗ B ∗ )C) . By Proposition 2.2, and Lemma 2.3, Z ∗ ∗ dP((A∗ B ∗ )C)(ω ; 0, 0) T ((A B )C) = Ω
Z =
dP(A∗ (B ∗ C))(ω ; 0, 0)
Ω
= T (A∗ (B ∗ C)) = hA, B ∗ CiL2 .
April 25, 2002 17:46 WSPC/148-RMP
00121
Absence of Transport in Anderson Localization
387
Therefore, the adjoint of the linear bounded operator A 7→ BA on L2 is the map A 7→ B ∗ A. The adjoint operator of A 7→ AB can be derived similarly. We are going to study unbounded operators on L2 . The following operator corresponds to taking commutator with x1 . Proposition 2.6. For A ∈ L2 , we define (∂1 A)(ω ; x, y) := (x1 − y1 )A(ω ; x, y) . Then, the operator A 7→ ∂1 A is self-adjoint with domain D(∂1 ) := {A ∈ L2 : ∂1 A ∈ L2 } . Therefore, Assumption (1.6) is equivalent to the statement that PF ∈ D(∂1 ). Proof. Let a := π(A). Then, (π(∂1 A))(ω, x) = x1 a(ω, x). Thus, the operator A 7→ ∂1 A corresponds to the multiplication operator of x1 on L˜2 , which is self-adjoint with domain {a ∈ L˜2 : x1 a ∈ L˜2 }. Next, we consider an operator on L2 which corresponds to taking commutator with Hω,E . The matrix element of Hω,E is given by (if |x − y| = 1) , 1, Hω,E (ω ; x, y) = λVω (x) + Ex1 , (if x = y) , 0, (otherwise). For A ∈ L2 , we define (Hω,E A)(ω ; x, y) and (AHω,E )(ω ; x, y) as (2.3) X A(ω ; z, y) + (λVω (x) + Ex1 )A(ω ; x, y) , (Hω,E A)(ω ; x, y) := |z−x|=1
(AHω,E )(ω ; x, y) :=
X
A(ω ; x, z) + (λVω (y) + Ey1 )A(ω ; x, y) .
|z−y|=1
We define (LHω,E A)(ω ; x, y) := (Hω,E A)(ω ; x, y) − (AHω,E )(ω ; x, y) ,
(2.13)
which makes sense as a function on Ω× Zd × Zd . We note (LHω,E A)(ω ; x, y) satisfies (CR), though (Hω,E A)(ω ; x, y) and (AHω,E )(ω ; x, y) do not. The corresponding element in L˜2 is X a(ω, z) + (λVω (x) + Ex1 )a(ω, x) π(LHω,E A)(ω, x) = |z−x|=1
−
X
|z|=1
a(T −z ω, x − z) + λVω (0)a(ω, x) .
(2.14)
April 25, 2002 17:46 WSPC/148-RMP
388
00121
F. Nakano
Since E|Vω (0)|2 < ∞, LHω,E A ∈ L2 if A ∈ A0 . Hence LHω,E is densely defined on L2 . Proposition 2.7. (1) LHω,E is self-adjoint with domain D(LHω,E ) := {A ∈ L2 : (λ(Vω (x) − Vω (y)) + E(x1 − y1 ))A(ω ; x, y) ∈ L2 } . (2) eitLHω,E A = eitHω,E Ae−itHω,E , A ∈ L2 . Remark 2.4. Hω,E has the following property U (a)Hω,E U (a)∗ = HT a ω,E − Ea1 ,
ω ∈ Ω, a ∈ Zd ,
(2.15)
which implies U (a)eitHω,E U (a)∗ = eitHT a ω,E e−itEa1 .
(2.16)
However, by using (2.16), we can show (eitHω,E Ae−itHω,E )(ω ; x, y) satisfies (CR) (it is straightforward if A ∈ A. For A ∈ L2 , we use Proposition 2.3). By using the argument in the proof of Proposition 2.5, eitHω,E Ae−itHω,E ∈ L2 , for A ∈ L2 . Proof of Proposition 2.7(1). It is easy to see the operators a(ω, x) 7→ P P −z ω, x − z), which appear in (2.14), |z−x|=1 a(ω, z), and a(ω, x) 7→ |z|=1 a(T are bounded on L˜2 . The multiplication operator a(ω, x) 7→ (λ(Vω (x) − Vω (0)) + Ex1 )a(ω, x) is clearly self-adjoint with domain D := {a ∈ L˜2 : (λ(Vω (x) − Vω (0)) + Ex1 )a(ω, x) ∈ L˜2 } . Therefore, LHω,E is self-adjoint with domain π −1 D = D(LHω,E ). To prove Proposition 2.7(2), we prepare the following lemma. Let D(Hω,E ) be the domain of the self-adjoint operator Hω,E on l2 (Zd ) given by D(Hω,E ) := {ϕ ∈ l2 (Zd ) : (λVω (x) + Ex1 )ϕ ∈ l2 (Zd )} . Lemma 2.4. Suppose ϕ ∈ D(Hω,E ) for any ω ∈ Ω, and A ∈ A0 . Then, Aϕ ∈ D(Hω,E ) for any ω ∈ Ω. Proof. We have only to show (λVω (x) + Ex1 )(Aϕ)(x) = (λVω (x) + Ex1 )
X
A(ω ; x, y)ϕ(y) ∈ l2 (Zd ) ,
y∈Zd
for any ω ∈ Ω. Since A ∈ A0 , there exist constants C > 0, K > 0, such that |A(ω ; x, y)| ≤ C, and A(ω ; x, y) = 0 if |x − y| > K. Then, X (λVω (x) + Ex1 ) A(ω ; x, y)ϕ(y) d y∈Z ≤ C|(λVω (x) + Ex1 )|
X |z|≤K
|ϕ(x + z)| .
April 25, 2002 17:46 WSPC/148-RMP
00121
Absence of Transport in Anderson Localization
389
We will show |(λVω (x) + Ex1 )ϕ(x + z)| ∈ l2 (Zd ), w.r.t. x for any fixed z(|z| ≤ K) which completes the proof. In fact, letting y = x + z, |(λVω (x) + Ex1 )ϕ(x + z)| = |λVT −x+y ω (y) + Ey1 + E(x1 − y1 )||ϕ(x + z)| ≤ |λVT z ω (y) + Ey1 ||ϕ(y)| + |Ez1 ||ϕ(y)| . By assumption, the first term of the RHS in the above expression belongs to l2 (Zd ), for any ω ∈ Ω. Proof of Proposition 2.7(2). Let U (t)A := eitHω,E Ae−itHω,E , A ∈ L2 . By Proposition 2.5 and Remark 2.4, U (t) is uniformly bounded on L2 w.r.t. t ∈ R. Thus, it suffices to show eitLHω,E A = U (t)A, for A ∈ A0 . Let ϕ ∈ D(Hω,E ) for any ω ∈ Ω, and A ∈ A0 . Then, by Lemma 2.4, Z t ds eisHω,E i[Hω,E , A]e−isHω,E ϕ (2.17) U (t)Aϕ − Aϕ = 0
as elements of l (Z ). We let ϕ = δy ∈ D(Hω,E ), and take the l2 (Zd )-inner product of (2.17) with δx (x, y ∈ Zd ). Then, we have Z t ds eisHω,E i[Hω,E , A]e−isHω,E (2.18) U (t)A − A = 2
d
0
as elements of L2 , for LHS of (2.18) belongs to L2 . We will show U (t) is strongly continuous on L2 . Let A ∈ A0 . By (2.18), h(U (t) − I)A, (U (t) − I)AiL2 Z t Z t = ds eisHω,E i[Hω,E , A]e−isHω,E , du eiuHω,E i[Hω,E , A]e−iuHω,E 0
0
Z =
dP Ω
XZ x∈Zd
Z
t
ds
t
du(V (s))(ω ; x, 0)(V (u))(ω ; x, 0) ,
0
L2
(2.19)
0
where (V (s))(ω ; x, y) := (eisHω,E i[Hω,E , A]e−isHω,E )(ω ; x, y) . By Proposition 2.5, V (s) is bounded in L2 uniformly w.r.t. s ∈ R. Thus, we use Fubini’s theorem the RHS of (2.19) Z t Z Z t X ds du dP (V (s))(ω ; x, 0)(V (u))(ω ; x, 0) = 0
Z
Z
t
ds
= 0
Ω
0
0
x∈Zd
t
duhV (s), V (u)iL2 .
April 25, 2002 17:46 WSPC/148-RMP
390
00121
F. Nakano
Since A0 is dense in L2 , we conclude that U (t) is strongly continuous on L2 . By (2.18), Z t ds U (s)(i[Hω,E , A]) . U (t)A − A = 0
If A ∈ A0 , then i[Hω,E , A] ∈ L so that the integrand in RHS in the above expression is continuous in L2 w.r.t. s ∈ R. Hence, for A ∈ A0 , U (t)A is differentiable, and we have d U (t)A = U (t)(i[Hω,E , A]) . dt By Lemma 2.4, it is easy to see 2
iLHω,E (U (t)A)(ω ; x, y) = U (t)(i[Hω,E , A])(ω ; x, y) ∈ L2 , for A ∈ A0 , and hence, we have U (t)A ∈ D(LHω,E ), and d U (t)A = iLHω,E (U (t)A) , A ∈ A0 . dt Therefore, by uniqueness of solution of time-independent Schr¨ odinger equation, U (t)A = eitLHω,E A ,
A ∈ A0 ,
as desired. Proposition 2.8. For f (λ) = fβ,µ (λ), or χ(−∞,µ] (λ), T (f (Hω )∂1 Hω ) = 0 ,
a.e. ω .
(2.20)
Proof. We divide the proof into some steps. Step 1. We will show T (e−itHω,E ∂1 Hω ) = 0 ,
a.e. ω .
(2.21)
Let ∇n be a bounded operator on L˜2 defined by ( x1 a(ω, x) , (if |x1 | ≤ n) , (∇n a)(ω, x) := (sgn x1 )n a(ω, x) , (otherwise) . In L2 , ∇n corresponds to the following operator which we also denote by ∇n . ( (x1 − y1 )A(ω ; x, y) , (if |x1 − y1 | ≤ n) , (∇n A)(ω ; x, y) = (sgn(x1 − y1 ))n A(ω ; x, y) , (otherwise) . eisHω ∇n (e−isHω ) ∈ L2 , and (eisHω ∇n (e−isHω ))(ω ; x, y) = hδx , eisHω ∇n (e−isHω )δy il2 (Zd ) . Since ∇n corresponds to a multiplication operator by a bounded function, and δy ∈ D(Hω ), for any ω ∈ Ω, it is easy to show (∇n (e−isHω,E )δy )(x) ∈ D(Hω ), for
April 25, 2002 17:46 WSPC/148-RMP
00121
Absence of Transport in Anderson Localization
391
any ω ∈ Ω. Therefore, (eitHω ∇n (e−itHω ))(ω ; x, y) Z t dshδx , eisHω {Hω (∇n (e−isHω )) − ∇n (Hω e−isHω )}δy il2 (Zd ) . =i 0
Because both sides in the above equality belong to l2 (Zd ) w.r.t. x, we have Z t −itHω −itHω )=e ds eisHω {Hω ∇n (e−isHω ) − ∇n (Hω e−isHω )} . (2.22) ∇n (e 0
We take the trace per volume on both sides. By Proposition 2.2, T (∇n (e−itHω )) = 0 ,
a.e. ω .
(2.23)
On the other hand, in the proof of Lemma 3.1 in Sec. 3, we will show that, (1) L2 norm of {Hω ∇n (e−isHω ) − ∇n (Hω e−isHω )} is bounded uniformly in s (3.5). (2) {Hω ∇n (e−isHω ) − ∇n (Hω e−isHω )} → −(∂1 Hω )e−isHω in L2 as n → ∞ (Remark 3.1). By (1), we can use Fubini’s theorem and conclude that Z t −itHω isHω −isHω −isHω ds e {Hω ∇n (e ) − ∇n (Hω e )} T e 0
Z
t
= 0
ds T (e−i(t−s)Hω {Hω ∇n (e−isHω ) − ∇n (Hω e−isHω )})
Z
→−
t
ds T (e−itHω (∂1 Hω )) ,
as n → ∞ .
(2.24)
0
In the last step, we used (2) above, and Proposition 2.2. By (2.22), (2.23), and (2.24), we have (2.21). Step 2. For z ∈ C, Im z 6= 0, we will show T ((Hω − z)−1 ∂1 Hω ) = 0 ,
a.e. ω .
(2.25)
If z = a + iλ, for λ > 0, T ((Hω − a − iλ)−1 ∂1 Hω ) Z XZ ∞ dP dt e−λt eita (e−itHω )(ω ; 0, x)(∂1 Hω )(ω ; x, 0) . (2.26) =i Ω
x∈Zd
0
Since e−itHω ∈ A Rand ∂1 Hω ∈ A0 , we can use Fubini’s theorem to exchange R P ∞ dP x∈Zd with 0 dt. Ω RHS of (2.26) Z ∞ dt e−λt eita T (e−itHω ∂1 Hω ) = 0 . =i 0
The proof of (2.25) for λ < 0 is similar.
April 25, 2002 17:46 WSPC/148-RMP
392
00121
F. Nakano
Step 3. (completion) We use the following formula which is norm convergent. Z 1 (∂z¯f˜β,µ (z))(Hω − z)−1 dzd¯ z, (2.27) fβ,µ (Hω ) = 2πi C where f˜β,µ (z) is the almost analytic continuation of fβ,µ , (∂z¯f )(x + iy) = 12 (∂x f + i∂y f )(x + iy), for z = x + iy ([11]). Then, by (2.25), T (fβ,µ (Hω )∂1 Hω ) = 0 ,
a.e. ω ,
and thus, we have (2.20) for f (λ) = fβ,µ (λ). Since fβ,µ (λ) → χ(−∞,µ] (λ), pointwise if λ 6= µ, as β → ∞, and the integrated density of states is continuous, Z 2 T ((fβ,µ (Hω ) − Pµ ) ) = (fβ,µ (λ) − χ(−∞,µ] (λ))2 dN (λ) → 0 , as β ↑ ∞ , R
where dN (λ) is the density of states measure. By Schwarz inequality, 0 = T (fβ,µ (Hω )∂1 Hω ) → T (Pµ ∂1 Hω ) , as β → ∞, and we obtain (2.20) for χ(−∞,µ] (λ). Remark 2.5. When we impose constant magnetic field, the analysis in this paper goes through with suitable modifications in the definitions of some notations described below. Let B := {Bij }1≤i,j≤d be d × d antisymmetric matrix whose components are given by Bij = −Bij ∈ R, if i < j, and 0 if i = j. Bij corresponds to the magnetic flux penetrating plaquettes in (i, j)-planes. (1) The translation operator U (a) defined after (2.1) should be replaced by the magnetic translation given by (UB (a)ψ)(x) := e 2 ha,BxiRd ψ(x − a) , i
x, a ∈ Zd .
(2) The definition of (CR) in (2.2) should be replaced by H(ω ; x, y) = e 2 ha,B(x−y)iRd H(T −a ω ; x − a, y − a) , i
x, y, a ∈ Zd .
(3) π −1 : L2 → L˜2 in Proposition 2.1, and the product of A, B ∈ L2 represented in L˜2 in (2.4) should be given by the following quantities respectively: (π −1 a)(ω ; x, y) = e 2 hy,BxiRd a(T −y ω ; x − y) , X i e 2 hz,BxiRd a(T −z ω, x − z)b(ω, z) . π(AB)(ω, x) = i
z∈Zd
3. Proof of Theorem 1.1 In this section, we prove Theorem 1.1. The following proposition is important to study E ↓ 0 limit.
April 25, 2002 17:46 WSPC/148-RMP
00121
Absence of Transport in Anderson Localization
393
Proposition 3.1. Let A ∈ D(∂1 ), Z X dP |x1 |2 |A(ω ; x, 0)|2 < ∞ . D(∂1 ) := A ∈ L2 : Ω d x∈Z
Then, we have the following resolvent equation. (z − LHω,E )−1 A − (z − LHω )−1 A = (z − LHω,E )−1 E∂1 (z − LHω )−1 A ,
Im z 6= 0 .
(3.1)
Before we prove Proposition 3.1., we prepare the following lemma. Lemma 3.1. (1) Let C ∈ L2 . Then, L2 -norm of [LHω , ∇n ]C which is defined by [LHω , ∇n ]C := [Hω , ∇n C] − ∇n ([Hω , C]) is bounded uniformly w.r.t. n (∇n is defined in the proof of Proposition 2.8). (2) ∂1 (z − LHω )−1 A ∈ L2 , if A ∈ D(∂1 ), Im z 6= 0. Proof. (1) Let c := π(C), h := π(Hω ). Then, h(ω, x) is given by (if |x| = 1) , 1, h(ω, x) = Vω (0) , (if x = 0) , 0, (otherwise) . By (2.4), the product is given by X
hc = π(Hω C) =
h(T −z ω, x − z)c(ω, z) .
z∈Zd
First of all, we compute (h∇n c)(ω, x) − (∇n (hc))(ω, x). By definition of ∇n , we have X h(T −z ω, x − z) z1 c(ω, z) (h∇n c)(ω, x) = z∈Zd ,|z1 |≤n
+
X
h(T −z ω, x − z)(sgn z1 )n c(ω, z) ,
(3.2)
z∈Zd ,|z1 |>n
X x1 h(T −z ω, x − z)c(ω, z) , d z∈Z X (∇n (hc))(ω, x) = h(T −z ω, x − z)c(ω, z) , (sgn x1 )n
(if |x1 | ≤ n) , (if |x1 | > n) .
(3.3)
z∈Zd
h(T −z ω, x − z) 6= 0 only if |x − z| ≤ 1, and thus the sums in (3.2), (3.3) are finite sums and they make sense for each (ω, x) ∈ Ω × Zd . We divide into several cases.
April 25, 2002 17:46 WSPC/148-RMP
394
00121
F. Nakano
Case A (|x1 | ≤ n) In this case, we have (h∇n c)(ω, x) − (∇n (hc))(ω, x) X (z1 − x1 )h(T −z ω, x − z)c(ω, z) = |z1 |≤n
+
X
((sgn z1 )n − x1 )h(T −z ω, x − z)c(ω, z) .
(3.4)
|z1 |>n
The second term of RHS is nonzero only if x1 = n, z1 = n + 1, or x1 = −n, z1 = −(n + 1). In either case, (sgn z1 )n − x1 = 0. Therefore, if |x1 | ≤ n, we have |(h∇n c)(ω, x) − (∇n (hc))(ω, x)| ≤ |c(ω, x + e1 )| + |c(ω, x − e1 )| ,
(3.5)
where e1 := (1, 0, . . . , 0) ∈ Zd . Case B (|x1 | > n) In this case, we have (h∇n c)(ω, x) − (∇n (hc))(ω, x) X (z1 − (sgn x1 )n)h(T −z ω, x − z)c(ω, z) = z∈Zd ,|z1 |≤n
X
+
((sgn z1 ) − (sgn x1 ))n h(T −z ω, x − z)c(ω, z) .
z∈Zd ,|z1 |>n
The first term of RHS in the above expression vanishes by the same reason as in Case A. The second term is nonzero only if |x1 | > n, |z1 | > n, and |x − z| ≤ 1, in which case sgn x1 = sgn z1 . Thus, the second term also vanishes. Next, we compute ((∇n c)h)(ω, x) − (∇n (ch))(ω, x). Each term is given by X
((∇n c)h)(ω, x) =
z∈Zd ,|x
(x1 − z1 )c(T −z ω, x − z)h(ω, z)
1 −z1 |≤n
X
+
n sgn(x1 − z1 )c(T −z ω, x − z)h(ω, z) ,
z∈Zd ,|x1 −z1 |>n
(∇n (ch))(ω, x) =
Case C (|x1 | ≤ n)
X x1 c(T −z ω, x − z)h(ω, z) ,
(if |x1 | ≤ n) ,
(if |x1 | > n) .
z∈Zd
X
z∈Zd
n(sgn x1 )c(T −z ω, x − z)h(ω, z) ,
April 25, 2002 17:46 WSPC/148-RMP
00121
Absence of Transport in Anderson Localization
395
In this case, we have ((∇n c)h)(ω, x) − (∇n (ch))(ω, x) X (−z1 )c(T −z ω, x − z)h(ω, z) = z∈Zd ,|x1 −z1 |≤n
X
+
(n sgn(x1 − z1 ) − x1 )c(T −z ω, x − z)h(ω, z) .
z∈Zd ,|x1 −z1 |>n
The second term is nonzero only if x1 = −n, z1 = 1, or x1 = n, z1 = −1. In either case, the second term vanishes. Hence, if |x1 | ≤ n, we have X |c(T −z ω, x − z)| . (3.6) |((∇n c)h)(ω, x) − (∇n (ch))(ω, x)| ≤ |z|=1
Case D (|x1 | > n) In this case, we have ((∇n c)h)(ω, x) − (∇n (ch))(ω, x) X ((x1 − z1 ) − (sgn x1 ) n)c(T −z ω, x − z)h(ω, z) = z∈Zd ,|x1 −z1 |≤n
+
X z∈Zd ,|x
n(sgn(x1 − z1 ) − sgn x1 )c(T −z ω, x − z)h(ω, z) .
1 −z1 |>n
By the similar argument as in Case B, C, we see that the RHS in the above expression vanishes. By (3.5), (3.6), we conclude that [LHω , ∇n ]C is uniformly bounded in L2 w.r.t. n. (2) Let C := (z − LHω )−1 A, for A ∈ D(∂1 ), Im z 6= 0. Then, C ∈ D(LHω ), and as discussed in the proof of Proposition 2.8, ∇n C ∈ D(LHω ). Then, ∇n (z − LHω )−1 A = ∇n C = (z − LHω )−1 (z − LHω )∇n C = (z − LHω )−1 ∇n ((z − LHω )C) + (z − LHω )−1 [(z − LHω ), ∇n ]C = (z − LHω )−1 ∇n A − (z − LHω )−1 [LHω , ∇n ]C . By (1) of Lemma 3.1, the RHS in the above expression is uniformly bounded in L2 w.r.t. n. On the other hand, by monotone convergence theorem, k∂1 Ck2L2 = lim k∇n Ck2L2 < ∞ . n→∞
−1
Therefore, we have k∂1 (z − LHω ) Lemma 3.1 is proved.
AkL2 < ∞, and ∂1 (z − LHω )−1 A ∈ L2 .
Remark 3.1. (3.4) shows h(∇n c) − ∇n (hc) → −(∂1 h) c , in L˜2 as n → ∞, which we needed in the proof of Proposition 2.8.
April 25, 2002 17:46 WSPC/148-RMP
396
00121
F. Nakano
Proof of Proposition 3.1. Let A ∈ D(∂1 ), Im z 6= 0. Then, (z − LHω,E )−1 A − (z − LHω )−1 A = (z − LHω,E )−1 (z − LHω )(z − LHω )−1 A −(z − LHω,E )−1 (z − LHω − E∂1 )(z − LHω )−1 A = (z − LHω,E )−1 E∂1 (z − LHω )−1 A . In the second step, we used Lemma 3.1. From now on, we will carry out each operations in the definition of σ2 (ω, F ). Lemma 3.2. If PF satisfies Assumption (1.6), Z 1 ∞ dt δe−tδ T (eitHω,E i[Hω , x1 ]e−itHω,E PF ) lim E↓0 E 0 = ih(−iδ − LHω )−1 ∂1 Hω , ∂1 PF iL2 . Proof. By Proposition 2.7, eitHω,E i[Hω , x1 ]e−itHω,E = −eitLHω,E (i∂1 Hω ), and by Proposition 2.5 and the fact that identity operator on l2 (Zd ) belongs to L2 , PF ∈ L2 . Then, Z ∞ dt δe−tδ T (eitHω,E i[Hω , x1 ]e−itHω,E PF ) 0
Z =− 0
∞
dt δe−tδ heitLHω,E (i∂1 Hω ), PF iL2
= (−δ)h(−iδ − LHω,E )−1 ∂1 Hω , PF iL2 = (−δ)h∂1 Hω , (iδ − LHω,E )−1 PF iL2 . By Assumption (1.6), PF ∈ D(∂1 ). Then, by Proposition 3.1., = (−δ)h∂1 Hω , (iδ − LHω )−1 PF iL2 + (−δ)Eh∂1 Hω , (iδ − LHω,E )−1 ∂1 (iδ − LHω )−1 PF iL2 =: I + II . For bounded Borel measurable function f (λ) on R, by definition, it is easy to see that Hω f (Hω )(ω ; x, y) = f (Hω )Hω (ω ; x, y), and thus LHω f (Hω ) = 0. Hence, PF = (iδ)−1 (iδ − LHω )PF . By Proposition 2.8 and (3.7), I = ih∂1 Hω , PF iL2 = 0 , II = iEh∂1 Hω , (iδ − LHω,E )−1 ∂1 PF iL2 = iEh(−iδ − LHω,E )−1 ∂1 Hω , ∂1 PF iL2 .
(3.7)
April 25, 2002 17:46 WSPC/148-RMP
00121
Absence of Transport in Anderson Localization
Therefore, 1 E
Z
∞
397
δe−tδ T (eitHω,E i[Hω , x1 ]e−itHω,E PF )
0
= ih(−iδ − LHω,E )−1 ∂1 Hω , ∂1 PF iL2 .
(3.8)
Since ∂1 Hω ∈ A0 ⊂ D(∂1 ), we use Proposition 3.1. again. RHS of (3.8) = ih(−iδ − LHω )−1 ∂1 Hω , ∂1 PF iL2 + iEh(−iδ − LHω,E )−1 ∂1 (−iδ − LHω )−1 ∂1 Hω , ∂1 PF iL2 . For fixed δ > 0, the second term goes to zero as E ↓ 0. Hence, we have Lemma 3.2.
Remark 3.2. The equality I = 0 corresponds to the fact that σ2 (ω, F ) = 0 if E = 0. The following lemma would be useful to study finite temperature properties of charge transport, though we will not use it in the proof of Theorems 1.1 and 1.2. Lemma 3.3. If fβ,F (Hω ) satisfies the following decay estimate E|fβ,F (Hω )(ω ; x, y)| ≤ Chx − yi−γ ,
(3.9)
for some constant C > 0 independent of x, y ∈ Zd , γ > d + 2, and β > 0 (hxi := 1 + |x|), then we have fβ,F (Hω ) → PF , ∂1 fβ,F (Hω ) → ∂1 PF , in L2 , as β → ∞. Proof. Since P{F is the eigenvalue of Hω } = 0, fβ,F (Hω )(ω ; x, y) → PF (ω ; x, y) , as β → ∞, for a.e. ω. Then, by dominated convergence theorem, Z dP|fβ,F (Hω )(ω ; x, 0) − PF (ω ; x, 0)|2 → 0 , Ω
as β → ∞, for any x ∈ Zd . Then, by (3.9) and dominated convergence theorem again, we can easily arrive at the conclusion. To study δ ↓ 0 limit, we prepare a series of lemmas, by borrowing the arguments in [2]. For the sake of simplicity, we write P instead of PF . Lemma 3.4. If P ∈ D(∂1 ), we have ∂1 P = P (∂1 P )Q + Q(∂1 P )P , where Q := 1 − P .
April 25, 2002 17:46 WSPC/148-RMP
398
00121
F. Nakano
Proof. Since P 2 = P , we have (∂1 P )(ω ; x, y) =
X
(x1 − z1 )P (ω ; x, z)P (ω ; z, y)
z∈Zd
+
X
(z1 − y1 )P (ω ; x, z)P (ω ; z, y) .
z∈Zd
Since ∂1 P ∈ L2 , the sums in the above expression are absolutely convergent (Lemma 2.1). Hence, as elements of L2 , ∂1 P = (∂1 P )P + P (∂1 P ). Multiplying P and Q from both sides, we have P (∂1 P )P = 0, Q(∂1 P )Q = 0. Therefore, ∂1 P = (P + Q)(∂1 P )(P + Q) = P (∂1 P )Q + Q(∂1 P )P . Lemma 3.5. If A ∈ D(LHω ), then P AQ ∈ D(LHω ), and we have LHω (P AQ) = P LHω (A)Q . Remark 3.3. By Proposition 2.5, P AQ ∈ L2 . LHω (P AQ) is defined as LHω (P AQ) := hδx , [Hω , P AQ]δy il2 (Zd ) . h := π(Hω ) satisfies h(ω, x) = 0, if |x| ≥ 2, and h(ω, x) < ∞. Therefore, hδx , [Hω , P AQ]δy il2 (Zd ) makes sense. We note that A ∈ D(LHω ) is equivalent to the statement that LVω (A)(ω ; x, y) := hδx , [Vω , A]δy il2 (Zd ) ∈ L2 . Proof. Let a := π(A), and define ( a(ω, x) , an (ω, x) := 0,
(if |x| ≤ n, and |a(ω, x)| ≤ n) , (otherwise) .
Then, An := π −1 (an ) → A in L2 . Since LVω (A) ∈ L2 , LVω (An ) → LVω (A) in L2 as n → ∞. We write Hω = H0 + Vω . Since H0 , P , Q, An are all bounded on l2 (Zd ), LH0 (P An Q)(ω ; x, y) = (LH0 (P )An Q)(ω ; x, y) + (P LH0 (An )Q)(ω ; x, y) + (P An LH0 (Q))(ω ; x, y) . On the other hand, LVω (P An Q)(ω ; x, y) =
X
(3.10)
P (ω ; x, z1 )An (ω ; z1 , z2 )Q(ω ; z2 , y)
z1 ,z2 ∈Zd
× (Vω (x) − Vω (z1 ) + Vω (z1 ) − Vω (z2 ) + Vω (z2 ) − Vω (y)) = (LVω (P )An Q)(ω ; x, y) + (P LVω (An )Q)(ω ; x, y) + (P An LVω (Q))(ω ; x, y) .
(3.11)
April 25, 2002 17:46 WSPC/148-RMP
00121
Absence of Transport in Anderson Localization
399
Since A, P ∈ D(LHω ), the sums appearing in the above expression absolutely converge. By (3.10), (3.11), and the equality LHω (P ) = LH0 (P ) + LVω (P ) = 0, we have LHω (P An Q) = P LHω (An )Q .
(3.12)
P LHω (An )Q → P LHω (A)Q in L2 (n → ∞), and by taking subsequence, (P LHω (An )Q)(ω ; x, y) → (P LHω (A)Q)(ω ; x, y) ,
(3.13)
as n → ∞ for a.e. ω, and for any x, y ∈ Zd . On the other hand, we have (P An Q)(ω ; x, y) → (P AQ)(ω ; x, y), as n → ∞ for a.e. ω, by the same reason. Since h(ω, x) = 0 if |x| ≥ 2, and |h(ω, x)| < ∞ for a.e. ω, LHω (P An Q)(ω ; x, y) → LHω (P AQ)(ω ; x, y)
(3.14)
for a.e. ω, as n → ∞. By (3.12), (3.13), and (3.14), we arrive at the conclusion. Lemma 3.6. Q{(−iδ − LHω )−1 ∂1 Hω }P = (−iδ − LHω )−1 (Q∂1 Hω P ) ,
(3.15)
P {(−iδ − LHω )−1 ∂1 Hω }Q = (−iδ − LHω )−1 (P ∂1 Hω Q) .
(3.16)
Proof. Let A := Q{(−iδ − LHω )−1 ∂1 Hω }P − (−iδ − LHω )−1 (Q∂1 Hω P ) . The first term of RHS belongs to D(LHω ) by Lemma 3.5. Moreover, (−iδ − LHω )A = Q(∂1 Hω )P − Q(∂1 Hω )P = 0 . Therefore, A = 0. The proof of (3.16) is similar. Lemma 3.7. If P ∈ D(∂1 ), then (Qx1 P )(ω ; x, y) is finite for a.e. ω. Proof. By Schwarz inequality, we have Z dP|(Qx1 P )(ω ; x, y)| Ω
=
XZ z∈Zd
≤
dP|Q(ω ; x, z)||z1 ||P (ω ; z, y)|
Ω
X Z z∈Zd
Ω
1/2 Z 1/2 dP|Q(ω ; x, z)|2 |z1 | dP|P (ω ; z, y)|2 |z1 | . Ω
(3.17)
April 25, 2002 17:46 WSPC/148-RMP
400
00121
F. Nakano
Since P , Q ∈ D(∂1 ), RHS of (3.17) Z 1 X dP|Q(ω ; x, z)|2 (|x1 − z1 | + |x1 |) ≤ 2 Ω d z∈Z
+
Z 1 X dP|P (ω ; z, y)|2 (|z1 − y1 | + |y1 |) 2 Ω d z∈Z
≤ C(k∂1 Qk2L2 + |x1 |kQk2L2 + k∂1 P k2L2 + |y1 |kP k2L2 ) ≤ Cx,y < ∞ , for some constants C, Cx,y depending on x, y ∈ Zd . Lemma 3.8. If P ∈ D(∂1 ), we have LHω (Qx1 P ) = QLHω (x1 )P . In particular, LHω (Qx1 P ) ∈ L2 . Proof. By Lemma 3.7, (Qx1 P )(ω ; x, y) is finite for a.e. ω. Thus, by Remark 3.3, LHω (Qx1 P ) makes sense as a function on Ω × Zd × Zd for a.e. ω. We will show LVω (Qx1 P ) = LVω (Q)x1 P + Qx1 LVω (P ) ,
(3.18)
LH0 (Qx1 P ) = LH0 (Q)x1 P + QLH0 (x1 )P + Qx1 LH0 (P ) ,
(3.19)
as functions on Ω×Zd ×Zd . Then, Lemma 3.8 follows by using LHω (P ) = LHω (Q) = 0. To prove (3.18), we compute (LVω (Qx1 P ))(ω ; x, y) = III + IV , where III :=
X
Q(ω ; x, z)z1 P (ω ; z, y)(Vω (x) − Vω (z)) ,
z∈Zd
IV :=
X
Q(ω ; x, z)z1 P (ω ; z, y)(Vω (z) − Vω (y)) .
z∈Zd
We will show that, III, IV make sense for a.e. ω. In fact, by Schwarz inequality, 1/2 Z Z X dP|III| ≤ dP |Q(ω ; x, z)|2 |Vω (x) − Vω (z)|2 Ω
Ω
×
z∈Zd
Z dP Ω
X z∈Zd
1/2 |z1 |2 |P (ω ; z, y)|2
.
April 25, 2002 17:46 WSPC/148-RMP
00121
Absence of Transport in Anderson Localization
401
Since P satisfies Assumption (1.6), Z Z X X 2 2 dP |z1 | |P (ω ; z, y)| ≤ 2 dP |z1 − y1 |2 |P (ω ; z, y)|2 Ω
Ω
z∈Zd
z∈Zd
Z
+ 2|y1 |2
dP Ω
X
|P (ω ; z, y)|2 ≤ Cy < ∞ ,
z∈Zd
for some constant Cy depending only on y ∈ Z . Since Q ∈ D(LHω ), Z X dP |Q(ω ; x, z)|2 |Vω (x) − Vω (z)|2 = kLVω (Q)k2L2 < ∞ . d
Ω
z∈Zd
Therefore, III makes sense for a.e. ω, and so does IV by the same argument. Since III + IV = (LVω (Q)x1 P )(ω ; x, y) + (Qx1 LVω (P ))(ω ; x, y) , we proved (3.18). To prove (3.19), we notice that, by Remark 3.3, (H0 Qx1 P )(ω ; x, y), (QH0 x1 P )(ω ; x, y), (Qx1 H0 P )(ω ; x, y), and (Qx1 P H0 )(ω ; x, y) are all finite for a.e. ω. Then, (LH0 (Qx1 P ))(ω ; x, y) = (H0 Qx1 P )(ω ; x, y) − (Qx1 P H0 )(ω ; x, y) = (H0 Qx1 P )(ω ; x, y) − (QH0 x1 P )(ω ; x, y) + (QH0 x1 P )(ω ; x, y) − (Qx1 H0 P )(ω ; x, y) + (Qx1 H0 P )(ω ; x, y) − (Qx1 P H0 )(ω ; x, y) = (LH0 (Q)x1 P )(ω ; x, y) + (QLH0 (x1 )P )(ω ; x, y) + (Qx1 LH0 (P ))(ω ; x, y) . Thus (3.19) is proved. The following lemma follows by simple argument of functional calculus. Lemma 3.9. lim(−iδ − LHω )−1 LHω A = −A , δ↓0
f or A ∈ (Ker LHω )⊥ .
Lemma 3.10. [[x1 , P ], P ] ∈ (Ker LHω )⊥ . Proof. By Proposition 2.5, hA, [[x1 , P ], P ]iL2 = h[A, P ], [x1 , P ]iL2 .
April 25, 2002 17:46 WSPC/148-RMP
402
00121
F. Nakano
Thus, it suffices to show [A, P ] = 0 ,
if A ∈ Ker LHω .
(3.20)
Let An ∈ A0 so that An → A in L2 as n → ∞ (Proposition 2.3). Since An ∈ A0 , (z − Hω )−1 (z − Hω )An = An (Im z 6= 0) makes sense as elements of L2 . Then, [An , (z − Hω )−1 ] = (z − Hω )−1 [An , Hω ](z − Hω )−1 . Since An → A in L2 , and (z − Hω )−1 is bounded on l2 (Zd ), the LHS of the above expression converges to [A, (z − Hω )−1 ] in L2 as n → ∞. On the other hand, trivially, [An , H0 ] → [A, H0 ], and by the construction of An , [An , Vω ] = −LVω (An ) → −LVω (A) = [A, Vω ]. Therefore, [A, (z − Hω )−1 ] = (z − Hω )−1 [A, Hω ](z − Hω )−1 = 0 . By using the almost analytic continuation of fβ,F (λ) (2.27) which is a norm convergent integral, we have [A, fβ,F (Hω )] = 0 . Let B ∈ A0 . By Proposition 2.5, h[A, P ], BiL2 = hA, [B, P ]iL2 . In Step 3 of the proof of Proposition 2.8, we saw fβ,F (Hω ) → P in L2 as β ↑ ∞. Thus hA, [B, P ]iL2 = lim hA, [B, fβ,F (Hω )]iL2 β↑∞
= lim h[A, fβ,F (Hω )], BiL2 = 0 . β↑∞
Therefore, [A, P ] = 0. We are ready to prove Theorem 1.1. Proof of Theorem 1.1. By Lemma 3.2, we have σ2 (ω, F ) = lim ih(−iδ − LHω )−1 ∂1 Hω , ∂1 P iL2 . δ↓0
By Lemma 3.4, Proposition 2.5, and Lemma 3.6, ih(−iδ − LHω )−1 ∂1 Hω , ∂1 P iL2 = ih(−iδ − LHω )−1 ∂1 Hω , P (∂1 P )Q + Q(∂1 P )P iL2 = ihP {(−iδ − LHω )−1 ∂1 Hω }Q + Q{(−iδ − LHω )−1 ∂1 Hω }P, ∂1 P iL2 = ih(−iδ − LHω )−1 (P (∂1 Hω )Q + Q(∂1 Hω )P ), ∂1 P iL2 . As functions on Ω × Zd × Zd , we have (∂1 Hω )(ω ; x, y) = −(LHω (x1 ))(ω ; x, y). Together with Lemma 3.8, ih(−iδ − LHω )−1 (P (∂1 Hω )Q + Q(∂1 Hω )P ), ∂1 P iL2 = (−i)h(−iδ − LHω )−1 LHω (Qx1 P + P x1 Q), ∂1 P iL2 .
April 25, 2002 17:46 WSPC/148-RMP
00121
Absence of Transport in Anderson Localization
403
By direct computation, [[x1 , P ], P ] ∈ D(LHω ), and Qx1 P + P x1 Q = [[x1 , P ], P ]. By Lemma 3.9 and (3.10), we compute ih(−iδ − LHω )−1 ∂1 Hω , ∂1 P iL2 = (−i)h(−iδ − LHω )−1 LHω ([[x1 , P ], P ]), ∂1 P iL2 → ih[[x1 , P ], P ], ∂1 P iL2 (δ ↓ 0) = ih[∂1 P, P ], ∂1 P iL2 = ih(∂1 P )P, ∂1 P iL2 − ihP (∂1 P ), ∂1 P iL2 . By Propositions 2.2 and 2.4, h(∂1 P )P, ∂1 P iL2 = −T (P (∂1 P )(∂1 P )) = −T ((∂1 P )P (∂1 P )) = hP (∂1 P ), (∂1 P )iL2 . Therefore, we proved σ2 (ω, F ) = 0, for a.e. ω. 4. Proof of Theorem 1.2 The following proposition is important to study E ↓ 0 limit in the proof of Theorem 1.2. Proposition 4.1. If A ∈ L2 satisfies A ∈ D(LHω ) , eitLHω A ∈ D(∂1 ) ,
(4.1) t ∈ R.
(4.2)
itLHω,E
and eitLHω Then, we have the following “Duhamel” formula for e Z t eitLHω,E A − eitLHω A = iE ds ei(t−s)LHω,E [x1 , eisLHω A] . 0
Proof. By (4.1), (4.2), eisLHω A ∈ D(LHω ), eisLHω A ∈ D(∂1 ). Thus eisLHω A ∈ D(LHω,E ), and e−isLHω,E eisLHω A is differentiable w.r.t. s. Z t d −isLHω,E isLHω (e ds e A) e−itLHω,E eitLHω A − A = ds 0 Z t ds e−isLHω,E ∂1 (eisLHω A) . = −iE 0
By multiplying e
itLHω,E
from the right on both sides, we arrive at the conclusion.
Lemma 4.1. (1) ∂1 Hω satisfies (4.1), (4.2), (2) [x1 , eisLHω (∂1 Hω )] satisfies (4.1), (4.2). To prove Lemma 4.1, we use the following estimate.
April 25, 2002 17:46 WSPC/148-RMP
404
00121
F. Nakano
Theorem 4.1. ([2], Theorem 1.3) Let F be a function analytic and bounded on a strip Sη := {z ∈ C : |Im z| < η}. Then, √ (4.3) |F (Hω )(ω ; x, y)| ≤ 18 2kF k∞ e−µ|x−y| , P for any µ > 0 such that the inequality |x|=1 (eµ|x| − 1) ≤ η/2 holds, and kF k∞ := supz∈Sη |F (z)|. We will make use of the fact that the bound in (4.3) holds uniformly w.r.t. ω ∈ Ω. Proof of Lemma 4.1. (1) Since ∂1 Hω ∈ A0 , and Hω ∈ L2 , we have LHω (∂1 Hω ) ∈ L2 . Hence ∂1 Hω ∈ D(LHω ). By (4.3), for any fixed η > 0, we have √ (4.4) |eitHω (ω ; x, y)| ≤ 18 2eη|t| e−µ|x−y| , P where µ > 0 is taken such that |x|=1 (eµ|x| − 1) ≤ η2 . By (4.4) and the fact that ∂1 Hω ∈ A0 , we have 0
|eitLHω,E (∂1 Hω )(ω ; x, y)| = |(eitHω (∂1 Hω )e−itHω )(ω ; x, y)| ≤ Ct e−µ |x−y| , (4.5) for some constants Ct > 0, µ0 > 0, uniformly in (ω ; x, y) ∈ Ω × Zd × Zd . Ct is uniformly bounded w.r.t. t, if |t| ≤ T for some T > 0. Thus, [x1 , eitLHω (∂1 Hω )] ∈ A ⊂ L2 , because [x1 , (eitLHω (∂1 Hω ))](ω ; x, y) = (x1 − y1 )(eitLHω (∂1 Hω ))(ω ; x, y) decays exponentially by (4.5). Hence ∂1 Hω satisfies (4.1), (4.2). (2) Because [x1 , eisLHω (∂1 Hω )] ∈ A, and Hω ∈ L2 which follows from the assumption that E|Vω (0)|2 < ∞, we have LHω [x1 , eisLHω (∂1 Hω )] ∈ L2 , and therefore, [x1 , eisLHω (∂1 Hω )] ∈ D(LHω ). By (4.4), (eitHω )(ω ; x, y) decays exponentially as well as [x1 , eisLHω (∂1 Hω )](ω ; x, y), and so does (eitLHω [x1 , eisLHω (∂1 Hω )])(ω ; x, y). Hence eitLHω ([x1 , eisLHω (∂1 Hω )]) ∈ D(∂1 ). Proof of Theorem 1.2. By the same argument as in the proof of Lemma 3.2, we have T (eitHω,E i[Hω , x1 ]e−itHω,E P ) = −iheitLHω,E (∂1 Hω ), P iL2 . By Lemma 4.1, ∂1 Hω satisfies the condition to use Proposition 4.1. −iheitLHω,E (∂1 Hω ), P iL2 = −iheitLHω (∂1 Hω ), P iL2 Z t +E dshei(t−s)LHω,E [x1 , eisLHω (∂1 Hω )], P iL2 =: V + V I . 0
By Proposition 2.2, 2.4, and 2.8, V = −iheitHω (∂1 Hω )e−itHω , P iL2 = −iT (eitHω (∂1 Hω )e−itHω P ) = −iT ((∂1 Hω )P ) = 0 .
April 25, 2002 17:46 WSPC/148-RMP
00121
Absence of Transport in Anderson Localization
405
Therefore, 1 T (eitHω,E i[Hω , x1 ]e−itHω,E P ) E Z t dshei(t−s)LHω,E [x1 , eisLHω (∂1 Hω )], P iL2 . = 0
By Lemma 4.1, we can use Proposition 4.1 for [x1 , eisLHω (∂1 Hω )]. Z t dshei(t−s)LHω,E [x1 , eisLHω (∂1 Hω )], P iL2 0
Z
t
dshei(t−s)LHω [x1 , eisLHω (∂1 Hω )], P iL2
= 0
Z
Z
t
+ iE
t−s
duhei(t−s−u)LHω,E [x1 , eiuLHω [x1 , eisLHω (∂1 Hω )]], P iL2 .
ds
0 0 [x1 , eiuHω,E [x1 , eisHω,E (∂1 Hω )]]
is uniformly bounded in L2 for s ∈ [0, t], By (4.4), u ∈ [0, t]. Therefore, for fixed t, the second term of RHS in the above expression goes to zero as E ↓ 0. Then, we compute, by using the fact that eitLHω P = P , and P ∈ D(∂1 ), Z t dshei(t−s)LHω,E [x1 , eisLHω (∂1 Hω )], P iL2 lim E↓0
0
Z
t
dshei(t−s)LHω [x1 , eisLHω (∂1 Hω )], P iL2
= 0
Z
t
dsh[x1 , eisLHω (∂1 Hω )], P iL2
= 0
Z
t
dsheisLHω (∂1 Hω ), ∂1 P iL2 .
= 0
We proceed as in the proof of Theorem 1.1, using Lemmas 3.4 and 3.8, Z t dsheisLHω (∂1 Hω ), ∂1 P iL2 0
Z
t
dsheisLHω LHω ([[x1 , P ], P ]), ∂1 P iL2
= 0
Z
= −i
t
ds 0
d isLHω he ([[x1 , P ], P ]), ∂1 P iL2 ds
= −iheitLHω ([[x1 , P ], P ]), ∂1 P iL2 + ih[[x1 , P ], P ], ∂1 P iL2 . The second term in the above expression vanishes as we saw in the proof of Theorem 1.1. On the other hand, by spectral theorem, Z 1 T dt eitLHω → P{0} , T 0
April 25, 2002 17:46 WSPC/148-RMP
406
00121
F. Nakano
strongly in L2 as T ↑ ∞, where P{0} is the orthogonal projection onto Ker(LHω ). Thus, by Lemma 3.10, Z 1 T dtheitLHω ([[x1 , P ], P ]), ∂1 P iL2 = 0 , lim T ↑∞ T 0 and we prove σ1 (ω, F ) = 0, for a.e. ω. Remark 4.1. If λ is sufficiently large, we have the following dynamical localization estimate E|eitHω (ω ; x, y)| ≤ Ce−µ|x−y| ,
(4.6)
for some constants C > 0, µ > R0, independent of t ∈ R, x, y ∈ Zd [1, 2, 4]. In ∞ this case, operations limE↓0 and 0 dt e−δt in the definition of σ2 (ω, F ) can be exchanged. In fact, we can show 1 T (eitHω,E i[Hω , x1 ]e−itHω,E P ) = O(t) , E
(4.7)
as t ↑ ∞, uniformly in E. To show (4.7), we use Proposition 4.1, 1 T (eitHω,E i[Hω , x1 ]e−itHω,E P ) E Z t dshei(t−s)LHω,E [x1 , eisLHω (∂1 Hω )], P iL2 . = 0
(4.6) implies that [x1 , e
isLHω
(∂1 Hω )] is uniformly bounded in L2 . In fact,
|eisLHω (∂1 Hω )(ω ; x, y)|2 ≤ 2d|eisLHω (∂1 Hω )(ω ; x, y)| X |eisHω (ω ; x, z1 )||∂1 Hω (ω ; z1 , z2 )||e−isHω (ω ; z2 , y)| . (4.8) ≤ 2d z1 ,z2 ∈Zd
We used |(∂1 Hω )(ω ; x, y)| ≤ k∂1 Hω kop ≤ 2d. We take expectation, and use Schwarz inequality w.r.t. E. E(RHS of (4.8)) X (E|eisHω (ω ; x, z1 )|3 )1/3 ≤ 2d z1 ,z2 ∈Zd
× (E|∂1 Hω (ω ; z1 , z2 )|3 )1/3 (E|e−isHω (ω ; z2 , y)|3 )1/3 . By (4.6) and the fact that |eisHω (ω ; x, y)| ≤ keisHω kop = 1, we have E|eisLHω (∂1 Hω )(ω ; x, y)|2 ≤ Ce−µ|x−y| ,
x, y ∈ Zd ,
for some constants C > 0, µ > 0. Therefore, [x1 , eisLHω (∂1 Hω )] is uniformly bounded in L2 w.r.t. s ∈ R, and we conclude (4.7).
April 25, 2002 17:46 WSPC/148-RMP
00121
Absence of Transport in Anderson Localization
407
References [1] M. Aizenman, “Localization at weak disorder: some elementary bounds”, Rev. Math. Phys. 6 (1994) 1163–1182. [2] M. Aizenman and G. M. Graf, “Localization bounds for an electron gas”, J. Phys. A: Math. Gen. 31 (1998) 6783–6806. [3] M. Aizenman and S. Molchanov, “Localization at large disorder and at extreme energies: an elementary derivation”, Commun. Math. Phys. 157 (1993) 245–278. [4] M. Aizenman, J. Schenker, R. Friedrich and D. Hundertmark, “Finite-volume fractional moment criteria for Anderson localization”, Commun. Math. Phys. 224 (2001) 219–253. [5] P. W. Anderson, “Absence of diffusion in certain random lattices”, Phys. Rev. 109 (1958) 1492–1505. [6] J. E. Avron, R. Seiler and B. Simon, “Charge deficiency, charge transport and comparison of dimensions”, Commun. Math. Phys. 159 (1994) 399–422. [7] F. Bentosela, R. Carmona, P. Duclos, B. Simon, B. Souillard, and R. Weder, “Schr¨ odinger operators with an electric field and random or deterministic potentials”, Commun. Math. Phys. 88 (1983) 387–397. [8] J. Bellissard, A. van Elst and H. Schultz-Baldes, “The non-commutative geometry of the quantum Hall effect”, J. Math. Phys. 35(10) (1994) 5373–5451. [9] J. Fr¨ ohlich and T. Spencer, “Absence of diffusion in the Anderson tight binding model for large disorder or low energy”, Commun. Math. Phys. 88 (1983) 151–184. [10] H. Kunz and B. Souillard, “Sur le spectre les op´ erateurs aux diff´erences finies al´eatoires”, Commun. Math. Phys. 78 (1980) 201–246. [11] S. Nakamura, Lectures on Schr¨ odinger Operators, Lectures in Mathematical Sciences, University of Tokyo, 1994. [12] F. Nakano and M. Kaminaga, “Absence of transport under a slowly varying potential in disordered systems”, J. Stat. Phys. 97 (1999) 917–940. [13] H. Schultz-Baldes and J. Bellissard, “Anomalous transport: a mathematical framework”, Rev. Math. Phys. 10 (1998) 1–46. [14] H. Schultz-Baldes and J. Bellissard, “A kinetic theory for quantum transport in aperiodic media”, J. Stat. Phys. 91 (1998) 991–1026. [15] B. Simon and T. Spencer, “Trace class perturbations and the absence of absolutely continuous spectra”, Commun. Math. Phys. 125 (1989) 113–125. [16] H. von Dreifus and A. Krein, “A new proof of localization in the Anderson tight binding model”, Commun. Math. Phys. 124 (1989) 285–299.
April 26, 2002 9:13 WSPC/148-RMP
00116
Reviews in Mathematical Physics, Vol. 14, No. 4 (2002) 409–420 c World Scientific Publishing Company
SOLITARY WAVES OF THE NONLINEAR KLEIN GORDON EQUATION COUPLED WITH THE MAXWELL EQUATIONS∗
VIERI BENCI Dipartimento di Matematica Applicata, Universit` a degli Studi di Pisa Via Bonanno, 25/b, 50126 PISA, Italy
[email protected] DONATO FORTUNATO Dipartimento Interuniversitario di Matematica, Universit` a degli Studi di Bari Via Orabona, 4, 70125 BARI, Italy
[email protected]
Received 9 February 2001 This paper is divided in two parts. In the first part we construct a model which describes solitary waves of the nonlinear Klein–Gordon equation interacting with the electromagnetic field. In the second part we study the electrostatic case. We prove the existence of infinitely many pairs (ψ, E), where ψ is a solitary wave for the nonlinear Klein–Gordon equation and E is the electric field related to ψ.
1. Introduction The existence of solitary waves for scalar fields in dimension 3 has been extensively studied by many authors (see e.g. [3, 8], and their references). The equation they have considered is a nonlinear perturbation of the Klein– Gordon equation. A typical simple case is the following one ∂2ψ − ∆ψ + m20 ψ − |ψ|p−2 ψ = 0 , ∂t2
(1)
where ψ = ψ(x, t) ∈ C (x ∈ R3 , t ∈ R), m0 is a real constant and p > 2. The solutions of (1) of the type ψ(x, t) = u(x)eiωt ,
u real function ,
ω∈R
(2)
are called standing waves. With this ansatz, Eq. (1) reduces to the following one −∆u + (m20 − ω 2 )u − |u|p−2 u = 0 . ∗ Supported
by M.U.R.S.T. (ex 40% and 60% funds). 409
(3)
April 26, 2002 9:13 WSPC/148-RMP
410
00116
V. Benci & D. Fortunato
This equation is the Euler–Lagrange equation relative to the functional Z Z 1 1 2 2 2 2 [|∇u| + (m0 − ω )u ]dx − |u|p dx . f (u) = 2 p Exploiting the critical point theory for even functionals the following result can be obtained (see [8]). Theorem 1.1. If |ω| < |m0 | and 6 > p > 2, Eq. (3) has infinitely many solutions uk belonging to the Sobolev space H 1 (R3 ). By the Lorentz invariance of (1) the field ψv (x, t) = u(γ(x1 − vt), x2 , x3 )eiωγ(t−vx1 ) with |v| < 1 and γ = (1 − v 2 )− 2 is again a solution of (1) and we obtain a travelling solitary wave. In this paper we want to investigate about the existence of nonlinear Klein– Gordon fields interacting with the electromagnetic field (E, H). Since E, H are not assigned, we have to study a system of equations whose unknowns are the Klein–Gordon field ψ(x, t) and the gauge potentials A = A(x, t), φ = φ(x, t) related to the electromagnetic field. In order to construct such a system we shall describe, as usual, the interaction between ψ and E, H by using the so called gauge covariant derivatives (see Sec. 2). We shall investigate the case in which A and φ do not depend on the time t and ψ(x, t) is a standing wave (see Sec. 2). In this situation we can assume A = 0 and we are reduced to study the following system of equations 1
−∆u + [m20 − (ω + eφ)2 ]u − |u|p−2 u = 0
(4)
−∆φ + e u φ = −eωu
(5)
2 2
2
where e denotes the electric charge (4) and (5) are the Euler–Lagrange equations relative to the functional Z Z 1 1 {|∇u|2 − |∇φ|2 + [m20 − (ω + eφ)2 ]u2 }dx − |u|p dx . F (u, φ) = 2 p This functional is strongly indefinite. This means that it is unbounded both from below and from above and this indefinitess cannot be removed by compact perturbations. The main result of this paper is the following Theorem 1.2. If |ω| < |m0 | and 6 > p > 4, the system of Eqs. (4) and (5) has infinitely many solutions (u, φ)u ∈ H 1 (R3 ), φ ∈ L6 (R3 ) with |∇φ| ∈ L2 (R3 ). We recall that Maxwell equations coupled with Schr¨ odinger or Dirac equations have been studied respectively in [2, 4] and [5]. Moreover topological solitary waves interacting with electromagnetic fields have been studied in [7].
April 26, 2002 9:13 WSPC/148-RMP
00116
Solitary Waves of the Nonlinear Klein–Gordon Equation
411
2. Deduction of the Equations The Lagrangian density relative to (1) is given by " # 2 1 ∂ψ 1 2 2 2 − |∇ψ| − m0 |ψ| + |ψ|p . L0 = 2 ∂t p
(6)
Assume that ψ is a charged field and let e denote the electric charge. By gauge invariance arguments, the interaction between ψ and the electromagnetic field is ∂ , ∇ with usually described (see e.g. [6]) substituting in (6) the usual derivatives ∂t the gauge covariant derivatives ∂ + ieφ , ∇ − ieA . ∂t Here A, φ are the gauge potentials which are related to the electromagnetic field E, H by the equations E = −∇φ − At ,
H = ∇× A.
Then we have the following Lagrangian density: " # 2 1 ∂ψ 1 2 2 2 + ieφψ − |∇ψ − ieAψ| − m0 |ψ| + |ψ|p . L0 = 2 ∂t p
(7)
(8)
If we set ψ(x, t) = u(x, t)eiS(x,t) ,
u, S ∈ R
the Lagrangian density (8) takes the following form: L0 =
1 2 1 {u − |∇u|2 − [|∇S − eA|2 − (St + eφ)2 + m20 ]u2 } + |u|p . 2 t p
(9)
Now we consider the lagrangian density of the electromagnetic field E, H 1 1 1 (|E|2 − |H|2 ) = |At + ∇φ|2 − |∇ × A|2 2 2 2 and so the total action is given by ZZ S= L0 + L1 . L1 =
Making the variation of S with respecto to δu, δS, δφ and δA respectively, we get u + [|∇S − eA|2 − (St + eφ)2 + m20 ]u − |u|p−2 u = 0 ,
(10)
∂ [(St + eφ)u2 ] − ∇ · [(∇S − eA)u2 ] = 0 , ∂t
(11)
∇ · (At + ∇φ) = e(St + eφ)u2 ,
(12)
∇ × (∇ × A) +
∂ (At + ∇φ) = e(∇S − eA)u2 . ∂t
(13)
April 26, 2002 9:13 WSPC/148-RMP
412
00116
V. Benci & D. Fortunato
If we set ρ = e(St + eφ)u2 , j = e(∇S − eA)u2 , Eqs. (11)–(13) take the form ∂ ρ −∇· j = 0, ∂t
(14)
−∇ · E = ρ ,
(15)
∂E = j. ∂t We look for solutions of (10)–(13) of the type ∇×H−
u = u(x) ,
S = ωt ,
A = 0,
(16)
φ = φ(x)
with this ansatz, Eqs. (11) and (13) are identically satisfied, while (10) and (12) become −∆u + [m20 − (ω + eφ)2 ]u − |u|p−2 u = 0 ,
(17)
∆φ = e(ω + eφ)u2 .
(18)
Since e2 = 1, it is easy to see that in (17) and (18) we can take e = 1. So (17) and (18) become −∆u + [m20 − (ω + φ)2 ]u − |u|p−2 u = 0 ,
(19)
∆φ = (ω + φ)u2 .
(20)
3. The Variational Principle Now we want to find solutions u, φ of (19) and (20) with u ∈ H 1 (R3 ) ,
φ ∈ D.
Here H 1 (R3 ) is the usual Sobolev space with norm Z 12 2 2 (u + |∇u| )dx kukH 1 = R3
and D denotes the completion of C0∞ (R3 ) with respect to the inner product Z (∇v | ∇w)dx . (21) (v | w)D = R3
Consider the functional Z Z 1 1 {|∇u|2 − |∇φ|2 + [m20 − (ω + φ)2 ]u2 }dx − |u|p dx . F (u, φ) = 2 p The following proposition holds
(22)
April 26, 2002 9:13 WSPC/148-RMP
00116
Solitary Waves of the Nonlinear Klein–Gordon Equation
413
Proposition 3.1. F is C 1 on H 1 (R3 ) × D and its critical points are the solutions of (19) and (20). Proof. Let Fu0 (u, φ), Fφ0 (u, φ) denote the partial derivatives of F at (u, φ) ∈ H 1 (R3 ) × D, namely for any v ∈ H 1 (R3 ) and w ∈ D Z 0 (23) Fu (u, φ)[v] = {(∇u | ∇v) + [m20 − (ω + φ)2 ]uv − |u|p−2 uv}dx , Fφ0 (u, φ)[w] = −
Z {(∇φ | ∇w) + (ω + φ)u2 w}dx .
(24)
By Sobolev inequalities H 1 (R3 ) and D are continuously embedded into L6 (R3 ), then standard computations show that Fu0 (respectively Fφ0 ) maps continuously H 1 (R3 )× D in H −1 (respectively D0 ). So we conclude that F is C 1 on H 1 (R3 )× D. Moreover Fu0 (u, φ) = 0 ,
Fφ0 (u, φ) = 0
amounts to say that (u, φ) is a weak solution of (19) and (20). The functional F is neither bounded from below nor from above and this indefinitess cannot be removed by a compact perturbation. For this reason the usual tools of the critical point theory cannot be used in a direct way. To avoid this difficulty we reduce the study of (22) to the study of a functional of the only variable u. We need some technical preliminaries. 3
Lemma 3.2. Let g ∈ L 2 (R3 ), g ≥ 0. Then the bilinear form on D defined by Z ag (v, w) = {(∇v | ∇w) + gvw}dx defines an inner product equivalent to the product (21). Proof. D is continuously embedded into L6 (R3 ), then for v, w ∈ D we have Z gvwdx ≤ kgk 32 kvkL6 kwkL6 ≤ ckvkD kwk , L
where c is a positive constant. Then the conclusion easily follows. Lemma 3.3. Let u ∈ H 1 (R3 ), then there exists a unique solution φ ∈ D of (20). Proof. Equation (20) can be written as follows −∆φ + u2 φ = −ωu2 .
(25)
Since H 1 (R3 ) is continuously embedded into L6 (R3 ), then clearly u2 ∈ L1 (R3 ) ∩ L3 (R3 ) ,
(26)
April 26, 2002 9:13 WSPC/148-RMP
414
00116
V. Benci & D. Fortunato
and, by interpolation, we have 3
u2 ∈ L 2 (R3 ) . Then, by Lemma 3.2, we can consider the isomorphism Lu2 between D and D0 defined by Z hLu2 φ, vi = {(∇φ | ∇v) + u2 φv}dx , u, φ ∈ D . On the other hand, using again (26), we have 6
u2 ∈ L 5 (R3 ) Then, since L 5 (R3 ) is continuously embedded into D0 , we have also 6
u2 ∈ D 0 . So there exists a unique φ ∈ D such that Lu2 φ = −ωu2 . This φ clearly solves (25). By Lemma 3.3 we can define the map Φ : H 1 (R3 ) → D
(27)
such that for all u ∈ H 1 (R3 ) Φ(u) = φ is the unique solution of (25) namely 2 Φ(u) = −ωL−1 u2 (u ) .
The following lemma holds Lemma 3.4. The map Φ is C 1 and its graph GΦ is given by GΦ = {(u, φ) ∈ H 1 (R3 ) × D | Fφ0 (u, φ) = 0} . Proof. The proof of the first part can be easily achieved by standard arguments. The second part follows immediately from the definition of Φ. Now for u ∈ H 1 (R3 ) Φ(u) solves (25), then −∆Φ(u) + u2 Φ(u) = −ωu2 , from which, taking the product with Φ(u), we have Z Z Z 2 2 − ωu Φ(u)dx = |∇Φ(u)| dx + u2 Φ(u)2 dx .
(28)
Now we set J(u) = F (u, Φ(u)) . By Lemmas 3.1 and 3.4, J is C 1 .
(29)
April 26, 2002 9:13 WSPC/148-RMP
00116
Solitary Waves of the Nonlinear Klein–Gordon Equation
415
Using (22), we have Z 1 (|∇u|2 − |∇Φ(u)|2 )dx J(u) = 2 Z Z 1 1 [m20 − (ω + Φ(u))2 ]u2 dx − |u|p dx + 2 p Z 1 (|∇u|2 − |∇Φ(u)|2 + (m20 − ω 2 )u2 − u2 Φ(u)2 )dx = 2 Z Z 1 2 |u|p dx . − ωu Φ(u)dx − p Then, inserting (28), we get Z Z 1 1 2 2 2 2 2 2 2 (|∇u| + |∇Φ(u)| + u Φ(u) + (m0 − ω )u )dx − |u|p dx . J(u) = 2 p The following proposition holds.
(30)
Proposition 3.5. Let (u, φ) ∈ H 1 (R3 ) × D. Then the following statements are equivalent : (a) (u, φ) is a critical point of F ; (b) u is a critical point of J and φ = Φ(u). Proof. By (29) and Lemma 3.4, clearly we have (b) ⇔ (Fu0 (u, φ) + Fφ0 (u, φ)Φ0 (u) = 0 and φ = Φ(u)) ⇔ (Fu0 (u, φ) = 0 and Fφ0 (u, φ) = 0) ⇔ (a) . 4. Proof of the Main Result Clearly, by Proposition 3.5, Theorem 1.2 is a consequence of the following result. Proposition 4.1. If |ω| < |m0 | and 6 > p > 4, the functional J possesses infinitely many critical points u ∈ H 1 (R3 ). Since J is invariant under the group transformations u(x) → u(x + a)(a ∈ R3 ), there is a clear lack of compactness. To overcome this difficulty we restrict ourselves to radial functions u = u(r), r = |x|. More precisely we shall consider the functional J on the subspace Hr1 = {u ∈ H 1 (R3 ) : u = u(r), r = |x|} . We recall (see [8] or [3]) that, for 6 > p > 2, Hr1 is compactly embedded into Lpr , where Lpr = {u ∈ Lp (R3 ) : u radially symmetric}. As a consequence, the restriction J|Hr1 does not exhibits the strong indefinitess of the functional F . In fact J|Hr1 is bounded from below modulo the compact perturbation Z 1 |u|p dx . u→ p
April 26, 2002 9:13 WSPC/148-RMP
416
00116
V. Benci & D. Fortunato
Moreover, unlike the functional F , J|Hr1 is even, and we shall exploit this symmetry property to obtain the multiplicity result. Now Hr1 is a natural constraint for J, namely the following lemma holds. Lemma 4.2. Any critical point u ∈ Hr1 of J|Hr1 is also a critical point of J. Proof. Consider the O(3) group action Tg on H 1 (R3 ) defined by for g ∈ O(3) ,
u ∈ H 1 (R3 ) : Tg u(x) = u(g(x)) .
Clearly Hr1 is the set of the fixed points for this action namely Hr1 = {u ∈ H 1 (R3 ) | Tg u = u for all g ∈ O(3)} . Then the conclusion can be achieved by usual arguments (see [8]), if we show that J is invariant under the Tg action, namely if for any u ∈ H 1 (R3 ) ,
g ∈ O(3) J(Tg u) = J(u) .
(31)
Now, for u ∈ H (R ), Φ(u) solves the equation 1
3
−∆Φ(u) + u2 Φ(u) = −ωu2 , then, if g ∈ O(3), we have Tg (−∆Φ(u) + u2 Φ(u)) = −ωTg u2 , −∆(Tg Φ(u)) + (Tg u)2 (Tg Φ(u))) = −ω(Tg u)2 . This equality and the definition of Φ imply that Tg Φ(u) = Φ(Tg u) .
(32)
Therefore, using (32) and the Tg invariance of the norms in H 1 (R3 ), D, Lp , we easily deduce (31). In order to prove Proposition 4.1 we need to prove the following compactness result Lemma 4.3. Let the assumptions of Proposition 4.1 be satisfied. Then the functional J|Hr1 satisfies the Palais–Smale condition, i.e. Any sequence {un } ⊂ Hr1 s.t. {J(un )} is bounded and J|0Hr1 (un ) → 0 contains a convergent subsequence .
(33)
Proof. Let {un } ⊂ Hr1 , s.t. J(un ) = Mn bounded , J|0Hr1 (un ) = εn , where εn → 0 in the dual (Hr1 )0 = Hr−1 .
(34) (35)
April 26, 2002 9:13 WSPC/148-RMP
00116
Solitary Waves of the Nonlinear Klein–Gordon Equation
417
Then, if we set φn = Φ(un ), (35) can be written (see Proposition 3.5) as follows Fu0 (un , φn ) + Fφ0 (un , φn )Φ0 (un ) = εn . Since Fφ0 (un , φn ) = 0, we have Fu0 (un , φn ) = εn .
(36)
1 1 0 hF (un , φn ), un i = hεn , un i p u p
(37)
So
where h, i denotes the pairing between Hr1 and its dual. By (5) Fu0 (un , φn ) = −∆un + [m20 − (ω + φn )2 ]un − |un |p−2 un . Using this equality, the left hand side in (37) can be easily calculated. So, subtracting (37) from (34), we get Z 1 (38) (c1 |∇un |2 + c2 |un |2 )dx + An = Mn − hεn , un i , p where c1 = and
Z An =
1 1 − > 0, 2 p 1 |∇φn |2 + 2
m0 − ω 2 m0 − ω 2 − >0 2 p
c2 =
1 1 + 2 p
2 (un φn )2 + ωu2n φn dx . p
(39)
Now (see (20)) φn satisfies the equation ∆φn = (ω + φn )u2n , then
Z h∆φn , φn i =
from which we get Z
(ω + φn )u2 φn dx ,
Z ωu2n φn dx
=−
(|∇φn |2 + (un φn )2 )dx .
(40)
Since p > 4, from (39) and (40) we deduce that there exists a constant c3 > 0, s.t. Z 2 2 (41) An ≥ c3 (|∇φn | + (un φn ) )dx . Finally from (38) and (41) we deduce that {(un , φn )} is bounded in Hr1 × D , then, up to a subsequence,
April 26, 2002 9:13 WSPC/148-RMP
418
00116
V. Benci & D. Fortunato
un * u (weakly in Hr1 ) , φn * φ (weakly in D) . Next we show that un → u (strongly in Hr1 ) .
(42)
First by (36) we have −∆un + (m20 − ω 2 )un = φ2n un + 2ωφn un + |un |p−2 un + εn . Thus, if we denote by L the isomorphysm between Hr1 and its dual defined by Lu = −∆u + (m20 − ω 2 )u , we get un = L−1 (φ2n un + 2ωφn un ) + L−1 (|un |p−2 un ) + L−1 (εn ) .
(43)
Since Hr1 is compactly embedded into Lpr (see [8] or [3]), standard arguments show that L−1 (|un |p−2 un ) strongly converges in Hr1 . Also L−1 (εn ) strongly converges in Hr1 , then (42) will be a consequence of (43) if we show that also L−1 (φ2n un + 2ωφn un ) strongly converges in Hr1 ,
(44) 3
where Hr1 is compactly embedded into L3r . Then, by duality, Lr2 is compactly embedded into (Hr1 )0 . As a consequence, in order to prove (44), it is enough to show that 3
φ2n un and φn un are bounded in Lr2 .
(45)
Since φn is bounded in D and un is bounded in Hr1 , by Sobolev embedding theorems, we deduce that φn is bounded in L6 , and un is bounded in L6 ∩ L2 ⊂ L3 . Moreover, by H¨ older inequalities, we have 3
3
kφ2n un k 2 3 ≤ kφn k3L6 kun kL2 3 , L2
and kφn un k from which we deduce (45).
3
L2
≤ kφn kL6 kun kL2 ,
April 26, 2002 9:13 WSPC/148-RMP
00116
Solitary Waves of the Nonlinear Klein–Gordon Equation
419
Now we are ready to prove Proposition 4.1 By Lemma 4.2 it is enough to show that J|Hr1 has infinitely many critical points. Since J|Hr1 is even, we shall use an equivariant version of the mountain pass theorem (see [1, Theorem 2.13]). Clearly from (30) we have J(u) ≥ c1 kuk2H 1 − c2 kukpLp ,
u ∈ Hr1 .
(46)
Here and in the sequel c1 , c2 , . . . denote positive constants. Since Hr1 is continuously embedded in Lp , we easily deduce from (46) that there exists α, ρ > 0 such that J(u) ≥ α for u ∈ Hr1 ,
kukH 1 = ρ .
(47)
Now let V be an m-dimensional (m positive integer) subspace of Hr1 . We need only to show that J(u) → −∞ as kuk → ∞ ,
u∈V .
(48)
In fact, since J|Hr1 satisfies the Palais–Smale condition (see Lemma 4.3) and the geometrical assumptions (47) and (48), we can use [1, Theorem 2.13] to conclude that there exist at least m distinct pairs of critical points of J|Hr1 . Since m is arbitrary, we obtain the existence of infinitely many pairs of critical points. Therefore we are left to prove (48). Clearly, if we set φ = Φ(u), from (30) we have 1 1 J(u) ≤ c3 kuk2H 1 + (k∇φk2L2 + kuφk2L2 ) − kukpLp , 2 p
u ∈ Hr1 .
(49)
Since φ satisfies (20), we have h∆φ, φi = h(ω + φ)u2 , φi . Then
Z k∇φk2L2
+
kuφk2L2
=−
ωu2 φdx .
By Sobolev and H¨older inequalities we deduce that Z ωu2 φdx ≤ c4 kuk2 12 kφkL6 ≤ c5 kuk2 12 k∇φkL2 . L
L
5
5
(50)
(51)
Then (50) and (51) give k∇φk2L2 + kuφk2L2 ≤ c5 kuk2 12 k∇φkL2 ,
(52)
k∇φkL2 ≤ c5 kuk2 12 .
(53)
L
5
which implies L
5
From (52) and (53), we have k∇φk2L2 + kuφk2L2 ≤ c25 kuk4 12 . L
5
(54)
April 26, 2002 9:13 WSPC/148-RMP
420
00116
V. Benci & D. Fortunato
Finally (54) and (49) imply 1 J(u) ≤ c3 kuk2H 1 + c6 kuk4 12 − kukpLp . L5 p Then, since p > 4 and V is finite dimensional, (48) easily follows. References [1] A. Ambrosetti and P. H. Rabinowitz, “Dual variational methods in critical point theory and applications”, J. Funct. Anal. 14 (1973) 349–381. [2] V. Benci and D. Fortunato, “An eigenvalue problem for the Schroedinger–Maxwell equations”, Topol. Methods Nonlinear Anal. 11 (1998) 283–293. [3] H. Berestycki and P. L. Lions, “Nonlinear scalar field equations I and II”, Arch. Rat. Mech. Anal. 82 (1983) 313–345 and 347–375. [4] G. M. Coclite, “Metodi Variazionali applicati allo studio delle equazioni di Schr¨ odinger–Maxwell”, Thesis, University of Bari, 1999. [5] M. J. Esteban, V. Georgiev and E. Sere, “Stationary solutions of the Maxwell–Dirac and the Klein–Gordon–Dirac equations”, Calc. Var. 4, 265–281. [6] B. Felsager, Geometry, Particle and fields, Odense University Press, 1981. [7] V. Benci, D. Fortunato, A. Masiello and L. Pisani, “Solitons and electromagnetic field”, Math. Z. 232 (1999) 73–102. [8] W. Strauss, “Existence of solitary waves in higher dimensions”, Comm. Math. Phys. 55 (1977) 149–162.
June 4, 2002 11:0 WSPC/148-RMP
00123
Reviews in Mathematical Physics, Vol. 14, No. 5 (2002) 421–467 c World Scientific Publishing Company
¨ SCHRODINGER OPERATORS ON HOMOGENEOUS METRIC TREES: SPECTRUM IN GAPS
ALEXANDER V. SOBOLEV Centre for Mathematical Analysis and Its Applications University of Sussex, Falmer, Brighton BN1 9QH, UK
[email protected] MICHAEL SOLOMYAK Department of Mathematics, Weizmann Institute, Rehovot, Israel
[email protected]
Received 2 August 2001 Revised 5 February 2002 The paper studies the spectral properties of the Schr¨ odinger operator AgV = A0 + gV on a homogeneous rooted metric tree, with a decaying real-valued potential V and a coupling constant g ≥ 0. The spectrum of the free Laplacian A0 = −∆ has a band-gap structure with a single eigenvalue of infinite multiplicity in the middle of each finite gap. The perturbation gV gives rise to extra eigenvalues in the gaps. These eigenvalues are monotone functions of g if the potential V has a fixed sign. Assuming that the latter condition is satisfied and that V is symmetric, i.e. depends on the distance to the root of the tree, we carry out a detailed asymptotic analysis of the counting function of the discrete eigenvalues in the limit g → ∞. Depending on the sign and decay of V , this asymptotics is either of the Weyl type or is completely determined by the behaviour of V at infinity.
1. Introduction Counting the number of eigenvalues of a perturbed operator, appearing in the spectral gaps of the unperturbed one, is a classical problem. It was extensively investigated both in the general operator-theoretic setting [3] and in applications to various specific problems of mathematical physics (the Hill operator, [14, 21]; the Dirac operator, [16, 5]; the periodic Schr¨odinger and magnetic Schr¨ odinger operators, [1, 11]; waveguide-type operators, [10], etc.) In this paper we study a new problem of this type, which only recently attracted the attention of specialists: our unperturbed operator is the Laplacian on a homogeneous rooted metric tree Γ. In general, a metric tree is a tree whose edges are viewed as non-degenerate line segments, rather than pairs of vertices, as in the case of the standard (combinatorial) trees. This difference is reflected in the nature of the corresponding Laplacian. For a combinatorial tree this is the discrete Laplacian, whereas the Laplacian A0 = −∆ 421
June 4, 2002 11:0 WSPC/148-RMP
422
00123
A. V. Sobolev & M. Solomyak
on a metric tree is represented by a family of the operators −d2 /dx2 on its edges, complemented by the Kirchhoff matching conditions at the vertices. The Laplacian on the homogeneous metric tree has very specific spectral properties which we describe later on in detail. In particular, the spectrum has the band-gap structure, with a single eigenvalue of infinite multiplicity in each finite gap. For some other operators on a homogeneous tree, having similar nature, the band-gap structure of the spectrum was established earlier by R. Carlson [12]. In the present paper we study the properties of the perturbed operator AgV = A0 + gV where V is a decaying real-valued potential, and g ≥ 0 is a coupling constant. The potential V is assumed to be symmetric, i.e. dependent only on the distance |x| between x ∈ Γ and the root of Γ. This perturbation may produce extra eigenvalues in the gaps of A0 . For an “observation point” λ inside a gap we denote by M (λ; AgV ) the number of the eigenvalues of the operator AαV , crossing λ as α varies from 0 to g. For any two points λ1 , λ2 , λ1 < λ2 lying in the same gap, we denote by N (λ1 , λ2 ; AgV ) the number of eigenvalues of this operator on the interval (λ1 , λ2 ); see Sec. 4 for more precise definitions. We are interested in the limiting behaviour of these quantities as the coupling constant g tends to infinity. Compared with other problems of this type, mentioned in the beginning of the Introduction, this problem has many new important features. The starting point of our investigation is a direct decomposition of the Sobolev space H1,0 (Γ) on the homogeneous tree Γ. This R decomposition is orthogonal with R respect to the inner products Γ u0 v 0 dx and Γ V (x)u¯ v dx for all symmetric weight functions V simultaneously. Therefore, it reduces the Laplacian and any Schr¨odinger operator AV with a symmetric potential V . This decomposition was constructed in the paper [18] and has proved very useful for spectral theory of this class of operators. Later it was re-discovered by R. Carlson [13]. For the spectral analysis of the discrete Laplacian on combinatorial trees a similar decomposition was used before, see e.g. [20, 2] and references therein. The parts of the operator AV in each component of the said orthogonal decomposition turn out to be unitary equivalent to second order differential operators AVk , k = 0, 1, . . . of the Sturm–Liouville type in the space L2 (R+ ). The potentials Vk are obtained from the original potential V by “shifting” the variable: Vk (t) = V (t + k), k = 0, 1, . . . . The operators AVk act as −d2 /dx2 +Vk , but in contrast to the standard Sturm–Liouville problem, the description of the operator domain of AVk involves specific matching conditions at the points tn = n, n = 1, 2 . . . (see Sec. 2). Each component AVk , k ≥ 1, enters AV with the multiplicity n0 = 1 ,
nk = bk − bk−1 ,
k ≥ 1.
Here b ≥ 2 is the integer-valued parameter (the branching number) which characterizes the homogeneous tree completely, see the definition in Sec. 2.1. The above orthogonal decomposition plays a central role in our approach. First of all, it allows us to calculate the spectrum of A0 explicitly (see Theorem 3.3): it
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
423
consists of the bands [(π(l − 1) + θ)2 , (πl − θ)2 ], θ = arccos(2(b1/2 + b−1/2 )−1 ), and the eigenvalues λl = (πl)2 , l ∈ N. Besides, for V = 0 all the components AVk = A0 are identical, so that the spectrum is of infinite multiplicity. For the perturbed operator this decomposition leads to the representation X M (λ; AgV ) = nk M (λ; AgVk ) . (1.1) k≥0
A similar formula also holds for N (λ1 , λ2 ; AgV ). The presence of the exponentially growing factors nk hampers the study of these sums. Remembering that the numbers nk reflect the geometry of the tree, rather than the properties of the potential ˜ and N ˜ ignoring the exponential multiV , we also study the counting functions M plicities nk . Precisely, we introduce X ˜ (λ; AgV ) = M M (λ; AgVk ) , (1.2) k≥0
˜ defined in a similar way. and the quantity N ˜,N ˜ reduces to that of the inClearly, the study of the four functions M, N, M dividual counting functions M (λ; AgVk ), N (λ1 , λ2 ; AgVk ) for the operators AVk . A similar problem for the classical Hill operator was investigated in [14] and, in a more detailed way, in [21]. The general strategy adopted in [21] applies to the operators AgVk with only minor changes. However, here a new problem emerges: in order to obtain the asymptotic formulas for the sums (1.1), (1.2) one needs an asymptotics of M (λ; AgVk ) jointly in two parameters: g and k, with a good control of the remainder estimate. In solving this new problem we see the main technical novelty of the paper. In the paper we obtain several results of rather different type. In this introduction we do not describe them in detail, but concentrate on their principal features. More extended comments are given in the main text. Also, for the sake of discussion ˜ (λ; AgV ) only. we restrict ourselves to the functions M (λ; AV ) and M First of all, as in the case of the “classical” Hill operator problem (see [21]), we ˜ is in general radically different for non-positive observe that the behaviour of M, M and non-negative potentials. More precisely, if V ≤ 0 decays sufficiently quickly at infinity, then the asymptotics is governed by an appropriate Weyl-type formula, and thus it depends on the values of V (t) at all points t ∈ R, and contains no information on the spectrum of the unperturbed operator. On the contrary, for a non-negative V the asymptotics is determined by the fall-off of V at infinity, and as a rule, depends heavily on some spectral characteristic of the operator A0 . For ˜ instance, the behaviour of M(λ) for V ≥ 0 is described by an integral of the density of states for A0 . Similar asymptotics is also observed for the potentials V ≤ 0 whose decay at infinity is slow in some specified sense. In accordance with this general observation our study of the asymptotics is divided in several parts. We begin in Sec. 4 by specifying the conditions on a non-positive symmetric potential V that guarantee the validity of the Weyl-type
June 4, 2002 11:0 WSPC/148-RMP
424
00123
A. V. Sobolev & M. Solomyak
asymptotics. Further on, we proceed to the cases when the Weyl formula fails and the asymptotics is determined by the behaviour of V at infinity. Here we investigate two types of potentials: power-like and exponentially decaying. In Sec. 5 we state ˜ , N. ˜ A common feature of the asymptotic formulae the results for the functions M in Sec. 5 is that virtually all of them contain the density of states for the operator A0 . It is also worth pointing out that the power-like potentials induce a power˜ as g → ∞, whereas the exponential potentials give rise to a like growth of M logarithmic growth. The study of the sum (1.1) is postponed until Sec. 9 as it calls for different techniques and is less complete. For the power-like potentials we are able to establish the asymptotics only for the quantity ln M (λ). For the exponential potentials we provide more detailed asymptotic information. This is possible due to the “self-similarity” of the exponential function. This property allows us to rewrite the formula (1.1) for the function M (λ, AgV ) in a form which can be interpreted as a Renewal Equation (see [15, 17]). Then the Renewal theorem ensures a specific asymptotic behaviour of M (λ). Let us briefly outline the contents of the remaining sections. In Sec. 2 we describe the basic orthogonal decomposition of the space H1,0 (Γ) and also the parts of the operator AV in its components. In Sec. 3 we calculate the spectrum of the Laplacian on Γ. Here we also carry out a detailed analysis of the density of states for the ˜ in the operator A0 . This function is involved in the asymptotic formulae for M non-Weyl situation. As was mentioned earlier, the study of the perturbed operator AV starts in Sec. 4 where the Weyl’s asymptotics is established. The main results ˜ and N ˜ are collected in Sec. 5. on the non-Weyl asymptotics for the functions M Their proofs are given in Sec. 8, preceded by necessary technical preliminaries in Secs. 6, 7. The last section, Sec. 9, is devoted to the analysis of the functions M, N. 2. Laplacian on a Homogeneous Tree and Its Decomposition 2.1. Homogeneous trees and Laplacians on them Let Γ be a rooted tree with the root o, the set of vertices V(Γ) and the set of edges E(Γ). We suppose that the length of each edge e is equal to 1. Given two points y, z ∈ Γ, we write y z if y lies on the unique simple path connecting o with z; let |z| stand for the length of this path. We write y ≺ z if y z and y 6= z. The relation ≺ defines on Γ a partial ordering. If y ≺ z, we denote hy, zi := {x ∈ Γ : y x z} . In particular, if e = hv, wi is an edge, we call v its initial point and say that e emanates from v and terminates at w. For any v ∈ V(Γ) the number |v| is a non-negative integer; we call it generation of v and denote Gen(v). For an edge e ∈ E(Γ) Gen(e) is defined as the generation of its initial point.
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
425
Let an integer b > 1 be given. We suppose that for each vertex v 6= o there are exactly b edges emanating from v. We denote them e1v , . . . , ebv and write e− v for the edge terminating at v. We call b the branching number of Γ. We always suppose that only one edge emanates from the root o. Thus, the tree Γ is fully determined by the parameter b, and sometimes we use the notation Γb . We call any tree Γb , with an arbitrary b, homogeneous. The metric topology and the Lebesgue measure on Γ are introduced in a natural way. The space L2 (Γ) is understood as L2 with respect to this measure. A function f on Γ belongs to the Sobolev space H1 (Γ) if and only if it is continuous, f e ∈ H1 (e) for each edge e, and Z kf k2H1(Γ) := (|f 0 |2 + |f |2 )dx < ∞ . Γ
As usual, H1,0 (Γ) = {f ∈ H1 (Γ) : f (o) = 0}. We define the Dirichlet Laplacian −∆ Ron Γ as the self-adjoint operator in 2 L (Γ), associated with the quadratic form Γ |f 0 |2 dx considered on the form domain H1,0 (Γ). It is easy to describe the operator domain Dom(∆) and the action of ∆. Evidently f ∈ Dom(∆) ⇒ f e ∈ H2 (e) for each edge e and the Euler–Lagrange equation reduces on e to ∆f = f 00 . In order to describe the matching conditions at a vertex v 6= o, denote by f− the restriction f e− v and by fj , j = 1, . . . , b the restrictions f ejv . The matching conditions at v are f− (v) = f1 (v) = · · · = fb (v) ;
0 f10 (v) + · · · + fb0 (v) = f− (v)
where the derivatives on each edge are taken in the direction consistent with the ordering on Γ. The first matching condition comes from the requirement f ∈ H1 (Γ) which includes the continuity of f , and the second appears as the natural condition in the sense of calculus of variations. At the root o we have the boundary condition f (o) = 0. It is easy to check that the conditions listed are also sufficient for f ∈ Dom(∆). Along with the Laplacian −∆ we shall be interested also in the Schr¨ odinger operators with a real, bounded and symmetric (that is, depending only on |x|) potential V : AV f := −∆f + V (|x|)f ,
f ∈ Dom(∆) .
(2.1)
The operator AV is self-adjoint. Its quadratic form is given by Z aV [f ] = (|f 0 |2 + V (|x|)|f |2 )dx , f ∈ H1,0 (Γ) . Γ
2.2. The orthogonal decomposition of L2 (Γ) Our techniques is based upon the orthogonal decomposition of L2 (Γ) into a family of subspaces associated with a class of subtrees of Γ. Given a subtree T ⊂ Γ, we say that a function f ∈ L2 (Γ) belongs to FT if and only if f = 0 outside T
June 4, 2002 11:0 WSPC/148-RMP
426
00123
A. V. Sobolev & M. Solomyak
and f (x) = f (y) if x, y ∈ T
and |x| = |y| .
(2.2)
Evidently FT is a closed subspace of L2 (Γ). It is easy to describe the operator PT of orthoprojection onto FT . To this end, introduce the function bT (t) = #{x ∈ T : |x| = t} . In particular, bΓ (t) = bk It is clear that (PT f )(x) =
for k − 1 ≤ t < k ,
(bΓ (|x|))−1
X
k ∈ N.
f (y)
(2.3)
for x ∈ T ;
y∈T :|y|=|x|
for x ∈ /T.
0
We shall need the subspaces FT , associated with the subtrees of two following types. Given a vertex v, let Tv = {x ∈ Γ : x v} . Given an edge e = hv, wi, let Te = e ∪ Tw . In particular, Te0 = To = Γ. For the sake of brevity, for any v 6= o below we use the notation Fv , Fvj for FTv , FT j . It is clear that the subspaces Fv1 , . . . , Fvb are ev mutually orthogonal and their orthogonal sum F˜v = Fv1 ⊕ · · · ⊕ Fvb contains Fv . Denote by Fv0 = F˜v Fv the orthogonal complement. The next theorem is a direct consequence of [18, Theorem 5.1 and Lemma 5.2], where a more general class of trees was considered. Later the result was re-discovered by R. Carlson [13], in a slightly different setting. A new detailed exposition, most convenient for our purposes, was recently given in [19]. Theorem 2.1. Let Γ = Γb for some b > 1. (i) The subspaces Fv0 , o 6= v ∈ V(Γ) are mutually orthogonal and orthogonal to FΓ . Moreover, X L2 (Γ) = FΓ ⊕ ⊕Fv0 . (2.4) v∈V(Γ)\{o}
(ii) Let V (t) be a real, measurable and bounded function on R+ . Then the decomposition (2.4) reduces the Schr¨ odinger operator (2.1), and in particular the Laplacian −∆ = A0 .
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
427
2.3. Parts of AV in the subspaces FΓ , Fv0 According to Theorem 2.1, the description of the spectrum σ(AV ) reduces to the similar problem for the parts of AV in the components of the decomposition (2.4). Consider at first the part of AV in the subspace FΓ . It is more convenient (and equivalent) to deal with the quadratic form aV . It is natural to identify a function f ∈ FΓ with the function ϕ on R+ , such that ϕ(t) = f (x) for |x| = t. The operator Π : f 7→ ϕ acts as an isometry of FΓ onto the weighted space L2 (R+ , bΓ ) with the norm given by Z 2 kϕkL2 (R+ ,bΓ ) = |ϕ(t)|2 bΓ (t)dt . R+
Then
Z aV [f ] =
R+
(|ϕ0 (t)|2 + V (t)|ϕ(t)|2 )bΓ (t)dt ,
ϕ = Πf .
(2.5)
Its domain is the weighted Sobolev space H1,0 (R+ , bΓ ) whose norm is defined by the quadratic form (2.5) with V ≡ 1. The corresponding operator AV FΓ turns into an operator acting in L2 (R+ , bΓ ). It is not difficult to describe it explicitly, however it is more natural to pass on to the operators acting in the “usual” L2 (R+ ). To this end we make the substitution y(t) = bΓ (t)1/2 ϕ(t) .
(2.6)
Then kyk2L2 (R+ ) = kϕk2L2 (R+ ,bΓ ) . Since bΓ (t) is a step function, we also have Z (|y 0 (t)|2 + V (t)|y(t)|2 )dt . aV [y] := aV [f ] = R+
However, the domain of aV does not coincide with H1,0 (R+ ), since the function y(t) may have jumps at the points n ∈ N. More exactly, it follows from (2.3) and (2.6) that Dom(aV ) consists of functions y ∈ H1 (0, 1) × H1 (1, 2) × · · · × H1 (n − 1, n) × · · ·
(2.7)
such that y(0) = 0 ; and
Z R+
y(n+) = b1/2 y(n−) ,
∀n ∈ N,
(|y 0 (t)|2 + |y(t)|2 )dt < ∞ .
(2.8)
(2.9)
The self-adjoint operator in L2 (R+ ), associated with this quadratic form, on each interval (n − 1, n), n ∈ N acts as AV y = −y 00 + V (t)y .
June 4, 2002 11:0 WSPC/148-RMP
428
00123
A. V. Sobolev & M. Solomyak
Its domain Dom(AV ) consists of all functions y ∈ H2 (0, 1) × H2 (1, 2) × · · · × H2 (n − 1, n) × · · · satisfying the conditions (2.8) and y 0 (n+) = b−1/2 y 0 (n−) , and also
Z R+
∀n ∈ N,
(2.10)
(|y 00 (t)|2 + |y(t)|2 )dt < ∞ .
(Here and in (2.9) it would be more accurate to write So we have proved the following
P∞ R n n=1 n−1
rather than
R R+
.)
Lemma 2.2. The part of the operator AV in the subspace FΓ is unitarily equivalent to the operator AV in L2 (R+ ). Now we turn to the operators AV Fv0 , v 6= o. It follows from the symmetry properties of the tree Γ and of the potential V (|x|) that all such operators with the same value of Gen(v) = k can be identified with each other. In order to reduce them to the operators in L2 (R+ ), introduce the “shifted” potentials Vk (t) = V (t + k) ,
t > 0,
k = 0, 1, . . . .
(2.11)
In particular, V0 = V . Lemma 2.3. Let Γ = Γb and v ∈ V(Γ), Gen(v) = k > 0. Then the operator AV Fv0 is unitarily equivalent to the orthogonal sum of (b − 1) copies of the operator AVk . For the formal proof, see [19]. On the qualitative level, the result follows from the fact that the restriction of the operator AV to the subspace F˜v reduces to orthogonal sum of b copies of the operator AVk . The passage to the subspace Fv0 corresponds to the withdrawal of one of these copies. 2.4. The orthogonal decomposition of the operators AV Now we are in position to present the final result of this section. Below A[r] stands for the orthogonal sum of r copies of a self-adjoint operator A. Theorem 2.4. Let Γ = Γb and the function V be real, measurable and bounded on R+ . Then the Schr¨ odinger operator (2.1) on Γ is unitarily equivalent to the orthogonal sum of the operators acting in L2 (R+ ): X [bk−1 (b−1)] AV ∼ AV ⊕ ⊕AVk . k∈N
In particular, for the Laplacian −∆ = A0 we get −∆ ∼ A0 [∞] . This theorem is a direct consequence of Theorem 2.1 and Lemmas 2.2, 2.3, if one remembers that the total number of vertices of generation k equals bk−1 .
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
429
3. Spectrum of the Laplacian on Γb 3.1. The operator A on the whole line Along with the operator A0 in L2 (R+ ) defined as A0 y = −y 00 with the boundary and matching conditions (2.8) and (2.10), consider the similar operator, say A, in L2 (R): (Ay)(t) = −y 00 (t) ,
t 6∈ Z ,
on the analogous domain supplied with the matching conditions y(n+) = b1/2 y(n−) ,
y 0 (n+) = b−1/2 y 0 (n−) ,
n ∈ Z.
(3.1)
The spectrum of A can be found by means of the standard Floquet procedure. The related quasi-periodic problem is y 00 + µ2 y = 0 ,
y(1+) = eiξ y(0+) ,
y 0 (1+) = eiξ y 0 (0+) ,
with the parameter (quasi-momentum) ξ ∈ [0, 2π). Taking into account the matching conditions at the point n = 1, we can re-write this as y 00 (t) + µ2 y(t) = 0 , y(1−) = b−1/2 eiξ y(0+) ,
0 < t < 1;
y 0 (1−) = b1/2 eiξ y 0 (0+) .
(3.2)
It is quite straightforward to calculate the eigenvalues of the problem (3.2). Introduce the function cos ξ b1/2 + b−1/2 ϕ(ξ) = arccos , R= > 1. (3.3) R 2 Then the numbers µl (square roots of eigenvalues) are given by ( π(l − 1) + ϕ(ξ) , l is odd , µl (ξ) = l ∈ N. (3.4) πl − ϕ(ξ) , l is even , The function ϕ is one-to-one on the interval [0, π]. Later we shall also need its inverse: ψ(µ) = arccos(R cos µ) ,
µ ∈ [ϕ(0), ϕ(π)] = [θ, π − θ] .
(3.5)
where θ = arccos(1/R) . It follows easily from (3.3) that ψ(µ) = 21/2 (R2 − 1)1/4 (µ − θ)1/2 + O(µ − θ) , ψ(µ) = π − 21/2 (R2 − 1)1/4 (µ − π + θ)1/2 + O(µ − π + θ) , µ → π − θ − . Define the segments (“bands”) [ bl = µ2l (ξ) = [(π(l − 1) + θ)2 , (πl − θ)2 ] , ξ
µ → θ+ , (3.6)
l∈N
June 4, 2002 11:0 WSPC/148-RMP
430
00123
A. V. Sobolev & M. Solomyak
and the intervals (“gaps”) ll = ((πl − θ)2 , (πl + θ)2 ) ,
l0 = (−∞, θ2 ) ,
l ∈ N.
(3.7)
The gaps are labelled so that ll separates the bands bl and bl+1 . The following statement is a direct consequence of the Floquet theory. Lemma 3.1. The spectrum of A coincides with the union of the bands bl , l = 1, 2, . . . . On this set the spectrum is of the Lebesgue type and of multiplicity two. We shall need also the spectral decomposition of the operator A. To this end, note that ζl (t, ξ) = cl (ξ)(cos(µl (ξ)(1 − t)) − b1/2 eiξ cos(µl (ξ)t)) , p cl (ξ) = 2(b + 1)−1 | sin µl (ξ)|−1 , 0 < t < 1 , is the normalized in L2 (0, 1) eigenfunction of Eq. (3.2) corresponding to the eigenvalue µ2l (ξ). It follows from (3.4), (3.3) that ζl (t, ξ) is smooth in ξ on each band bl . Let us extend each function ζl (t, ξ) to all t ∈ R in the following way. Let ωl (t, ξ) be the periodic (in t) extension of the function e−itξ ζl (t, ξ) from the interval [0, 1) to R. Then we define ζl (t, ξ) on the whole of R by the equation ζl (t, ξ) = eitξ ωl (t, ξ) ,
t ∈ R.
Let Pl be the spectral projection of A associated with the band bl . The map Z (Ul y)(ξ) = (2π)−1/2 ζl (t, ξ)y(t)dt R
defines the unitary operator from L (R) onto L2 (−π, π) which diagonalizes APl , namely 2
(Ul APl y)(ξ) = λl (ξ)(Ul Pl y)(ξ) ,
λl (ξ) = µ2l (ξ) .
The adjoint operator U ∗ : L2 (−π, π) → L2 (R) is given by Z π Z π ∗ −1/2 −1/2 (Ul z)(t) = (2π) ζl (t, ξ)z(ξ)dξ = (2π) eitξ ωl (t, ξ)z(ξ)dξ . −π
−π
Denoting by [m] the operator of multiplication by a scalar function m, we get the spectral decomposition of A in the form X A= Ul∗ [λl ]Ul Pl . (3.8) l∈N
3.2. Spectrum of the operators A0 and A0 Theorem 3.2. The spectrum of the operator A0 consists of the bands bl , l ∈ N and of the simple eigenvalues λl = (πl)2 , l ∈ N. The corresponding eigenfunctions (normalized in L2 (R+ )) are yl (t) = c(b)b−n/2 sin(πlt) ,
t ∈ (n − 1, n) ,
c(b) = (2(b − 1))−1/2 .
n ∈ N;
(3.9)
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
431
Proof. The operator A0 is non-negative, so its spectrum lies on [0, ∞). 1. Bands. Let D be the operator in L2 (R+ ), defined as follows: its operator domain coincides with the quadratic domain of A0 , i.e. is defined by (2.7)–(2.9), and for y from this domain (Dy)(t) = −iy 0 (t) ,
t∈ / N.
(3.10)
The operator D is closed and its adjoint D∗ acts by the same formula (3.10) on the domain consisting of those functions y from the direct product (2.7) which satisfy (2.9) and the matching conditions similar to the ones in (2.8) but with the factor b1/2 replaced by b−1/2 ; there is no boundary condition at t = 0. It is easy to see that A0 = D∗ D. Along with A0 , consider the operator DD∗ . It acts as (DD∗ y)(t) = −y 00 (t), t 6∈ N, and its domain is described by the boundary condition y 0 (0) = 0 and the matching conditions y(n+) = b−1/2 y(n−) ,
y 0 (n+) = b1/2 y 0 (n−) ,
n ∈ N.
According to the general operator theory, the non-zero spectra of the operators A0 = D∗ D and DD∗ coincide. Now, in the definition of the operator A let us replace the matching condition at t = 0 by the boundary conditions y(0+) = 0 ,
y 0 (0−) = 0 .
The new operator, say A0 , splits into the orthogonal sum, A0 = A0 ⊕ A00 where the operator A00 acts in L2 (R− ). Its description is clear from the construction and it is easy to see that the substitution t 7→ −t reduces A00 to DD∗ . The essential S spectrum of A0 is the same as that of A, i.e. l∈N bl . It also coincides with the union of the essential spectra of the operators A0 and A00 , i.e. with each of them. It follows that the essential spectrum of A0 coincides with the spectrum of A given by Lemma 3.1. 2. Eigenvalues. The fact that each function yl (t), cf. (3.9), is the eigenfunction corresponding to the eigenvalue (πl)2 , can be verified by the direct inspection. Any two solutions satisfying the boundary condition y(0) = 0 are proportional to each other, so that this eigenvalues are simple. The direct inspection shows also that λ = 0 is not an eigenvalue. So it remains to show that any number λ = k 2 > 0 with π −1 k 6∈ N can not be an eigenvalue. For this purpose we use the explicit formulae for the solutions of the equation y 00 (t) + k 2 y(t) = 0 ,
t 6∈ N
(3.11)
under the matching conditions (3.1). Namely, let q1 , q2 be found from the quadratic equation q 2 − 2Rq cos k + 1 = 0
(3.12)
June 4, 2002 11:0 WSPC/148-RMP
432
00123
A. V. Sobolev & M. Solomyak
where R is defined in (3.3). Suppose that q1 6= q2 , that is R| cos k| 6= 1. The functions yj (t) = (b1/2 sin k(n − t) + qj sin k(t − n + 1))qjn−1 , n−1 < t < n,
n ∈ N,
(3.13)
j = 1, 2
are solutions of the equation (3.11) under the matching conditions (3.1). Their Wronskian is equal to y1 y20 − y10 y2 = b1/2 (q2 − q1 )k sin k, so that the solutions y1 , y2 are linearly dependent only if π −1 k ∈ N which is the excluded case. Any solution satisfying the condition y(0+) = 0 is proportional to the function y0 (t) =
q n−1 − q1n−1 y2 (t) − y1 (t) = b1/2 2 sin k(n − t) q2 − q1 q2 − q1 +
q2n − q1n sin k(t − n + 1) , q2 − q1 n−1 < t < n,
(3.14)
n ∈ N.
For π−1 k 6∈ N this function does not lie in L2 (R+ ) and hence, is not an eigenfunction. If R| cos k| = 1, then q1 = q2 = ±1 and it is easy to see that there also are no L2 solutions of the problems (3.11) and (3.1), and we are done. The result for the operator A0 , that is for the Laplacian on the tree, immediately follows from Theorem 2.4 and Theorem 3.2. Theorem 3.3. The spectrum of the operator A0 is of infinite multiplicity and consists of the bands bl and the eigenvalues λl = (πl)2 , l ∈ N. We see that the gap l0 of the operator A is also the gap for A0 , and each gap ll of A with l ≥ 1 splits into two gaps when we turn to the operator A0 : ll,− = ((πl − θ)2 , (πl)2 ) ;
ll,+ = ((πl)2 , (πl + θ)2 ) .
3.3. Global quasi-momentum and density of states Define the density of states for the operators A and A0 as the limit ρ(λ) = lim
NP (λ; ∆) , |∆|
|∆| → ∞ .
(3.15)
Here we denote ∆ = (0, L), L ∈ N, and NP (λ) = #{j : µ2l < λ} is the counting function for the operator By = −y 00 which at the points 1, . . . , L − 1 has the same matching conditions as in (3.1), and also satisfies the boundary conditions y(0) = b1/2 y(L) ,
y 0 (0) = b−1/2 y 0 (L) .
The subscript P in the notation for the counting function indicates that the operator B has the boundary conditions of this type. If the limit (3.15) exists for these conditions, then it will also exist for any other conditions, and its value will not depend on them. Later, in order to calculate the density of states we shall use the
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
433
same formula (3.15), but with the counting function of the Dirichlet problem. In this case we do not use any subscripts and simply √ write N (λ). Let us find eigenvalues of B. Denote k = λ, then choose solutions on every interval (n, n + 1) in the form y(t) = αn cos k(t − n) + βn sin k(t − n) . In view of the matching conditions, we come, with the notations c = cos k, s = sin k, to the equalities βn = b−1/2 (−αn−1 s + βn−1 c)
αn = b1/2 (αn−1 c + βn−1 s) ,
(3.16)
for n = 0, . . . , L. Here we have identified the points with n = 0 and n = L, so that α0 = αL and β0 = βL . To solve this system introduce the functions A(z) =
L−1 X
αn z n ,
n=0
B(z) =
L−1 X
βn z n ,
n=0
where z runs over the set of all complex numbers such that z L = 1. Then by (3.16) A(z) = b1/2 z(cA(z) + sB(z)) ,
B(z) = b−1/2 z(−sA(z) + cB(z)) .
This system of two equations has non-trivial solution iff its determinant is identically zero: ! sb1/2 z cb1/2 z − 1 det = c2 z 2 − cb−1/2 z − cb1/2 z + 1 + s2 z 2 −sb−1/2 z cb−1/2 z − 1 = z 2 − 2cRz + 1 = 0 , whence z + z −1 2πn = cos , n = 0, 1, . . . , L − 1 . 2 L It is convenient to write the formulae for the eigenvalues in terms of the function ϕ defined by (3.3), and the formulae for NP (λ) — in terms of the “global quasimomentum” ω(λ) which we now define. Namely, ω(λ) = πl if λ ∈ ll , and for λ ∈ bl √ π(l − 1) + ψ( λ − π(l − 1)) , l is odd , ω(λ) = (3.17) √ πl − ψ(πl − λ) , l is even . R cos k = Rc =
Here ψ is the function inverse to ϕ, cf. (3.5). Evidently p √ c λ − θ2 ≤ ω(λ) ≤ C λ , λ ≥ 0 . The eigenvalues of the operator B are given by the formulae 2πj µl (j) = π(l − 1) + ϕ , l odd ; L 2πj µl (j) = πl − ϕ , l even , L l ∈ N,
j = 0, 1, . . . , L − 1 .
(3.18)
June 4, 2002 11:0 WSPC/148-RMP
434
00123
A. V. Sobolev & M. Solomyak
The number NP (λ) depends on the location of λ. For instance, if λ ∈ ll , then N (λ) = lL, so that ρ(λ) =
NP (λ, ∆) 1 = l = ω(λ) , |∆| π
λ ∈ ll .
To cover the case λ ∈ bl we shall √ consider two options: l is odd or l is even. Suppose first that l is odd and denote λ = π(l − 1) + χ. Then NP (λ; ∆) = l − 1 + `1 (χ) , L 1 2πj `1 (χ) = # j ∈ [0, L − 1) : ϕ <χ . L L The term `1 (χ) can be estimated as follows: `1 (χ) − 1 ψ(χ) ≤ 2L−1 . π Hence by (3.17)
NP (λ; ∆) 1 ≤ 2L−1 , ∀ λ > 0. − ω(λ) L π √ Suppose now that l is even. Denote λ = πl − χ. Then
(3.19)
NP (λ; ∆) = l − `2 (χ) , L 1 2πj `2 (χ) = # j ∈ [0, L − 1) : ϕ >χ . L L The term `2 (χ) can be estimated as follows: `2 (χ) + 1 ψ(χ) − 1 ≤ 2L−1 . π Using (3.17) again, we get (3.19). All this results in the formula 1 ω(λ) (3.20) π which is well known for the clasical Hill operator. Relying upon the estimate (3.19) we shall prove a similar estimate for the counting function of the Dirichlet problem on an arbitrary interval (R1 , R2 ), not necessarily with integer R1 , R2 . ρ(λ) =
Theorem 3.4. Let ∆ = (R1 , R2 ), R1 , R2 ∈ R, R1 < R2 . Then the inequality holds: √ N (λ; ∆) 1+ λ (3.21) |∆| − ρ(λ) ≤ C |∆| for all λ > 0, with a universal constant C.
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
435
Proof. It is well known that for the Dirichlet realization of the operator −y 00 on ∆ (with √ no matching conditions inside!) the counting function is controlled by C|∆| λ, with a universal constant. Since the number of integer points inside ∆ is not greater than |∆| + 1, it follows from the decoupling principle that √ N (λ; ∆) ≤ C(|∆| λ + |∆| + 1) . (3.22) If the length of ∆ is small, say |∆| ≤ 2, then (3.21) is implied by (3.18) and (3.22). Let now |∆| > 2. Without loss of generality, we may assume 0 ≤ R1 < 1. Define L = [R2 ] (the integer part of R2 ), then L ≥ 2. Let ∆+ and ∆− be the intervals (0, L + 1) and (1, L) respectively. Then, clearly, N (λ; ∆− ) ≤ N (λ; ∆) ≤ N (λ; ∆+ ) , by variation argument. Furthermore, by the decoupling principle, |N (λ; ∆± ) − NP (λ; ∆± )| ≤ 4 , so that NP (λ; ∆− ) 4 N (λ; ∆) NP (λ; ∆+ ) 4 − ≤ ≤ + . |∆| |∆| |∆| |∆| |∆| Therefore
NP (λ; ∆± ) N (λ; ∆) 4 − ρ(λ) + . |∆| − ρ(λ) ≤ max ± |∆| |∆| Let us estimate the r.h.s. with the “−” sign. The modulus equals NP (λ; ∆− ) L − 1 NP (λ; ∆− ) 2NP (λ; ∆− ) − ρ(λ) ≤ − ρ(λ) + . L−1 |∆| L−1 |∆|(L − 1) In view of (3.19), the first term in the r.h.s. is bounded by 2(L − 1)−1 ≤ 3L−1 and in view of (3.22), the second term is bounded by √ √ C((L − 1) λ + L) C( λ + 1) ≤ . |∆|(L − 1) |∆| Repeating the same argument for the “+” sign, we arrive at (3.21). We conclude this section by discussing the H¨older properties of the global quasimomentum ω(λ) and thus, those of the density of states ρ(λ). It is clear from (3.6) that near the edges of the gap ll = (λ− , λ+ ) the function ρ has the following behaviour: √ 2 1 2 (R − 1) 4 1 ρ(λ) = ρ(λ± ) ± (λ − λ± ) 2 + O(λ − λ± ) , λ → λ± + 0 ± . π λ± Together with the formula (3.18) this asymptotics guarantees that 1
|ρ(λ) − ρ(λ± )| ≥ c|λ − λ± | 2 ,
λ ∈ ll ,
(3.23)
with a constant c depending on l. The formula (3.3) also ensures that the function ψ is 1/2-H¨ older continuous, i.e. |ψ(µ2 ) − ψ(µ1 )| ≤ C|µ2 − µ1 |1/2 ,
µ1 , µ2 ∈ [θ, π − θ] .
June 4, 2002 11:0 WSPC/148-RMP
436
00123
A. V. Sobolev & M. Solomyak
Using (3.18), one can immediately extend this information to the function ω: 1/2
|ω(λ2 ) − ω(λ1 )| ≤ C(|λ2
1/2
1/2
− λ1 |1/2 + |λ2
1/2
− λ1 |) ,
λ1 , λ2 ≥ θ ,
with a constant C independent of λ1 , λ2 . Later we shall use a less precise, but somewhat more compact consequence of this estimate and (3.20): |ρ(λ2 ) − ρ(λ1 )| ≤ C|λ2 − λ1 |1/2 ,
λ1 , λ2 ∈ R ,
(3.24)
with a universal constant C. 4. Operator AV with a Decaying Potential. Eigenvalues in the Gaps 4.1. Functions M (λ) and N (λ1 , λ2 ) Here we turn to the study of the spectrum of the Schr¨odinger operators AV , cf. (2.1), with the real-valued and bounded potential V (|x|) which in an appropriate sense decays as |x| → ∞. The essential spectrum of AV is the same as for the unperturbed operator A0 (i.e. Laplacian) and therefore, is given by Theorem 3.3. The spectrum of AV may include also eigenvalues lying in the gaps of A0 . For their study, the following quantities are standardly used. Let C be a self-adjoint operator in a Hilbert space, and let V be its relatively compact perturbation; we denote CV = C + V . Suppose that the interval (λ− , λ+ ) is a gap in σ(C). Let λ ∈ (λ− , λ+ ). Define the counting function M (λ; CV ) as the number of eigenvalues of CαV crossing the point λ while α varies from 0 to 1. In other words, X M (λ; CV ) = dim ker(C + αV − λ) . 0<α<1
If λ coincides with one of the ends of a gap, the function M (λ; CV ) is defined as the corresponding one-sided limit. If V is a perturbation of fixed sign, that is if V = ±q with a q ≥ 0, then the function M (λ) is increasing (for V = −q) or decreasing (for V = q) in λ ∈ (λ− , λ+ ) and increasing in q. For any subinterval (λ1 , λ2 ) ⊂ (λ− , λ+ ) the function N (λ1 , λ2 ; CV ) is defined as the total multiplicity of eigenvalues of the operator CV , lying in (λ1 , λ2 ). Note that if λ− = −∞ and λ ≤ λ+ , then N (λ; CV ) := N (−∞, λ; CV ) = M (λ; CV ) . According to Theorem 3.3, the following equalities hold: X bk M (λ; AVk ) , M (λ; AV ) = M (λ; AV ) + (1 − b−1 )
(4.1)
k∈N
N (λ1 , λ2 ; AV ) = N (λ1 , λ2 ; AV ) + (1 − b−1 )
X
bk N (λ1 , λ2 ; AVk ) .
(4.2)
k∈N
Recall that the potentials Vk appearing in (4.1), (4.2) were defined in (2.11). These formulae show that the key step to understanding the behaviour of the functions
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
437
M (λ; AV ), N (λ1 , λ2 ; AV ) consists in studying the individual terms of the series (4.1), (4.2). More precisely, we need the detailed information about their behaviour depending on the parameter k. The study of the sums (4.1), (4.2) is hampered by the presence of the exponential factors in their r.h.s. These factors reflect the geometry of the tree rather than the properties of the potential V (t). For this reason, it makes sense to investigate, along with the functions M (λ; AV ), N (λ1 , λ2 ; AV ), also the functions X ˜ (λ; AV ) = M M (λ; AVk ) , (4.3) k≥0
˜ (λ1 , λ2 ; AV ) = N
X
N (λ1 , λ2 ; AVk ) .
(4.4)
k≥0
For technical reasons, we shall need also the functions M , N for the operators on intervals ∆ ⊆ R. Define AV,∆ as the operator in L2 (∆) acting as (AV,∆ y)(t) = −y 00 (t) + V (t)y(t) for t 6∈ Z, under the zero boundary conditions at each finite end of ∆ and the matching conditions (3.1) at the points n ∈ Z ∩ ∆. In particular, AV,R+ = AV . Often we use abbreviated notation for the corresponding functions M , N , such as M (λ, V ; ∆) or even M (λ; ∆) when the potential V is fixed. Note a convenient relation M (λ; AVk , (R1 , R2 )) = M (λ; AV , (R1 + k, R2 + k)) ,
(4.5)
which is valid for any 0 ≤ R1 < R2 ≤ ∞ and integer k’s. This formula is useful when it is more natural to study the dependence of M on the interval ∆ than on the potential. If V is a function of constant sign, then there is a useful relationship between M (λ; AV,∆ ) and the spectrum of the compact operator T (λ) = T (λ, V, ∆) = |V |1/2 (A0,∆ − λI)−1 |V |1/2 ,
λ 6∈ σ(A0,∆ ) .
(4.6)
Namely, if λ is a regular point of A0,∆ , then M (λ; V, ∆) = n+ (1, T (λ, V, ∆)) ,
V ≤ 0;
(4.7)
M (λ; V, ∆) = n− (1, T (λ, V, ∆)) ,
V ≥ 0.
(4.8)
Here n± (·, T ) stands for the counting functions of the positive and negative eigenvalues ±λ± j (T ) of a compact, self-adjoint operator T , that is n± (s, T ) = #{j : λ± j (T ) > s} ,
s > 0.
The equalities (4.7), (4.8) proved very effective in the problems of the type considered, see e.g. [21]. Actually, these are facts of rather general a nature, see e.g. [3, Proposition 1.5]. The following relations have their prototypes in the theory of the perturbed Hill operator, see [21, (2.5)–(2.8)]. For bounded ∆ ( M (λ; V, ∆) = N (λ; V, ∆) − N (λ; 0, ∆) , V ≤ 0, (4.9) M (λ; V, ∆) = N (λ+; 0, ∆) − N (λ+; V, ∆) , V ≥ 0 .
June 4, 2002 11:0 WSPC/148-RMP
438
00123
A. V. Sobolev & M. Solomyak
Further, for any (bounded or unbounded) ∆ |N (λ1 , λ2 ; V, ∆) − |M (λ2 ; V, ∆) − M (λ1 ; V, ∆)| | ≤ N (λ1 , λ2 ; 0, ∆) + 1 .
(4.10)
One can give a more precise formula: for any two points λ1 , λ2 such that N (λ1 , λ2 ; 0, ∆) = 0, we obtain from (4.9): ( N (λ1 , λ2 ; V, ∆) = M (λ2 ; V, ∆) − M (λ1 +; V, ∆) , (4.11) N (λ1 , λ2 +; V, ∆) = M (λ1 ; V, ∆) − M (λ2 ; V, ∆) . The next two inequalities are usually referred to as the “decoupling principle”. Let ∆1 = (a, d), ∆2 = (d, c), −∞ ≤ a < d < c ≤ ∞, and ∆ = (a, c). Then |N (λ1 , λ2 ; V, ∆) − (N (λ1 , λ2 ; V, ∆1 ) + N (λ1 , λ2 ; V, ∆2 ))| ≤ 2 ,
(4.12)
|M (λ; V, ∆) − (M (λ; V, ∆1 ) + M (λ; V, ∆2 ))| ≤ 2 .
(4.13)
The proofs of the relations (4.9)–(4.13) are either straightforward, or are based upon standard facts from the perturbation theory. Note that the number 1 rather than 2 stands in the r.h.s of the inequalities [21, (2.7) and (2.8)] whose analogs are the above inequalities (4.12), (4.13). This difference appears due to the nature of the matching conditions at the points n ∈ Z. If d 6∈ Z, one can replace 2 by 1 in (4.12) and (4.13). 4.2. Individual Weyl asymptotics The material presented in this subsection, is a minor refinement of [21, Theorems 3.2 and 3.3]. We give it here for the operators we need in this paper (that is, the functions in the domains of the operators considered are subject to the matching conditions (3.1)). However, it is useful to keep in mind that the results of Theorem 4.2(i) and of Theorem 4.3 hold also for the usual Hill operator. For a real-valued function V on R+ , introduce the quantity !1/2 X Z 2n t|V (t)|dt . J(V ) = n∈Z
2n−1
Consider the operator on L2 (R+ ): KV y = −y 00 + V y ,
y ∈ H2 (R+ ) ,
y(0) = 0 .
(4.14)
In contrast to the operator AV , the description of KV involves no matching conditions, and the quadratic domain of KV is H1,0 (R+ ). The following estimate and asymptotics are particular cases of the results of [9, Sec. 6]; see also expositions in [6] and [7]. A close result was obtained earlier in [8, Theorems 4.18 and 4.19]. Proposition 4.1. Let J(V ) < ∞. Then the negative spectrum of the operator KV is finite and there exists an absolute constant C > 0 such that M (0; KV ) ≤ CJ(V− ) .
(4.15)
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
439
Besides, let g > 0 be the large parameter and V− 6≡ 0. The function M (0; KgV ) satisfies Weyl’s asymptotics Z p lim g −1/2 M (0; KgV ) = π −1 V− (t)dt , g → ∞ . (4.16) R+
If |V (t)| monotonically decreases, then by H¨ older’s inequality Z √ √ J(V )/ 6 ≤ |V (t)|1/2 dt ≤ 6J(V ) .
(4.17)
R+
Hence, for monotone |V (t)| the function M (0; KgV ) is controlled by the r.h.s. of its asymptotics given by (4.16). Along with J(V ), introduce the functional !1/2 Z 1 1/2 X Z 2n ∞ ˜ )= J(V |V (t)|dt + t|V (t)|dt . 0
n=1
2n−1
˜ ), therefore in the r.h.s of (4.15) J(V ) can be replaced by Clearly J(V ) ≤ cJ(V ˜ ). Compared with Theorem 4.1, its corollary with J(V ˜ ) in the r.h.s ignores the J(V fact that due to the Dirichlet condition at 0 the potential V need not be integrable at this point. Still, this corollary is quite convenient provided one is dealing with V integrable at 0. Present also an estimate for M (−1; AV ); we need it in the course of the proof of Theorem 4.2 below. 1/2 X Z n M (−1; AV ) ≤ CΘ(V ) := C |V (t)|dt . (4.18) n∈N
n−1
For the proof, one splits R+ into the union of the intervals (n − 1, n] and applies to each interval the well known eigenvalue estimate for the equation −y 00 + y = λV y with the Neumann boundary conditions. This is exactly the way in which the same estimate for M (−1; KV ) was proved in [4]. It follows from H¨older’s inequality that Θ(V ) ≤ J˜(V ). However, the functional Θ(V ) can not be estimated by J(V ). Note that similarly to (4.17), for a decreasing |V | we have Z 1 1/2 √ Z p ˜ J(V ) ≤ |V (t)|dt + 6 |V (t)|dt . (4.19) 0
R+
˜ ) < ∞. Theorem 4.2. Let V be a function with a fixed sign and let J(V (i) Suppose that λ ∈ ll where ll is one of the gaps (see (3.7)). Then, given an interval ∆ ⊆ R, the estimate ˜ ) + 1) M (λ; V, ∆) ≤ C(J(V holds, where the constant C = C(l) does not depend on λ ∈ ¯ll and V.
(4.20)
June 4, 2002 11:0 WSPC/148-RMP
440
00123
A. V. Sobolev & M. Solomyak
(ii) Suppose in addition that ∆ = R+ (so that AV,∆ = AV ), and that ( l0 , l = 0, λ ∈ I where I ⊂ is a closed interval. ll \ {λl } , l≥1
(4.21)
Then ˜ ) M (λ; AV ) ≤ C 0 J(V
(4.22)
with a constant C 0 uniform in λ ∈ I. In particular, for V ≤ 0 the estimate (4.22) is uniform in λ ≤ θ2 . Proof. The proof of (i) follows the scheme suggested in [21]. For this reason, we only outline the necessary changes in the argument. To be definite, we suppose that V ≤ 0 and that l (the index of the gap) is even. We start with the spectral decomposition (3.8) of the operator A = A0,R on the whole line. Set
P+
λ(ξ) = λl+1 (ξ) , P = Pl+1 , X = ⊕Pj , Q = P + − P , U = Ul+1 . j>l
As in [21, Sec. 4], the estimating of M (λ; V, ∆) is reduced to the problem of eigenvalue estimates for the operators T1 (λ) = |V |1/2 (A − λI)−1 P |V |1/2 , T2 (λ) = |V |1/2 (A − λI)−1 Q|V |1/2 . Since k(A + I)(A − λI)−1 Qk ≤ Cl (actually, Cl = O(l)), we have n+ (1, T2 (λ)) ≤ n+ (Cl−1 , (|V |1/2 (A + I)−1 |V |1/2 ) = M (−1; A + Cl V ) . To the latter quantity the estimate (4.18) applies, and we obtain n+ (1, T2 (λ)) ≤ C 0 Θ(V ) .
(4.23)
To the operator T1 (λ) the argument of [21] applies without changes. Indeed, the nature of the operator U in our case is the same as in the case of periodicity coming from a potential. This allows to reduce the problem to estimating the counting function N (λ; KV ) for the operator KV defined in (4.14). Then using the bound ˜ ) + 1) which in combination (4.15), we arrive at the inequality n(1, T1 (λ)) ≤ C(J(V with (4.23) leads to (4.20). (ii) Again, for definiteness, we prove the result for the non-positive potentials. Let λ = k 2 ∈ I, k > 0. In the case l ≥ 1 assume temporarily that λ 6= (πl − θ)2 and λ 6= (πl + θ)2 , so that k ∈ (πl − θ, πl) ∪ (πl, πl + θ). In the case l = 0 assume that λ ∈ (0, θ2 ), so that k ∈ (0, θ). The roots q1 , q2 of Eq. (3.12) are real and distinct, and q1 q2 = 1. Let us label them so that |q1 | < 1 < |q2 | and denote |q1 | = e−σ , then σ > 0. Consider the solutions y0
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
441
and y1 (cf. (3.14) and (3.13)) of the equation (3.11) under the matching conditions (3.1) Their Wronskian is W (y0 , y1 ) = y0 y10 − y00 y1 = −b1/2 k sin k 6= 0, so that y0 , y1 are linearly independent. On the interval (n − 1, n) the function y0 (t) satisfies the inequality |y0 (t)| ≤ nb1/2 (q2n−2 + q2n−1 ) ≤ 2nb1/2 eσt , |y0 (t)| ≤ | sin kt| ≤ kt ,
n > 1;
n = 1.
For y1 (t) we have |y1 (t)| ≤ (b1/2 + 1)q1n−1 ≤ q2 (b1/2 + 1)e−σt ,
t > 0.
Note also that |q2 | = R| cos k| + (R2 cos2 k − 1)1/2 ≤ 2R. So we see that the inequalities |y0 (t)| ≤ cteσt ,
|y1 (t)| ≤ ce−σt ,
t>0
(4.24)
hold uniformly in λ ∈ I. The solution y0 satisfies the boundary condition y0 (0+) = 0. Given a function f ∈ L2 (R+ ), the solution of the non-homogeneous equation on R+ : y 00 (t) + k 2 y(t) = −f (t) ,
t 6∈ N ;
y(0+) = 0
satisfying the matching conditions (3.1) for n ∈ N, is given by Z y(t) = K(t, s)f (s)ds R+
where
( W (y0 , y1 )K(t, s) =
y1 (t)y0 (s) ,
s < t,
y1 (s)y0 (t) ,
t < s.
It follows from (4.24) that |K(t, s)| ≤ c2 e−σ|t−s| min(s, t)(b1/2 k| sin k|)−1 √ ≤ C1 st(k| sin k|)−1 , C1 = c2 b−1/2 .
(4.25)
The operator (4.6) (for λ = k 2 and ∆ = R+ ) acts as Z (T (λ)f )(t) = |V (t)|1/2 K(t, s)|V (s)|1/2 f (s)ds . R+
˜ ) < ∞ this operator belongs to the Hilbert–Schmidt Under the assumption J(V class. Indeed, by virtue of (4.25) ZZ 2 (k sin k) |K(t, s)|2 |V (t)| |V (s)|dtds R2+
ZZ ≤ C12
R2+
Z st|V (t)| |V (s)|dtds = C12
!2 t|V (t)|dt
R+
˜ )4 . ≤ C12 J(V
June 4, 2002 11:0 WSPC/148-RMP
442
00123
A. V. Sobolev & M. Solomyak
Since kT k ≤ kT kHS and n+ (1, T ) = 0 if kT k ≤ 1, the last estimate and (4.7) imply ˜ )2 ≤ C −1 k| sin k|. In its turn, this and the estimate (4.20) that M (λ; AV ) = 0 if J(V 1 yield the inequality (4.22) with C 0 = C(1+(C1−1 k| sin k|)−1/2 ). By continuity, (4.22) extends to the ends of the gap, i.e. to k = πl ± θ (l ≥ 1) or k = θ (l = 0). Since the function M (λ; AV ) is monotone in λ, the result for l = 0 automatically extends to all λ ≤ θ. As in (4.19) one can simplify the estimate (4.20) if one assumes that |V | is decreasing on the interval ∆ = (R1 , R2 ): !1/2 Z R1 +1 √ Z p M (λ; AV , ∆) ≤ C |V (t)|dt + 6 |V (t)|dt + 1 . (4.26) R1
∆
Theorem 4.3. Let the assumptions of Theorem 4.2 be satisfied and V ≤ 0. Then the asymptotics Z p |V (t)|dt , g → ∞ (4.27) lim g −1/2 M (λ; gV, ∆) = π −1 ∆
(cf. (4.16)) holds uniformly in λ ∈ ll . Proof. Like in [21], the problem reduces to the case of a finite interval ∆. In view of (4.9) we need to study only the term depending on g. Removal of the matching conditions inside the interval shifts the function N (λ; gV, ∆) no more than by 2|∆| + 2 and therefore, does not affect its asymptotic behaviour. As a result, we come to the operator of the Dirichlet problem on a finite interval for which the asymptotics (4.27) is well known. ˜ (λ; AgV ) 4.3. Weyl asymptotics for M (λ; AgV ) and M The results of this subsection follow immediately from Theorems 3.2 and 3.3. Theorem 4.4. Let Γ = Γb and let V (t) ≤ 0 be a bounded measurable function on R+ . Let λ satisfy (4.21). Then (i) If
X
bk J˜(Vk ) < ∞ ,
(4.28)
k∈N
then the Weyl asymptotics holds for the function M (λ; AgV ) of the operator (2.1) : Z p 1 −1/2 M (λ; AgV ) = |V (|x|)|dx , g → ∞ . (4.29) lim g π Γ (ii) If
X k∈N
˜ k) < ∞ , J(V
(4.30)
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
then
∞ Z 1 X ∞p ˜ lim g −1/2 M(λ; AgV ) = |V (t)|dt , π k
g → ∞.
443
(4.31)
k=0
The above asymptotic formulae are uniform in λ on any closed interval I from (4.21). Proof. For definiteness, we prove (4.29). The proof of (4.31) is the same. It follows from (4.1) that g −1/2 M (λ; AgV ) = g −1/2 M (λ; AgV ) + (1 − b−1 )
X
bk g −1/2 M (λ; AgVk ) .
k∈N
By Theorem 4.3, for each k ≥ 0 g −1/2 M (λ; AgVk ) → π −1
Z R+
p |Vk (t)|dt ,
g → ∞.
The series (4.28) dominates the series (4.32) and it follows from Lebesgue’s theorem on the dominated convergence that Z p Z p ∞ X −1/2 −1 k πg M (λ; AgV ) → |V (t)|dt + (1 − b ) b |Vk (t)|dt , g → ∞ . R+
k=1
R+
This is equivalent to (4.29). Due to Theorem 4.2 the series (4.28) converges uniformly on I from (4.21), and hence the asymptotics (4.29) is uniform in λ ∈ I. 5. Power-Like and Exponential Potentials 5.1. Weyl asymptotics Here we show how the theorems in the previous section apply to potentials with a specified rate of decay at infinity. To have a clear distinction between the cases of non-positive and non-negative potentials, we slightly change our notation: we denote the potential by V = ±q or V = sq with s = ±1, always assuming that q ≥ 0. We are interested in two types of potentials: power-like and exponential. More precisely, suppose that q(t) ≤ CQ(t) where Q(t) = (1 + t)−2γ ,
γ > 0 , or Q(t) = e−2κt ,
κ > 0.
(5.1)
˜ (λ; A−gq ). Let us first establish the Weyl type asymptotics for M (λ; A−gq ) and M Theorem 5.1. Let Condition (4.21) be fulfilled. Suppose that q(t) ≤ CQ(t). (i) If Q(t) = (1 + t)−2γ with γ > 2, then the asymptotic formula (4.31) holds for ˜ (λ; A−gq ); M
June 4, 2002 11:0 WSPC/148-RMP
444
00123
A. V. Sobolev & M. Solomyak
˜ (λ; A−gq ) (ii) If Q(t) = e−2κt , κ > 0, then the asymptotic formula (4.31) for M holds. If, in addition, κ > ln b, then the asymptotics (4.29) for M (λ; A−gq ) holds as well. These results are uniform in λ ∈ I with a closed interval I from (4.21). The proof of Theorem 5.1 is based on two elementary Lemmas 5.2 and 5.3, which describe individual counting functions. These lemmas will be also useful in ˜ the analysis of the non-Weyl behaviour of the function M. Recall that by qk , k ≥ 0 are denoted the “shifted” potentials qk (t) = q(t + k), t > 0. Remembering the relation (4.5) and a comment after it, we sometimes transfer the dependence on k to the interval ∆. This is why some of the estimates below are stated for intervals ∆ depending on an additional parameter R, which plays the role of k, but is not supposed to be integer. Lemma 5.2 (Power-like potentials). Suppose that q ≤ CQ with Q(t) = (1 + t)−2γ , γ > 1. (i) Then for any R ≥ 0 M (λ; ±gq, (R, ∞)) ≤ C(g 1/2 (1 + R)1−γ + 1) ,
∀k ≥ 0,
(5.2)
uniformly in λ ∈ ¯ll and R ≥ 0. (ii) If the condition (4.21) is satisfied, then M (λ; ±gqR ) ≤ Cg 1/2 (1 + R)1−γ ,
∀k ≥ 0,
(5.3)
uniformly in λ ∈ I with a closed interval I from (4.21). Lemma 5.3 (Exponential potentials). Suppose that q ≤ CQ with Q(t) = e−2κt , κ > 0. (i) Then for any R > 0 M (λ; ±gq, (R, ∞)) ≤ C(g 1/2 e−κR + 1) ,
∀k ≥ 0,
(5.4)
uniformly in λ ∈ ¯ll and R ≥ 0. (ii) If the condition (4.21) is satisfied, then M (λ; ±gqR ) ≤ Cg 1/2 e−κR ,
∀R ≥ 0,
(5.5)
uniformly in λ ∈ I with a closed interval I from (4.21). Proofs of Lemmas 5.2 and 5.3. Due to the monotonicity of the function M (λ; ±V ) in V (see Sec. 4.1), it is sufficient to obtain the estimates for the “model” potential Q. For a power-like Q, we have "Z #1/2 Z R+1 ∞ −2γ (1 + t) dt + (1 + t)−γ dt ≤ C(1 + R)1−γ , R
R
which implies (5.2) by virtue of (4.26). Similarly, (4.22) leads to (5.3). The proof of Lemma 5.3 is the same.
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
445
Proof of Theorem 5.1. According to (5.3) (respectively (5.5)) the series (4.30) is convergent for γ > 2 (respectively all κ > 0), which ensures the validity of (4.31). In the exponential case, if κ > ln b, then the series (4.28) is also convergent, which leads to the asymptotics (4.29) by Theorem 4.4. 5.2. Non-Weylian asymptotics The rest of the paper is focused on the situations when the Weyl formula fails, and the asymptotics of the counting functions (4.1)–(4.4) depends on the behaviour of the potential at infinity. We concentrate on bounded potentials q behaving like Q (see (5.1)) at infinity. The precise meaning of this phrase will be made clear later. ˜ As in the previous section, the asymptotics of M (λ; A±gq ) and M(λ; A±gq ) will be deduced from the asymptotics of the individual counting functions M (λ; ±gqk ), k ≥ 0 for the operators A±gqk . In the case of the power-like potential q the total number of eigenvalues of Asq in each gap may become infinite. More precisely, if q = (1 + t)−2γ , s = −1 (respectively s = +1) and γ ≤ 1 then the eigenvalues accumulate at the upper (respectively lower) end of the gap ll . In this connection it is convenient to introduce the notion of an admissible point λ ∈ ¯ll . From now on we fix the number l ≥ 0 and denote ll = (λ− , λ+ ). If l = 0, then λ− = −∞ and λ+ = θ2 . If l ≥ 1, then λ± = (πl ±θ)2 . In the definition below, to avoid unnecessary repetitions, by [−∞, λ+ ], [−∞, λ+ ) we understand the intervals (−∞, λ+ ], (−∞, λ+ ). Definition 5.4. Let Q(t) = (1 + t)−2γ . Then a point λ ∈ [λ− , λ+ ] is said to be γ0 -admissible, γ0 > 0, if (λ , λ ] , γ ≤ γ0 , s = +1 ; − + λ ∈ [λ− , λ+ ) , γ ≤ γ0 , s = −1 ; [λ− , λ+ ] , γ > γ0 . For Q(t) = e−2κt any point λ ∈ [λ− , λ+ ] is said to be γ0 -admissible with any γ0 > 0. Clearly, for any two positive numbers γ0 , γ1 , γ0 < γ1 , any γ1 -admissible λ is automatically γ0 -admissible. For the model potential Q(t) = (1 + t)−2γ the number M (λ; ±gqn ) is finite for all g > 0 if λ is 1-admissible. For the exponential model potential Q(t) = e−2κt the quantity M (λ; ±gqn ) is finite for all λ ∈ [λ− , λ+ ]. ˜ (λ; Asgq ), N ˜ (λ1 , λ2 ; Asgq ) 5.3. Results for the functions M ˜ (λ; A±gq ) and This subsection contains the results on the asymptotics of M ˜ N (λ1 , λ2 ; A±gq ). Their proofs require some technical preparations which we give in Secs. 6, 7. The proofs are completed in Sec. 8. Our results for the functions (4.1), ˜ and (4.2) require different techniques and are much less complete than those for M ˜ ; they are presented in Sec. 9. N Recall that in contrast to the spectrum of the “individual” operator A0 the spectrum of A0 contains eigenvalues λl = (πl)2 of infinite multiplicity. Thus, when
June 4, 2002 11:0 WSPC/148-RMP
446
00123
A. V. Sobolev & M. Solomyak
stating the results we assume that λ, λ1 , λ2 satisfy (4.21) and are 2-admissible. The constants in all the estimates below are • uniform in λ, λ1 , λ2 varying within any closed interval I of 2-admissible points, satisfying (4.21), • independent of the coupling constant g. We begin with the power-like potentials. Theorem 5.5. Let q satisfy the condition q(t) = Q(t)(1 + o(1)) ,
t → ∞,
Q(t) = (1 + t)−2γ ,
(5.6)
and one of the following two conditions be fulfilled: (1) γ ∈ (0, 2) and s = −1; (2) The exponent γ > 0 is arbitrary and s = +1. Suppose that λ is 2-admissible and satisfies (4.21). Then Z ∞Z ∞ − γ1 ˜ lim g M(λ; A±gq ) = ± [ρ(λ) − ρ(λ ∓ (s + σ)−2γ )]dsdσ , g→∞
0
(5.7)
0
where ρ is the density of states for the operator A0 . Remark 5.6. A simple change of variables leads to another expression for the asymptotic coefficient: Z ∞Z ∞ − γ1 ˜ [ρ(λ) − ρ(λ ∓ s−2γ )]dsdβ . lim g M(λ; A±gq ) = ± 0
β
In Sec. 7 we shall show that the asymptotic coefficients in the r.h.s. of (5.7) and that in Theorem 5.9 below, are finite. Note that in contrast to s = −1, the above formula describes the asymptotics ˜ of M(λ; Asgq ) with s = +1 for all positive γ. If s = −1, then the case γ = 2 is critical in the sense that for γ > 2 the Weyl asymptotics is applicable instead of (5.7) (cf. Lemma 5.2). We point out however that for the individual counting function M (λ; −gqn ) the critical case is γ = 1 (see Theorem 4.3). ˜ in the case γ = 2 we need to introduce more restrictions To find a formula for M on q. Condition 5.7. Let q ∈ C1 (R+ ) be a function such that cQ(t) ≤ q(t) ≤ CQ(t) ,
∀ t ∈ R+ ,
|q 0 (t)| ≤ CQ(t) . Now we are in position to study the critical case: Theorem 5.8. Suppose that q satisfies (5.6) with γ = 2 and Condition 5.7. Let λ be 2-admissible and satisfy (4.21). Then ˜ (λ; A−gq ) = (4π)−1 , lim g −1/2 (ln g)−1 M
g → ∞.
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
447
˜ (λ1 , λ2 ): The next theorem gives an asymptotic formula for the number N Theorem 5.9. Suppose that q satisfies (5.6) with some γ > 0, and that in the case s = −1, α ≥ 2, Condition 5.7 is also fulfilled. Let λ1 , λ2 be 2-admissible and satisfy (4.21). Then 1 ˜ (λ1 , λ2 , A±gq ) lim g − γ N Z ∞Z ∞ = [ρ(λ2 ∓ (t + σ)−2γ ) − ρ(λ1 ∓ (t + σ)−2γ )]dtdσ ,
0
(5.8)
0
as g → ∞. ˜ (λ1 , λ2 ) is described by the density of We point out that the asymptotics of N states ρ(λ) for all γ > 0. Under the conditions of Theorem 5.5 the asymptotics (5.8) can be immediately deduced from (5.7) with the help of (4.11). On the contrary, ˜ can not be inferred from the asymptotics for α > 2 and s = −1 the behaviour of N ˜ of M (λ) which is given by the Weyl term, see Lemma 5.2. Let us proceed to the exponential potentials. From Lemma 5.3 we know that for ˜ (λ; Asq ) is described the case q ≤ CQ, Q(t) = e−2κt , s = −1, the asymptotics of M by the Weyl formula (4.31). The next theorem gives an answer in the case s = +1. Below g0 > e is a constant. Theorem 5.10. Suppose that cQ(t) ≤ q(t) ≤ CQ(t) ,
∀ t ≥ R0 ,
(5.9)
−2κt
with Q(t) = e , κ > 0 and some R0 ≥ 0. Let λ, λ1 , λ2 be arbitrary numbers satisfying (4.21). Then ˜ (λ; Agq ) = 1 ρ(λ)(ln g)2 + O(ln g) , M 8κ2
g ≥ g0 ,
(5.10)
and ˜ 1 , λ2 ; Agq ) ≤ C ln g , N(λ
g ≥ g0 .
(5.11)
The next result complements the Weyl formula (4.31) by providing an estimate ˜ (λ1 , λ2 ; A−gq ): for the function N Theorem 5.11. Suppose that q fulfills Condition 5.7 with Q(t) = e−2κt . Let λ1 , λ2 be arbitrary numbers satisfying (4.21). Then ˜ (λ1 , λ2 ; A−gq ) ≤ C(ln g)2 , N
g ≥ g0 .
(5.12)
6. Individual Estimates and Weyl Asymptotics with a Remainder 6.1. Individual estimates Here we obtain further estimates for individual counting functions M (λ; ±gq, ∆). Although our ultimate objective is to establish asymptotic formulae for the counting functions M (λ; ±gqk ) with integer non-negative k’s, most of the results in this
June 4, 2002 11:0 WSPC/148-RMP
448
00123
A. V. Sobolev & M. Solomyak
section are uniform with respect to a wide class of potentials, including the shifted potentials qR , R ≥ 0. Unless stated otherwise, in this section we always assume that the points λ, λ1 , λ2 ∈ [λ− , λ+ ] are 1-admissible. The constants in all the estimates obtained below are • uniform in λ, λ1 , λ2 varying within any closed interval I ⊂ [λ− , λ+ ] of 1-admissible points; • independent of the coupling constant g. Whenever possible we treat the power-like and exponential potentials simultaneously. It is convenient to use the notation 1 if Q(t) = (1 + t)−2γ ; g 2γ , α= (6.1) 1 ln g , if Q(t) = e−2κt . 2κ We always assume that g ≥ g0 > e, so α ≥ α0 with some α0 > 0 in both cases. The following simple lemma will be repeatedly used: Lemma 6.1. Suppose that q(t) ≤ CQ(t). (i) If Q(t) = (1 + t)−2γ , then for all R ≥ 1 |M (λ; ±gq, R+ ) − M (λ; ±gq, (0, Rα))| ≤ C 0 α
(6.2)
with a constant C 0 depending only on C. Moreover, lim sup sup α−1 |M (λ; ±gq, R+ ) − M (λ; ±gq, (0, Rα))| = 0 .
(6.3)
R→∞ g≥g0
(ii) If Q(t) = e−2κt , then |M (λ; ±gq, R+ ) − M (λ; ±gq, (0, Rα))| ≤ C 0 ,
(6.4)
with a constant C 0 depending only on C. Proof. In view of the decoupling principle (4.13) |M (λ; ±gq, R+ ) − M (λ; ±gq, (0, Rα))| ≤ 2 + M (λ; ±qq, (Rα, ∞)) . If Q(t) = (1 + t)−2γ and γ > 1, then by (5.2) the last term in the right hand side does not exceed g 1/2 (αR + 1)1−γ + 1 ≤ αR1−γ + 1 . This implies (6.2) and (6.3). If γ ≤ 1, then λ < λ+ (for s = −1) or λ > λ− ( for ˆ so as to ensure that M (λ; ±qq, (Rα, ˆ ∞)) = 0, s = +1). Thus one can choose R
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
449
ˆ −2γ α−2γ = R ˆ −2γ for all t ≥ Rα. ˆ Now (6.3) follows. To show (6.2) since |gq(t)| ≤ g R use the decoupling principle and (4.26) to conclude that ˆ M (λ; ±gq, (Rα, ∞)) ≤ 2 + M (λ; ±gq, (Rα, Rα)) " #1/2 Z R+1 Z −2γ ≤ C α |t| dt +α
ˆ R
|t|−γ dt + 1 + 2
R
R
≤ C(1 + α) . For the case Q(t) = e−2κt , using (5.4) we obtain the estimate: M (λ; ±gq, (Rα, ∞)) ≤ C(g 1/2 e−καR + 1) ≤ C(g 1/2−R/2 + 1) ≤ C for all R ≥ 1, as required. Lemma 6.2. Let q(t) ≤ CQ(t). (i) If Q(t) = (1 + t)−2γ , then 1/2 Cg , M (λ; −gq) ≤ Cg 1/2 ln g , Cα ,
γ > 1, (6.5)
γ = 1, γ < 1.
(ii) Let either Q(t) = (1 + t)−2γ with arbitrary γ > 0, or Q(t) = e−2κt . Then M (λ; gq) ≤ Cα ,
(6.6)
N (λ1 , λ2 ; gq) ≤ Cα .
(6.7)
and
Proof. (i) The estimate (6.5) for γ > 1 follows from (5.2). If γ ≤ 1, then by (4.26), Z M (λ; −gq, (0, α)) ≤ Cg ( ≤
1/2
1
−2γ
(1 + t)
1/2 Z 1/2 dt + Cg
0
α
t−γ dt + C
0
Cg 1/2 ln α ,
γ = 1,
Cα ,
γ < 1.
By virtue of Lemma 6.1 this leads to (6.5). (ii) It is clear from (4.9) that M (λ; gq, (0, α)) ≤ N (λ; 0, (0, α)) ≤ Cα . Now Lemma 6.1 gives (6.6). The estimate (6.7) follows from (6.6) and (4.10).
June 4, 2002 11:0 WSPC/148-RMP
450
00123
A. V. Sobolev & M. Solomyak
From these lemmas we can immediately deduce the asymptotics of M (λ; gq) with an exponential q. Theorem 6.3. Let q be a bounded function satisfying (5.9) with Q(t) = e−2κt . Then |M (λ; gq) − ρ(λ)α| ≤ C ,
∀α ≥ 1,
(6.8)
and N (λ1 , λ2 ; gq) ≤ C ,
∀α ≥ 1.
(6.9)
Proof. The bound for N (λ1 , λ2 ; gq) immediately follows from (6.8) by (4.11), since ρ(λ1 ) = ρ(λ2 ) for λ1 , λ2 ∈ [λ− , λ+ ]. By the decoupling principle (6.4) it suffices to study the counting functions M (λ; gq, ∆) with ∆ = (0, α). The result for the unperturbed function N (λ; 0, ∆) immediately follows from Theorem 3.4: α → ∞.
N (λ; 0, ∆) = ρ(λ)α + O(1) ,
(6.10)
To handle the perturbed function split the interval ∆ as follows: ∆ = ∆0 ∪ ∆1 ∪ ∆2 , ∆0 = (0, R0 ] ,
∆1 = (R0 , R1 α] ,
∆2 = (R1 α, α) ,
where R0 > 0 is defined in (5.9). The number R1 > 0 is found from the requirement gq(t) ≥ λ − λ0 ,
∀ t ∈ (R0 , R1 α] ,
where λ0 = θ2 = inf σ(A0 ). This implies that for R1 we can take R1 = 1 −
C0 , 2κα
(6.11)
with a sufficiently large C 0 = C 0 (λ) > 0. It follows from (6.11) and Theorem 3.4 that N (λ; gq, ∆2 ) ≤ N (λ; 0, ∆2 ) ≤ (1 − R1 )α + C 0 ≤ C 00 ,
∀α ≥ 1.
Since N (λ; gq, ∆0 ) ≤ N (λ; 0, ∆0 ) ≤ C and N (λ; gq, ∆1 ) = 0, by the decoupling principle (4.12) we have N (λ; gq, ∆) ≤ C ,
∀α ≥ 1.
Now it follows from (6.10) and (4.9) that |M (λ; gq, ∆) − ρ(λ)α| ≤ C , which implies (6.8).
α ≥ 1,
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
451
6.2. Individual asymptotics of the Weyl type with a remainder Even if a potential V ≤ 0 satisfies the conditions of Theorem 4.3, the formula (4.27) fails to provide an asymptotics for N (λ1 , λ2 ; V ) as the leading term in (4.27) does not depend on λ. Below we establish, under certain conditions on q, a Weyl-type asymptotics for M (λ; −gq, ∆) with a remainder, which allows us to obtain bounds on the growth of N (λ1 , λ2 ; −gq) as g → ∞. We begin with an asymptotic formula for the Schr¨ odinger operator −d2 /dt2 − q on a bounded interval ∆ ⊂ R+ with the Dirichlet boundary conditions but without any matching conditions. Denote the counting function of this operator by #(λ; −q, ∆). The next theorem is a minor modification of a similar statement from [21]: Theorem 6.4. Let q ∈ C1 ([0, ∞)) be a non-negative function, and let ∆ = (R1 , R2 ) with 0 ≤ R1 ≤ R2 < ∞. Then for any λ ∈ R one has Z p #(λ; −q; ∆) − 1 q(t)dt π ∆
Z ≤ ∆
p 3 |λ| + 1 |q (t)| dt + |∆| + 1 , 4π(q(t) + |λ|) π 0
(6.12)
where the constant C does not depend on ∆, g and is uniform in λ on a compact interval. Proof. Assume without loss of generality that R1 = 0. The idea is to use the fact that the number #(λ) = #(λ; −q, ∆) equals the number of roots of the solution u of the equation −u00 − qu = λu ,
u(0) = 0 ,
u0 (0) = 1 ,
(6.13)
lying strictly inside ∆. To find the number of roots, represent u in the polar form: p u(t) = β(t) sin ξ(t) , u0 (t) = β(t)f (t) cos ξ(t) , f = q + λ0 , (6.14) with a λ0 > 0. Eq. (6.13) and the above equalities define the real-valued amplitude β and the phase ξ uniquely under the assumption that ξ is continuous. Substituting (6.14) in Eq. (6.13), one obtains the following non-linear equation for ξ: f0 λ − λ0 sin(2ξ) + sin2 ξ , 2f f and a linear equation for β: ξ0 = f +
ξ(0) = 0 ,
β β 0 = − ((λ − λ0 ) sin ξ + f 0 cos ξ) cos ξ , f
β(0) =
(6.15)
1 . f (0)
Since β never vanishes, the number of roots of u equals the number of points t ∈ ∆ where ξ(t) = 0(modπ). From (6.15) it is clear that ξ 0 (t) = f (t) > 0 for those t, so that π−1 ξ(R2 ) − 1 ≤ #(λ) ≤ π −1 ξ(R2 ) .
June 4, 2002 11:0 WSPC/148-RMP
452
00123
A. V. Sobolev & M. Solomyak
Therefore (6.15) implies that Z Z Z |f 0 | |λ − λ0 | #(λ) − 1 f dt ≤ dt + dt + 1 . π ∆ πf ∆ 2πf ∆ √ √ √ √ Since f ≥ λ0 , f 0 = q 0 (2f )−1 , and q + λ0 − q ≤ λ0 , this leads to Z Z Z √ |q 0 | |λ| + 2λ0 #(λ) − 1 √ qdt ≤ dt + dt + 1 . π ∆ ∆ 4π(q + λ0 ) ∆ π λ0 It remains to take λ0 = |λ| + 1. Theorem 6.5. Suppose that q satisfies Condition 5.7 and let ∆ = (R1 , R2 ) with 0 ≤ R1 ≤ R2 < ∞. Then for any λ ∈ R one has √ Z p N (λ; −gq; ∆) − g ≤ C(|∆| + 1) , q(t)dt (6.16) π ∆
for all g ≥ g0 , where the constant C does not depend on ∆, g and is uniform in λ on a compact interval. Proof. Let us split ∆ as follows: ∆=
m [
∆k ,
l = [R1 ] ,
m = [R2 ] ,
k=l
∆l = (R1 , l + 1] , ∆k = (k, k + 1] ,
∆m = (m, R2 ) ,
k = l + 1, . . . , m − 1 .
Then by the decoupling principle X N (λ; ∆k ) ≤ 2(m − l) ≤ 2(|∆| + 2) . N (λ; ∆) − k
To study each ∆k we use Theorem 6.4. Namely, since |q 0 | ≤ Q and q ≥ cQ, the estimate (6.12) yields: √ Z p N (λ; ∆k ) − g ≤C q(t)dt π ∆k with a constant independent of k. Adding up these inequalities over k = l, l + 1, . . . , m, we arrive at (6.16). Let us derive from this theorem the asymptotics for the counting function M (λ; −gq, ∆) with ∆ = (k, ∞), k > 0. Theorem 6.6. Suppose that Condition 5.7 is satisfied and let ∆ = (k, ∞), k ∈ N ∪ {0}. Then √ Z k+α p g q(t)dt ≤ Cα , ∀ g ≥ g0 , (6.17) M (λ; −gq, ∆) − π k
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
453
uniformly in k, and N (λ1 , λ2 ; −gq, ∆) ≤ Cα , −2γ
∀ g ≥ g0 .
(6.18)
−2κt
If Q(t) = (1 + t) , γ > 1 or Q(t) = e , then Z ∞p √ M (λ; −gq, ∆) − g ≤ Cα , q(t)dt π k
∀ g ≥ g0 .
(6.19)
Proof. The estimate (6.18) follows from (6.17) by virtue of (4.10). Let us prove (6.17). Let ∆1 = (k, k + α] and ∆2 = (k + α, ∞). In view of Lemma 6.1 and the relation (4.5) it suffices to show that the distribution function M (λ; −gq, ∆1 ) satisfies (6.17). This is a direct consequence of Theorem 6.5 and the identity (4.9). The formula (6.19) follows from (6.17) in view of the inequality Z ∞ p √ g q(t)dt ≤ Cα . k+α
7. Individual Asymptotics. Power-Like Potentials In this section we study the individual counting function M (λ; sgqk ) for q satisfying (5.6) with Q(t) = (1 + t)−2γ , where γ > 0 is arbitrary for both cases s = ±1. In contrast to the previous section here we focus on the asymptotics of this function under the assumption that g and k tend to infinity in a coordinated way (see Lemma 7.3 below). We shall use the following notation: k+1 . (7.1) α Emphasise again that the main difference with the asymptotics obtained in Theorem 4.3 is that now it is determined by the density of states for the unperturbed operator A0 . We begin with the study of asymptotic coefficients. 1
α = g 2γ ,
β = βk (g) =
7.1. Asymptotic coefficients Introduce the asymptotic coefficients for M (λ; ±gqk ): Z ∞ F± (σ, λ) = ± [ρ(λ) − ρ(λ ∓ (s + σ)−2γ )]ds
(7.2)
0
and for N (λ1 , λ2 ; ±gqk ): G± (σ, λ1 , λ2 ) = ±(F± (σ, λ1 ) − F± (σ, λ2 )) Z ∞ = [ρ(λ2 ∓ (s + σ)−2γ ) − ρ(λ1 ∓ (s + σ)−2γ )]ds .
(7.3)
0
It is clear that F± ≥ 0 and G± ≥ 0 if λ1 ≤ λ2 . Some other useful properties of F± , G± are collected in the next lemma:
June 4, 2002 11:0 WSPC/148-RMP
454
00123
A. V. Sobolev & M. Solomyak
Lemma 7.1. Let λ, λ1 , λ2 ∈ [λ− , λ+ ] be 1-admissible numbers. Then the integral F± (σ, λ) is finite for all σ > 0. Moreover, (i) If γ > 1, then F− (σ, λ) ≤ Cσ1−γ , F+ (σ, λ) ≤ C(1 + σ)1−γ , for all σ > 0. (ii) If γ ≤ 1, then F+ (σ, λ) ≤ C and ( F− (σ, λ) ≤
C, C ln(σ
γ < 1, −1
+ 1) ,
γ = 1.
(iii) If s = +1 and λ > λ− or s = −1 and λ < λ+ , then F± (σ, λ) = 0 for all −
1
σ ≥ σ± (λ) = d± 2γ with d± = |λ − λ∓ |, and 3
F± (σ, λ) ≥ C(σ± − σ) 2 ,
σ ≤ σ± .
(iv) The integral Z F±R (σ, λ) = ±
R
[ρ(λ) − ρ(λ ∓ (s + σ)−2γ )]ds ,
R > 0,
(7.4)
0
tends to F± (σ, λ) as R → ∞ uniformly in σ > 0. (v) For all γ > 0 one has G± (σ, λ1 , λ2 ) ≤ C, ∀ σ > 0. Proof. (i) By (3.24), the integrand in the definition (7.2) does not exceed C(s + σ)−γ for s = −1 and min{ρ(λ), C(s + σ)−γ } for s = +1. The required estimates follow immediately. (ii) Let γ ≤ 1. Let first s = +1, so that λ ∈ (λ− , λ+ ]. Define R > 0 to be the number such that λ − R−2γ = λ− . Consequently, λ − (s + σ)−2γ ≥ λ− for s + σ ≥ R, and hence ρ(λ) − ρ(λ − (s + σ)−2γ ) = 0 , This implies that
Z F+ (σ, λ) ≤
R
∀s : s + σ ≥ R.
ρ(λ)ds = Rρ(λ) ≤ C 0 (λ) .
0
Consider now the case s = −1, so that λ ∈ [λ− , λ+ ). Let R > 0 be the number such that λ + R−2γ = λ+ . Consequently, λ + (s + σ)−2γ ≤ λ+ for s + σ ≥ R, and hence ρ(λ + (s + σ)−2γ ) − ρ(λ) = 0 ,
∀s : s + σ ≥ R.
If σ ≥ R, then F− (σ, λ) = 0. If σ < R, then ( Z R−σ C(R) , (σ + s)−γ ds ≤ F− (σ, λ) ≤ C 0 C(R) ln(σ−1 + 1) ,
γ < 1, γ = 1.
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
455
(iii) For brevity consider only the case s = −1, so that λ < λ+ . For s ≥ σ− − σ, σ− = σ− (λ), the integrand in (7.2) equals zero. Besides, as ρ(λ) = ρ(λ+ ), in view of (3.23) we have ρ(λ + (s + σ)−2γ ) − ρ(λ) ≥ c((s + σ)−2γ − d− ) 2 1
≥ c0 (σ− − σ − s) 2 , 1
∀ s ∈ (0, σ− − σ) .
Integrating this inequality in s, we obtain the required lower bound for F− (σ, λ). The analogous bound for F+ (σ, λ) is obtained in the same way. (iv) It suffices to notice that for any 1-admissible λ and any R > 0 one has Z ∞ Z ∞ ± [ρ(λ) − ρ(λ ∓ (σ + s)−2γ )]ds ≤ ± [ρ(λ) − ρ(λ ∓ s−2γ )]ds R
R
for all σ ≥ 0. (v) Arguing as on the previous step, it suffices to prove that the integral of the form (7.3) over a finite interval is bounded uniformly in σ > 0. By (3.24) |ρ(λ2 ∓ (s + σ)−2γ ) − ρ(λ1 ∓ (s + σ)−2γ )| ≤ C|λ1 − λ2 |1/2 , which provides the required boundedness. When studying the sum of the counting functions, we shall need some properties of the sum of asymptotic coefficients F± (βk , λ): Lemma 7.2. (i) Suppose that λ is a 1-admissible number and δ = δ(α) is a bounded function such that δ ≤ 1,
αδ 2γ+1 → ∞ ,
as α → ∞ .
Then for any fixed A ≥ supα δ(α) one has Z A X lim α−1 F± (βk , λ) − F± (σ, λ)dσ = 0 ,
(7.5)
g → ∞.
δ
[δα]≤k≤[Aα]
(ii) If λ, λ1 , λ2 are 2-admissible, then the integrals Z ∞ Z ∞ F+ (σ, λ)dσ , G± (σ, λ1 , λ2 )dσ 0
0
are finite. If in addition γ ∈ (0, 2), then the integral Z ∞ F− (σ, λ)dσ 0
is also finite. Proof. For brevity we omit λ from the notation of F± . (i) Let σ ∈ (k, k + 1] be an arbitrary number, and let λ1 = λ ± (s + βk )−2γ ,
λ2 = λ ± (s + σ/α)−2γ , t > 0 .
(7.6)
June 4, 2002 11:0 WSPC/148-RMP
456
00123
A. V. Sobolev & M. Solomyak
Observe that |λ1 − λ2 | ≤ 2γδ −2γ−1 α−1 . Now it follows from (3.24) that |ρ(λ1 ) − ρ(λ2 )| ≤ CE1 (α, δ) ,
E1 (α, δ) = α−1/2 δ −γ−1/2 .
Thus, by definition (7.4), for each R > 0 one has Z k+1 R F±R (σ/α)dσ ≤ CRE1 (α, δ) , F± (βk ) − k or, changing the variable under the integral, Z (k+1)/α R F±R (σ)dσ ≤ CRE1 (α, δ) . F± (βk ) − α k/α Therefore Z A X −1 [Aα] R R α F± (βk ) − F± (σ)dσ δ k=[δα] Z ≤ CARE1 (α, δ) +
Z
δ [δα]α−1
F±R (σ)dσ +
([Aα]+1)α−1
A
F±R (σ)dσ .
Clearly, the first term tends to zero under the conditions (7.5). The last two integrals tend to zero as α → ∞ by Lemma 7.1(i), (ii). Consequently, for each R > 0 Z A X −1 [Aα] lim sup α F± (βk ) − F± (σ)dσ δ k=[δα] ≤ A lim sup Z + δ
max
[δα]≤k≤[Aα]
|F±R (βk ) − F± (βk )|
A
|F±R (σ) − F± (σ)|dσ ,
where lim sup is taken under the conditions (7.5). Recall that by Lemma 7.1 F±R (σ) converges to F± (σ) as R → ∞ uniformly in σ > 0. Thus the right hand side of the above inequality vanishes as R → ∞. This proves (7.6). (ii) By Lemma 7.1(i), (v), and also by definition (7.3), the functions F+ and G± are integrable in σ for γ > 2. If γ < 2, then F± , G± have compact support due to Lemma 7.1(iii). They are also integrable near σ = 0 by virtue of Lemma 7.1(i). If γ = 2, then the same applies to F+ and G± again by Lemma 7.1(i), (iii), (v). Lemma 7.2 guarantees that the asymptotic coefficients in Theorems 5.5 and 5.9 are finite.
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
457
7.2. Asymptotics of M (λ; ±gqk ) Lemma 7.3. Let q satisfy (5.6), and let λ, λ1 , λ2 ∈ [λ− , λ+ ] be 1-admissible. Then for any function δ = δ(α) satisfying (7.5) and any fixed A > supα δ(α), one has lim max |α−1 M (λ; ±gqk ) − F± (βk (g), λ)| = 0 ,
(7.7)
lim max |α−1 N (λ1 , λ2 ; ±gqk ) − G± (βk (g), λ1 , λ2 )| = 0 ,
(7.8)
δ≤βk ≤A
δ≤βk ≤A
as g → ∞. Proof. Note without further ado, that (7.8) is a direct consequence of (7.7) in view of (4.11). Let us concentrate on the proof of (7.7). By (6.3) it suffices to establish the required asymptotic formula for the counting function M (λ; ±gqk , ∆) with ∆(g) = (0, Rα] and afterwards take R to infinity. The proof of this fact is an adaptation of the corresponding argument from [21]. Suppose first that q(t) = Q(t). Let us split (0, R] into L identical subintervals (sj−1 , sj ], j = 1, 2, . . . , L, so that s0 = 0, sL = R and sj+1 − sj = RL−1 , and denote ∆j = (sj−1 α, sj α] ,
j = 1, 2, . . . , L .
Define step functions qn,1 , qn,2 : qk,1 (t) = qk (sj−1 α) = qk,2 (t) = qk (sj α) =
1 , g(sj−1 + β)2γ 1 , g(sj + β)2γ
t ∈ ∆j , t ∈ ∆j .
Here we have denoted β = βk . Further proof is for the case s = −1 only. The other case is done in the same way. Clearly, qk,2 ≤ qk ≤ qk,1 , and hence the counting function of the operator A0 − gqk with the Dirichlet conditions at the ends of the interval ∆ satisfies the two-sided estimate N (λ; −gqk,2 , ∆) ≤ N (λ; −gqk , ∆) ≤ N (λ; −gqk,1 , ∆) . Let us find the asymptotics of the r.h.s. We are going to use the decoupling principle again: N (λ; −gqk,1 , ∆) ≤
L X
N (λ; −gqk,1 , ∆j ) + 2(L − 1) .
j=1
Now, using Theorem 3.4, we get for each j α−1 N (λ; −gqk,1 , ∆j ) = (sj − sj−1 )|∆j |−1 N (λ + (sj−1 + β)−2γ ; 0, ∆j ) ≤ (sj − sj−1 )ρ(λ + (sj−1 + β)−2γ ) q + C(1 + |λ| + (sj−1 + β)−2γ )α−1 ,
(7.9)
June 4, 2002 11:0 WSPC/148-RMP
458
00123
A. V. Sobolev & M. Solomyak
with a universal constant C. Since β ≥ δ, we obtain from (7.9) that X α−1 N (λ; −gqk,1 , ∆) − (sj − sj−1 )ρ(λ + (sj−1 + β)−2γ ) j
≤ (L + C(λ) + Cδ −γ )α−1 . This can be rewritten as α−1 N (λ; −gqk,1 , ∆) −
XZ
ρ(λ + (sj−1 + β)−2γ )ds
∆j
j
≤ (L + C(λ) + Cδ −γ )α−1 .
(7.10)
To replace the sum in the left hand side by the integral, we use the H¨older property (3.24) with λ1 = λ + (sj−1 + β)−2γ ,
λ2 = λ + (t + β)−2γ .
Observe that |λ1 − λ2 | = |(sj−1 + β)−2γ − (t + β)−2γ | ≤ 2γδ −2γ−1 |sj−1 − t| ≤ 2γRδ −2γ−1 L−1 ,
∀ t ∈ ∆j .
Now we infer from (3.24) that ρ(λ + (sj−1 + β)−2γ ) − ρ(λ + (t + β)−2γ ) ≤ Cδ −γ−1/2 R1/2 L−1/2 ,
t ∈ ∆j .
Substituting this estimate into (7.10), we get RR α−1 N (λ; −gqk,1 , ∆) − 0 ρ(λ + (s + β)−2γ )ds ≤ CE(α, δ, L; R) , E(α, δ, L; R) = (L + 1 + δ −γ )α−1 + δ −γ−1/2 R3/2 L−1/2 . Arguing similarly, we arrive at the analogous lower bound for N (λ; −gqk,2 , ∆). Consequently, Z R −1 ρ(λ + (s + β)−2γ )ds ≤ CE(α, δ, L; R) . α N (λ; −gqk , ∆) − 0 In view of Theorem 3.4 we also have Z −1 α N (λ; 0, ∆) −
0
R
ρ(λ)ds ≤ Cα−1 .
By (4.9), in combination with the previous estimate this gives |α−1 M (λ; −gqk , ∆) − F−R (βk , λ)| ≤ CE(α, δ, L; R) . The parameter L can be chosen so as to insure that E → 0 as α → ∞. Indeed, in view of 7.5 αδ 2γ+1 → ∞ as α → ∞. Therefore, defining L = [α1/2 δ −γ−1/2 ] ,
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
459
we guarantee that δ −2γ−1 L−1 ∼ α−1/2 δ −γ−1/2 → 0 ,
Lα−1 ∼ α−1/2 δ −γ−1/2 → 0 ,
α → ∞.
Taking R to infinity and referring to Lemma 7.1(iv), we obtain (7.7), thus completing the proof for q(t) = Q(t). It remains to include the potentials satisfying (5.6). To this end note that under the condition (5.6), for any ε > 0 Qk (t)(1 − ε) ≤ qk (t) ≤ Qk (t)(1 + ε) , if k is sufficiently large. Thus, using the monotonicity of M (λ; V ) in V (see Sec. 4) and the asymptotics (7.7) for q = Q we easily deduce (7.7) for the general case. In conclusion note that we shall not need the asymptotics (7.8) in what follows. ˜ (λ, A±gq ): Proof of Theorems 5.5 and 8. Asymptotics of M 5.8 5.11 Throughout this section we assume that λ, λ1 , λ2 are 2-admissible and satisfy (4.21). 8.1. Proof of Theorems 5.5 and 5.9 Recall that in Theorem 5.5 we assume that either s = −1, γ < 2, or s = +1 and γ > 0 is arbitrary. In Theorem 5.9, γ > 0 is arbitrary, but if s = −1 and γ ≥ 2, then the potential q satisfies Condition 5.7. Step 1. To begin with we show that “small” or “large” values of k do not contribute ˜ and N ˜. to the sums M Suppose that k ≥ Aα with some fixed A > 0. Then for γ ≤ 2 and large A the perturbation gqk ≤ CA−2γ is small and therefore M (λ, ±gqk ) = 0, since λ is strictly inside the gap. For γ > 2, by (5.3) we have X M (λ; ±gqk ) ≤ Cg 1/2 (Aα)2−γ = C 0 A2−γ α2 . k≥Aα
By (4.11) a similar bound holds for the sum of the functions N (λ1 , λ2 ; ±gqk ) for all γ > 0. These calculations again show that k ≥ Aα do not contribute as A grows. Suppose that k ≤ δα. If γ < 2 and s = −1, then the bounds (5.3) (for γ > 1) and (6.5) (for γ ≤ 1) ensure that for δ > 0 2−γ 2 α , 1 < γ < 2; Cδ X M (λ; −gqk ) ≤ Cδα2 ln α , γ = 1 ; (8.1) k≤δα 2 Cδα , γ < 1. This means that the share of this sum becomes small when δ → 0. For γ 6= 1 we can take δ to be arbitrarily small constant, independent of α. With γ = 1 we must
June 4, 2002 11:0 WSPC/148-RMP
460
00123
A. V. Sobolev & M. Solomyak
be more careful. Since we want to obtain the asymptotics of order α2 , we should “kill” the “ln” term in the estimate by choosing δ to be dependent on α, but in a very mild way: δ = α−η with a parameter η < (1 + 2γ)−1 , so that the condition (7.5) from Lemma 7.3 is satisfied. For s = +1 the estimate (6.6) yields: X M (λ; gqk ) ≤ Cγ δα2 , ∀ γ > 0 . k≤δα
As in the case s = −1 and γ 6= 1, it is possible to take δ to be an arbitrarily small constant. However, for the sake of uniformity, we take δ = α−η , for both signs s = ±1. Consequently, X M (λ; ±gqk ) = o(α2 ) k≤δα
˜ the estimate (6.7) and the condition (7.5) is satisfied. In the case of the function N guarantees that X N (λ1 , λ2 ; sgqk ) ≤ Cγ δα2 = o(α2 ) , (8.2) k≤δα
for s = +1 and all γ > 0. If s = −1, γ < 2, then the same bound follows from (8.1) and (4.11). In the case s = −1, γ ≥ 2 the estimate (8.2) is a direct consequence of (6.18). Thus, it remains to study the sums (4.3), (4.4) only over the numbers [δα] ≤ k ≤ [Aα] , −η
with δ = α
−1
, η < (1 + 2γ)
, and a fixed A ≥ supα δ.
Step 2. We use the notation (7.1). Estimate using (7.6): Z A X −2 [Aα] lim sup α M (λ; ±gqk ) − F± (σ, λ)dσ δ k=[δα] X
[Aα]
≤ lim sup α−1
|α−1 M (λ; ±gqk ) − F± (βk , λ)|
k=[δα]
≤ A lim sup
max [δα]≤k≤[Aα]
|α−1 M (λ; ±gqk ) − F± (βk , λ)| .
The right hand side tends to zero by Lemma 7.3. Consequently Z A [Aα] X lim α−2 M (λ; ±gqk ) − F± (σ, λ)dσ = 0 , g → ∞ . k=[δα]
δ
By (4.11) and (7.3) this equality implies that Z A [Aα] X lim α−2 N (λ1 , λ2 ; ±gqk ) − G± (σ, λ1 , λ2 )dσ = 0 , k=[δα]
δ
g → ∞.
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
461
Referring to Step 1 of the proof and Lemma 7.2(ii), we can now replace the lower and upper limits of summation and integration by 0 and ∞ respectively. This completes the proof of Theorems 5.5 and 5.9. 8.2. Proof of Theorem 5.8 By (5.6), Z
∞
Z p qk (t)dt =
0
∞
0
1 + o(1) 1 + k dt = , 2 (1 + k + t) k+1
where k → 0 as k → ∞. From here we find by virtue of Theorem 6.6 that the function M ((λ) has the following asymptotics: 1/2 M (λ; −gqk ) − g ≤ Cg 1/4 + C 0 g 1/2 k , (8.3) π(k + 1) k+1 uniformly in k ≥ 0. Observe that the components with numbers k ≥ Ag 1/4 do not contribute if A is sufficiently large, since |gqk (t)| ≤ CA−4 . Let us turn to the remaining terms: 1/2 X M ˜ (λ, A−gq ) − g ln g ≤ |M (λ; −gqk ) − g 1/2 (k + 1)−1 π −1 | 4π 1/4 k≤Ag
X + g 1/2 π −1 (1 + k)−1 − ln g/4 . k≤Ag1/4 The second term in the r.h.s. is of order O(g 1/2 ) in view of the known formula for the partial sum of the harmonic series. By (8.3) the first term is bounded by X k CAg 1/2 + C 0 g 1/2 . 1+k 1/4 k≤Ag
Since k → 0 as k → ∞, this quantity is of order o(g 1/2 ln g). 8.3. Proof of Theorems 5.10 and 5.11 ˜ in the form Rewrite the sum M X ˜ (λ, Agq ) = M M (λ; ge−2κk qˆ(k) ) ,
qˆ(k) = e2κk qk .
k
Since qˆ(k) , k ≥ 0, satisfies the bound (5.9) for all t ≥ R0 , from Theorem 6.3 we obtain that |M (λ; ge−2κk qˆ(k) ) − ρ(λ)(α − k)| ≤ C ,
∀k ≤ α− 1.
June 4, 2002 11:0 WSPC/148-RMP
462
00123
A. V. Sobolev & M. Solomyak
Consequently,
X
M (λ; ge−2κk qˆ(k) ) = ρ(λ)
k≤α−1
X
(α − k) + O(α)
k≤α−1
=
1 ρ(λ)α2 + O(α) . 2
On the other hand, by (5.5) M (λ; ±gqk ) ≤ Cg 1/2 e−κk = Ceκ(α−k) , so that
X
M (λ; ±gqk ) ≤ C .
(8.4)
k>α−1
The asymptotics (5.10) follows. The estimates (5.11) and (5.12) are proved in the same way. By (6.9) and (8.4), (4.11) X ˜ (λ1 , λ2 ; Agq ) ≤ C + C 0 ≤ Cα . N k≤α−1 (k)
Furthermore, as qˆ
satisfies Condition 5.7 by (6.18) and (8.4), (4.11) X ˜ 1 , λ2 ; A−gq ) ≤ N(λ N (λ1 , λ2 ; −ge−2κk qˆk ) + C 0 k≤α−1
≤
X
C(α − k) + C 0 ≤ C 00 α2 .
k≤α−1
9. Asymptotics of M (λ, A±gq ) and N (λ, A±gq ) Here we turn to the study of the sums (4.1) and (4.2). As before, to ensure that they are finite we assume that λ, λ1 , λ2 satisfy (4.21) with the same closed interval I. Due to the presence of exponential terms in the sums, their study is more com˜ N ˜ , and hence the asymptotic formulae are less explicit. plicated than that of M, Another feature is that for the exponential and power-like potentials the results are qualitatively different. 9.1. Exponential potentials In this subsection we always (except for Theorem 9.3) assume that q(t) = Q(t) = e−2κt .
(9.1)
This assumption allows one to obtain asymptotic formulae based on the “selfsimilarity” property of the function e−2κt . Introduce the notation ln b = β > 0 . Theorem 9.1. Assume (9.1). Then the following two statements hold:
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
463
(i) Let κ > 0 be arbitrary. Then there exist functions ϕ± that are 2κ-periodic, bounded and separated from zero, such that lim[g − 2κ N (λ1 , λ2 ; A±gq ) − ϕ± (ln g)] = 0 , β
g → ∞.
(9.2)
(ii) Suppose that κ > 0 is arbitrary if s = +1 and κ < β if s = −1. Then there exist two functions ψ± that are 2κ-periodic, bounded and separated from zero, such that lim[g − 2κ M (λ, A±gq ) − ψ± (ln g)] = 0 , β
g → ∞.
(9.3)
We precede the proof with an elementary but convenient lemma: Lemma 9.2. Let n(t), t ∈ R be a bounded function such that ( n(t) = 0 for all t ≤ t0 with some t0 > 0 , β
n(t) ≤ Ct 2κ − , t ≥ t0 ,
(9.4)
for some > 0 .
Then for the function N (g) = n(g) + (1 − b−1 )
X
eβk n(ge−2κk )
(9.5)
k≥1
there exists a function φ which is 2κ-periodic, bounded and separated from zero, such that β
lim[g − 2κ N (g) − φ(ln g)] = 0 ,
g → ∞.
Proof. The sum in the right hand side of (9.5) is finite, since for sufficiently large k we have ge−2κk ≤ t0 . Denote β
ξ(g) = g − 2κ n(g) ,
β
Ξ(g) = g − 2κ N (g) .
Then (9.5) yields Ξ(g) = ξ(g) + (1 − b−1 )
X
ξ(ge−2κk ) ,
k≥1
which in its turn implies that Ξ(g) − Ξ(ge−2κ ) = ξ(g) − b−1 ξ(ge−2κ ) . Using the notation (6.1) and introducing new functions F (α) = Ξ(g), f (α) = ξ(g), we arrive at the equation F (α) − F (α − 1) = f (α) − b−1 f (α − 1) . Since n(t) satisfies (9.4), f (α) = 0 for α ≤ α0 = (2κ)−1 ln t0 , and f (α) ≤ Ce−2κα , α ≥ α0 . Therefore all the conditions of the Renewal theorem are satisfied (see [15, Chapter XI.1], or a modern exposition in [17]), which guarantees the existence of a 1-periodic function φ˜ which is bounded and separated from zero, such that ˜ F (α) = φ(α) + o(1) ,
α → ∞.
˜ which leads to (9.2) after substitution φ(α) = φ(α/(2κ)).
June 4, 2002 11:0 WSPC/148-RMP
464
00123
A. V. Sobolev & M. Solomyak
Proof of Theorem 9.1. (i) The proof is done simultaneously for both signs s = ±1. We use Lemma 9.2 with n(g) = N (λ1 , λ2 ; ±gq). Since N (λ1 , λ2 ; ±gqk ) = n(ge−2κk ), by (4.2) the function N (g) in the r.h.s. of (9.5) coincides with N (λ1 , λ2 ; A±gq ). By (5.5) and (4.11) n(t) = 0, t ≤ t0 for a sufficiently small t0 > 0. Moreover, by (6.18) or (6.7), the second condition in (9.4) is also fulfilled for any < β(2κ)−1 . Thus the required asymptotics (9.2) follows from Lemma 9.2. (ii) The cases s = +1 and s = −1 are treated separately. Let first s = +1. Denote now n(g) = M (λ; gq). Then, similarly to the first part of the proof, the total counting function (4.1) coincides with (9.5). By (5.5) and (6.6) the function n satisfies (9.4) for any < β(2κ)−1 . Thus Lemma 9.2 guarantees the asymptotics (9.3) for s = +1. In the case s = −1, κ < β, the first condition in (9.4) is satisfied for n(g) = M (λ; −gq) in view of (5.5). Besides, (5.5) ensures also that the second condition is satisfied with = β(2κ)−1 − 1/2 > 0. Again, Lemma 9.2 leads to (9.3) for s = −1. For s = −1 the cases κ < β and κ > β are described by Theorem 9.1 and Lemma 5.3 respectively. Let us handle the critical case β = κ. We emphasise that this is the only asymptotic formula in this subsection which does not require the exact equality q(t) = e−2κt . Theorem 9.3. Suppose that q(t) = Q(t)(1 + o(1)) ,
t → ∞,
Q(t) = e−2κt ,
with κ = β, and that q satisfies Condition 5.7. Then lim
M (λ; A−gq ) 1 − b−1 = , 2πκ2 g 1/2 ln g
g → ∞.
Proof. By (5.5) M (λ; −gqk ) = 0 for all k ≥ α + A with a sufficiently large A, and X X bk M (λ; −gqk ) + bk M (λ; −gqk ) ≤ CAg 1/2 . k≤A
α−A≤k≤α+A
Consequently lim
M (λ; A−gq ) 1 − b−1 = lim 1/2 1/2 g ln g g ln g
X
eκk M (λ; −gqk ) ,
g → ∞,
(9.6)
A
if the limit in the r.h.s. exists. For k ∈ (A, α − A) apply Theorem 6.6 and the relation (4.5) to obtain the asymptotics Z g 1/2 ∞ p M (λ; −gqk ) = q(k + t)dt + O(α) π 0 = g 1/2 e−κk ((πκ)−1 + oA (1)) + O(α) ,
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
465
where oA (1) → 0 as A → ∞ uniformly in k, g. Therefore the sum in the r.h.s. of (9.6) equals X X 1 g 1/2 + g 1/2 αoA (1) + O(α) eκk πκ A
A
g 1/2 α + g 1/2 O(A) + g 1/2 αoA (1) + eκα e−κA O(α) . πκ Since α = ln g/(2κ), now it follows from (9.6) that M (λ; A−gq ) 1 − = oA (1) + O(e−κA ) . lim sup 2πκ2 g 1/2 ln g g→∞ =
Since A is arbitrary, the required result follows. 9.2. Power-like potentials For power-like potentials the asymptotic formulae that we obtain, are less informative since they are established for ln M and ln N . For the sake of illustration we consider here only M (λ; A±gq ). The corresponding asymptotics of N (λ; A±gq ) can be easily derived using the same argument as well. For simplicity we assume that q(t) = Q(t) = (1 + t)−2γ . For more general power-like potentials the results follow by monotonicity of M with respect to the potential. Recall that I denotes −
1
the interval defined in (4.21). We also use the notation d± = |λ − λ∓ |, σ± = d± 2γ introduced in Lemma 7.1(iii). Theorem 9.4. Let I be a closed interval defined in (4.21), which is strictly inside the gap (λ− , λ+ ). Let q(t) = (1 + t)−2γ , γ > 0. Then for any λ ∈ I −
1
lim α−1 ln M (λ; A±gq ) = βd± 2γ ,
g → ∞,
where β = ln b. Proof. Upper bound. A straightforward perturbation argument ensures that M (λ; ±gqk ) = 0 if gqk (t) ≤ d± , ∀ t > 0, i.e. for −
1
k ≥ K1 = K1 (g) = (d−1 g) 2γ = αd± 2γ . 1
It follows from Lemmas 5.2 and 6.2 that M (λ; ±gqk ) ≤ Cg ω with some ω = ω(γ) > 0. Substituting this bound in (4.1) gives X M (λ; A±gq ) = M (λ; ±qq) + (1 − b−1 ) bk M (λ; ±gqk ) k∈N
≤ Cg ω
K1 X
bk ≤ C 0 g ω bK1 ,
k=0 0
with a constant C depending only on b, γ.
June 4, 2002 11:0 WSPC/148-RMP
466
00123
A. V. Sobolev & M. Solomyak
Lower bound. For the lower bound we drop all but one term from the sum (4.1): for any ∈ (0, 1) we have M (λ; A±gq ) ≥ (1 − b−1 )bk M (λ; ±gqk ) ,
k = [(1 − )K1 ] .
By Lemmas 7.3 and 7.1(iii), −
1
M (λ; ±gqk ) ≥ cαF± (λ; βk ) ≥ c0 α(d± 2γ − βk ) 2 , 3
for sufficiently large α. Since βk = (k + 1)α−1 , we see that the r.h.s. is bounded from below by c00 α3/2 with a constant depending only on d± . Consequently, 3
M (λ; A±gq ) ≥ c 2 αbK1 (1−) , with a constant depending on b, λ and γ. Since > 0 is arbitrary, in combination with the upper bound, this gives the required asymptotics. Acknowledgments The second author (M.S.) was partly supported by the Minerva center for nonlinear physics and by the Israel Science Fundation, and partly by the EPSRC grant GR/N 37193/01. This paper was essentially completed when M.S. was visiting King’s College, London in April–May 2001. The authors are grateful to Yu. Safarov for valuable discussions. References [1] S. Alama, P. A. Deift and R. Hempel, Eigenvalue branches of the Schr¨ odinger operator H − λW in a gap of σ(H), Comm. Math. Phys. 121 (1989) 291–321. [2] C. Allard and R. Froese, A Mourre estimate for a Schr¨ odinger operator on a binary tree, Rev. Math. Phys. 12(12) (2000) 1655–1667. [3] M. Sh. Birman, Discrete spectrum in the gaps of a continuous one for perturbations with large coupling constant, in Estimates and Asymptotics for Discrete Spectra of Integral and Differential Equations, Leningrad, 1989–90, Adv. Soviet Math., 7, Amer. Math. Soc., Providence, RI, 1991, pp. 57–73. [4] M. Sh. Birman and V. V. Borzov, On the asymptotics of the discrete spectrum for certain singular differential operators, Probl. Mat. Fiz. 5 (1971), 24–38; English transl. in Topics in Math. Phys. 5 (1972). [5] M. Sh. Birman and A. Laptev, Discrete spectrum of the perturbed Dirac operator, Ark. Mat. 32(1) (1994) 13–32. [6] , The negative discrete spectrum of a two-dimensional Schr¨ odinger operator, Comm. Pure Appl. Math. 49(9) (1996) 967–997. [7] M. Sh. Birman, A. Laptev and M. Solomyak, On the eigenvalue behaviour for a class of differential operators on semiaxis, Math. Nachr. 195 (1998) 17–46. [8] M. Sh. Birman and M. Solomyak, Quantitative analysis in Sobolev imbedding theorems and applications to spectral theory, in Tenth Mathem. School, Izd. Inst. Mat. Akad. Nauk Ukrain, SSSR, Kiev, 1974 (Russian), pp. 5–189; English translation in Amer. Math. Soc. Translations 114(2) (1980) 1–132.
June 4, 2002 11:0 WSPC/148-RMP
00123
Schr¨ odinger Operators on Homogeneous Metric Trees
[9]
[10]
[11]
[12] [13] [14] [15] [16] [17] [18] [19] [20] [21]
467
, Estimates for the number of negative eigenvalues of the Schr¨ odinger operator and its generalizations, in Estimates and Asymptotics for Discrete Spectra of Integral and Differential Equations, Leningrad, 1989–90, Adv. Soviet Math., 7, Amer. Math. Soc., Providence, RI, 1991, pp. 1–5. , On the negative discrete spectrum of a periodic elliptic operator in a waveguide-type domain, perturbed by a decaying potential, J. d’Anal. Math. 83 (2000) 337–391. M. Sh. Birman and G. D. Raikov, Discrete spectrum in the gaps for perturbations of the magnetic Schr¨ odinger operator, Estimates and Asymptotics for Discrete Spectra of Integral and Differential Equations Leningrad, 1989–90, Adv. Soviet Math., 7, Amer. Math. Soc., Providence, RI, 1991, pp. 75–84. R. Carlson, Hill’s equation for a homogeneous tree, Electron. J. Differen. Equations 1997 (23), 30 pp. (electronic). , Nonclassical Sturm–Liouville problems and Schr¨ odinger operators on radial trees, Electron. J. Differen. Equations 2000 (71), 24 pp. (electronic). P. A. Deift and R. Hempel, On the existence of eigenvalues of the Schr¨ odinger operator H − λW in a gap of σ(H), Comm. Math. Phys. 103 (1986) 461–490. W. Feller, An Introduction to the Probability Theory and Its Applications, Vol. II, John Wiley and Sons, Inc., New York-London-Sidney-Toronto, 1971. M. Klaus, On the point spectrum of Dirac operators, Helv. Phys. Acta 53 (1980) 453–462. M. Levitin and D. Vassiliev, Spectral asymptotics, renewal theorem, and the Berry conjecture for a class of fractals, Proc. London Math. Soc. 72(3) (1996) 188–214. K. Naimark and M. Solomyak, Eigenvalue estimates for the weighted Laplacian on metric trees, Proc. London Math. Soc. 80(3) (2000) 690–724. , Geometry of the Sobolev spaces on the regular trees and Hardy’s inequalities, Russian J. Math. Physics 8(3) (2001). R. V. Romanov and G. E. Rudin, Scattering on the Bruhat-Tits tree. I, Phys. Lett. A198 (1995) 113–118. A. V. Sobolev, Weyl asymptotics for the discrete spectrum of the perturbed Hill operator, in Estimates and Asymptotics for Discrete Spectra of Integral and Differential Equations, Leningrad, 1989–90, Adv. Soviet Math., 7, Amer. Math. Soc., Providence, RI, 1991, pp. 159–178.
June 4, 2002 11:36 WSPC/148-RMP
00122
Reviews in Mathematical Physics, Vol. 14, No. 5 (2002) 469–510 c World Scientific Publishing Company
EVOLUTION OF CENTRAL MOMENTS FOR A GENERAL-RELATIVISTIC BOLTZMANN EQUATION: THE CLOSURE BY ENTROPY MAXIMIZATION
ZBIGNIEW BANACH Centre of Mechanics, Institute of Fundamental Technological Research Department of Fluid Mechanics, Polish Academy of Sciences Swietokrzyska 21, 00-049 Warsaw, Poland
[email protected] WIESLAW LARECKI Institute of Fundamental Technological Research Department of Theory of Continuous Media Polish Academy of Sciences, Swietokrzyska 21, 00-049 Warsaw, Poland
[email protected]
Received 8 August 2001 Revised 28 January 2002
Beginning from the relativistic Boltzmann equation in a curved space-time, and assuming that there exists a fiducial congruence of timelike world lines with four-velocity vector field u, it is the aim of this paper to present a systematic derivation of a hierarchy of closed systems of moment equations. These systems are found by using the closure by entropy maximization. Our concepts are primarily applied to the formalism of central moments because if an alternative and more familiar theory of covariant moments is taken into account, then the method of maximum entropy is ill-defined in a neighborhood of equilibrium states. The central moments are not covariant in the following sense: two observers looking at the same relativistic gas will, in general, extract two different sets of central moments, not related to each other by a tensorial linear transformation. After a brief review of the formalism of trace-free symmetric spacelike tensors, the differential equations for irreducible central moments are obtained and compared with those of Ellis et al. [Ann. Phys. (NY) 150 (1983) 455]. We derive some auxiliary algebraic identities which involve the set of central moments and the corresponding set of Lagrange multipliers; these identities enable us to show that there is an additional balance law interpreted as the equation of balance of entropy. The above results are valid for an arbitrary choice of the Lorentzian metric g and the four-velocity vector field u. Later, the definition of u as in the well-known theory of Arnowitt, Deser, and Misner is proposed in order to construct a hierarchy of symmetric hyperbolic systems of field equations. Also, the Eckart and Landau–Lifshitz definitions of u are discussed. Specifically, it is demonstrated that they lead, in general, to the systems of nonconservative equations. Keywords: General-relativistic kinetic theory; differential equations for central moments; closure by entropy maximization; symmetric hyperbolic systems. PACS Nos.: 04.20.-q, 05.20.Dd, 05.70.Ln, 47.75.+f 469
June 4, 2002 11:36 WSPC/148-RMP
470
00122
Z. Banach & W. Larecki
1. Introduction In general-relativistic kinetic theory of classical and quantum ideal gases [1, 2], a traditional way of defininig the moments of the distribution function is to multiply this function by tensorial powers of the particle four-momentum and then to integrate the resulting products over the future mass-shell [3, 4]. We call these traditional moments the covariant moments, since the calculations leading to them do not require that there exists a 3 + 1 splitting of space-time determined by some chosen four-velocity vector field. On the other hand, if our space-time is supplied with a preferred four-velocity vector field corresponding to a congruence of timelike world lines (world lines of observers), it will also be possible to introduce the so-called central moments of the distribution function. As noted already by van Kampen [5], these central moments are not covariant in the following sense: two observers looking at the same relativistic gas will, in general, extract two different sets of central moments, not related to each other by a tensorial linear transformation. Within the context of general-relativistic radiative transfer, the idea of using such central moments was explained by Anderson and Spiegel [6], and mathematical and physical aspects of their formalism have been developed by Thorne [7]. For particles with both vanishing and nonvanishing proper mass, a very complete elaboration of the formalism, taking account of the most general definition of a central moment, has been given by Ellis et al. [8, 9]. More recently, Struchtrup [10] considered a new central-moment closure of the relaxation-time model equation of Anderson and Witting [11] as opposed to that based on the covariant moments [12, 13]. Finally, we mention that Banach [14] and Banach and Larecki [15] investigated two preferred sets of central moments, the basic role of which is associated with the unique parametrization of the space of distribution functions and the exact decomposition of the Boltzmann entropy (per unit volume) into equilibrium and nonequilibrium parts. In this context, see also our discussion in Sec. 5.1. Beginning from the relativistic Boltzmann equation in a curved or flat spacetime, it is the aim of this paper to present a systematic derivation of two different hierarchies of closed systems of moment equations. These derivations adopt the closure by entropy maximization, and here we note that an analogous method of truncating the nonrelativistic moment equations has been carried out by Dreyer [16] and Levermore [17]. We consider two moment formalisms. The first formalism applies to covariant moments, while the second formalism starts with essentially the same general program but uses the central moments in place of covariant ones. In the case of covariant moments, a major problem confronts the mathematical procedure for deriving evolution equations by means of a maximum entropy principle. Indeed, for any member of a hierarchy of closed systems of moment equations, approaches of this type impose some restrictions on the possible values of moments or other equivalent variables, and it is physically clear that they are not natural unless the Euler system is the theme of interest. The basic conclusion is as follows. In the formalism of covariant moments, the method of maximum entropy is ill-defined in a neighborhood of equilibrium states. A very similar situation exists concerning the
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
471
nonrelativistic gas [17], where only two physically satisfactory members of the hierarchy are possible. The first one is the Euler system, which is based on Maxwellian velocity distributions, while the second is based on the 10 moment, nonisotropic Gaussian closure. Systematically, this closure was first discussed by Banach and Piekarski [18]. Later, it has been used as a starting point for a thermodynamic description of ideal gases under shear [19] and for an analysis of the nature of shock profiles [20]. Since the aforementioned difficulties are unavoidable in the case of covariant moments, this paper is primarily concerned with the general structure of a theory of central moments. Precisely speaking, we find that if the relativistic gas consists of particles with proper mass different from zero, then the method of maximum entropy applied to two suitably chosen sets of central moments automatically leads to a well-defined hierarchy of closed systems of field equations. This result explains, for the first time, the idea of M¨ uller and Ruggeri [13] that the closure by entropy maximization really yields a “theory of theories,” which carries its own proof of validity: as soon as an increase of the number of central moments does not improve a result, we know the the extant number is sufficient. Our constructions are also valid for the case of a gas of massless particles, although a fully satisfactory theory of theories cannot be obtained by exploiting the maximum entropy closure in this case. As to the details, see Secs. 5.1 and 7. Now, in order to prove that each member of the hierarchy is consistent with a supplementary balance law, interpreted as the equation of balance of entropy, we derive some auxiliary algebraic identities which involve the set of central moments and the corresponding set of Lagrange multipliers. These identities and the formulas of Secs. 2–5 can be constructed quite generally in a completely rigorous way for an arbitrary Lorentzian metric g and an arbitrary four-velocity vector field u. However, it is sometimes natural to assume that there exists a “slacing” of the space-time manifold X by spacelike Cauchy surfaces Σt , labeled by the time parameter t. According to the well-known formalism of Arnowitt et al. [21], the Lorentzian metric g is then projected along, and orthogonal to, the hypersurfaces of the foliation to give a canonical picture in which u is the unit (future-directed) normal to Σt . Our reply to the question “why do we define in Secs. 6.1 and 6.2 the four-velocity vector field u in this way?” is that we also want to obtain a hierarchy of symmetric hyperbolic systems of field equations [22], because they ensure well-posedness of local Cauchy problems [23]. Of course, many interesting problems arise when attempting to implement the alternative definitions of u such as, e.g. the Eckart and Landau–Lifshitz options [24, 25]. As we shall see, in discussing these two options, one is led inevitably to the systems of nonconservative equations. Consequently, no immediate proof of symmetric hyperbolicity of the field equations can be provided, except in the cases mentioned in Sec. 6.3. Another remark is simply this. Given a choice of a timelike vector field, Ellis et al. [8, 9] proposed a spherical harmonic analysis of the distribution function f , obtaining differential equations for a set of moments of the spherical harmonic components of f , which are the quantities of physical interest and which they represent
June 4, 2002 11:36 WSPC/148-RMP
472
00122
Z. Banach & W. Larecki
as trace-free symmetric spacelike tensors. However, such irreducible equations can also be extracted directly from the equations for central moments by using the method of Banach and Piekarski [26]. These matters are not altogether trivial or immediate, and thus we hope that the alternative and relatively simple derivation given here will prove to be useful. The layout of this paper is as follows. Section 2 establishes our notation for tensors. Section 3 first defines the relativistic Boltzmann equation in a curved spacetime and then considers the difficulties associated with deriving a hierarchy of closed systems of equations for covariant moments. Sections 4 and 5 develop a theory of central moments, since this theory is superior to the more familiar covariant formalism in that it meshes more naturally with the concept of closure by a maximum entropy principle. Within the framework of a theory of central moments, Sec. 6 discusses the crucial assumptions that enter into the construction of symmetric hyperbolic systems of field equations. Section 7 concludes by summmarizing the principal results and indicating the direction of future research. Some intermediate calculations are put into the Appendix. Our signature convention is (−, +, +, +). Lowercase Latin indices are used for the general basis and uppercase Latin indices for the coordinate basis. If they are taken from the beginning of the alphabet, i.e. (a, b, . . . , h; A, B, . . . , H), their range is {0, 1, 2, 3}, whereas for the middle of the alphabet, i.e. (i, j, . . . , n; I, J, . . . , N ), their range is only {1, 2, 3}. In Sec. 6, we adopt the notation of Thorne [7]: M Ωα := M i1 i2 ···iα ,
(1.1)
where a capital Greek superscript denotes a string of lowercase Latin indices; their number is denoted by a lowercase Greek subscript on the capital superscript (ip = 1, 2, 3; p = 1, 2, . . . , α). The units are such that c, kB , and ~ are equal to 1. 2. Preliminaries 2.1. Basic tensorial definitions For each point of the space-time manifold X, we denote by Tβα (x) the space of tensors of type (α, β) at x ∈ X. Let {ea } be a basis of vectors at x, and suppose that {ea } is a basis dual to {ea }. If we introduce the simple tensors ea1 ···aα := ea1 ⊗ · · · ⊗ eaα
(2.1a)
eb1 ···bβ := eb1 ⊗ · · · ⊗ ebβ ,
(2.1b)
and
then an arbitrary tensor Mβα ∈ Tβα (x) can be expressed in terms of these tensors as Mβα = Mβαa1 ···aα b
1 ···bβ
ea1 ···aα ⊗ eb1 ···bβ ,
(2.2)
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
where {Mβαa1 ···aα b
1 ···bβ
473
} are the components of Mβα with respect to the dual bases
{ea } and {ea }. By definition, we say that M α ∈ T0α (x) is a tensor of type α at x. The action of the symmetrizer S on a tensor M α of type α is given by SM α := M α(a1 ···aα ) ea1 ···aα ,
α ≥ 2,
(2.3)
where parentheses enclosing a set of α indices represent symmetrization of these indices, i.e. the sum of M αa1 ···aα over α! permutations of the indices, divided by α!. For M 1 ∈ T01 (x) and M 0 ∈ T00 (x), we set SM 1 := M 1 and SM 0 := M 0 . Now, suppose that M α and M β are the tensors of types α and β, respectively. Then the equality M α ∨ M β := S(M α ⊗ M β ) (α ≥ 1, β ≥ 1)
(2.4)
defines the symmetric tensor product of M α and M β . If α ≥ β = 0, we set M α ⊗ M 0 := M 0 M α , M 0 ⊗ M α := M 0 M α , M α ∨ M 0 := M 0 (SM α ), and M 0 ∨ M α := M 0 (SM α ). Let g = gab ea ⊗ eb be the metric on X. The trace operator with respect to the pair (µ, ν), denoted by Tr(µ,ν) , determines a linear map: Tr(µ,ν) : T0α (x) → T0α−2 (x) ,
α ≥ 2.
(2.5)
Because of this, we only need to consider how it operates on simple tensors: Tr(µ,ν) ea1 ···aα := gaµ aν ea1 ···ˆaµ ···ˆaν ···aα .
(2.6)
Here the hat over aµ and aν tells us that eaµ and eaν do not appear in ea1 ···aµ ···aν ···aα . Clearly, Tr(µ,ν) M α is a tensor of type α − 2 if M α is a tensor of type α. For symmetric tensors M α ∈ T0α (x), α ≥ 2, we abbreviate Tr(µ,ν) M α as Tr M α . In the cases α = 1 and α = 0, it will be convenient to set Tr M 1 := M 1 and Tr M 0 := M 0 . If M α ∈ T0α (x), α ≥ 2, is a symmetric tensor, we define Trβ M α , β ≥ 0, by the following formulas: Trβ M α := Tr(Trβ−1 M α ) ,
(2.7a)
Tr1 M α := Tr M α ,
(2.7b)
Tr0 M α := M α .
Consequently, Trβ M α is the result of the β-fold successive application of the Tr operator to the symmetric tensor M α ∈ T0α (x), α ≥ 2. We extend the above definition of Trβ M α to the cases α = 1 and α = 0 by setting Trβ M 1 := M 1 and Trβ M 0 := M 0 . Let µ := min(α, β). Then the µ-fold contraction of M α ∈ T0α (x) with M β ∈ β T0 (x) will be denoted by M α ◦M β . The tensor M α ◦M β of type α+β −2µ is usually termed the inner tensor product of M α and M β . However, if M α and M β are not totally symmetric tensors, then some convention as to which of the 2µ indices are to be contracted, must be followed when doing the contraction. Since M α ◦ M β is
June 4, 2002 11:36 WSPC/148-RMP
474
00122
Z. Banach & W. Larecki
a bilinear operation on M α and M β , this convention can be read directly from the definition of ea1 ···aα ◦ eb1 ···bβ : ga1 b1 · · · gaβ bβ eaβ+1 ···aα for α > β , (2.8) ea1 ···aα ◦ eb1 ···bβ := ga1 b1 · · · gaβ bβ for α = β , for α < β . ga1 b1 · · · gaα bα ebα+1 ···bβ These constructions, which imply that M α ◦M β = M β ◦M α , are only valid for α ≥ 1 and β ≥ 1. In other cases, i.e. if α ≥ β = 0, we simply write M α ◦ M 0 := M 0 M α and M 0 ◦ M α := M 0 M α . Recalling the definitions of S and Tr(µ,ν) , the action of t on M α ∈ T0α (x) and M β ∈ T0β (x) is characterized by M α t M β := S Tr(1,α+β) (M α ⊗ M β ) ,
α +β ≥ 2.
(2.9)
¯ α are the tensors of type α and α ≥ 2, then the action of on Next, if M α and M α α ¯ M and M is described by ¯ α := M αaa2 ···aα M ¯ αb a2 ···aα ea ⊗ eb , Mα M
(2.10)
¯ αb a2 ···aα := ga2 b2 · · · gaα bα M ¯ αbb2 ···bα . M
(2.11)
where
¯ αbb2 ···bα }, these are the componets of M α and As regards {M αaa2 ···aα } and {M α ¯ ¯ 1 ∈ T 1 (x), the tensor M with respect to the basis {ea }. For M 1 ∈ T01 (x) and M 0 1 1 ¯ is given by M M ¯ 1 := M 1a M ¯ 1b ea ⊗ eb . M1 M
(2.12)
¯ 0 ∈ T 0 (x), it will be natural to identify M 0 M ¯ 0 with For M 0 ∈ T00 (x) and M 0 α ¯ α is an the zero tensor of type 2. From all these definitions we infer that M M element of T02 (x). Consider a space-time in which there exists a fiducial congruence of timelike world lines (world lines of “fiducial observers”) with four-velocity vector field u = ua ea . Using gab and ua := gab ub , we define Iab by Iab := gab + ua ub : I ab := g ab + ua ub
(g ac gcb = δ a b ) ,
(2.13a)
I a b := g a b + ua ub
(g a b = δ a b ) .
(2.13b)
The tensor I a b ea ⊗ eb projects orthogonal to u. For any Mβα ∈ Tβα (x), we can construct (Mβα )⊥ ∈ Tβα (x) in such a way that the components {(Mβα )⊥ a1 ···aα b1 ···bβ } of (Mβα )⊥ with respect to the dual bases {ea } and {ea } are given by (Mβα )⊥
a1 ···aα
b1 ···bβ
:= I a1 c1 · · · I aα cα I d1 b1 · · · I dβ bβ Mβαc1 ···cα d
1 ···dβ
.
(2.14)
A tensor Mβα ∈ Tβα (x) will be referred to as spacelike whenever it is orthogonal to u, i.e. whenever Mβα = (Mβα )⊥ . For M 0 ∈ T00 (x), we put (M 0 )⊥ := M 0 and thus
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
475
say that M 0 is spacelike. This definition of M 0 as a spacelike quantity is valid for all choices of u. Using Eq. (2.13a), we introduce the projection tensor I and its tensorial powers I α: I := I ab ea ⊗ eb ,
(2.15a)
I α := ⊗α I
(2.15b)
(I 0 := 1, I 1 := I) .
If α ≥ 2 and M α ∈ T0α (x) is a symmetric spacelike tensor, the trace-free part of M α will be defined by [26] X
[α/2]
hM α i :=
(−1)β k(α − 1 − β, β)l(α, β)[(Trβ M α ) ∨ I β ] ,
(2.16)
β=0
where [α/2] means the largest integer less than or equal to α/2 and k(α, β) :=
(2α + 1)!! , (2α + 2β + 1)!!
l(α, β) :=
α! , 2β β!(α − 2β)!
(2α + 1)!! := 1 · 3 · 5 · · · (2α + 1) .
(2.17a) (2.17b)
With Eqs. (2.16) and (2.17), elementary inspection shows that TrhM α i = 0, where α ≥ 2 and 0 ∈ T0α−2 (x). For scalars M 0 ∈ T00 (x) and spacelike vectors M 1 ∈ T01 (x), we also define hM 0 i and hM 1 i: hM 0 i := M 0 ,
hM 1 i := M 1 .
(2.18)
Given the spacelike tensors M α ∈ T0α (x), the following notation will be useful [see Eqs. (2.4), (2.9), and (2.16)]: M α ? M β := hM α ∨ M β i ,
(2.19a)
M α u M β := hM α t M β i .
(2.19b)
If these spacelike tensors are such that S M α = M α , then the further useful tensors are defined by ˆ α[β] := Trβ M α , M
ˆ α|β := Trβ M α+2β , M
(2.20a)
ˆ α[β] i , M α[β] := hM
ˆ α|β i . M α|β := hM
(2.20b)
ˆ α|β = M ˆ α+2β[β] and M α|β = M α+2β[β] . Clearly, using these definitions, we have M Here, of course, care has to be taken with the interpretation of M α[β] for β = [α/2]: M 2α[α] := hTrα M 2α i := Trα M 2α ,
(2.21a)
M 2α+1[α] := hTrα M 2α+1 i := Trα M 2α+1 .
(2.21b)
June 4, 2002 11:36 WSPC/148-RMP
476
00122
Z. Banach & W. Larecki
Let M α ∈ T0α (x) and (M α )⊥ = M α = SM α . Then there exists the relation [26] X
[α/2]
Mα =
k(α − 2β, β)l(α, β)(M α[β] ∨ I β ) ,
α ≥ 0,
(2.22)
β=0
expressing M α in terms of M α[β] . 2.2. Differential operations on tensor fields Let ∇ be the covariant derivative determined by the metric g. This derivative generates a tensor field of type (α, 1) from a tensor field of type (α, 0): ∇M α := (∇b M αa1 ···aα )ea1 ···aα ⊗ eb .
(2.23)
Note that Eq. (2.23) is valid for the general dual bases {ea } and {ea }. For each ∇M α , we can introduce an associated tensor (∇M α )0 of type (α + 1, 0) by (∇M α )0 := (g ab ∇b M αa1 ···aα )ea1 ···aα ⊗ ea .
(2.24)
Consider also the tensorial objects ∇ · M α and ∇ ∨ M α defined by ∇ · M α := Tr(α,α+1) (∇M α )0
(α ≥ 1)
(2.25a)
and ∇ ∨ M α := S(∇M α )0 .
(2.25b)
According to these definitions, ∇·M is a tensor field of type (α−1, 0) and ∇∨M α is a symmetric tensor field of type (α + 1, 0). Given the four-velocity vector field u, the time derivative of any tensor M α along the fundamental flow lines is α
M˙ α := Tr(1,α+1) [u ⊗ (∇M α )0 ] .
(2.26)
If u˙ does not vanish, then (M˙ α )⊥ is, in general, different from [(M α )⊥ ]· when α > 0. In this paper, we denote u˙ by a [see, e.g. Eq. (3.14)]. In view of these definitions [see also Eq. (2.14)], we are now prepared to construct the following additional operations on M α : ∇⊥ M α := (∇M α )⊥ ,
(2.27a)
(∇⊥ M α )0 := [(∇M α )0 ]⊥ ,
(2.27b)
∇⊥ · M α := Tr(α,α+1) (∇⊥ M α )0 ,
(2.27c)
∇⊥ ∨ M α := S(∇⊥ M α )0 ,
(2.27d)
∇⊥ ? M α := h∇⊥ ∨ M α i .
(2.27e)
In the sequence (2.27), it appears that the operation (2.24) commutes with the taking of the “perp” operation. Using Eqs. (2.25) and (2.27), it is a straightforward matter to verify that ∇⊥ ∨ M α = (∇ ∨ M α )⊥
(2.28a)
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
477
and ∇⊥ ? M α = h(∇ ∨ M α )⊥ i .
(2.28b)
However, ∇⊥ · M α cannot be set equal to (∇ · M α )⊥ , except in the case when Tr(1,α+1) (u ⊗ M α ) and a vanish. 3. Structural Features of a Theory with Covariant Moments 3.1. The Boltzmann equation in a curved space-time We restrict ourselves to the situation in which the uncharged (i.e. neutral) particles of the gas all have identical proper mass m so that m2 = −¯ p ◦ p¯, where p¯ is the four-momentum of the particle: p¯ = pa ea = p0 e0 + pi ei .
(3.1)
˜ be We mostly assume m 6= 0, leaving the case m = 0 to be treated as a limit. Let X the one-particle phase-space for particles of arbitrary proper masses. Then m given ˜ which is constant on each phase orbit. The by (−¯ p ◦ p¯)1/2 is a scalar function on X ˜ hypersurface of X defined by m = const is generated by all those phase orbits which ˜ m for it. Clearly, X ˜ m is belong to the assigned mass value; we use the symbol X the phase space for particles of proper mass m, its dimension is 7. A more detailed discussion of these issues is presented in [1, pp. 26–28]. If u is a future directed timelike vector field (u ◦ u = −1), then the future directed particle four-momenta p¯ satisfy the inequality u ◦ p¯ < 0. In Eq. (3.1), the basis {ea } is a general tetrad basis. Clearly, such a basis can be obtained from a coordinate basis {∂A } by determining the functions ea A which are the components of ea with respect to {∂A }: ea = ea A ∂A
(∂A := ∂/∂xA ) .
(3.2)
For the dual basis {ea }, a consequence of this is that ea = ea A dxA ,
(3.3)
where ea A eb A = δ a b ,
ea A ea B = δ A B .
(3.4)
An obvious way of describing the particles in a given region of space-time is to specify a one-particle distribution function f (xA , pi ) which represents the number of particles at event x with four-momentum p¯. For a single particle, the phase space is the seven-dimensional tangent bundle on X, coordinatized by (xA , pi ), with basic metric g: g = gab ea ⊗ eb = gAB dxA ⊗ dxB .
(3.5)
In a curved space-time, the distribution function f satisfies the following Boltzmann equation [8]: pa ∂a f − Γi ab pa pb
∂f = J(f ) , ∂pi
(3.6)
June 4, 2002 11:36 WSPC/148-RMP
478
00122
Z. Banach & W. Larecki
where J(f ) is the collision term. As to the meaning of ∂a and Γa bc , these objects are defined by ∂a := ea A ∂A
(∂a = ea )
(3.7a)
and Γa bc := ea A ec B (∂B eb A + ΓA DB eb D ) = ea A ec B ∇B eb A ,
(3.7b)
where ΓA BC are the Christoffel symbols of gAB . In kinetic theory, a knowledge of the distribution function f enables us to determine the entropy four-vector. For each x ∈ X, this entropy four-vector is characterized by Z Ψ := −y p¯H(f¯)πm , (3.8) where f¯ := y −1 f ,
y := w/(2π)3 ,
(3.9a)
¯ ¯ , H(f¯) := f(|η| − 1 + ln f¯) − η(1 + η f¯) ln(1 + η f) p |g| 1 πm := − dp ∧ dp2 ∧ dp3 , p0
(3.9b)
|g| := − det(gab ) .
(3.9d)
p0 := g0a pa ,
(3.9c)
Here the value of η is made to be +1 for bosons, −1 for fermions, and 0 for classical particles. Further, explaining Eqs. (3.9a), the quantity w equals 2¯ s + 1 for particles with spin s¯ and proper mass m different from zero, w = 2 for particles with s¯ > 0 and m = 0, and w = 1 for particles with s¯ = 0 and m = 0. Given a four-velocity vector field u, the specification of Ψ is completely equivalent to specifying Z s := −y (−¯ p ◦ u)H(f¯)πm (3.10a) and
Z Φ := −y
pH(f¯)πm ,
(3.10b)
where p := p¯ + (¯ p ◦ u)u
(u ◦ p = 0) .
(3.11)
For essentially obvious reasons, it will be convenient to call s the entropy per unit volume and Φ the entropy flux vector. Now, we come to the entropy law. Multiplying Eq. (3.6) by −dH/df¯ = |η| ln(1 + η f¯) − ln f¯ and integrating over momentum space yields ∇·Ψ= P
(P ≥ 0) ,
(3.12)
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
where
479
Z (dH/df¯)J(f )πm .
P := −
(3.13)
We refer to P as the entropy production. Because of the well-known properties of J(f ), this entropy production is nonnegative. In order to cast Eq. (3.12) into a form consistent with the decomposition Ψ = su + Φ, we first observe that (∇u)0 = (g bc ∇c ua )ea ⊗ eb can be written as 1 (∇u)0 = σ + ω + θI − a ⊗ u , 3
(3.14)
where σ is the shear tensor (Sσ = σ, Tr σ = 0, u ◦ σ = 0), ω is the vorticity tensor (Sω = 0, u ◦ ω = 0), θ is the expansion, and a is the acceleration (u ◦ a = 0). This physical interpretation of (σ, ω, θ, a) primarily applies to the case where u is defined as in the Eckart and Landau–Lifshitz theories [24, 25]. Using the above formula for (∇u)0 then leads us to convert Eq. (3.12) to s˙ + θs + ∇⊥ · Φ + a ◦ Φ = P .
(3.15)
Equation (3.15) is the entropy law in a form convenient for later discussion (see Sec. 5). At this stage, we need to introduce the so-called covariant moments of the distribution function f : Z α f := (⊗α p¯)f πm , α ≥ 0 . (3.16) By means of Eq. (3.6), we obtain at once ∇ · f α+1 = P α , where
(3.17)
Z α
P :=
(⊗α p¯)J(f )πm .
(3.18)
For P 0 and P 1 , we must have P0 = 0,
P1 = 0.
(3.19)
A full derivation of these simple equations may be found in [1]. Notice that since p¯ ◦ p¯ = −m2 , the following identities are satisfied if α ≥ 2: Tr f α = −m2 f α−2 ,
Tr P α = −m2 P α−2 ,
Tr(∇ · f α+1 − P α ) = −m2 (∇ · f α−1 − P α−2 ) .
(3.20a) (3.20b)
Consequently, the trace of Eq. (3.17) reduces to ∇ · f α−1 = P α−2 for α ≥ 2. The basic covariant moments are f 1 and f 2 ; these moments represent the particle number current and the energy-momentum tensor, respectively. However, from the viewpoint of an observer with four-velocity u, the most important fields in many
June 4, 2002 11:36 WSPC/148-RMP
480
00122
Z. Banach & W. Larecki
applications are the number density n, the energy per unit volume , the pressure p, the number flux vector N, the heat flux q, and the stress deviator Π: n := −u ◦ f 1 , p :=
:= (u ⊗ u) ◦ f 2 ,
(3.21a)
N := (f 1 )⊥ ,
(3.21b)
1 Tr(f 2 )⊥ , 3
q := −(u ◦ f 2 )⊥ ,
Π := h(f 2 )⊥ i .
(3.21c)
The above definitions are valid for m 6= 0 and m = 0. 3.2. Maximization of entropy The set consisting of the first k + 1 equations (3.17) is not a determined system since there appear more variables than equations. It can be made so, however, by using the distribution function f that provides a maximum of s, Z (3.22) s = s(f ) := −y (−¯ p ◦ u)H(f¯)πm , under the constraints of fixed values of fuα := −u ◦ f α+1 ,
0 ≤ α ≤ k.
(3.23)
With the Lagrange multipliers Λα , 0 ≤ α ≤ k, we introduce the following functional of f : Z k X α α α Y (f ) := s(f ) + Λ ◦ fu − (−¯ p ◦ u)(⊗ p¯)f πm . (3.24) α=0
The variation of Y (f ) is given by Z δY (f ) = − (−¯ p ◦ u) ln
f¯ + κ δf πm , 1 + η f¯
(3.25)
where κ :=
k X
Λα ◦ (⊗α p¯) .
(3.26)
α=0
The form of f — the one which maximizes s — has to be deduced from the condition δY (f ) = 0. Denoting this form by F or F (κ), we immediately find that y F = F (κ) := κ . (3.27) e −η Because of the dependence of κ on Λ := (Λ0 , Λ1 , . . . , Λk )
(3.28)
and p¯, the natural variables of F are Λ and p¯. Thus we obtain F = F (Λ, p¯) ,
(3.29)
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
481
which means that F depends on x ∈ X through Λ: x 7→ Λ(x) := (Λ0 (x), Λ1 (x), . . . , Λk (x)) .
(3.30)
The Lagrange multipliers Λα , 0 ≤ α ≤ k, appearing in the definition of Λ are tensors of type α, i.e. Λα (x) ∈ T0α (x). For 2 ≤ α ≤ k, these tensors may be assumed without any loss of generality to be symmetric and traceless, since S(⊗α p¯) = ⊗α p¯ and p¯ ◦ p¯ = −m2 [see Eqs. (3.26)]. Consequently, we have SΛα = Λα ,
0 ≤ α ≤ k,
(3.31a)
Tr Λα = 0 ,
2 ≤ α ≤ k.
(3.31b)
Here, an important question arises of how to fit Λ to the actual f . Our construction of F implies that fuα = −u ◦ F α+1 (Λ) where
(0 ≤ α ≤ k) ,
(3.32)
Z F α (Λ) :=
(⊗α p¯)F (Λ, p¯)πm
(0 ≤ α < ∞) .
(3.33)
Mathematically, after specifying the metric g and the four-velocity vector field u, conditions (3.32) are a set of equations for the determination of fu := (fu0 , fu1 , . . . , fuk )
(3.34)
as a function of Λ. The inverse problem to that just mentioned is that of recovering the relation Λ = Λ(fu ) if the relation fu = fu (Λ) is given. For m 6= 0 and k ≥ 1, it is impossible to invert Eqs. (3.32) analytically so as to obtain Λ = Λ(fu ); this can only be done for m = 0 and k = 1. However, if we desire to consider fuα (0 ≤ α ≤ k) as fundamental gas-state variables in place of Λα (0 ≤ α ≤ k), we must postulate that Eqs. (3.32) permit us to calculate Λ as a function of fu , at least in principle. A formal aspect of the closure procedure, which is discussed most conveniently in terms of the Lagrange multipliers, may now be stated very neatly. Define PFα (Λ), 0 ≤ α < ∞, by Z α PF (Λ) := (⊗α p¯)J(F (Λ, p¯))πm . (3.35) Next, focus upon the first k +1 equations (3.17). Finally, in these equations, replace f α+1 by F α+1 (Λ) and P α by PFα (Λ). Accepting this closure procedure, we conclude that the system of equations for Λ is given by ∇ · F α+1 (Λ) = PFα (Λ) (0 ≤ α ≤ k) .
(3.36)
On the understanding that k ≥ 1, there are only (k+1)(k+2)(2k+3)/6 independent scalar variables in Λ and the total number of independent scalar equations in the system (3.36) is also (k + 1)(k + 2)(2k + 3)/6 [27]. Thus this system provides a correct number of equations to determine Λ.
June 4, 2002 11:36 WSPC/148-RMP
482
00122
Z. Banach & W. Larecki
As noted already by M¨ uller and Ruggeri in [13, pp. 211 and 212] Eqs. (3.36) have many desirable properties that follow from their so-called potential form, many of which are lacking in traditional closures. First of all, Eqs. (3.36) are consistent with an additional balance law interpreted as the equation of balance of entropy. Another important property of Eqs. (3.36) is that if g is the metric of a curved or flat spacetime and u is defined as in the theory of Arnowitt et al. [21] (see Sec. 6.1), then these equations can be reduced to a symmetric hyperbolic system. Such a system precludes action at a distance and ensures finite speeds. Also, symmetric hyperbolicity is an attractive feature mathematically, because it ensures well-posedness of local Cauchy problems, i.e. existence, uniqueness, and continuous dependence of the solutions on the data. For k = 1, the connection of Eqs. (3.36) with a system of Eulerian equations is obvious. For k > 1, the internal consistency and physical reasonables of the formalism is less obvious. It would seem that we have derived a hierarchy of closed systems of field equations. However, it is important to understand exactly what freedom is available in the choice of k, i.e. to state precisely what conditions on k are necessary and sufficient for the relation fu = fu (Λ) to be mathematically well defined. When a gas departs only slightly from local equilibrium and k > 1, there is considerable awkwardness in attempting to express fu as a function of Λ. The main reason for this awkwardness will be explained in the text below. 3.3. Requirements placed on the Lagrange multipliers In order to determine the relation fu = fu (Λ), the object Λ must be such as to ensure that the function F α (Λ) is calculable from Eq. (3.33) for all values of α. A set of objects of this kind will be denoted by M. It is natural to think of M as being the set in a real vector space E of dimension (k + 1)(k + 2)(2k + 3)/6, endowed with its canonical structure of differentiable manifold [28]. Let M0 be the interior of M, i.e. the largest open subset of M. Clearly, M0 is an open submanifold of E and the dimension of M0 is equal to the dimension of E. Now, our main problem is to know whether or not the equilibrium states, which are elements of M, are also elements of M0 . If they are not, then the relation fu = fu (Λ) is ill-defined in a neighborhood of these equilibrium states. As to the definition of equilibrium states, see our discussion directly before and after Eq. (3.48). First, we introduce the following notation: Λα−β := [Λα ◦ (⊗β u)]⊥ , β
α ≥ β ≥ 0.
(3.37)
Since Tr Λα = 0 for α ≥ 2 [see Eq. (3.31b)], it can be demonstrated that Λα−2β = Trβ Λα 0 , 2β
Λα−2β−1 = Trβ Λα−1 . 1 2β+1
(3.38)
α−1 In accordance with the above formulas, a knowledge of Λα permits us 0 and Λ1 to calculate Λα−β . Mathematically, a more convenient way of evaluating Λα−β is β β
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
483
based on the identities referred to in Sec. 2.1. Indeed, by using Eq. (2.22), Λα 0 and α−β Λα−1 (and hence Λ ) are expressible in terms of 1 β α[β]
Λ0
:= hTrβ Λα 0i
(3.39a)
and α−1[β]
Λ1
:= hTrβ Λα−1 i. 1
(3.39b)
Therefore, given the definition (3.26) of κ, our next task is to relate κ to α−1[β] Λ1 . The essential step in doing this is to observe that α X α (−¯ p ◦ u)β [(⊗β u) ∨ (⊗α−β p)] , ⊗α p¯ = β
α[β] Λ0
and
(3.40)
β=0
where
α! α := . β β!(α − β)!
(3.41)
For computational reasons, we introduce the additional quantities by setting j := p/(−¯ p ◦ u) , Aα := ⊗α j ,
j := (j ◦ j)1/2 ,
C α := hAα i .
(3.42a) (3.42b)
From Eqs. (2.22) and (3.40), together with the definitions (3.37) and the identities (3.38), it will be possible to obtain a desired expression for κ. After some tedious algebra, by invoking the formulas (2.17a) and (3.26), we can prove that κ=
k [α/2] X X
α[β]
(−¯ p ◦ u)α Kβα (Λ0
◦ C α−2β )
α=0 β=0
+
k [(α−1)/2] X X α=1
α−1[β]
(−¯ p ◦ u)α Lα β (Λ1
◦ C α−2β−1 ) ,
(3.43)
β=0
where Kβα :=
β X ν=0
Lα β :=
β X ν=0
α!(2α − 4β + 1)!! j 2ν , − 2β)!(2β − 2ν)!(2α − 4β + 2ν + 1)!!
(3.44a)
α!(2α − 4β − 1)!! j 2ν . − 2β − 1)!(2β + 1 − 2ν)!(2α − 4β + 2ν − 1)!!
(3.44b)
2ν ν!(α
2ν ν!(α
With the abbreviation X
[α/2]
Qα : =
α[β]
Kβα (Λ0
◦ C α−2β )
β=0
X
[(α−1)/2]
+
β=0
α−1[β]
Lα β (Λ1
◦ C α−2β−1 ) ,
(3.45)
June 4, 2002 11:36 WSPC/148-RMP
484
00122
Z. Banach & W. Larecki
an alternative expression for κ is κ = Λ0 +
k X
Qα (−¯ p ◦ u)α ,
k ≥ 1.
(3.46)
α=1
This form of κ is just the one required for the discussion of our main problem. The object Λ has (k + 1)(k + 2)(2k + 3)/6 independent components and these α[β] α−1[β] components are represented by Λ0 and Λ1 . From Eq. (3.43) we see that κ is α[β] α−1[β] an explicit function of Λ0 and Λ1 . For large values of p ◦ p, the quantities j and j can be identified with p/(p ◦ p)1/2 and 1, respectively. Moreover, −¯ p◦u = 2 1/2 approaches ∞ as p ◦ p → ∞. Let the coefficient Qk in Eq. (3.46) (m + p ◦ p) satisfy the condition Qk 6= 0. (If k > 1 and Qk = 0, then Qk must be replaced by the highest nonvanishing coefficient of κ.) Evidently, κ tends to Qk (−¯ p ◦ u)k as p ◦ p → ∞ and the existence of the functions F α (Λ) depends on the sign of Qk . It is clear, from the way it occurs in Eq. (3.46), that R Qk is positive if Qk 6= 0 and Λ ∈ M, and for the convergence of the integrals (⊗α p¯)F (κ)πm it should obey the condition Qk ≥ Dk (Λ) ,
(3.47)
where Dk (Λ) is a positive constant; this constant may depend parametrically on Λ. Also, the inequality F > 0 for bosons (η = 1) imposes some restrictions on Λ, but they will not be discussed here [14]. Let k > 1. The important subset ME of M is defined by saying that Λ ∈ E α[β] α−1[β] belongs to ME if and only if Λ belongs to M and Λ0 and Λ1 are zero tensors for 1 < α ≤ k. We denote the elements of ME by ΛE . If Λ = ΛE , Eq. (3.43) becomes κ = κE := Λ0 + Λ1 ◦ p¯
(3.48)
and F (κ) reduces to F (κE ). In relativistic kinetic theory, we regard as being appropriate to local equilibrium any molecular-density function f such as to be left unaltered by collisions [29]: J(f ) = 0 .
(3.49)
If we recall the definition of J(f ) given, e.g. in [30], we conclude that any such f is described by f = F (κE ). In what follows, we refer to F (κE ) as the local Maxwell– J¨ uttner distribution function [29, 31] and to ΛE as the equilibrium state. Note that for f = F (κE ) the productions P α in Eqs. (3.17) vanish and the entropy production P in Eq. (3.12) reaches its minimum value, i.e. zero. For a gas that departs only slightly from local equilibrium, we may choose (independently at each point x) an arbitrary local equilibrium distribution function FE := F (κE ) that is close to the actual distribution function f and set f = FE (1 + ϕ). By linearizing the collision term J(f ) with respect to ϕ, we obtain the linearized Boltzmann equation for ϕ. This equation is inhomogeneous and contains the source term because the local Lagrange multipliers Λ0 (x) and Λ1 (x)
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
485
are, in general, arbitrary. However, if equilibrium holds globally in some region, it then follows from Eq. (3.6) that ∇Λ0 = 0 and ∇ ∨ Λ1 = 0. All these problems are carefully discussed by Israel in [29, pp. 205 and 206]. Let ΛE ∈ ME be an arbitrary equilibrium state and let ϑE ⊂ E stand for an arbitrary neighborhood of ΛE ∈ E. Then there exist elements Λ in ϑE such that Λ 6∈ M, i.e. such that the integrals in Eqs. (3.33) are not well defined. These elements are characterized by the conditions h i k κ = κE + (−¯ p ◦ u)k Kk/2 Trk/2 (Λk )⊥ (3.50a) and Trk/2 (Λk )⊥ < 0
(3.50b)
κ = κE + (−¯ p ◦ u)k Lk(k−1)/2 [Tr(k−1)/2 (Λk ◦ u)⊥ ]
(3.51a)
Tr(k−1)/2 (Λk ◦ u)⊥ < 0 ,
(3.51b)
or by the conditions
and
depending on whether k is even or odd. In the case of Eq. (3.50a) where k = 2l 0[0] 1[0] and l ≥ 1, we begin from the general formula (3.43) and postulate that Λ0 , Λ0 , 0[0] k[l] Λ1 , and Λ0 are the only nonvanishing components of Λ. If k = 2l + 1 and l ≥ 1, 0[0] 1[0] 0[0] k−1[l] play the role previously played by the the components Λ0 , Λ0 , Λ1 , Λ1 0[0] 1[0] 0[0] k[l] components Λ0 , Λ0 , Λ1 , Λ0 and Eq. (3.50a) is replaced by Eq. (3.51a). We can now summarize our results. For the system (3.17), we have been attempting to implement the closure by entropy maximization, which involves calculating F (κ) from Eqs. (3.26) and (3.27). This method meets a difficulty in that the intersection of ME ⊂ M and M0 ⊂ M has no elements (ME ∩ M0 = ∅). Since ΛE 6∈ M0 , the relation fu = fu (Λ) is ill-defined in a neighborhood of equilibrium states and we cannot use the system (3.36) with k > 1 to describe a situation around these equilibrium states. In order to do so, we need to show that ME ⊂ M0 . However, it turns out that this is not possible using the formalism of covariant moments, so we will concentrate on the formalism of central moments. In such a case, ME is a proper subset of M0 (see Sec. 5). Another method of overcoming these difficulties is to retain only the term κE in F (κ), while formally expanding F (κ) − F (κE ) in powers of κ − κE . A more detailed explanation as to why the equilibrium state ΛE should be an element of an open set M0 is the following. First, in order to derive an explicit system of partial differential equations for Λ from Eqs. (3.36), we must calculate (in a classical sense) the derivatives of F α+1 (Λ) with respect to Λ. Because of this, the whole procedure leading to such a system is certainly well-defined if Λ ∈ M0 and may be ill-defined if Λ ∈ M and Λ 6∈ M0 . Second, thanks to the concavity of entropy, Eqs. (3.36) obtained through the method of maximum entropy can be written as a symmetric hyperbolic system and the region of hyperbolicity for
June 4, 2002 11:36 WSPC/148-RMP
486
00122
Z. Banach & W. Larecki
Eqs. (3.36) is any open convex subset of M0 . Now, let x 7→ Λ0 (x) be an arbitrarily fixed mapping such that Λ0 is a continuously differentiable function on X and Λ0 (x) is an element of M0 for each x ∈ X. By setting Λ = Λ0 + λ, where λ is a perturbation of Λ0 , it is always possible to linearize Eqs. 3.36) around Λ0 . In this way, we obtain the linear equations for λ, generally with nonvanishing source terms depending on the known function Λ0 , which are symmetric hyperbolic. At first sight, if we are interested in processes not far from an equilibrium state, it seems reasonable to use the mapping x 7→ ΛE (x) in place of the mapping x 7→ Λ0 (x). However, since ΛE (x) 6∈ M0 for each x ∈ X, the linearization procedure based on Λ = ΛE + λ cannot be automatically expected to be capable of yielding the well-defined system of partial differential equations for λ. Also, due to the fact that the matrices of a characteristic equation are ill-defined in a neighborhood of equilibrium states, it is not clear how to interpret the characteristic speeds of disturbances propagating into a region in equilibrium. Thus, we should look for a new formulation of the theory of moments in which the analog of ΛE belongs to the analog of M0 . 4. Description in Terms of Central Moments 4.1. Equations of balance for central moments Using Eqs. (3.11) and (3.42), we define the central moments by Z frα := (−¯ p ◦ u)r Aα f πm ,
(4.1)
where α and r take the values {0, 1, 2, . . .}. Clearly, these moments are symmetric spacelike tensors. It follows from Eqs. (3.16) and (3.40) that the covariant moments f α are related to the central moments fαα−β by α X α α (4.2) f = [(⊗β u) ∨ fαα−β ] . β β=0
We also obtain fαα−β = (−1)β [f α ◦ (⊗β u)]⊥ ,
α ≥β.
(4.3)
Because of these relations, a knowledge of u and frα allows us to calculate f α . However, the converse is not true, i.e. a knowledge of u and f α does not permit us to calculate frα for all values of α and r. The key central moments which uniquely determine the number density n, the energy per unit volume , the pressure p, the number flux vector N, the heat flux q, and the stress deviator Π are f10 , f20 , f11 , f21 , and f22 : n = f10 ,
= f20 ,
N = f11 ,
q = f21 ,
p=
1 Tr f22 , 3
1 Π = f22 − (Tr f22 )I . 3
(4.4a) (4.4b)
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
487
Given Eq. (4.1), it is worthwhile observing that if one changes ones’s choice of fiducial congruence, and thus of the fiducial four-velocity vector field u, then one will change the central moments in a very complicated way. Without persuing the details, it is clear that the procedure leading to Eq. (4.1) is not covariant. More explicitly, two observers looking at the same relativistic gas will, in general, extract two different sets of central moments, not related to each other by a tensorial linear transformation. The situation is evidently virtually the same as that in the case of the moment formalism for radiative transfer [7]. Defining i := p/(p ◦ p)1/2
(4.5)
B α := ⊗α i ,
(4.6)
and
we see that a more general class of central moments is given by Z frα,β := (−¯ p ◦ u)r j β B α f πm ,
(4.7)
where α, β, and r take the values {0, 1, 2, . . .}. With Eq. (3.42a) for j, the description of frα in terms of frα,β leads to the formula frα = frα,α . Note that Tr frα,β = frα−2,β for α ≥ 2. Consequently, we may prefer to use Z hfrα,β i = (−¯ p ◦ u)r j β hB α if πm
(4.8)
(4.9)
instead of frα,β . Then Eq. (2.22) gives X
[α/2]
frα,β
=
k(α − 2µ, µ)l(α, µ)[hfrα−2µ,β i ∨ I µ ] .
(4.10)
µ=0
The irreducible moments hfrα,β i were first studied by Ellis et al. [8, 9]. However, in order to derive a system of evolution equations for hfrα,β i, they assumed that r > β ≥ 0. This assumption is not very convenient for investigating the evolution of all fields of physical interest, since there are the following expressions for n, , p, N, q, and Π: n = hf10,0 i = f10,0 , p=
= hf20,0 i = f20,0 ,
1 0,2 1 hf i = f20,2 , 3 2 3
q = hf21,1 i = f21,1 ,
N = hf11,1 i = f11,1 ,
Π = hf22,2 i .
(4.11a) (4.11b) (4.11c)
Fortunately, although the restriction r > β ≥ 0 is given in [8] and [9] for the validity of the evolution equations, they hold for all values of α, β, and r.
June 4, 2002 11:36 WSPC/148-RMP
488
00122
Z. Banach & W. Larecki
Considering the particular central moments frα , a special-relativistic discussion of the equations of balance for frα was proposed by Struchtrup [10]. As an illustration of the method, he specialized his results to the case in which the state of the gas is assumed to be described completely by the usual two conserved variables n and supplemented by p and the eleven components of N, q, and Π. Here, we return to the problem of deriving the equations of balance for frα , but we solve it according to general relativity. The reasons for studying these equations are twofold. First, we hope that our approach will lead to new insights about the irreducible representation of frα based on trace-free symmetric spacelike tensors such as frα[β] := hfˆrα[β] i = hTrβ frα i
(4.12a)
frα|β := hfˆrα|β i = hTrβ frα+2β i .
(4.12b)
and
Concerning the details, see Sec. 4.2. Second, we would like to give a systematic derivation of a hierarchy of closed systems of moment equations. This then brings up the basic question of whether the closure by entropy maximization ensures that every member of the hierarchy is hyperbolic, implies an equation of balance of entropy, and formally recovers the Euler limit (see Secs. 5 and 6). The differential equations satisfied by the central moments frα can be obtained directly by multiplying the Boltzmann equation (3.6) by (−¯ p ◦ u)r−1 Aα and integrating over the future mass-shell. Doing this and using the decomposition (3.14) of (∇u)0 , we find that α α (f˙rα )⊥ + ∇⊥ · frα+1 + αa ∨ frα−1 + 1 + θfr 3 + α(σ − ω) t frα + (r − α)a ◦ frα+1 1 α+2 α+2 + (r − α − 1) σ ◦ fr + θ Tr fr = Prα , 3 where α and r range from 0 to ∞ and Z α Pr := (−¯ p ◦ u)r−1 Aα J(f )πm .
(4.13)
(4.14)
In Eq. (4.13), we set frα−1 := 0 if α = 0. From the relations (3.19) it follows that P10 , P20 , and P21 vanish. At this stage, we leave the fiducial congruence of timelike world lines arbitrary, so that the user of the formalism can specify it for himself. Landau and Lifshitz [25] chose as rest frame the one in which q = f21 = 0. Eckart [24] assumed that the four-velocity vector field u is parallel to the particle number current f 1 , i.e. that N = f11 = 0. The theory of Ruggeri [32] was founded on the postulate that the entropy flux vector Φ vanishes [see Eq. (3.10b)]. This option remains feasible if the covariant moments are taken into account. Of course, in a curved space-time, many other choices are possible. In Sec. 6.1, we choose u as in the well-known theory
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
489
of Arnowitt et al. [21]; thus there we identify u with the so-called normal vector field. Then the closure by entropy maximization leads directly to a hierarchy of symmetric hyperbolic systems of field equations, i.e. the evolution equations (6.34) for ΛΩ which are symmetric hyperbolic at every order of truncation (see Sec. 6.2). 4.2. Equations of balance for irreducible central moments Given a choice of a timelike vector field u, the particle distribution function f can be expressed in terms of spherical harmonics; the Boltzmann equation can then be written as a set of equations relating the consecutive harmonic components of f . Ellis et al. [8, 9] derived this set of equations and the resulting set of equations for the irreducible moments hfrα,β i. Originally, they used a spherical harmonic formalism to prove a number of exact results on the anisotropic solutions of the Einstein– Boltzmann system. Later, their method has been applied to the study of cosmic microwave background temperature anisotropies arising from inhomogeneities in the early universe [33]. Our purpose here is to provide an alternative to the approach of Ellis et al. [8, 9] which avoids a spherical harmonic decomposition of the Boltzmann equation and uses only a finite number of elementary algebraic operations on Eqs. (4.13). For nonrelativistic classical gases, a very similar formulation of the theory was given by Banach and Piekarski [26]. In order to systematically derive the equations of α|β balance for fr , we first observe that α Trβ (a ∨ frα−1 ) = (α − 2β)a ∨ fˆrα−1[β] + 2βa ◦ fˆrα−1[β−1] ,
(4.15a)
α Trβ [(σ − ω) t frα ] = (α − 2β)(σ − ω) t fˆrα[β] + 2βσ ◦ fˆrα[β−1] ,
(4.15b)
Trβ (f˙rα )⊥ = [(fˆrα[β] )· ]⊥ ,
(4.15c)
Trβ (∇⊥ · frα+1 ) = ∇⊥ · fˆrα+1[β] ,
(4.15d)
Trβ (a ◦ frα+1 ) = a ◦ fˆrα+1[β] ,
(4.15e)
Trβ (σ ◦ frα+2 ) = σ ◦ fˆrα+2[β] ,
(4.15f)
where α ≥ 2β and fˆrα[β] := Trβ frα .
(4.16)
Before using the above formulas, we need to give further thought to the meaning of some terms in these formulas. As regards the details, a complete explanation α−1[β] of Eqs. (4.15a) and (4.15b) will be obtained if we set fˆr := 0 for α = 2β α−1[β−1] α[β−1] ˆ ˆ and fr := 0 and fr := 0 for α ≥ 2β = 0. The identities (4.15) follow from the argumentation similar in spirit to the proof of Eqs. (3.9) in [26], and it is not difficult to transpose this proof to the context of the present theory. With Eqs. (4.15), we see that the result of the β-fold successive application of the operator
June 4, 2002 11:36 WSPC/148-RMP
490
00122
Z. Banach & W. Larecki
Tr to Eq. (4.13) may be written as
α α[β] · α+1[β] ˆ ˆ [(fr ) ]⊥ + ∇⊥ · fr + 1+ θfˆrα[β] 3 + (α − 2β)(a ∨ fˆrα−1[β] + (σ − ω) t fˆrα[β] ) + 2β(a ◦ fˆrα−1[β−1] + σ ◦ fˆrα[β−1] ) + (r − α)a ◦ fˆrα+1[β] 1 + (r − α − 1) σ ◦ fˆrα+2[β] + θfˆrα+2[β+1] = Pˆrα[β] , 3
(4.17)
where (4.18) Pˆrα[β] := Trβ Prα . 0[0] 0[0] 1[0] Clearly, Eqs. (3.19) imply that Pˆ1 = Pˆ2 = 0 and Pˆ2 = 0. α[β] α[β] α[β] is defined by fr := hfˆr i, from Eq. (2.16) we now Remembering that fr derive the following important formulas: h[(fˆrα[β] )· ]⊥ i = (f˙rα[β] )⊥ , α ≥ 2β , (4.19a) hfˆrα+2[β+1] i = frα+2[β+1] ,
α ≥ 2β ,
ha ∨ fˆrα−1[β] i = a ? frα−1[β] , h∇⊥ · fˆrα+1[β] i =
α > 2β ,
α > 2β ,
(4.19d)
α − 2β a ? frα+1[β+1] 2α − 4β + 1 + a ◦ frα+1[β] ,
ha ◦ fˆrα−1[β−1] i =
(4.19c)
α − 2β ∇⊥ ? frα+1[β+1] 2α − 4β + 1 + ∇⊥ · frα+1[β] ,
ha ◦ fˆrα+1[β] i =
(4.19b)
α > 2β ,
(4.19e)
α − 2β a ? frα−1[β] 2α − 4β + 1 + a ◦ frα−1[β−1] ,
α > 2β > 0 ,
(4.19f)
α − 2β − 1 h(σ − ω) t fˆrα[β] i = σ ? frα[β+1] 2α − 4β − 1 + (σ − ω) u frα[β] ,
α > 2β + 1 ,
(4.19g)
hσ ◦ fˆrα[β−1] i = σ ◦ frα[β−1] +
(α − 2β)(α − 2β − 1) σ ? frα[β+1] (2α − 4β − 1)(2α − 4β + 1)
+
2(α − 2β) σ u frα[β] , 2α − 4β + 3
α > 2β + 1 > 1 ,
(4.19h)
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
491
hσ ◦ fˆrα+2[β] i = σ ◦ frα+2[β] +
(α − 2β)(α − 2β − 1) σ ? frα+2[β+2] (2α − 4β − 1)(2α − 4β + 1)
+
2(α − 2β) σ u frα+2[β+1] , 2α − 4β + 3
α > 2β + 1 .
(4.19i)
For the proof of these formulas, it is sufficient to use the same reasoning as that used in the proof of [26, Eqs. (3.12)–(3.18)]. Consequently, we will be brief and omit the technical details. In Eq. (2.16), the symbol h i represents a linear operator which acts on M α . Because of the identities just formulated, the action of this operator on Eq. (4.17) is given by α 1 (f˙rα|β )⊥ + ∇⊥ · frα+1|β + ∇⊥ ? frα−1|β+1 + (α + 2β + 3)θ frα|β 2α + 1 3 2α + 4β + 3 +α σ u frα|β − ω u frα|β 2α + 3 + 2β(a ◦ frα+1|β−1 + σ ◦ frα+2|β−1 ) α(2α + 2β + 1) α−1 α−1|β α−2|β+1 + a ? fr + σ ? fr 2α + 1 2α − 1 α α+1|β α−1|β+1 + (r − α − 2β) a ◦ fr + a ? fr 2α + 1 2α + (r − α − 2β − 1) σ ◦ frα+2|β + σ u frα|β+1 2α + 3 α(α − 1) + σ ? frα−2|β+2 + (θ/3)frα|β+1 = Prα|β , (2α − 1)(2α + 1)
(4.20)
where α, β, and r range from 0 to ∞. In arriving at Eq. (4.20), we have employed the transformations α → α + 2β and β → β, which take frα[β] := hfˆrα[β] i = hTrβ frα i
(4.21a)
Prα[β] := hPˆrα[β] i = hTrβ Prα i
(4.21b)
frα|β := hfˆrα+2β[β] i = hTrβ frα+2β i
(4.22a)
Prα|β := hPˆrα+2β[β] i = hTrβ Prα+2β i ,
(4.22b)
and
into
and
June 4, 2002 11:36 WSPC/148-RMP
492
00122
Z. Banach & W. Larecki 0|0
0|0
1|0
respectively. Naturally, we have P1 = P2 = 0 and P2 0 ≤ α < ∞ and 0 ≤ β < ∞, it is understood that
= 0. Moreover, since
frα−2|β+1 := frα−2|β+2 := 0 if α = 0 or α = 1 ,
(4.23a)
frα−1|β := frα−1|β+1 := 0
(4.23b)
if α = 0 ,
frα+1|β−1 := 0
if β = 0 ,
frα+2|β−1 := 0
if β = 0 .
(4.23c) (4.23d) α|β fr .
We refer to Eq. (4.20) as the equation of balance for The evolution equations satisfied by the basic physical variables n, , p, N, q, and Π may be obtained directly from Eqs. (4.20) if we make use of the fact that 1 0|1 0|0 0|0 n = f1 , = f2 , p = f2 , (4.24a) 3 1|0
N = f1 ,
1|0
q = f2 ,
2|0
Π = f2 .
(4.24b)
A different approach to constructing the differential equations for n, , p, N, q, and Π was taken by Israel and Stewart [12] and M¨ uller and Ruggeri [13]; the idea is to transform and linearize the system (3.36) with k = 2. In the case of a gas of massless particles, we find that Z frα|β = Nrα := (−¯ p ◦ u)r hAα if π0 (4.25a) and
Z Prα|β = Qα r :=
(−¯ p ◦ u)r−1 hAα iJ(f )π0 ,
(4.25b)
where π0 := (πm )m=0 .
(4.26)
Then Eq. (4.20) becomes 2+r (N˙ rα )⊥ + ∇⊥ · Nrα+1 + θ Nrα + (r − α)a ◦ Nrα+1 3 α + ∇⊥ ? Nrα−1 + (α + r + 1)a ? Nrα−1 2α + 1 (α − 1)(α + r) α−2 + σ ? Nr + (r − α − 1)σ ◦ Nrα+2 2α − 1 1 + 2r α α +α σ u Nr − ω u Nr = Qα r . 2α + 3 The correponding result of Thorne [7] is recovered on putting r = 2.
(4.27)
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
493
5. Maximization of Entropy in the Formalism of Central Moments 5.1. Two privileged sets of central moments Let r be an integer ≥ 0. Consider the distribution function f = f (xA , pi ) such that there exists the following infinite set of central moments: fr := {frα | 0 ≤ α < ∞} .
(5.1)
A set of distribution functions of this kind will be denoted by Ar . For each r ≥ 0 and each f ∈ Ar , put ϑr (f ) := fr . Let Br be the set consisting of ϑr (f ) for all f ∈ Ar . If the distribution function f ∈ Ar is given, all the central moments fr0 , fr1 , fr2 , . . . can, in principle, be computed by straightforward integration. Conversely, if all these central moments are known and the proper mass m differs from zero (m 6= 0), the distribution function f ∈ Ar is uniquely defined by fr ∈ Br , except possibly for points (pi ) := (p1 , p2 , p3 ) of a set W ⊂ R3 of Lebesgue measure µL (W) = 0. (The details of the proof, which are purely technical and require the use of the lemma of Dijkstra and van Leeuwen as formulated in [34, p. 468], will be presented in a separate paper.) Thus, if m 6= 0, the mapping Ar 3 f 7→ ϑr (f ) ∈ Br
(5.2)
is one-to-one and onto. Here we assume that f ∈ A1 and f ∈ A2 , since f1 ∈ B1 and f2 ∈ B2 are of fundamental importance in the determination of such variables as n, , p, N, q, and Π [see Eqs. (4.4)]. Another explanation of the special role of this assumption follows from the discussion below. We call f1 and f2 two privileged sets of central moments. Unfortunately, it does not seem possible to support all these statements in the case of a gas of massless particles, and a more sophisticated approach must be used. Then the mapping ϑr : Ar → Br is not one-to-one. However, the new question arises as to how we can compute the distribution function f from a knowledge of {fr | 1 ≤ r < ∞}. This approach is at present under investigation (see also Sec. 7). Maximization of s(f ) subject to the constraints Z α f1 = (−¯ p ◦ u)Aα f πm (0 ≤ α ≤ k) (5.3a) and
Z f2α
(−¯ p ◦ u)2 Aα f πm
=
(0 ≤ α ≤ l)
(5.3b)
implies maximizing the functional Y (f ) := s(f ) +
k X
Λα 1
◦
Z f1α
−
(−¯ p ◦ u)A f πm α
α=0
+
l X α=0
Λα 2
◦
Z f2α
−
(−¯ p ◦ u) A f πm , 2
α
(5.4)
June 4, 2002 11:36 WSPC/148-RMP
494
00122
Z. Banach & W. Larecki
α α α where Λα 1 ∈ T0 (x) and Λ2 ∈ T0 (x) are the Lagrange multipliers corresponding to the aforementioned constraints: α (Λα 1 )⊥ = Λ1 , α SΛα 1 = Λ1 ,
α (Λα 2 )⊥ = Λ2 , α SΛα 2 = Λ2 .
(5.5a) (5.5b)
Equations (5.3)–(5.5) are valid for both m 6= 0 and m = 0. However, if m = 0 and α α ≥ 2, it is also natural to assume that Tr Λα 1 = 0 (k ≥ 2) and Tr Λ2 = 0 (l ≥ 2), α α−2 since Tr fr = fr (0 ≤ r < ∞). Hence, we can replace the constraints (5.3a) and (5.3b) by the constraints Z N1α = (−¯ p ◦ u)hAα if π0 (0 ≤ α ≤ k) (5.6a) and
Z N2α =
(−¯ p ◦ u)2 hAα if π0
Variation of Y (f ) yields the expression Z δY (f ) = − (−¯ p ◦ u) ln
(0 ≤ α ≤ l) .
(5.6b)
f¯ + κ δf πm , 1 + η f¯
(5.7)
in which κ :=
k X
α Λα p ◦ u) 1 ◦ A + (−¯
α=0
l X
α Λα 2 ◦A .
(5.8)
α=0
Then the maximum entropy distribution function is easily seen from δY (f ) = 0 to be y f = F := κ . (5.9) e −η The key difference between Eqs. (3.27) and (5.9) is in the definition of κ. Letting Λ denote the set (Λ01 , . . . , Λk1 , Λ02 , . . . , Λl2 ), we are now in a position to use the distribution function F to express (frα , Prα ) or (Nrα , Qα r ) in terms of Λ: frα = frα (Λ) , Nrα = Nrα (Λ) ,
Prα = Prα (Λ) , α Qα r = Qr (Λ) .
(5.10a) (5.10b)
For m 6= 0, provided that (f10 , . . . , f1k+2 , f20 , . . . , f2l+2 ) and (P10 , . . . , P1k , P20 , . . . , P2l ) appear in the form (5.10a), the first k + 1 equations (4.13) with r = 1 and the first l + 1 equations (4.13) with r = 2 give a closed set of equations from which the evolution of Λ can in principle be determined. After replacing Eqs. (4.13) and (5.10a) by Eqs. (4.27) and (5.10b), a very similar method of closure can be proposed in the case of a gas of massless particles. The local Maxwell–J¨ uttner distribution function arises when Λ = ΛE , i.e. when ( κE := Λ01 + Λ02 (−¯ p ◦ u) for m 6= 0 , κ= (5.11) 0 κE := Λ2 (−¯ p ◦ u) for m = 0 ,
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
495
where Λ02 > 0. In attempting an analysis of the gas in states “infinitesimally near to equilibrium,” it is useful to define the sets E, M, M0 , and ME analogous to those defined previously. To this end, we only need to repeat the construction of Sec. 3.3. Then the advantage of the formalism of central moments over the formalism of covariant moments is that it enables us to prove that ME ⊂ M0 if k ≥ 0 and l > 0. Indeed, the values of U1 :=
k X
α Λα 1 ◦ Λ1
(5.12a)
α Λα 2 ◦ Λ2
(5.12b)
α=1
and U2 :=
l X α=1
of interest are those close to 0; the terms V1 :=
k X
α Λα 1 ◦A
(5.13a)
α Λα 2 ◦A
(5.13b)
α=1
and V2 :=
l X α=1
√ √ are small for these values (Aα ◦ Aα ≤ 1, |V1 | ≤ kU1 , |V2 | ≤ lU2 ), and to the √ extent that Λ02 > lU2 the quantity κ = Λ01 + V1 + (m2 + p ◦ p)1/2 (Λ02 + V2 ) behaves as Λ02 (p ◦ p)1/2 when p ◦ p approaches ∞. This permits the replacement of F by y exp[−Λ02 (p ◦ p)1/2 ] in the limit p ◦ p → ∞. Consequently, there exists a neighborhood ϑE ⊂ E of ΛE ∈ ME ⊂ E such that if Λ ∈ ϑE , then Eqs. (5.10a) and (5.10b) are well defined. This in turn means that ME ⊂ M0 , as it should be. Having confirmed in detail the correctness of the above deductions from Eqs. (5.8) and (5.9), it is now possible to proceed with confidence to the typical problem in which the system (6.31) is linearized around ΛE . The idea is essentially that introduced at the end of Sec. 3.3. The constraints considered so far have all been generated by f1 and f2 . There are, in addition to these, the constraints based on fr (r ≥ 3). However, after appropriately defining ΛE ∈ ME and M0 , they give ΛE 6∈ M0 , a result that may be verified directly from the relation f = F in a similar manner — and with the same kind of certainty — to that discussed in connection with the formalism of Sec. 3. The conclusion is simply this. For an adequate investigation of the method of maximum entropy, it is necessary to restrict attention to the constraints generated by f1 and f2 .
June 4, 2002 11:36 WSPC/148-RMP
496
00122
Z. Banach & W. Larecki
5.2. Auxiliary formulas and useful identities We now derive some auxiliary formulas and useful identities associated with the maximum entropy distribution function F . Our reason for deriving these formulas and identities will be seen in Sec. 5.3, where we prove that every solution of the differential equations for Λ automatically satisfies the supplementary balance law, interpreted as the equation of balance of entropy. As a direct application of F , it will be convenient to introduce the following notation: Z Frα := (−¯ p ◦ u)r Aα F πm . (5.14) Then using the definition (3.9b) of H(f¯) and inserting f = F into Eqs. (3.10a) and (3.10b) gives s = s∗ +
k X
α Λα 1 ◦ F1 +
l X
α Λα 2 ◦ F2
(5.15a)
α+1 Λα , 2 ◦ F2
(5.15b)
(−¯ p ◦ u) ln(1 + η F¯ )πm ,
(5.16a)
α=0
α=0
and Φ = Φ∗ +
k X
α+1 Λα + 1 ◦ F1
α=0
l X α=0
where Z s∗ := (1 − |η|)F10 + ηy Z Φ∗ := (1 −
|η|)F11
+ ηy
(−¯ p ◦ u)j ln(1 + η F¯ )πm ,
(5.16b)
and F¯ := y −1 F . The straightforward calculation of s˙∗ and ∇⊥ · Φ∗ begins with Eqs. (5.16a) and (5.16b) for s∗ and Φ∗ , in which the dependence on space-time location is contained in F10 , F11 , p¯ ◦ u, F¯ , and πm . After some tedious algebra, we find that s˙ ∗ = −a ◦ Z1 −
k X
F1α ◦ Λ˙ α 1 −
α=0
∇⊥ · Φ∗ = −
+
F2α ◦ Λ˙ α 2 ,
(5.17a)
α=0
1 θ[Tr(SZ2 )] + (σ − ω) ◦ Z2 3
k X α=0
where
l X
(∇⊥ ∨ Λα 1)
◦
F1α+1
+
l X
(∇⊥ ∨
α=0
Λα 2)
◦
F2α+1
,
(5.17b)
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments k X
Z1 := Φ∗ −
497
α−1 α+1 [α(Λα − Λα )] 1 ◦ F1 1 ◦ F1
α=0
−
l X
α−1 α+1 [α Λα − (α − 1)Λα ], 2 ◦ F2 2 ◦ F2
(5.18a)
α=0 k X
Z2 := s∗ I −
α+2 α [α(F1α Λα )] 1 − Λ1 ◦ F1
α=0
−
l X
α+2 α [α F2α Λα ]. 2 − (α − 1)Λ2 ◦ F2
(5.18b)
α=0
We have written s˙ ∗ and ∇⊥ · Φ∗ in this way so as to be able to calculate s˙ and ∇⊥ ·Φ from Eqs. (5.15a) and (5.15b). However, before presenting these calculations, it remains to show that Z1 = 0 ,
Z2 = 0 .
(5.20)
The proof of these identities is given in the Appendix. Substituting Eqs. (5.19) into Eqs. (5.17) yields s˙ ∗ = −
k X
F1α ◦ Λ˙ α 1 −
α=0
l X
F2α ◦ Λ˙ α 2 ,
(5.20a)
α=0
∇⊥ · Φ∗ = −
k X
α+1 (∇⊥ ∨ Λα − 1 ) ◦ F1
α=0
l X
α+1 (∇⊥ ∨ Λα . 2 ) ◦ F2
(5.20b)
α=0
Also, given Eq. (A.6a), the trace of s∗ I/3 is s∗ =
k 1X α+2 α [α Λα )] 1 ◦ (F1 − Tr F1 3 α=0
+
l 1X α {Λ ◦ [α F2α − (α − 1)(Tr F2α+2 )]} . 3 α=0 2
(5.21)
α α This may be proved using the fact that Tr[S(Frα Λα r )] = Λr ◦ Fr ; here, of course, r = 1 or r = 2. In view of Eqs. (5.15a) and (5.15b), using the results (5.21) and (A.6b), we show that k X α α α+2 α s= Λα ◦ 1 + F − (Tr F ) 1 1 1 3 3 α=0
+
l X α=0
Λα 2 ◦
1+
α 1 F2α − (α − 1)(Tr F2α+2 ) , 3 3
(5.22a)
June 4, 2002 11:36 WSPC/148-RMP
498
00122
Z. Banach & W. Larecki
Φ=
k X
α+1 α−1 [(1 − α)Λα + α Λα ] 1 ◦ F1 1 ◦ F1
α=0
+
l X
α+1 α−1 [(2 − α)Λα + α Λα ]. 2 ◦ F2 2 ◦ F2
(5.22b)
α=0
Also, given Eqs. (5.15) and (5.20), we obtain s˙ =
k X
˙α Λα 1 ◦ F1 +
α=0
∇⊥ · Φ =
l X
˙α Λα 2 ◦ F2 ,
(5.23a)
α=0 k X
α+1 Λα )+ 1 ◦ (∇⊥ · F1
α=0
l X
α+1 Λα ). 2 ◦ (∇⊥ · F2
(5.23b)
α=0
With the aid of these formulas, it will be possible to prove the existence of the equation of balance for s. 5.3. Construction of the additional balance law From the assumption that f = F , since F = F (Λ, p¯), we verify that Z frα = Frα := (−¯ p ◦ u)r Aα F πm and
(5.24a)
Z Prα = Prα :=
(−¯ p ◦ u)r−1 Aα J(F )πm
(5.24b)
are functions of Λ. Evidently, defining Drα by Drα := (F˙rα )⊥ + ∇⊥ · Frα+1 + αa ∨ Frα−1 α + 1+ θFrα + α(σ − ω) t Frα + (r − α)a ◦ Frα+1 3 1 + (r − α − 1) σ ◦ Frα+2 + θ Tr Frα+2 , 3
(5.25)
the differential equations for Λ turn out to be D1α = P1α
(0 ≤ α ≤ k) ,
(5.26a)
D2α = P2α
(0 ≤ α ≤ l) .
(5.26b)
Before using these equations, it will be convenient to observe that σ ◦ I = (σ − ω) ◦ I = 0 , α ˙α ˙α Λα r ◦ (Fr )⊥ = Λr ◦ Fr , α−1 α−1 Λα ) = a ◦ (αΛα ), r ◦ (αa ∨ Fr r ◦ Fr
(5.27a) (5.27b) (5.27c)
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
499
α+1 α+1 Λα ) = a ◦ (Λα ), r ◦ (a ◦ Fr r ◦ Fr
(5.27d)
α α α Λα r ◦ [(σ − ω) t Fr ] = (σ − ω) ◦ (Fr Λr ) ,
(5.27e)
α+2 α+2 Λα ) = Λα ] r ◦ (σ ◦ Fr r ◦ [(σ − ω) ◦ Fr α+2 = (σ − ω) ◦ (Λα ), r ◦ Fr
(5.27f)
where r = 1, 2. With the notation P :=
k X
Λα 1
◦
P1α
+
α=0
l X
α Λα 2 ◦ P2 ,
(5.28)
α=0
the combination of Eqs. (5.18b), (5.22), (5.23), (5.26), and (5.27) gives the following formulas: k l X X α α P := Λα ◦ D + Λα (5.29a) 1 1 2 ◦ D2 , α=0
α=0
P = s˙ + θs + ∇⊥ · Φ + a ◦ Φ − (σ − ω) ◦ Z2 .
(5.29b)
Substitution of Z2 = 0 and s˙ + θs + ∇⊥ · Φ + a ◦ Φ = ∇ · (su + Φ)
(5.30)
into Eq. (5.29b) then shows that ∇·Ψ=P,
(5.31)
Ψ := su + Φ .
(5.32)
where
Correspondence with Eq. (3.12) suggests the interpretation of Eq. (5.31) as the entropy law; this entropy law is satisfied if Λ is such that Eqs. (5.26) are satisfied. The integral form of P is Z P = − [dH(F¯ )/dF¯ ]J(F )πm , (5.33) where H(F¯ ) := F¯ (|η| − 1 + ln F¯ ) − η(1 + η F¯ ) ln(1 + η F¯ ) .
(5.34)
Proceeding as in the case of Eq. (3.13), we find that P ≥ 0,
(5.35)
the equality holding if and only if F is the Maxwell–J¨ uttner distribution function, i.e. if and only if k = 0 and l ≤ 1. We have so far concentrated on the situation when m 6= 0. However, by repeating and continuing the discussion given at the end of Sec. 4.2, it is a relatively simple matter to extend all our calculations in order to handle a gas composed of massless particles. Consequently, we will not study the case m = 0 further here; some interesting comments may be found, for example, in [35].
June 4, 2002 11:36 WSPC/148-RMP
500
00122
Z. Banach & W. Larecki
6. Field Equations and Hyperbolicity 6.1. A normal vector field In the framework of general relativity, space-time structure is described by a fourdimensional manifold, X, on which there is present a Lorentz metric, g. In addition, in general relativity g is related to the matter distribution by Einstein’s equations. However, the dynamics of g will not concern us here. Thus, in the following (see especially Sec. 6.2), we shall view the space-time structure (X, g) as being given, and focus exclusively upon the formulation of the theory in this fixed, curved background. It is useful to apply the standard technique of Arnowitt et al. [21] and decompose the metric tensor as g = (N K NK − N 2 )dx0 ⊗ dx0 + IKL dxK ⊗ dxL + NK (dx0 ⊗ dxK + dxK ⊗ dx0 ) ,
(6.1)
where N , N K , and IKL are functions of x0 and xK and where NK := IKL N L . In the literature [36, 37], N is called the lapse function (N > 0) and N K and NK are called the shift functions. With the help of Eq. (6.1), we obtain the explicit 3+1 decomposition of space-time into a family of three-dimensional spacelike hypersurfaces parametrized by the value of an arbitrarily chosen time coordinate x0 (t := x0 ). The natural metric induced on a typical equal-time hypersurface is simply (3)
g := IKL dxK ⊗ dxL
and the contravariant form of (3)
(3)
(6.2a)
g is written as
g˜ := I KL
∂ ∂ ⊗ L, ∂xK ∂x
(6.2b)
where I KL ILM = δ K M .
(6.3)
X 3 x 7→ ek K (x) ∈ R
(6.4)
ek K el L IKL = δkl ,
(6.5)
Introducing the functions
such that
we define an orthonormal tetrad basis {e0 , ek } by ∂ ∂ − N −1 N K K , ∂x0 ∂x
(6.6a)
∂ . ∂xK
(6.6b)
e0 := N −1 ek := ek K
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
501
After some algebra, we obtain e0 ◦ e0 = g(e0 , e0 ) = −1 ,
(6.7a)
e0 ◦ ek = g(e0 , ek ) = 0 ,
(6.7b)
ek ◦ el = g(ek , el ) = δkl .
(6.7c)
It will be of interest to choose u so that u := e0 .
(6.8)
ω = 0,
(6.9)
Since ω vanishes when u := e0 ,
there exists a family of three-surfaces everywhere orthogonal to u; these are instantaneous surfaces of simultaneity for all the observers associated with u. Given the above remarks, we are justified in saying that u is the normal vector field. Let {e0 , ek } be a basis dual to {e0 , ek }; thus e0 := N dx0 ,
(6.10a)
ek := ek K (N K dx0 + dxK ) ,
(6.10b)
where the functions X 3 x 7→ ek K (x) ∈ R
(6.11)
are related to ek K by ek M el M = δk l ,
em K em L = δK L .
(6.12)
˜ K is the covariant derivative based on IKL , we can express θ as If ∇ θ=
1 KL ˜ K NL ) , I (IKL,0 − 2∇ 2N
(6.13)
where IKL,0 := ∂IKL /∂x0 = ∂IKL /∂t. The objects a and σ are characterized by a = ak ek ,
σ = σkl ek ⊗ el ,
(6.14)
where 1 k KL ˜ e K I ∇L N , N 1 k l 1 = e K e L I KM I LN − I KL I MN 2N 3
ak = σkl
˜ M NN − ∇ ˜ N NM ) . × (IMN,0 − ∇
(6.15a)
(6.15b)
For any orthonormal tetrad basis, the rotation coefficients Γabc := gad Γd bc
(6.16)
June 4, 2002 11:36 WSPC/148-RMP
502
00122
Z. Banach & W. Larecki
are skew in a and b: Γabc = −Γbac . For the particular basis considered here, we also have 1 Γ0 k0 = δkl al , Γk0l = θ δkl + δkm δln σmn , 3 K k k M L K ˜ M el K , Γ lm = e K em el + ∂M el = ek K em M ∇ LM
(6.17)
(6.18a) (6.18b)
where ∂L := ∂/∂xL , 1 K := I KN (∂L IN M + ∂M IN L − ∂N ILM ) . LM 2
(6.19a) (6.19b)
K Clearly, LM are the Christoffel symbols determined by IKL . One important thing should be noted about the meaning of our construction. In this section, we have assumed ab initio that X can be foliated by a one-parameter family of three-surfaces Σt , i.e. that a time coordinate t can be chosen such that a surface of constant t is Σt . If (xK ) are local coordinates on Σt for each t, then the metric g takes the form (6.1). For any given foliation of X, it is possible to define {e0 , ek } by Eqs. (6.6a) and (6.6b). Consequently, there is no reason to believe that the normal vector fields corresponding to different foliations of X will agree. However, in the context of cosmology [36], there will always be a preferred foliation of X. Using this foliation, we obtain a preferred family of world lines (the fundamental world lines) representing the motion of typical observers in the universe (fundamental observers). The normal vector u is tangent to these world lines. 6.2. Symmetric hyperbolic systems Let M α be a tensor of type (α, 0). If {ea } and u are defined as in Sec. 6.1 and (M α )⊥ = M α , then it is easy to verify that M α = (M α )i1 i2 ···iα ei1 i2 ···iα ,
(6.20)
where ei1 i2 ···iα := ei1 ⊗ ei2 ⊗ · · · ⊗ eiα ,
(6.21a)
(M α )i1 i2 ···iα := δ i1 j1 δ i2 j2 · · · δ iα jα (M α ◦ ej1 j2 ···jα ) .
(6.21b)
We now simplify the notation by setting Ωα := i1 i2 · · · iα ; hence we obtain M α = (M α )Ωα eΩα .
(6.22)
On the understanding that m1 := k and m2 := l, the form of Eqs. (5.26) we are interested in is (Drα )Ωα = (Prα )Ωα
(0 ≤ α ≤ mr ; r = 1, 2) .
(6.23)
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
503
Our basic purpose here is to write Eqs. (6.23) as a quasilinear symmetric hyperbolic system. To see this, we first introduce a number of useful formulas. Let ∇a be a covariant derivative based on Γa bc , and define ∂a by ∂0 := N −1 ∂t − N −1 N K ∂K ,
(6.24a)
∂k := ek K ∂K ,
(6.24b)
where ∂t := ∂/∂x0 = ∂/∂t. Abbreviating (Frα )Ωα as FrΩα , we can show that ((F˙rα )⊥ )Ωα = ∇0 FrΩα = ∂0 FrΩα + α Γ(i1 j0 FrΩα−1 )j =
1 1 ∂t FrΩα − N K ∂K FrΩα + α Γ(i1 j0 FrΩα−1 )j , N N
(6.25a)
(∇⊥ · Frα+1 )Ωα = ∇k FrΩα k = ∂k FrΩα k + Γk jk FrΩα j + α Γ(i1 jk FrΩα−1 )jk = ek K ∂K FrΩα k + Γk jk FrΩα j + α Γ(i1 jk FrΩα−1 )jk ,
(6.25b)
where Ωα−1 := i2 i3 · · · iα . At this stage, it is natural to consider the following objects: Φk∗ := δ ki (Φ∗ ◦ ei ) ,
(6.26a)
ΛrΩα := Λα r ◦ eΩα
(6.26b)
(0 ≤ α ≤ mr ; r = 1, 2) .
Clearly, s∗ and Φk∗ are functions of ΛrΩα , and the results established in Sec. 5.2 and in the Appendix allow us to prove that FrΩα = −
∂s∗ , ∂ΛrΩα
FrΩα k = −
∂Φk∗ . ∂ΛrΩα
(6.27)
By combining Eqs. (6.25) and (6.27), we get ((F˙rα )⊥ + ∇⊥ · Frα+1 )Ωα =
∂2T ∂ΛsΩβ ∂ΛrΩα +
∂t ΛsΩβ
∂2T K ∂K ΛsΩβ + α Γ(i1 j0 FrΩα−1 )j ∂ΛsΩβ ∂ΛrΩα
+ Γk jk FrΩα j + α Γ(i1 jk FrΩα−1 )jk ,
(6.28)
where T := −
1 s∗ , N
T K :=
1 K N s∗ − ek K Φk∗ . N
(6.29)
In Eq. (6.28), we adopt the summation convention whereby a repeated index implies summation over all values of that index.
June 4, 2002 11:36 WSPC/148-RMP
504
00122
Z. Banach & W. Larecki
With the notation CrΩα := (Prα − Drα + (F˙rα )⊥ + ∇⊥ · Frα+1 )Ωα − α Γ(i1 j0 FrΩα−1 )j − Γk jk FrΩα j − α Γ(i1 jk FrΩα−1 )jk ,
(6.30)
Eqs. (6.23) become ∂2T K ∂2T s ∂ Λ + ∂K ΛsΩβ = CrΩα . t Ω β ∂ΛsΩβ ∂ΛrΩα ∂ΛsΩβ ∂ΛrΩα
(6.31)
Note that since Drα − (F˙rα )⊥ − ∇⊥ · Frα+1 = αa ∨ Frα−1 α + 1+ θFrα + α σ t Frα + (r − α)a ◦ Frα+1 3 1 α+2 α+2 + (r − α − 1) σ ◦ Fr + θ Tr Fr , (6.32) 3 the quantity CrΩα does not involve the temporal and spatial gradients of ΛsΩβ (0 ≤ β ≤ ms ; s = 1, 2). In order to make more transparent the dynamical structure of Eqs. (6.31), it will be helpful to introduce ΛΩ to represent the entire collection of dynamical variables: ΛΩ := (ΛrΩα ). Similarly, we introduce C Ω to represent the source terms: C Ω := (CrΩα ). Define T ΩΓ and T KΩΓ by T ΩΓ :=
∂2T , ∂ΛΩ ∂ΛΓ
T KΩΓ :=
∂2T K . ∂ΛΩ ∂ΛΓ
(6.33)
Then Eqs. (6.31) may be written as T ΩΓ ∂t ΛΓ + T KΩΓ ∂K ΛΓ = C Ω .
(6.34)
This first-order system of equations for ΛΓ is called symmetric because T ΩΓ and T KΩΓ are symmetric in the indices Ω and Γ (a consequence, in turn, of the fact that partial derivatives commute). Finally, we obtain an explicit formula for T ΩΓ . Let AΩα := (Aα )Ωα = (⊗α )Ωα .
(6.35)
It follows from Eqs. (5.8) and (5.9) that F = y[exp(ΛΩ AΩ ) − η]−1 ,
(6.36)
where ΛΩ AΩ :=
k X α=0
Λ1Ωα AΩα + p0
l X
Λ2Ωα AΩα .
(6.37)
α=0
Using Eqs. (5.14), (5.16), (A.1), (A.2), (6.29), and (6.33) yields Z y ΩΓ T AΩ AΓ F¯ (1 + η F¯ )dp . =− N
(6.38)
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
505
If XΩ is an arbitrary nonvanishing abstract vector field, then this formula for T ΩΓ implies that Z y T ΩΓ XΩ XΓ = − (AΩ XΩ )2 F¯ (1 + η F¯ )dp < 0 . (6.39) N Thus the symmetric matrix [T ΩΓ ] is negative definite and the system (6.34) is symmetric hyperbolic in an open set of states. We end this sub-section with the following remark. Let Grα := (F˙rα )⊥ + ∇⊥ · Frα+1 − Drα + Prα ,
(6.40)
and focus again upon the formulas of Sec. 5.3. By inspection of Eq. (5.25) with ω = 0, it appears that Grα is a function of the unknowns Λ01 , . . . , Λk1 , Λ02 , . . . , Λl2 considered as independent dynamical variables. Because of this property of Grα , a differential equation of the form (F˙rα )⊥ + ∇⊥ · Frα+1 = Grα
(0 ≤ α ≤ mr ; r = 1, 2)
(6.41)
will be called a conservation equation; it will be called a proper conservation equation if Grα = 0. (By way of digression, Grα equals 0 if and only if Prα = 0 and g = −dt ⊗ dt + δKL dxK ⊗ dxL .) Mathematically, the procedure adopted here was to start with the transformations obtained by taking the identities (6.27), which arise from the fact that the normal vector field u is not a dynamical variable, and subsequently to show how the conservation equations (6.41) can be reduced to the symmetric hyperbolic system (6.34). In the text below, we will explain why these methods are of little value when we turn to the Eckart and Landau–Lifshitz definitions of u. 6.3. Comments on the Eckart and Landau–Lifshitz alternatives Usually, instead of considering a normal vector field, one agrees to consider the Eckart and Landau–Lifshitz alternatives. Define first a local observer in a simple gas who is at rest with respect to the average motion of the particles. His fourvelocity u is such that the number flux vector N, given in Eqs. (4.4b), vanishes: Z 1 N := f1 := (−¯ p ◦ u)A1 f πm = 0 . (6.42) This choice of u is the basis of the relativistic formalism constructed by Eckart [24]. Combining N = 0 with f = F (Λ, p¯) yields the following condition for u and Λ: Z (−¯ p ◦ u)A1 F (Λ, p¯)πm = 0 . (6.43) We pass now to a second observer; the four-velocity u of this observer is defined by the requirement that he measures no heat flux q in his rest frame: Z 1 q := f2 := (−¯ p ◦ u)2 A1 f πm = 0 . (6.44)
June 4, 2002 11:36 WSPC/148-RMP
506
00122
Z. Banach & W. Larecki
A formulation of the theory based on q = 0 is due to Landau and Lifshitz [25]. Then the corresponding condition for u and Λ takes the form Z (−¯ p ◦ u)2 A1 F (Λ, p¯)πm = 0 . (6.45) In both these options, the four-velocity vector field u must be coopted as an extra dynamical variable on the same footing as the dynamical variables in Λ. Thus, if N = 0, the full set of equations for u and Λ consists of Eqs. (6.43) and (5.26). If q = 0, this set is given by Eqs. (6.45) and (5.26). For m 6= 0, there are 1X mT := 3 + (mr + 1)(mr + 2)(mr + 3) 6 r=1 2
independent scalar variables in u and Λ and the total number of independent scalar equations in the aforementioned sets is also mT . In other words, these sets provide an appropriate number of equations to determine u and Λ. Clearly, Eqs. (6.41) are still valid, but u is now the dynamical variable and we must observe that Grα is a function of Λ and ∇u. From this observation it is possible to come to the very important conclusion: for the the Eckart and Landau–Lifshitz definitions of u, Eqs. (6.41) do not form a system of conservation equations. Also, since u and Λ := (Λ01 , . . . , Λk1 , Λ02 , . . . , Λl2 )
(6.46)
satisfy the additional condition [see, e.g., Eq. (6.43)], we cannot use s∗ and Φ∗ to calculate F := (F10 , . . . , F1k+1 , F20 , . . . , F2l+1 )
(6.47)
by means of the formulas similar to those obtained in Sec. 6.2 [see Eqs. (6.27)]. Owing to the above complications, little progress has been made in deriving a hierarchy of symmetric hyperbolic systems of partial differential equations on the basis of these ideas. Nevertheless, as Friedrich [38] and Elst and Ellis [39] first showed, when the matter source for a space-time geometry is described as a perfect fluid (k = l = 0), upon introduction of a set of local coordinates and a specific choice to remove the gauge fixing freedom it is possible to find linear combinations of the field variables and their dynamical equations that lead to an evolution system of (autonomous) partial differential equations in the (first-order) symmetric hyperbolic format. Also, in the case k = l − 2 = 0, Eq. (6.44) implies Λ12 = 0 and then the Landau–Lifshitz definition of u can be effectively used. It is clear that, within this approach, the dynamical variables are the Landau–Lifshitz vector field u, the number density n, the energy per unit volume , the pressure p, and the stress deviator Π. Equivalently, we can consider (u, Λ01 , Λ02 , Λ22 ) as dynamical variables in place of (u, n, , p, Π). These and similar issues, especially the explicit solution of the variational problem in terms of elliptic integrals of the first and second kind (for the case of a gas of massless particles), will be discussed in more detail in a separate paper.
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
507
7. Final Remarks Whether a description of f in terms of covariant or central moments is preferable must be decided individually for each problem. The covariant moments have the advantage of yielding simpler equations. The central moments are clearly of great help in demonstrating that the closure by entropy maximization gives rise to hyperbolic systems of differential equations, i.e. equations in which there is a finite speed of propagation. For a gas of particles with m 6= 0, the initial information appropriate to these differential equations is intermediate between a specification of only standard hydrodynamic variables (such as n and ) and a specification of f . Consequently, the choice of the integers k and l in Eq. (5.8) depends on the degree of approximation which is desired. In particular, if a few central moments in f1 := {f1α | 0 ≤ α < ∞} and f2 := {f2α | 0 ≤ α < ∞} are taken, the expression (5.22a) is most naturally interpreted as an entropy of extended thermodynamics [13]. If many central moments are taken, it is more natural to consider this expression as an approximation to the Boltzmann entropy itself, but the line of demarcation is to some extent arbitrary. Surprisingly, the above observations do not apply to the case of a gas of massless particles. An important role of the condition m 6= 0 occurs in connection with the following fact: a knowledge of f1 and f2 is equivalent to a knowledge of f if and only if m 6= 0. In a sense, the situation becomes more complex when we assume that m = 0. Then, in the formalism of central moments, the determination of f requires calculating fr := {frα | 0 ≤ α < ∞} for all values of r (r = 1, 2, . . .). At first thought, it seems reasonable to modify our previous construction of F by simply using the constraints based on a wider class of central moments. However, as noted already at the end of Sec. 5.1, this modification does not guarantee that ME ⊂ M0 . To sum up, further investigation of these issues may well represent our best opportunity to gain further insight into the nature of the method of maximum entropy for relativistic gases, and such investigations are presently being pursued. Appendix A. Derivation of the Identities Z1 = 0 and Z2 = 0 Let us explain the identities (5.19) in more detail. Given a four-velocity vector field u, there clearly exists a preferred family of orthonormal tetrad bases {ea } associated with u, namely, those in which the timelike vector e0 of the basis {ea } is chosen to be parallel to u. For such a tetrad basis, we have 1 u = e0 , p¯ ◦ u = −p0 , πm = 0 dp , (A.1a) p 1 ∂p0 j = 0 pi ei = δ ik k ei , (A.1b) p ∂p i ∂p kj I = δ ij ei ⊗ ej = δ ei ⊗ ej , (A.1c) ∂pk
June 4, 2002 11:36 WSPC/148-RMP
508
00122
Z. Banach & W. Larecki
where p0 :=
p m2 + (p1 )2 + (p2 )2 + (p3 )2 ,
dp := dp1 ∧ dp2 ∧ dp3 .
(A.2a) (A.2b)
Setting R := (1 − |η|)F + η y ln(1 + η F¯ ) and using Eqs. (5.16) and (A.1), we further observe that Z Z ∂pi ij s∗ I = R dp (δ ei ⊗ ej ) = R dp (δ kj ei ⊗ ej ) ∂pk Z Z ∂κ i ∂R kj i =− p dp (δ ei ⊗ ej ) = p F k dp δ kj ei ⊗ ej ∂pk ∂p and
Z 1 i ∂p0 ik p R dp e = δ R dp ei i p0 ∂pk Z Z ∂R ∂κ = − δ ik p0 k dp ei = δ ik p0 F k dp ei , ∂p ∂p
(A.3)
(A.4a)
Z
Φ∗ =
(A.4b)
where κ is defined by Eq. (5.8). Since k X ∂κ α α+2 pi k (δ kj ei ⊗ ej ) = [α(Aα Λα )] 1 − Λ1 ◦ A ∂p α=0 + (−¯ p ◦ u)
l X
α α+2 [α Aα Λα ] 2 − (α − 1)Λ2 ◦ A
(A.5a)
α=0
and k X ∂κ α−1 α+1 δ ik p0 k ei = [α(Λα − Λα )] 1 ◦A 1 ◦A ∂p α=0 + (−¯ p ◦ u)
l X
α−1 α+1 [α Λα − (α − 1)Λα ], 2 ◦A 2 ◦A
(A.5b)
α=0
Eqs. (A.4a) and (A.4b) can be written as s∗ I =
k X
α+2 α [α(F1α Λα )] 1 − Λ1 ◦ F1
α=0
+
l X α=0
α+2 α [α F2α Λα ] 2 − (α − 1)Λ2 ◦ F2
(A.6a)
June 4, 2002 11:36 WSPC/148-RMP
00122
Evolution of Central Moments
509
and Φ∗ =
k X
α−1 α+1 [α(Λα − Λα )] 1 ◦ F1 1 ◦ F1
α=0
+
l X
α−1 α+1 [α Λα − (α − 1) Λα ]. 2 ◦ F2 2 ◦ F2
(A.6b)
α=0
From these formulas for s∗ I and Φ∗ , recalling the definitions (5.18a) and (5.18b) of Z1 and Z2 , we obtain the required identities Z1 = 0 and Z2 = 0. References [1] J. Ehlers, General relativity and kinetic theory, in General Relativity and Cosmology, ed. B. K. Sachs, Academic Press, New York, 1971, pp. 1–70. [2] W. Israel, in General Relativity, ed. L. O’Raifeartaigh, Clarendon Press, Oxford, 1972, p. 201. [3] N. A. Chernikov, Acta Phys. Pol. 23 (1963) 629. [4] C. Marle, Ann. Inst. Henri Poincare Phys. Theor. 10 (1969) 127. [5] N. G. van Kampen, J. Stat. Phys. 46 (1987) 709. [6] J. L. Anderson and E. A. Spiegel, Astrophys. J. 171 (1972) 127. [7] K. S. Thorne, Mon. Not. R. Astr. Soc. 194 (1981) 439. [8] G. F. R. Ellis, D. R. Matravers and R. Treciokas, Ann. Phys. (NY) 150 (1983) 455. [9] G. F. R. Ellis, R. Treciokas and D. R. Matravers, Ann. Phys. (NY) 150 (1983) 487. [10] H. Struchtrup, Physica A 253 (1998) 555. [11] J. L. Anderson and H. R. Witting, Physica 74 (1974) 466. [12] W. Israel and J. M. Stewart, Ann. Phys. (NY) 118 (1979) 341. [13] I. M¨ uller and T. Ruggeri, Rational Extended Thermodynamics, Springer-Verlag, New York, 1998. [14] Z. Banach, Physica A 275 (2000) 405. [15] Z. Banach and W. Larecki, Physica A 293 (2001) 485. [16] W. Dreyer, J. Phys. A: Math. Gen. 20 (1987) 6505. [17] C. D. Levermore, J. Stat. Phys. 83 (1996) 1021. [18] Z. Banach and S. Piekarski, Nuovo Cimento D 15 (1993) 1087. [19] H. Bidar, D. Jou and M. Criado-Sancho, Physica A 233 (1996) 163. [20] C. D. Levermore and W. J. Morokoff, SIAM J. Appl. Math. 59 (1998) 72. [21] R. Arnowitt, S. Deser and C. W. Misner, The dynamics of general relativity, in Gravitation: An Introduction to Current Research, ed. L. Witten, Wiley, New York, 1962, p. 227. [22] K. O. Friedrichs and P. D. Lax, Proc. Nat. Acad. Sci. USA 68 (1971) 1686. [23] A. E. Fischer and J. E. Marsden, Commun. Math. Phys. 28 (1972) 1. [24] C. Eckart, Phys. Rev. 58 (1940) 919. [25] L. Landau and E. M. Lifshitz, Fluid Mechanics, Pergamon, Oxford, 1959. [26] Z. Banach and S. Piekarski, J. Math. Phys. 30 (1989) 1804. [27] G. Boillat and T. Ruggeri, Continuum Mech. Thermodyn. 11 (1999) 107. [28] R. Abraham, J. E. Marsden and T. Ratiu, Manifolds, Tensor Analysis, and Applications, Springer-Verlag, New York, 1988. [29] W. Israel, Covariant fluid mechanics and thermodynamics: an introducton, in Lectures delivered at C.I.M.E. Session on Relativistic Fluid Dynamics, eds. A. M. Anile and Y. Choquet-Bruchat, Centro Studi Noto, Noto (Syracuse), 1987, pp. 152–210.
June 4, 2002 11:36 WSPC/148-RMP
510
00122
Z. Banach & W. Larecki
[30] S. R. de Groot, W. A. van Leeuwen and Ch. G. van Weert, Relativistic Kinetic Theory, North-Holland, Amsterdam, 1980. [31] J. L. Synge, The Relativistic Gas, North-Holland, Amsterdam, 1957. [32] T. Ruggeri, Continuum Mech. Thermodyn. 2 (1990) 163. [33] T. Gebbie and G. F. R. Ellis, Ann. Phys. (NY) 282 (2000) 285. [34] J. J. Dijkstra and W. A. van Leeuwen, Physica A 90 (1978) 450. [35] G. Mascali and V. Romano, Ann. Inst. Henri Poincare Phys. Theor. 67 (1997) 123. [36] C. W. Misner, K. S. Thorne and J. A. Wheeler, Gravitation, Freeman, San Francisco, 1973. [37] R. Wald, General Relativity, University of Chicago, Chicago, 1984. [38] H. Friedrich, Phys. Rev. D 57 (1998) 2317. [39] H. Elst and G. F. R. Ellis, Phys. Rev. D 62 (2000) 104023.
June 4, 2002 11:50 WSPC/148-RMP
00132
Reviews in Mathematical Physics, Vol. 14, No. 5 (2002) 511–517 c World Scientific Publishing Company
ERRATUM TO “HADAMARD STATES, ADIABATIC VACUA AND THE CONSTRUCTION OF PHYSICAL STATES FOR SCALAR QUANTUM FIELDS ON CURVED SPACETIME” IN REV. MATH. PHYS. 8 (1996) 1091 1159
WOLFGANG JUNKER Max-Planck-Institut f¨ ur Gravitationsphysik, Albert-Einstein-Institut Am M¨ uhlenberg 1, D-14476 Golm, Germany
[email protected] Received 11 October 2001 Revised 19 February 2002
In this erratum we want to correct or modify some of the original statements and proofs in “Hadamard States, Adiabatic Vacua and the Construction of Physical States for Scalar Quantum Fields on Curved Spacetime” in Rev. Math. Phys. 8 (1996) 1091–1159. E0. Introduction In the paper in question, we investigated Hadamard and adiabatic vacuum states for linear scalar field theories on curved spacetimes applying methods from microlocal analysis. In this erratum we want to correct or modify some of the original statements and proofs. One of the main results of the paper was the derivation of a sufficient condition on certain pseudodifferential operators which guarantees that quasifree states of the Klein–Gordon field constructed from such operators are Hadamard states (Theorem 3.12). We have to include here the additional assumption that the spacetime under consideration has a compact Cauchy surface. Moreover, due to an error in the proof of the theorem the statement is slightly modified [Eqs. (74) and (75)], without however affecting its essence. The reason for these modifications and a precise restatement of Theorem 3.12 are presented in Sec. E1. As an application of this theorem we had claimed in the paper that adiabatic vacuum states on Robertson–Walker spacetimes are Hadamard states (Theorem 3.24). This statement is wrong. In Sec. E2 we explain the origin of this error and prove the weaker statement that in the spatially compact case adiabatic vacua and Hadamard states generate unitarily equivalent GNS-representations. For a microlocal characterization of adiabatic vacuum states on arbitrary spacetimes and a clarification of their precise relation to Hadamard states in terms of wavefront sets we refer to [2]. 511
June 4, 2002 11:50 WSPC/148-RMP
512
00132
W. Junker
Finally, in Sec. E3 we correct some further minor mistakes which have come to our attention since the publication of the paper.
E1. Addendum to Theorem 3.12 We have to add to the statement of Theorem 3.12 the assumption that the Cauchy surface Σ is compact. We were implicitly using this e.g. in the proof of Lemma 3.13 when concluding that W F (I2 ) ≡ W F (K1Σ (1−χ)◦(1−ψ)K2Σ) = ∅ from the fact that K1Σ (1 − χ) and (1 − ψ)K2Σ are smooth. Since, however, these distributions are not properly supported, this statement implicitly involves an assumption about their global behaviour. One would have to supply the Riemannian manifold (Σ, h) with asymptotic conditions to assure that our distributions are sufficiently well-behaved at infinity in order that no additional singularities ensue from their composition. By assuming that Σ is compact we avoid this task (in fact, it makes Lemma 3.13 superfluous). Moreover, in part (v) of the proof of Theorem 3.12 we assumed that 1 ⊗ P and P ⊗ 1 with P := R − iI − nα ∇α are “good” pseudodifferential operators possessing the pseudolocal property in order to conclude the ⊂-relations in Eqs. (80) and (83). This, however, is not correct in two respects: Even when R − iI is a pseudodifferential operator in L11,0 w.r.t. the spatial variables and nα ∇α a differential operator in L11,0 w.r.t. the time variable, then R − iI − nα ∇α is not a pseudodifferential operator in L11,0 w.r.t. all variables unless R − iI is a differential operator, too. Consider as an example P := −i(−∆ + µ2 )1/2 − ∂t on R1+3 . Its sym1 1 bol −i(|ξ|2 + µ2 )1/2 − iτ is an element of S0,0 (R4 × R4 ), but not of S1,0 (R4 × R4 ) since taking derivatives w.r.t. ξi does not lower the order w.r.t. the τ -variable below 0. The same reasoning shows that even if P were in L11,0 (M) then 1 ⊗ P and P ⊗ 1 would only be in L10,0 (M × M) in general (unless P were a differential operator). However, Lm 0,0 is a class of rather badly behaved pseudodifferential operators which are excluded in most of the statements of Chap. 2, in particular they do not possess the pseudolocal property in general. We claim that nevertheless Eqs. (80) and (83) are correct due to the particular properties of the Fourier integral operators E + and E − . For the precise proof we refer to Lemma 5.6 in [2], but the essential idea is that even when P is only in L10,0 (] − T, T [×Σ) then one can find a pseudodifferential operator X in ] − T, T [×Σ such that char X ∩ N = ∅ (where N := {(x, ξ); g µν (x)ξµ ξν = 0} is the light cone in cotangent space) and XP ∈ L11,0 (] − T, T [×Σ): Choose X = op χ, where χ is microlocally constructed as a real-valued function which is smooth on T ∗ (] − T, T [×Σ), zero for |(τ, ξ)| ≤ 1/2, homogeneous of degree zero for |(τ, ξ)| ≥ 1 such that χ(t, x; τ, ξ) = 0 on a conic neighborhood of {ξ = 0} and χ(t, x; τ, ξ) = 1 outside a larger conic neighborhood of {ξ = 0} that does not intersect the light cone. Due to the chosen support of χ the decrease in the τ -variable of the symbol of XP (and its derivatives w.r.t. ξi ) can be estimated by that of the ξ-variables and therefore XP ∈ L11,0 (] − T, T [×Σ). The expressions (XP ⊗ 1)E ± and (1 ⊗ XP )E ± can then be considered as compositions of the “good” pseudodifferential operator XP and the Fourier integral operators
June 4, 2002 11:50 WSPC/148-RMP
00132
Erratum
513
E ± which leave the wavefront sets of E ± invariant and differ from (P ⊗ 1)E ± and (1 ⊗ P )E ± , respectively, only by operators with smooth kernels due to N ⊂ char (X − 1). A similar argument justifies Eq. (82) (see the proof of Lemma 5.6 in [2], here the additional assumption enters that Q has a real-valued principal symbol). For the same reasons the operators Q in the proof of Theorem 3.18, Qn in Eq. (122), and Q1 , Q2 on p. 1153 have to be replaced by iXQ, iXQn, iXQ1 , and iXQ2 , respectively (the factor i is introduced to make their principal symbols real-valued). This replacement changes Eq. (74) to Q(R − iI − nα ∇α ) = iX(g + µ2 + r1 (t) + r2 (t)nα ∇α ) and enlarges the characteristic set of Q by char X. Since, however, char X ∩ N = ∅ 0 0 we still have char Q ∩ N− = ∅ (where N− := {(x, ξ) ∈ N ; ξ 0 < 0} is the past light cone) and therefore none of the conclusions is affected by this change. Let us restate Theorem 3.12 including the mentioned changes (cf. Theorem 5.3 in [2]): Theorem 3.120 . Let I(t), R(t), S(t), C(t) be pseudodifferential operators on a compact Σt , t ∈] − T, T [ (satisfing the properties stated in Theorem 3.11) such that I is elliptic, S ∈ L−∞ (from which follows that C ∈ L0 is elliptic) and such that there exist pseudodifferential operators Q and X on ] − T, T [×Σ which have the property Q(R − iI − nα ∇α ) = X(g + µ2 + r1 (t) + r2 (t)nα ∇α )
(740 )
for some r1 , r2 ∈ L−∞ , and Q possesses a real-valued principal symbol q with 0 = ∅. char Q ∩ N−
(750 )
Then the quasifree state given by (72) is an Hadamard state, i.e. the wavefront set (2) of the two-point distribution ΛΣ ∈ D0 (M × M) given by (73) is W F (ΛΣ ) = {(x1 , ξ1 ; x2 , −ξ2 ) ∈ T ∗ (M × M)\{0}; (x1 , ξ1 ) ∼ (x2 , ξ2 ), ξ10 ≥ 0} (2)
(see Theorem 3.9). E2. Correction of Theorem 3.24 The statement of Theorem 3.24 is wrong as it stands. The adiabatic vacuum states on the Robertson–Walker spacetimes are not of Hadamard type in general. This can be seen in the following way: (2) (2) Suppose that the two-point functions Λn and Λn+1 of two adiabatic vacuum (2) (2) states of order n and n + 1, respectively, are of Hadamard type. Then Λn − Λn+1 ∈ (2) (2) C ∞ (M×M), and consequently Λn |Σt ×Σt −Λn+1 |Σt ×Σt ∈ C ∞ (Σt ×Σt ) w.r.t. some Cauchy surface Σt of a Robertson–Walker spacetime M (with flat spatial section, say). However, from the formula for the initial data of Λ(2) at the top of p. 1129 and (2) (2) Eq. (121) we note that Λn |Σt ×Σt − Λn+1 |Σt ×Σt is the kernel of a pseudodifferential operator with symbol 1 (n) −1 (n+1) −1 [(Ω ) − (Ωk ) ]. (138) 2 k
June 4, 2002 11:50 WSPC/148-RMP
514
00132
W. Junker (n+1)
Introducing k
by (n+1) 2
(Ωk
(n)
(n+1)
) = (Ωk )2 (1 + k
),
(139)
∈ S −2n−2 and remembering that ωk := noting from Lemma 3.2 of [4] that k (n) (k 2 /a2 + m2 )1/2 is the leading term in the asymptotic expansion of Ωk for any n we find for (138) the principal symbol (n+1)
(n+1)
1 k 4 ωk
∈ S −2n−3 .
In general, this is not a symbol in S −∞ (as can be explicitly checked from (115)), i.e. the corresponding pseudodifferential operator has not a smooth kernel, in contradiction to our assumption. This contradiction to the statement of Theorem 3.24 has been noted by Hollands [1] when investigating the same question for Dirac fields, and implicitly also by Lindig [3] when calculating the singularity structure of the energymomentum tensor of the Klein–Gordon field in adiabatic vacuum states. Let us now state and prove the correct relation between adiabatic vacua and Hadamard states. Due to the additional compactness assumption in Theorem 3.120 (see Sec. E1) we restrict ourselves to the spatially compact case: Theorem 3.240 . The adiabatic vacuum states of order n ≥ 0 of the linear Klein– Gordon quantum field (100) of mass µ > 0 on the Robertson–Walker spacetimes (96) with κ = +1 (compact spatial sections) are unitarily equivalent to the Hadamard states. Proof. The statement of the theorem is a consequence of Theorems 4.7 and 6.3 in [2] and Theorem 3.3 in [4]. For clarity’s sake, however, we indicate the necessary corrections and modifications in the proof of Theorem 3.24. The main mistake was that we did not clearly distinguish between the solutions Tk (t) of the differential (n) equation (102) and their initial data Wk (t), given by (114). They (and their first derivatives) are of course identical on the initial surface, say Σt0 , but not away (2) from it. Therefore, in the expressions for Λn on p. 1143, we can everywhere replace (n) (n) (n) ˙ (n) (t0 ), respectively. But we assumed in Tk (t0 ) and T˙k (t0 ) by Wk (t0 ) and W k the following that the Eqs. (119) are also valid for t in an interval around t0 . This is not true in general, and Eq. (119) has to be replaced by |Wk (t)|2 = (a3 2Ωk (t))−1 , (n)
˙ (n) (t) W k
(n)
(n) 3 a(t) ˙ 1 Ω˙ k (t) (n) = − − − iΩk (t) . (n) (n) 2 a(t) 2 Wk (t) Ωk (t)
(1190 )
Then the definitions (121) and (122) of our operators Rn , In , and Qn remain correct (n) (n) ˙ (n) /W (n) and multi(when replacing T˙k /Tk in the 2nd line of Eq. (122) by W k k plying the r.h.s. of the definition of Qn in the 1st line by iX, X = op χ, see the
June 4, 2002 11:50 WSPC/148-RMP
00132
Erratum
515
remark in Sec. E1) for t in an interval around t0 . The statement of Lemma 3.26 still holds true uniformly in t near t0 when one replaces in (iii) q(t; ξ) by iχq(t; ξ) to take into account the redefinition of Qn . (In the proof of Lemma 3.26 line 5 of p. 1146 is incorrect and has to be deleted. In the spatially compact case it follows from Theorem 2.28(b) that D and the operators derived from it are pseudodifferential operators (cf. Lemma 6.5 in [2]).) However, the factorization of the Klein–Gordon operator now reads correctly ! ! ˙ (n) ˙ (n) W W a˙ a˙ 1 a˙ ˙ (n) (n) 2 k k ¨ −∂t − 3 − (n) − ∂t = ∂t + 3 ∂t − (n) Wk + 3 Wk (n) a W a a Wk Wk k 1 a˙ (n) 2 2 2 = g + µ − (n) ∂t + 3 ∂t + ωk Wk a Wk (n+1) 2
= g + µ2 + (Ωk
(n)
) − (Ωk )2
(n)
(n+1)
= g + µ2 + (Ωk )2 k
(n+1)
where we used (114), (115) and the definition (139) of k . From Lemma 3.2 in (n) 2 (n+1) −2n [4] it follows that (Ωk ) k = O(k ) for k → ∞, i.e. we have −Qn (Rn − iIn − ∂t ) = iX(g + µ2 )
mod L−2n
(140)
(uniformly for t near t0 ) and not mod L−∞ as we had claimed. Whereas Qn , Rn , and In are only defined by truncated asymptotic expansions mod O(k −2n−1 ) and O(k −2n−2 ), respectively, we can add up the asymptotic expansions defining these operators to infinite order (analogous to Lemma 2.3) obtaining operators Q∞ , R∞ , and I∞ (uniquely mod L−∞ ) which satisfy (140) for all n ∈ N and hence (740 ), precisely as we did in the construction of Hadamard states in Sec. 3.7. Theorem 3.120 then implies that the states defined by R∞ and I∞ (after a possible redefinition mod L−∞ to make them satisfy the assumptions in Theorem 3.11) are Hadamard states. To show that the adiabatic vacua of order n are unitarily equivalent to the Hadamard states we calculate the Bogoliubov transformation (116) between two such states and find, using (117), that |β(k)| = (2a3 )−1 (In (k)I∞ (k))−1/2 |R∞ (k) − Rn (k) − i(I∞ (k) − In (k))| where In (k), I∞ (k), Rn (k), R∞ (k) denote the symbols of the corresponding operators (121) defining the states, hence |β(k)| = O(k −2n−2 ) for k → ∞. Comparing with Theorem 4.5 of L¨ uders & Roberts [4] we can conclude that for κ = +1 all adiabatic vacua are unitarily equivalent to Hadamard states. We do not see any difficulty to extend this analysis to Robertson–Walker spacetimes with flat spatial sections (κ = 0) since in this case all operators are explicitly given as pseudodifferential operators with symbols that are constant in the spatial variable. To treat the case κ = −1, however, the use of a specially adapted pseudodifferential calculus on non-compact spaces seems to be necessary.
June 4, 2002 11:50 WSPC/148-RMP
516
00132
W. Junker
Unfortunately, this correction of Theorem 3.24 leaves again open the question for the adiabatic vacuum of order 0 on the open universe, and we have to restrict the statement of Corollary 3.25 correspondingly: Corollary 3.250 . All Hadamard states and adiabatic vacuum states of order n ≥ 0 of the linear Klein–Gordon quantum field on the closed Robertson–Walker spacetimes (κ = 1) lie in the same local primary folium (quasiequivalence class) of states. For the special spacetime manifolds chosen here, this already confirms our conjecture at the end of the paper, that, by truncating the asymptotic expansions of the operators, one obtains states that are locally quasiequivalent to Hadamard states. In the case of Dirac fields the same result has been established by Hollands [1]. In [2] Junker & Schrohe extend the definition of adiabatic vacuum states from Robertson–Walker spacetimes to arbitrary globally hyperbolic spacetime manifolds by a generalized wavefront set condition (using the notion of the Sobolev wavefront set). With this definition Hadamard states are the adiabatic vacua of infinite order. A criterion analogous to (140) for quasifree states to be adiabatic in this extended sense can then be given, and the construction of adiabatic states can be carried out using the methods of Sec. 3.7. E3. Minor Corrections p. 1107: The last line should read: W F (v) = −W F T (v) , where W F T (v) := {(x1 , ξ1 ; x2 , ξ2 ) ∈ T ∗ (X × X); (x2 , ξ2 ; x1 , ξ1 ) ∈ W F (v)} . p. 1125: Eq. (71) should read EF := −iΛ(2) + ∆A . On line 8 from the bottom read timelike instead of lightlike. p. 1127: On lines 13, 20, 22 read Eq. (73) instead of Eq. (2). p. 1129: On lines 9 and 15 replace are lightlike connected by lie on a pair of lightlike geodesics that intersect on Σ. p. 1153: Line 11 should read: 1 √ ⇔ −rt + iA1/2 + ist − √ ∂t h (rt − iA1/2 − ist − ∂t ) = P + r1t h
on ] − T, T [ .
Acknowledgments I am grateful to Stefan Hollands for pointing out the contradiction to Theorem 3.24 and discussions about this issue, and to Rainer Verch for indicating the missing compactness assumption in Theorem 3.12.
June 4, 2002 11:50 WSPC/148-RMP
00132
Erratum
517
References [1] S. Hollands, “The Hadamard condition for Dirac fields and adiabatic states on Robertson–Walker spacetimes”, Comm. Math. Phys. 216 (2001) 635–661. [2] W. Junker and E. Schrohe, “Adiabatic vacuum states on general spacetime manifolds: Definition, construction, and physical properties”, to appear in Ann. H. Poincar´e. [3] J. Lindig, “Not all adiabatic vacua are physical states”, Phys. Rev. D, 59:064011, 1999. [4] C. L¨ uders and J. E. Roberts, “Local quasiequivalence and adiabatic vacuum states”, Comm. Math. Phys. 134 (1990) 29–63.
July 18, 2002 9:28 WSPC/148-RMP
00134
Reviews in Mathematical Physics, Vol. 14, No. 6 (2002) 519–530 c World Scientific Publishing Company
´ ALGEBRAS AND CONSIDERATIONS ON SUPER POINCARE THEIR EXTENSIONS TO SIMPLE SUPERALGEBRAS
S. FERRARA CERN, Theory Division, CH 1211 Geneva 23, Switzerland INFN, Laboratori Nazionali di Frascati, Italy Department of Physics and Astronomy University of California, Los Angeles, USA ´ M. A. LLEDO INFN, Sezione di Torino, Italy Dipartimento di Fisica, Politecnico di Torino Corso Duca degli Abruzzi 24, I-10129 Torino, Italy Received 20 December 2001 Revised 8 May 2002
We consider simple superalgebras which are a supersymmetric extension of the spin algebra in the cases where the number of odd generators does not exceed 64. All of them contain a super Poincar´e algebra as a contraction and another as a subalgebra. Because of the contraction property, some of these algebras can be interpreted as de Sitter or anti de Sitter superalgebras. However, the number of odd generators present in the contraction is not always minimal due to the different splitting properties of the spinor representations under a subalgebra. We consider the general case, with arbitrary dimension and signature, and examine in detail particular examples with physical implications in dimensions d = 10 and d = 4.
1. Introduction Super Poincar´e algebras [1] are non semisimple superalgebras [2, 3, 4]. Their even part is the Poincar´e algebra (plus some extra generators that we will see below) and their odd part carries one or more (for N -extended supersymmetry) spinor representations of the underlying Lorentz algebra. A spinor representation of the Lorentz algebra is an irreducible complex representation whose highest weights are the fundamental weights corresponding to the right extreme nodes in the Dynkin diagram. These are representations of the spin group that do not descend to representations of the orthogonal group. (For a review see [5, 6]). We will call odd charges or spinor charges the generators of the odd part of the super Poincar´e algebra. A reality condition must be imposed on the spinor charges to obtain a real Lie superalgebra. 519
July 18, 2002 9:28 WSPC/148-RMP
520
00134
S. Ferrara & M. A. Lled´ o
Generically, we can write the anticommutator of two spinor charges as X [µ ···µ ] [IJ] µ 1 k pµ GIJ + γ(αβ) Z[µ1 ···µk ] . {QIα , QJβ } = γαβ
(1.1)
k [IJ]
Here the indexes α run over the spinor representation, and I, J = 1, . . . , N . Z[µ1 ···µk ] are even generators that are in a antisymmetric tensorial representation (a representation on the antisymmetric nth tensor product of the fundamental representation space) of the Lorentz group and commute with the translation generators. In general there is a group G that acts on QI , which depends on the particular properties of the spinor representations in different signatures and dimensions. GIJ is then an invariant tensor under this group and Z IJ is in the two fold (symmetric or antisymmetric) representation of such group. It is called the automorphism group because its action leaves invariant the Lie superbrackets of the Poincar´e superalgebra (it acts trivially on the other generators). The symmetry properties of Z IJ [µ1 ···µk ] are the same than the symmetry properties (in α, β) of γ(αβ) , which in turn depend only on the space-time dimension modulo 8 (and not on the signature). From (1.1), we see that the even part of a super Poincar´e algebra has an abelian ideal that contains the spacetime translation generators and the Z generators. Together with the odd charges they form a central extension of the supertranslation algebra. The physical interpretation of the Z generators is related to p-brane charges [7, 8], that is, to certain configurations of supergravity theories in which the expectation value of the Z-charges is nonzero. Superconformal algebras are simple supersymmetric extensions of the conformal algebra. In [9] these extensions for N = 1 were studied for arbitrary dimension (d = s+t) and signature (ρ = s−t) and in [10] the case of extended supersymmetry was treated. There are in general two superconformal algebras, a maximal one which is always osp(1|2N n) (2n is the dimension of the spinor representation of the odd generators) and a minimal one, whose bosonic part is the Spin(s, t)-algebra of the spinor representation. Only in dimensions d = 3, 4, 5, 6 is it possible to find a simple superalgebra with a bosonic part which factorizes as a direct sum of the orthogonal algebra so(s, t) plus a simple R-symmetry algebra [11], as required by the Coleman–Mandula theorem [12]. The odd part of the superalgebra is a direct sum of spinor representations [13]. For higher dimensions, the bosonic part is also the direct sum of an R-symmetry subalgebra and a spacetime subalgebra, but the spacetime subalgebra is enlarged with extra generators. In the web of connections between string theory and M-theory [14], or possible generalizations as F-theory [15] or S-theory [16], it is natural to investigate the role played by the simple superalgebra, even in the cases d > 6 [9, 10, 17, 18, 19, 20]. As in the purely bosonic case, the super Poincar´e algebra with Z-charges can be obtained from simple superalgebras in two different ways. One is by contraction [21], the other as sub-superalgebra [17, 18]. Let spacetime be a manifold of dimension d with an (indefinite) metric of signature (s, t). The Poincar´e group acts on flat spacetime, Rd ' ISO(s, t)/SO(s, t).
July 18, 2002 9:28 WSPC/148-RMP
00134
Considerations on Super Poincar´ e Algebras
521
Its Lie algebra, iso(s, t), is a subalgebra the conformal algebra of Rd , the simple algebra so(s + 1, t + 1) [22]. Other possible backgrounds are the symmetric spaces SO(s + 1, t)/SO(s, t), with signature (s, t). There is a contraction of the isometry algebra so(s+1, t) which gives the Poincar´e algebra iso(s, t). Interchanging the roles of s and t we have a different contraction, from so(s, t + 1). For physical signature, the two spaces are the de Sitter space SO(d, 1)/SO(d − 1, 1) and the anti de Sitter one SO(d − 1, 2)/SO(d − 1, 1). Unitary multiplets of the anti de Sitter superalgebra in dimension 11, osp(1|32) where investigated in [23]. The super Poincar´e algebra in dimension d is a subalgebra of the superconformal algebra in the same dimension, it appears with a different number of Z generators in the abelian ideal, depending whether one looks at the minimal or the maximal superconformal algebra. [24] is an attempt to formulate M-theory as a spontaneously broken phase of its superconformal extension, where the symmetry under the superconformal algebra osp(1|64) is broken to the super Poincar´e subalgebra with 2 and 5-brane charges. Anti de Sitter and de Sitter superalgebras are supersymmetric extensions of so(d − 1, 2) and so(d, 1) respectively. They play an important role in the framework of the AdS and dS/CFT duality [25, 26, 27, 28]. Also, it is possible to obtain the super Poincar´e algebra as a contraction of a simple superalgebra. We have the de Sitter and anti de Sitter superalgebras (simple extensions of the de Sitter and anti de Sitter algebras respectively), although not always the contraction of an N = 1 super algebra gives an N = 1 super Poincar´e algebra. In [19] the possibility of using some superalgebra gauge theory which gives M-theory as a particular low energy configuration (contraction) is explored. Simple superalgebras embedding ordinary spacetime supersymmetry algebras are also relevant to explore how the theories depend upon the signature of spacetime and provide a clue on the existence of supergravity theories with non lorentzian spacetime signature, as conjectured in [29] on the basis of time like T duality, with the M, M0 and M∗ -theories in eleven dimensions. The paper is organized as follows. In Sec. 2 we enumerate all the (minimal) superconformal algebras with 64, 32, 16 and 8 spinor charges in dimensions d = 3, . . . , 11 and arbitrary signature. We observe that the same superalgebra may be obtained from spacetimes with different signatures ρ, ρ0 if they are congruent mod 8, ρ = ±ρ0 + 8n, which may suggest a duality of the physical theories. In Sec. 3 the de Sitter and anti de Sitter superalgebras in dimensions d = 3, . . . , 12 are considered and their contractions to super Poincar´e algebras are studied. In Sec. 4 we consider physically interesting examples in d = 4 and d = 10. In the Appendix we give some basic definitions about Lie superalgebras. 2. Super Conformal Algebras in Diverse Dimensions Superconformal algebras with up to 64 spinor real charges correspond to different real forms of complex superalgebras whose even part contains so(s + 1, t + 1) and
July 18, 2002 9:28 WSPC/148-RMP
522
00134
S. Ferrara & M. A. Lled´ o
whose odd part is a direct sum of spinor representations of the same algebra. A spinor in dimension d + 2, d = s + t, has complex dimension 2(d+1)/2 for d odd and 2d/2 for d even (chiral spinors). The dimension of the real representation depends on the reality condition (the same than the complex in the real case, twice the complex dimension in the quaternionic and complex case). In even dimension, when the superalgebra is of type sl(m|n) (d + 2 = 2, 6), it contains left and right spinors (non chiral algebra) while if the algebra is of type osp(m|n) then it is chiral (in fact, the metric preserving condition halves the number of odd generators with respect to the linear superalgebra). For example, for d = 12 the superconformal algebra is linear (or unitary) and for N = 1 it has already 128 charges. So the maximal dimension that we can consider is d = 11. d = 11. For ρ = 1, 7 mod 8 we have osp(1|64) with 64 odd charges. It corresponds to spacetimes of type (10, 1) (M-theory), (9, 2) (M∗ -theory) and (6, 5) (M0 -theory) [29]. d = 10. For ρ = 0 mod 8, we have osp(2 − q, q|32), (q = 0, 1) with 64 charges and osp(1|32) with 32 charges. They correspond to spacetimes of type (5, 5) and (9, 1). For ρ = 2, 6 mod 8 we have osp(1|32, C) with 64 odd charges and spacetimes of type (6, 4), (10, 0) and (8, 2). For ρ = 4 we have osp(2∗ |16, 16) with 64 charges and spacetime (7, 3). These correspond to different forms of Type IIA, IIB and (1, 0) theories studied in [29]. d = 9. For ρ = 1, 7 mod 8 we have osp(2 − q, q|32) (q = 0, 1) with 64 charges and osp(1|32) with 32 charges. They correspond to spacetimes of type (9, 0), (5, 4), (8, 1). For ρ = 3, 5 we have osp(2∗ |16, 16) with 64 charges and spacetimes of type (6, 3) and (7, 2). d = 8. For ρ = 0 mod 8 we have sl(16|2) with 64 charges and sl(16|1) with 32 charges. They correspond to spacetimes of types (4, 4) and (8, 0). For ρ = 2, 6 we have su(8, 8|2 − q, q) (q = 0, 1) with 64 charges and su(8, 8|1) with 32 charges. They correspond to spacetimes of type (5, 3) and (7, 1). For ρ = 4 mod 8 we have su∗ (16|2) with 64 charges and spacetime of type (6, 2). d = 7. For ρ = 1, 7 we have osp(8, 8|4) with 64 charges and osp(8, 8|2) with 32 charges. They correspond to spacetimes of type (4, 3) and (7, 0). For ρ = 3, 5 we have osp(16∗ |4−2q, 2q) (q = 0, 1) with 64 charges and osp(16∗ |2) with 32 charges. They correspond to spacetimes of type (5, 2) and (6, 1).
July 18, 2002 9:28 WSPC/148-RMP
00134
Considerations on Super Poincar´ e Algebras
523
d = 6. For ρ = 0 we have osp(4, 4|8) with 64 charges, osp(4, 4|4) with 32 charges and osp(4, 4|2) with 16 charges. They correspond to spacetime of type (3, 3). For ρ = 2, 6 we have osp(8|4, C)R with 64 charges and osp(8|2, C) with 32 charges. They correspond to spacetimes of type (6, 0), (4, 2). For ρ = 4 we have osp(8∗ |8 − 2q, 2q) (q = 0, 1, 2) with 64 charges, osp(8∗ |4 − 2q, 2q) (q = 0, 1, 2) with 32 charges and osp(8∗ |2) with 16 charges. They correspond to spacetime of type (5, 1). d = 5. For ρ = 1 we have osp(4, 4|8) with 64 charges, osp(4, 4|4) with 32 charges and osp(4, 4|2) with 16 charges. They correspond to spacetime of type (3, 2). For ρ = 3, 5 we have osp(8∗ |8 − 2q, 2q) (q = 0, 1, 2) with 64 charges, osp(8∗ |4 − 2q, 2q) (q = 0, 1) with 32 charges and osp(8∗ |2) with 16 charges. They correspond to spacetimes of type (5, 0) and (4, 1). For d = 5 there exists a smaller superalgebra, the exceptional superalgebra fp4 . The integer number p denotes the real form of the complex superalgebra f4 , which depends on the signature of spacetime. For ρ = 3, 5 the even part of the superalgebra is spin(7 − p, p) ⊕ su(2) with p = 2, 1. In these signatures, the spinors are quaternionic (pseudoreal), so there exists a pseudoconjugation in the spinor space, which together with the pseudoconjugation in the fundamental of sl(2, C) defining su(2) gives a conjugation defining the real form of the superalgebra. For signatures ρ = 1, 7 we have an even part spin(7 − p, p) ⊕ sl(2, R), with p = 3, 0. In these cases the spinors are real, so the conjugation defining the real form of the superalgebra is formed with the conjugation in the spinor space and the conjugation in the fundamental of sl(2, C) defining sl(2, R). The superalgebra has 16 charges. (For more on conjugations, pseudoconjugations and real forms see [9]. See also [30]). d = 4. For ρ = 0 we have sl(4|8) with 64 charges, sl(4|4) with 32 charges, sl(4|2) with 16 charges and sl(4|1) with 8 charges. They correspond to spacetime of type (2, 2). For ρ = 2 we have su(2, 2|8 − q, q) (q = 0, . . . , 4) with 64 charges, su(2, 2|4 − q, q) (q = 0, 1, 2) with 32 charges, su(2, 2|2−q, q) (q = 0, 1) with 16 charges and su(2, 2|1) with 8 charges. They correspond to spacetime of type (3, 1). For ρ = 4 mod 8 we have su∗ (4|8) with 64 charges, su∗ (4|4) with 32 charges, ∗ su (4|2) with 16 charges and spacetime of type (4, 0). d = 3. For ρ = 1 we have osp(16 − q, q|4) (q = 0, . . . , 8) with 64 charges, osp(8 − q, q|4) (q = 0, . . . , 4) with 32 charges, osp(4 − q, q|4) (q = 0, 1, 2) with 16 charges, osp(2−q, q|4) (q = 0, 1) with 8 charges and osp(1|4) with 4 charges. They correspond to spacetime of type (2, 1).
July 18, 2002 9:28 WSPC/148-RMP
524
00134
S. Ferrara & M. A. Lled´ o
For ρ = 3 we have osp(16∗ |2, 2) with 64 charges, osp(8∗ |2, 2) with 32 charges, osp(4∗ |2, 2) with 16 charges and osp(2∗ |2, 2) with 8 charges. They correspond to spacetime of type (3, 0). 3. De Sitter and Anti de Sitter Superalgebras and Their Contractions We write the simple superalgebras that are extensions of de Sitter (so(d, 1)) and anti de Sitter (so(d − 1, 2)) algebras in physical signature (d − 1, 1). For d = 4, 5, 12, the de Sitter superalgebra exists only with N even. In d = 6 and d = 10 the de Sitter and anti de Sitter superalgebras coincide. This is because the signature ±ρ of the the (d + 1)-dimensional spaces (where the de Sitter or anti de Sitter algebras are linearly realized) are congruent modulo 8 (3 and 5 for d = 6, 1 and 7 for d = 10). For anti de Sitter we have that the superalgebras of d = 6, 7 and d = 10, 11 coincide. For d = 6 there exists a smaller subalgebra, the exceptional superalgebra f4 with two real forms, f14 and f24 whose bosonic parts are spin(6, 1) ⊕ su(2) and spin(5, 2) ⊕ su(2) respectively. f4 has a non chiral odd part of type (1, 1). They are the proper de Sitter and anti de Sitter superalgebras. For d ≤ 7 one can find a simple superalgebra whose bosonic part has the de Sitter or anti de Sitter algebra as a factor. For higher dimensions this is not true, as one can check directly from Table 1. The contractions of these algebras to Poincar´e superalgebras where studied in detail in [9] for N = 1. It is of interest to note that the contractions give super Poincar´e algebras with a number of odd generators which is not, in general, the minimal one in super Poincar´e. In Table 1 we have added the dimension of the odd part of the superalgebra (“odd”) together with the dimension of the odd part of the N -extended super Poincar´e algebra for a spacetime of type (d − 1, 1) (“odd SP”). The de Sitter superalgebra gives the correct contraction for d = 4, 5, 12. The anti de Sitter superalgebra gives the correct contraction for d = 4, 5, 7, 11, 12. The remaining case give twice the number of odd generators. For d = 6, 10 one obtains Table 1. d
De Sitter and anti de Sitter superalgebras.
de Sitter
odd
anti de Sitter
odd
odd SP
3
osp(N |2, C)R
4N
osp(N − q, q|2)
2N
2N
4
osp(N ∗ |2, 2)
4N
osp(N − q, q|4)
4N
4N
5
su∗ (4|N )
8N
su(2, 2|N − q, q)
8N
8N
6
osp(8∗ |2N − 2q, 2q)
16N
osp(8∗ |2N − 2q, 2q)
16N
8N
7
osp(8|2N, C)
32N
osp(8∗ |2N − 2q, 2q)
16N
16N
8
osp(8, 8|2N )
32N
osp(16∗ |2N − 2q, 2q)
32N
16N
9
sl(16|N )
32N
su(8, 8|N − q, q)
32N
16N
10
osp(N − q, q|32)
32N
osp(N − q, q|32)
32N
16N
11
osp(N |32, C)R
64N
osp(N − q, q|32)
32N
32N
12
osp(N ∗ |32, 32)
64N
osp(N − q, q|64)
64N
64N
July 18, 2002 9:28 WSPC/148-RMP
00134
Considerations on Super Poincar´ e Algebras
525
by contraction a non chiral algebra (In both, de Sitter and anti de Sitter), and the same happens if one makes the contraction of f4 . Physically, in dimension d ≤ 7 supergravity theories exist with both, Minkowski and anti de Sitter supersymmetric solutions. 4. Examples We consider some examples in d = 10 and d = 4. Spacetime of type (9, 1). We have d = 2 mod 8 and ρ = 0 mod 8. The chiral spinor modules S ± are real of dimension 2n = 16. There is a chiral Poincar´e superalgebra. The N = 2 chiral super Poincar´e algebra is the IIB algebra. One can also construct a non chiral one with the odd generators in the direct sum S + ⊕ S − and it is the IIA algebra. Both have 32 odd spinor charges. The conformal algebra is so(10, 2). Poincar´e superalgebras are subalgebras of the conformal superalgebras. The chiral algebra of type (1, 0) is a subalgebra of osp(1|32), the embedding given by ⊃
sp(32) ←−−−− so(10, 2) 32
−−−−→ S + = 32 .
Type IIB algebra (type (2, 0)) is a subalgebra of osp(2−q, q|32), with q = 0, 1 and q = 0 for compact R-symmetry. It has 64 spinor charges. Finally, this superalgebra is embedded (not as a subalgebra) into another superalgebra with 64 spinor charges, osp(1|64), which has the interpretation of the superconformal algebra of a spacetime of type (10, 1) (that is, one dimension more). The embedding of the even parts and decompositions of the representations which are the odd part are as follows ⊃
⊃
sp(64) ←−−−− so(2) ⊕ sp(32) ←−−−− so(2) ⊕ so(10, 2) 64
−−−−→
−−−−→ (2, S + ) = (2, 32) .
(2, 32)
Type IIA is also embedded into a superconformal algebra with 64 spinor charges, osp(1, N 2n) = osp(1|64). As before, we have ⊃
⊃
sp(64) ←−−−− spin(11, 2) ←−−−− so(10, 2) 64
−−−−→
S = 64
−−−−→ S + ⊕ S − = 32+ ⊕ 32− .
The d = 4 case. It is interesting to consider the case of dimension 4 with all possible signatures ρ = 0, 2, 4. For ρ = 4 (Euclidean case) the superalgebra is su∗ (4|2N ), for ρ = 2 (Lorentzian case) the superalgebra is su(2, 2|N ) and for ρ = 0 it is sl(4|N ). The superalgebras with 32 charges are su∗ (4|4), su(2, 2|4) and sl(4|4) (since n = m these algebras have no u(1) or o(1, 1) factor). They correspond to the
July 18, 2002 9:28 WSPC/148-RMP
526
00134
S. Ferrara & M. A. Lled´ o
underlying symmetries of N = 4 Euclidean Yang–Mills [31, 32], N = 4 ordinary Yang–Mills and N = 4 self dual Yang–Mills [33] considered in the literature. Only the latter two exist with eight charges, corresponding to N = 1 supersymmetry, su(2, 2|1) and sl(4|1). We note that these minimal superconformal algebras have a further extension into osp(1|8) [9, 34], since sp(8, R) contains both, su(2, 2) ⊕ u(1) and sl(4, R) ⊕ R. osp(1|8) can also be viewed as an anti de Sitter super algebra in d = 5, so by contraction we get the five dimensional super Poincar´e algebra with Z-charges. The Z-charges do not appear if we make the contraction from the minimal superalgebra su(2, 2|1). It is interesting to note that the enlargement of su(2, 2) ⊕ u(1) to sp(8, R) does not change the rank of the algebra, so the number of quantum numbers that label an irreducible unitary representation would be the same. osp(1|8) has (as all superconformal algebras) an o(1, 1) grading osp(1|8) = L−1 ⊕ Q−1/2 ⊕ L0 ⊕ Q+1/2 ⊕ L+1 , with L0 = sl(4, R) ⊕ so(1, 1). Note that sl(4, R) = spin(3, 3) and that we have so(3, 1) ⊕ so(2) ∈ so(3, 3) ,
(ρ = 2)
with so(2) being the R-symmetry of su(2, 2|1) and so(2, 2) ⊕ so(1, 1) ∈ so(3, 3) ,
(ρ = 0)
with so(1, 1) being the R-symmetry of sl(4|1). Appendix In this appendix we give some definitions that are used throughout the paper. There are many references where these concepts are treated in great detail. We cite here the ones we have used [35, 36, 3, 37]. Appendix A A super Lie algebra is a Z2 -graded vector space g = go +g1 with a bilinear operation [ , ] : g × g → g satisfying the following properties: (a) We say that an element a ∈ g is homogeneous of degree pa = 0, 1 if a ∈ g0 (a is even) or a ∈ g1 (a is odd) respectively. We then have that for a and b homogeneous elements of g p[a,b] = pa + pb
modulo Z2 .
(b) The bracket is graded-skew symmetric, [a, b] = −(−1)pa ·pb [b, a] , with a and b homogeneous in g.
July 18, 2002 9:28 WSPC/148-RMP
00134
Considerations on Super Poincar´ e Algebras
527
(c) Generalized Jacobi identity, (−1)pa ·pc [a, [b, c]] + (−1)pc ·pb [c, [a, b]] + (−1)pb ·pa [b, [c, a]] = 0 if a, b and c are homogeneous in g. It follows that g0 is an ordinary Lie algebra and that the subspace of the odd elements g1 carries a representation of g0 . A superalgebra g is simple if it has no other ideal than 0 and g. If g is simple, then the representation of g0 on g1 is faithful and [g1 , g1 ] = g0 . If these two conditions are satisfied and, in addition the representation of g0 on g1 is irreducible, then g is simple. Appendix B We give here the definition of some classical complex superalgebras that are used in the text. Let V = V0 ⊕ V1 be a Z2 graded vector space over C with dim V0 = m and dim V1 = n. Then we have that the endomorphisms of V are also a graded vector space. In terms of a basis !) ( Am×m Bn×m , End(V ) = Cm×n Dn×n ( !) Am×m 0 , End(V )0 = 0 Dn×n !) ( 0 Bn×m . End(V1 ) = Cm×n 0 It is endowed with a super Lie algebra structure with bracket [a, b] = ab − (−1)pa pb ba . This superalgebra is denoted by gl(m|n, C). We define the supertrace of an element of gl(m|n, C) as ! A B str = tr(A) − tr(B) . C D The subspace of elements of gl(m|n, C) that have zero supertrace is a subsuperalgebra denoted by sl(m|n, C). If m 6= n, then sl(m|n, C) is a simple superalgebra. Its even part is sl(m, C) ⊕ sl(n, C) ⊕ C. In sl(n|n) there is a one-dimensional ideal i generated by the matrix 1l2n×2n . The algebra sl(n|n)/i is also simple. Its even part is sl(n, C) ⊕ sl(n, C). Let F be a non degenerate bilinear form on the graded vector space V . We assume that it is graded symmetric, that is, F (a, b) = (−1)pa pb F (b, a). This means
July 18, 2002 9:28 WSPC/148-RMP
528
00134
S. Ferrara & M. A. Lled´ o
that the restriction to V0 is symmetric and the restriction to V1 is antisymmetric. We assume also that F (a, b) = 0 if a ∈ V0 and b ∈ V1 , so V0 and V1 are orthogonal. Because of the non degeneracy, we have that dim(V1 ) must be an even number. In a certain basis the bilinear for is given by a matrix ! ! 0 I 0 1lm×m . , Ω2p×2p = −I 0 0 Ω2p×2p The subspace of gl(m|2p, C) which satisfies t
a F + Fa = 0,
t
a =
AT
CT
−B T
DT
!
(T denotes the usual transpose) is a simple Lie superalgebra whose even part is so(m, C) ⊗ sp(2p, C). It is called the ortosymplectic algebra, osp(m|2p). Appendix C The complex Lie superagebras defined above have real forms that are real simple Lie superalgebras. These real forms are determined by the real form of the even part (see [30, 9, 10]). We list here the ones that are of interest for our paper. The notation that we use for the real forms of Lie algebras is the standard one [38]. Real forms of sl(m|n, C). (1) (2) (3) (4)
sl(m|n, R), with even part sl(m, R) ⊕ sl(n, R) ⊕ R. su(m|n), with even part su(m) ⊕ su(n) ⊕ u(1). su(m, n|p, q), with even part su(m, n) ⊕ su(p, q) ⊕ u(1). su∗ (m|n), with even part su∗ (m) ⊕ su∗ (n) ⊕ so(1, 1).
Real forms of osp(m|n, C) (n = 2p). (1) osp(m|n, R), with even part so(m, R) ⊕ sp(n, R). (2) osp(m, q|n), with even part so(m, q) ⊕ sp(n). (3) osp(m∗ |2s, 2t), with even part so∗ (m) ⊕ usp(2s, 2t). Additionally, there are other simple Lie superalgebras that are constructed by taking the complex Lie superalgebra and looking at it as a real Lie superalgebra space of twice the dimension. Their even parts correspond to the complex even parts taken as real Lie algebras. We denote those by sl(n|m, C)R and osp(n|m, C)R . Acknowledgments S. Ferrara would like to thank the Dipartimento di Fisica, Politecnico di Torino for its kind hospitality during the completion of this work. The work of S. Ferrara has been supported in part by the European Commission RTN network HPRNCT-2000-00131, (Laboratori Nazionali di Frascati, INFN) and by the D.O.E. grant
July 18, 2002 9:28 WSPC/148-RMP
00134
Considerations on Super Poincar´ e Algebras
529
DE-FG03-91ER40662, Task C. M. A. Lled´o. would like to thank the Department of Physics and Astronomy of the University of California, Los Angeles for its hospitality during the completion of this work. References [1] J. Wess and B. Zumino, Supergauge transformations in four dimensions, Nucl. Phys B70 (1974) 39. [2] W. Nahm, V. Rittenberg and M. Scheunert, Classification of all simple graded Lie algebras whose Lie algebra is reductive, J. Math. Phys. 17 (1976) 1626. [3] V. G. Kac, A sketch of Lie superalgebra theory, Commun. Math. Phys. 53 (1977) 31; Lie Superalgebras, Adv. Math. 26 (1977) 8; J. Math. Phys. 21 (1980) 689. [4] I. Bars and M. G¨ unaydin, Construction of Lie algebras and Lie superalgebras from ternary algebras, J. Math. Phys. 20(9) (1979) 1977. [5] J. Strathdee, extended poincare supersymmetry, Int. J. Mod. Phys. A2(1) (1987) 273. [6] D. Alekseevsky and V. Cort´es, Classification of N-(super)-extended Poincar´e algebras and bilinear invariants of the spinor representation of Spin(p, q), Commun. Math. Phys. 183 (1997) 477–510. [7] M. J. Duff, R. R. Khuri and J. X. Lu, String Solitons, Phys. Rept. 259 (1995) 213. [8] G. W. Gibbons and P. K. Townsend, Vacuum interpolation in supergravity via super P-Branes, Phys. Rev. Lett. 71 (1993) 3754. [9] R. D’Auria, S. Ferrara, M. A. Lled´ o and V. S. Varadarajan, Spinor algebras, J. Geom. Phys. 40 (2001) 101–129. [10] R. D’Auria, S. Ferrara and M. A. Lled´ o , On the embedding of space-time symmetries into simple superalgebras, Lett. Math. Phys. 57 (2001) 123–133. [11] W. Nahm, Supersymmetries and their representations, Nucl. Phys. B135 (1978) 149. [12] S. Coleman and J. Mandula, All possible symmetries of the S matrix, Phys. Rev. 159 (1967) 1251. [13] R. Haag, J. Lopusza´ nski and M. Sohnius, All possible generators of supersymmetries of the S matrix, Nucl. Phys. B88 (1975) 257. [14] E. Witten, Five-branes and M-theory on an Orbifold, Nucl. Phys. B463 (1996) 383. [15] C. Vafa, Evidence for F-theory, Nucl. Phys. B469 (1996) 403. [16] I. Bars, Two-time physics in field theory, Phys. Rev. D62 (2000) 046007; A case for 14 dimensions, Phys. Lett. B403 (1997) 257–264; S-Theory, Phys. Rev. D55 (1997) 2373–2381. [17] P. K. Townsend, M(embrane) theory on T 9 , Nucl. Phys. Proc. Suppl. 68 (1998) 11. [18] P. K. Townsend, p-prane democracy, in the proceedings of the March 95 PASCOS/Johns Hopkins conference. [19] P. Horava, M-theory as a holographic field theory, Phys. Rev. D59 (1999) 046004. [20] E. Bergshoeff and A. Van Proeyen, The Many faces of OSp(1|32), Class. Quant. Grav. 17 (2000) 3277; Symmetries of string, M and F-theories, Class. Quant. Grav. 18 (2001) 3083; The unifying superalgebra OSp(1|32), preprint. [21] R. D’Auria and P. Fr`e, Geometric supergravity in D = 11 and its hidden supergroup, Nucl. Phys. B201 (1982) 101. [22] E. Angelopoulos, M. Flato, C. Fronsdal and D. Sternheimer, Massless particles, conformal group and de Sitter universe, Phys. Rev. D23 (1981) 1278. [23] M. G¨ unaydin, Unitary supermultiplets of OSp(1/32, R) and M-theory, Nucl. Phys. B528 (1998) 432. [24] P. West, Hidden superconformal symmetry in M theory, JHEP 0008 (2000) 007.
July 18, 2002 9:28 WSPC/148-RMP
530
00134
S. Ferrara & M. A. Lled´ o
[25] O. Aharony, S. S. Gubser, J. Maldacena, H. Ooguri and Y. Oz, Large N field theories, string theory and gravity, Phys. Rept. 323 (2000) 183. [26] E. Witten, Quantum gravity in de Sitter space, preprint. [27] A. Strominger, The dS/CFT correspondence, JHEP 0110 (2001) 034. [28] C. M. Hull, De Sitter space in supergravity and M theory, JHEP 0111 (2001) 012. [29] C. Hull, Duality and the signature of space-time. JHEP 9811 (1998) 017; Symmetries and compactifications of (4, 0) conformal gravity, JHEP 0012 (2000) 007. [30] A. Van Proeyen, Tools for supersymmetry, preprint. [31] D. G. McKeon, Harmonic superspace with four-dimensional Euclidean space, Can. J. Phys. 78 (2000) 261; the simplest superalgebras in two, three, four and five dimensions, Nucl. Phys. B591 (2000) 591; D. G. McKeon and T. N. Sherry, Extended supersymmetry in four-dimensional Euclidean space, Ann. Phys. 285 (2000) 221; F. T. Brandt, D. G. McKeon and T. N. Sherry, Supersymmetry in 2 + 2 dimensions, Mod. Phys. Lett. A15 (2000) 1349. [32] A. V. Belitsky, S. Vandoren and P. van Nieuwenhuizen, Instantons, Euclidean supersymmetry and Wick rotations, Phys. Lett. B477 (2000) 335. [33] W. Siegel, The N = 2 (4) string is selfdual N = 4 Yang–Mills, Phys. Rev. D46 (1992) 3235; Selfdual N = 8 supergravity as closed N = 2 (N = 4) strings, Phys. Rev. D47 (1993) 2504; Supermulti-instantons in conformal Chiral superspace, Phys. Rev. D52 (1995) 1042. [34] J. W. Van Holten and A. Van Proeyen, N = 1 supersymmetry algebras in D = 2, D = 3, D = 4 mod(8), J. Phys. A15 (1982) 3763. [35] F. A. Berezin, Introduction to Superanalysis. D. Reidel Publishing Company, Kluwer Academic Publishers Group, 1987. [36] B. Kostant, Differential Geometrical Methods in Mathematical Physics (Proc. Sympos., Univ. Bonn, Bonn, 1975). Lecture Notes in Math., Vol. 570, Springer, Berlin, 1977, pp. 177–306. [37] D. A. Leites, Introduction to the theory of supermanifolds, Russian Math. Surveys 35(1) (1980) 1–64. [38] S. Helgason, Differential Geometry, Lie Groups and Symmetric Spaces, Academic Press, 1978.
July 18, 2002 9:30 WSPC/148-RMP
00136
Reviews in Mathematical Physics, Vol. 14, No. 6 (2002) 531–568 c World Scientific Publishing Company
WEAKLY REGULAR FLOQUET HAMILTONIANS WITH PURE POINT SPECTRUM
‡ and M. VITTOT∗ ˇTOV ˇ ´ICEK ˇ P. DUCLOS∗,† , O. LEV‡ , P. S ∗Centre
de Physique Th´ eorique, CNRS, Luminy, Case 907 13288 Marseille Cedex 9, France
†PHYMAT,
Universit´ e de Toulon et du Var, BP 132 F-83957 La Garde Cedex, France
‡Department
of Mathematics, Faculty of Nuclear Science Czech Technical University, Trojanova 13 120 00 Prague, Czech Republic Received 13 February 2002 Revised 4 June 2002
We study the Floquet Hamiltonian −i∂t + H + V (ωt), acting in L2 ([0, T ], H, dt), as depending on the parameter ω = 2π/T . We assume that the spectrum of H in H is discrete, Spec(H) = {hm }∞ m=1 , but possibly degenerate, and that t 7→ V (t) ∈ B(H) is a 2π-periodic function with values in the space of Hermitian operators on H. Let 8 9 J . Suppose that for some σ > 0 it holds true that J 8 P > 0 and set Ω0 = 9 J,−σ < ∞ where Mm is the multiplicity of hm . We show hm >hn Mm Mn (hm − hn ) that in that case there exist a suitable norm to measure the regularity of V , denoted V , and positive constants, ? and δ? , with the property: if V < ? then there exists a measurable subset Ω∞ ⊂ Ω0 such that its Lebesgue measure fulfills |Ω∞ | ≥ |Ω0 | − δ? V and the Floquet Hamiltonian has a pure point spectrum for all ω ∈ Ω∞ . Keywords: KAM theory; small divisors; perturbation theory; Floquet Hamiltonian; pure point spectrum. Mathematics Subject Classification 2000: 37J40, 58C40, 34L15, 81Q10
1. Introduction The problem we address in this paper concerns spectral analysis of so called Floquet Hamiltonians. The study of stability of non-autonomous quantum dynamical systems is an effective tool to understand most of quantum problems which involve a small number of particles. When these systems are time-periodic the spectral analysis of the evolution operator over one period can give a fairly good information on this stability, see e.g. [1]. In fact this type of result generalizes the celebrated RAGE theorem concerned with time-independent systems (one can consult [2] for a summary). As shown in [3] and [4] the spectral analysis of the evolution operator over one period (so called monodromy operator or Floquet operator) is equivalent to the spectral analysis of the corresponding Floquet Hamiltonian (sometimes 531
July 18, 2002 9:30 WSPC/148-RMP
532
00136
P. Duclos et al.
called operator of quasi-energy). This is also what we are aiming for in this article. More precisely, we analyze time-periodic quantum systems which are weakly regular in time and “space” in the sense of an appropriately chosen norm, and give sufficient conditions to insure that the Floquet Hamiltonians has a pure point spectrum. Such a program is not new. In the pioneering work [5] Bellissard has considered the so called pulsed rotor which is analytic in time and space, using a KAM type algorithm. Then Combescure [6] was able to treat harmonic oscillators driven by sufficiently smooth perturbations by adapting to quantum mechanics the well known Nash–Moser trick (c.f. [7] and [8]). Later on these ideas have been extended to a wider class of systems in [9]; it was even possible to require no regularity in space by using the so called adiabatic regularization, originally proposed in [10] and further extended in [11, 12]. However none of these papers can be considered as optimal in the sense of having found the minimal value of regularity in time below which the Floquet Hamiltonian ceases to be pure point. Though it is impossible to mention all the relevant contributions to the study of stability of time-dependent quantum systems we would like to mention the following ones. Perturbation theory for a fixed eigenvalue has been extended, in [13], to Floquet Hamiltonians which generically have a dense point spectrum. Bounded quasi-periodic time dependent perturbations of two level systems are considered in [14] whereas the case of unbounded perturbation of one-dimensional oscillators are studied in [15]. Averaging methods combined with KAM techniques were described in [16] and [17]. In the present paper we attempt to further improve the KAM algorithm, particularly having in mind more optimal assumptions as far as the regularity in time is concerned. As a thorough analysis of the algorithm has shown this is possible owing to the fact that the algorithm contains several free parameters (for example the choice of norms in auxiliary Banach spaces that are constructed during the algorithm) which may be adjusted. This type of improvements is also illustrated on an example following Theorem 2.1. A more detailed discussion of this topic is postponed to concluding remarks in Sec. 10. Another generalization is that in the present result (Theorem 2.1) we allow degenerate eigenvalues of the unperturbed Hamilton operator (denoted H in what follows). The degeneracy of eigenvalues hm of H can grow arbitrarily fast with m provided the time-dependent perturbation is sufficiently regular. To our knowledge this is a new feature in this context. Previously two conditions were usually imposed, namely bounded degeneracy and a growing gap condition on eigenvalues hm , reducing this way the scope of applications of this theory to one-dimensional confined systems. Owing to the generalization to degenerate eigenvalues we are able to consider also some models in higher dimensions, for example the N -dimensional quantum top, i.e., the N -dimensional version of the pulsed rotor. A short description of this model is given, too, after Theorem 2.1. The article is organized as follows. In Sec. 2 we introduce the notation and formulate the main theorem. The basic idea of the KAM-type algorithm is outlined in
July 18, 2002 9:30 WSPC/148-RMP
00136
Weakly Regular Floquet Hamiltonians with Pure Point Spectrum
533
Sec. 3. The algorithm consists in an iterative procedure resulting in diagonalization of the Floquet Hamiltonian. For this sake one constructs an auxiliary sequence of Banach spaces which form in fact a directed sequence. The procedure itself may formally be formulated in terms of an inductive limit. Sections 4–8 contain some additional results needed for the proof, particularly the details of the construction of the auxiliary Banach spaces and how they are related to Hermitian operators in the given Hilbert space, and a construction of the set of “non-resonant” frequencies for which the Floquet Hamiltonian has a pure point spectrum (the frequency is considered as a parameter). Section 9 is devoted to the proof of Theorem 2.1. In Sec. 10 we conclude our presentation with several remarks concerning comparison of the result stated in Theorem 2.1 with some previous ones. 2. Main Theorem The central object we wish to study in this paper is a self-adjoint operator of the form K + V acting in the Hilbert space K = L2 ([0, T ], dt) ⊗ H ∼ = L2 ([0, T ], H, dt) where T = 2π/ω, ω is a positive number (a frequency) and H is a fixed separable Hilbert space. The operator K is self-adjoint and has the form K = −i∂t ⊗ 1 + 1 ⊗ H where the differential operator −i∂t acts in L2 ([0, T ], dt) and represents the selfadjoint operator characterized by periodic boundary conditions. This means that the eigenvalues of −i∂t are kω, k ∈ Z, and the corresponding normalized eigenvectors are χk (t) = T −1/2 exp(ikωt). H is a self-adjoint operator in H and is supposed to have a discrete spectrum. Finally, V is a bounded Hermitian operator in K determined by a measurable operator-valued function t 7→ V (ωt) ∈ B(H) such that supt∈R kV (t)k < ∞, V (t) is 2π-periodic, and for almost all t ∈ R, V (t)∗ = V (t). Naturally, (Vψ)(t) = V (ωt)ψ(t) in K ∼ = L2 ([0, T ], H, dt). Let X kω Pk k∈Z
be the spectral decomposition of −i∂t in L2 ([0, T ], dt) and let X hm Q m H= m∈N
be the spectral decomposition of H in H. Thus we can write X⊕ Hm H= m∈N
where Hm = Ran Qm are the eigenspaces. We suppose that the multiplicities are finite, Mm = dim Hm < ∞,
∀m ∈ N .
July 18, 2002 9:30 WSPC/148-RMP
534
00136
P. Duclos et al.
Hence the spectrum of K is pure point and its spectral decomposition reads XX (kω + hm )Pk ⊗ Qm , K=
(1)
k∈Z m∈N
implying a decomposition of K into a direct sum, X⊕ Ran(Pk ⊗ Qm ) . K= (k,m)∈Z×N
Here is some additional notation. Set Z 1 T −ikωt e Qn V (ωt)Qm dt Vknm = T 0 Z 2π 1 e−ikt Qn V (t)Qm dt ∈ B(Hm , Hn ) . = 2π 0
(2)
Further, ∆mn = hm − hn , and ∆0 = inf |∆mn | . m6=n
Now we are able to formulate our main result. Though not indicated explicitly in the notation the operator K + V is considered as depending on the parameter ω. Theorem 2.1. Fix J > 0 and set Ω0 := 89 J, 98 J . Assume that ∆0 > 0 and that there exists σ > 0 such that X Mm Mn < ∞. ∆σ (J) := J σ (∆mn )σ m,n∈N ∆mn >J/2
Then for every r > σ + 12 there exist positive constants (depending, as indicated , on σ, r, ∆0 and J but independent of V ), ? (r, ∆0 , J) and δ? (σ, r, J), with the property: if XX |Ω0 | r kVknm k max{|k| , 1} < min ? (r, ∆0 , J), V := sup δ? (σ, r, J) n∈N k∈Z m∈N
(here |Ω∗ | stands for the Lebesgue measure of Ω∗ ) then there exists a measurable subset Ω∞ ⊂ Ω0 such that |Ω∞ | ≥ |Ω0 | − δ? (σ, r, J)V
(3)
and the operator K + V has a pure point spectrum for all ω ∈ Ω∞ Remark 2.2. (1) It is sometimes necessary to consider potentials V which depend on the frequency ω in a more elaborate way. Suppose that V : R × R+ → B(H) is a bounded measurable function, which is 2π periodic with respect to the first variable and such that for almost all t ∈ R and ω ∈ R+ , V (t, ω)? = V (t, ω); then
July 18, 2002 9:30 WSPC/148-RMP
00136
Weakly Regular Floquet Hamiltonians with Pure Point Spectrum
535
(Vψ)(t) = V (ωt, ω)ψ(t) defines an operator family which is uniformly bounded on K with respect to the variable ω. Let now X ˜ knm (ω, ω 0 )k) max{|k|r , 1} (kVknm (ω)k + Jk∂V V := sup sup ω,ω 0 ∈Ω0 n∈N
k∈Z,m∈N
where Vknm (ω) :=
1 2π
Z
2π
e−ikt Qn V (t, ω)Qm dt
0
and where the symbol ∂˜ is the discrete derivative in ω, 0 ˜ knm (ω, ω 0 ) := Vknm (ω) − Vknm (ω ) . ∂V 0 ω−ω
It is not difficult to check that Theorem 2.1 applies in exactly the same conditions. (2) In the course of the proof we shall show even more. Namely, for all ω ∈ Ω∞ and any eigenvalue of K + V the corresponding eigen-projector P belongs to the Banach algebra with the norm XX 1 kPknm k max{|k|r−σ− 2 , 1} . kP k = sup n∈N
k∈Z m∈N
This shows that P is (r − σ − 1/2)-differentiable as a map from [0, T ] to the space of bounded operators in H (3) The constants ? (r, ∆0 , J) and δ? (σ, r, J) are in fact known quite explicitly and are given by formulae (70), (71), (77) and (78). Setting α = 2 and q r = e2 in these formulae (this is a possible choice) we get min{4∆0 , J} , 270e3
? (r, ∆0 , J) = and
σ+ 12
2σ + 1
δ? (σ, r, J) = 720πe5 2σ 2 1 − e− r e
! s2 e
− 2r (r−σ− 12 )s
2σ + 1
2 1 − e− r e
where we used the estimate ∞ X s=1
s2 e−2xs =
∆σ (J)
s=1
σ+ 12
= 720π
∞ X
1 + e−2+ r (σ+ 2 ) 3 ∆σ (J) 2 1 1 − e−2+ r (σ+ 2 ) 2
2
1
2σ e3+ r (σ+ 2 )
1
cosh(x) 1 ≤ . 4 sinh(x)3 4 x3
We conclude this section with a brief description of two models illustrating the effectiveness of Theorem 2.1. In the first model we set H = L2 ([0, 1], dx), H = −∂x2 with Dirichlet boundary conditions, and V (t) = z(t)x2 where z(t) is a sufficiently regular 2π-periodic function. As shown in [18] the spectral analysis of this simple
July 18, 2002 9:30 WSPC/148-RMP
536
00136
P. Duclos et al.
model is essentially equivalent to the analysis of the so called quantum Fermi accelerator. The particularity of the latter model is that the underlying Hilbert space itself is time-dependent, Ht = L2 ([0, a(t)], dx) where a(t) is a strictly positive periodic function. The time-dependent Hamiltonian is −∂x2 with Dirichlet boundary conditions. Using a convenient transformation one can pass from the Fermi accelerator to the former model getting the function z(t) expressed in terms of a(t), a0 (t) and a00 (t). But let us return to the analysis of our model. Eigenvalues of H are non-degenerate, hm = m2 π 2 for m ∈ N, with normalized eigenfunctions equal to √ 2 sin(mπx). Note that in the notation we are using in the present paper 0 ∈ / N. A straightforward calculation gives 8(−1)m+n mn (m2 − n2 )2 π 2 if m 6= n , Vknm = zk × 1 1− if m = n , 3 2m2 π 2 R 2π −ikt 1 e z(t) dt is the Fourier coefficient of z(t). Hence one derives where zk = 2π 0 that ! n−1 X 2 4 X 1 X 1 r + 2 2+ 2 |z | max{|k| , 1} = |zk | max{|k|r , 1} . V = sup k 3 n π π j=1 j 2 n∈N k∈Z
k∈Z
For any J > 0, ∆σ (J) is finite if and only if σ > 1. On the other hand, to have V finite it is sufficient that z(t) ∈ C s where s > r + 1 > σ + 12 + 1 > 52 . So z(t) ∈ C 3 suffices for the theory to be applicable. This may be compared to an older result in [9, Sec. 4.2], giving a much worse condition, namely z(t) ∈ C 17 . The second model is the pulsed rotator in N dimensions. In this case H = L2 (S N , dµ), with S N ⊂ RN +1 being the N -dimensional unit sphere with the standard (rotationally invariant) Riemann metric and the induced normalized measure dµ, and H = −∆LB is the Laplace–Beltrami operator on S N . The spectrum of H is well known, [11]: Spec(H) = {hm }∞ m=0 , where hm = m(m + N − 1) and the multiplicities are
Mm =
m+N N
−
m+N −2 N
.
The time-dependent operator V (t) in H acts via multiplication, (V (t)ϕ)(x) = v(t, x)ϕ(x), where v(t, x) is a real measurable bounded function on R × S N which is 2π-periodic in the variable t. Consequently, K ∼ = L2 ([0, T ] × S N , dt dµ) and (Vψ)(t, x) = v(ωt, x)ψ(t, x). Note that the asymptotic behavior of the eigenvalues and the multiplicities, as m → ∞, is hm ∼ m2 , Mm ∼ (2/(N − 1)!) mN −1 . So ∆σ (J) is finite, for any J > 0, if and only if X (nm)N −1 < ∞. (m2 − n2 )σ 2 2 m −n >J/2
July 18, 2002 9:30 WSPC/148-RMP
00136
Weakly Regular Floquet Hamiltonians with Pure Point Spectrum
537
To ensure this condition we require that σ > 2(N − 1) + 1 = 2N − 1. Let us assume that there exist s, u ∈ Z+ such that, for any system of local (smooth) coordinates (y1 , . . . , yN ) on S N , the derivatives ∂tα ∂yβ11 · · · ∂yβNN v(t, y1 , . . . , yN ) exist and are continuous for all α, β, α ≤ s and β1 + · · · + βN ≤ u. If u ≥ 4 then [H, [H, V (t)]] is a well defined second order differential operator with continuous coefficient functions and the operator [H, [H, V (t)]](1 + H)−1 is bounded. Clearly, (hm − hn )2 Qn V (t)Qm = Qn [H, [H, V (t)]](1 + H)−1 Qm . 1 + hm Using this relation one derives an estimate on Vknm , kVknm k ≤ const
1 + min{hn , hm } , |k|s (hm − hn )2
valid for k 6= 0 and m 6= n. The number X 1 + min{hn , hm } sup (hm − hn )2 n∈Z+ m∈Z+ , m6=n
is finite. To see it one can employ the asymptotics of hm and the fact that the sequence 2n−1 X 3 1 π2 5 1 X 1 1 + min{n2 , m2 } − , = 1 + + − an = (m2 − n2 )2 n2 12 16n2 16n4 2n m=1 m m∈Z+ , m6=n
n = 1, 2, 3, . . . , is bounded. It follows that the norm V is finite if s > r + 1 > σ + 12 + 1 > 2N − 1 + 32 = 2N + 12 . Thus the theory is applicable provided u ≥ 4 and s > 2N + 12 . The same example has also been treated by adiabatic methods in [11]. In that case the assumptions are weaker. It suffices that v(t, x) be (N + 1)-times differentiable in t with all derivatives ∂tα v(t, x), 0 ≤ α ≤ N + 1, uniformly bounded. However the conclusion is somewhat weaker as well. Under this assumption K + V has no absolutely continuous spectrum but nothing is claimed about the singular continuous spectrum. 3. Formal Limit Procedure Suppose there is given a directed sequence of real or complex Banach spaces, {Xs }∞ s=0 , with linear mappings ιus : Xs → Xu
if s ≤ u, with kιus k ≤ 1 ,
(and ιss is the unit mapping in Xs ) and such that ιvu ιus = ιvs
if s ≤ u ≤ v .
To simplify the notation we set in what follows ιs = ιs+1,s .
July 18, 2002 9:30 WSPC/148-RMP
538
00136
P. Duclos et al.
Denote by X∞ the norm inductive limit of {Xs , ιus } in the sense of [19, Sec. 1.3.4] or [20, Sec. 1.23] (the algebraic inductive limit is endowed with a seminorm induced by lim sups k·ks , the kernel of this seminorm is divided out and the result is completed). X∞ is related to the original directed sequence via the mappings ι∞s : Xs → X∞ obeying kι∞s k ≤ 1 and ι∞u ιus = ι∞s if s ≤ u. By the construction, the union S s≥s0 ι∞s (Xs ) is dense in X∞ for any s0 ∈ Z+ . If {As ∈ B(Xs )} is a family of bounded operators, defined for s ≥ s0 and such that if s0 ≤ s ≤ u, and sup kAs k < ∞ ,
Au ιus = ιus As
s
then A∞ ∈ B(X∞ ) designates the inductive limit of this family characterized by the property A∞ ι∞s = ι∞s As , ∀s ≥ s0 . Let D∞ ∈ B(X∞ ) be the inductive limit of a family of bounded operators {Ds ∈ B(Xs ); s ≥ 0}, with the property kDs k ≤ 1,
k1 − Ds k ≤ 1 ,
∀s .
(4)
We also suppose that there is given a sequence of one-dimensional spaces kKs , s = 0, 1, . . . , ∞, where the Ks are distinguished basis elements. Here the field k is either C or R depending on whether the Banach spaces Xs are complex or real. Set ˜ s = kKs ⊕ Xs , X
s = 0, 1, . . . , ∞ .
˜ s }∞ becomes a directed sequence of vector spaces provided one defines Then {X s=0 ˜ ˜ u by ˜ιus : Xs → X ˜ιus |Xs = ιus and ˜ιus (Ks ) = Ku Set 1 x
φ(x) =
if s ≤ u .
X ∞ ex − 1 k+1 k x . ex − = x (k + 2)!
(5)
k=0
∞ Proposition 3.1. Suppose that , in addition to the sequences {Xs }∞ s=0 , {Ks }s=0 ∞ ∞ s ∞ and {Ds }s=0 , there are given sequences {Vs }s=0 and {Θu }u=s+1 such that Vs ∈ Xs , Θsu ∈ B(Xu ), and
if s < u ≤ v .
Θsv ιvu = ιvu Θsu
(6)
Set s−1
s−2
T s = e Θs e Θs
0
· · · eΘs ∈ B(Xs )
for s ≥ 1 .
(7)
Let {Ws }∞ s=0 be another sequence, with Ws ∈ Xs , defined recursively: W0 = V0 , Ws+1 = ιs (Ws ) + Ts+1 (Vs+1 − ιs (Vs )) + Θss+1 φ(Θss+1 )ιs (1 − Ds )(Ws − ιs−1 (Ws−1 )) ,
(8)
July 18, 2002 9:30 WSPC/148-RMP
00136
Weakly Regular Floquet Hamiltonians with Pure Point Spectrum
539
where we set , by convention, X−1 = 0, W−1 = 0. Extend the mappings Θsu to ˜s : X ˜u → X ˜ u by Θ u ˜ s (Ku ) = −Θs Du (ιus (Ws )) − (1 − Du )(ιus (Ws ) − ιu,s−1 (Ws−1 )) , Θ u u
(9)
˜s → X ˜ s, and consequently the mappings Ts to T˜s : X ˜ s−1 ˜ s−2 ˜0 T˜s = eΘs eΘs · · · eΘs for s ≥ 1, T˜0 = 1 .
Then it holds T˜s (Ks + Vs ) = Ks + Ds (Ws ) + (1 − Ds )(Ws − ιs−1 (Ws−1 )) ,
s = 0, 1, 2, . . . .
(10)
˜ su (Ku ) ∈ Xu it is easy to observe that Remark 3.2. Since Θ T˜s (Ks ) − Ks ∈ Xs . ˜ s (Ku ) if 0 ≤ s < u ≤ v, ˜ s (Kv ) = ιvu Θ Furthermore, note that (9) implies that Θ v u s ˜ and so the mappings Θu still satisfy ˜ su ˜ sv ˜ιvu = ˜ιvu Θ Θ
if s < u ≤ v .
Proof of Proposition 3.1. By induction in s. For s = 0 the claim is obvious. In the induction step s → s + 1 one may use the induction hypothesis and relations (9) and (8): T˜s+1 (Ks+1 + Vs+1 ) = T˜s+1˜ιs (Ks + Vs ) + Ts+1 (Vs+1 − ιs (Vs )) ˜s = eΘs+1 ˜ιs T˜s (Ks + Vs ) + Ts+1 (Vs+1 − ιs (Vs )) ˜s
= eΘs+1 ˜ιs (Ks + Ds (Ws ) + (1 − Ds )(Ws − ιs−1 (Ws−1 ))) + Ts+1 (Vs+1 − ιs (Vs )) s
eΘs+1 − 1 ˜ s Θs+1 ˜ιs (Ks + Ds (Ws )) Θss+1 s + eΘs+1 ιs (1 − Ds )(Ws − ιs−1 (Ws−1 )) + Ts+1 (Vs+1 − ιs (Vs ))
= Ks+1 + Ds+1 (ιs (Ws )) +
= Ks+1 − (1 − Ds+1 )ιs (Ws ) + ιs (Ws ) + Ts+1 (Vs+1 − ιs (Vs )) ! s eΘs+1 − 1 Θss+1 − ιs (1 − Ds )(Ws − ιs−1 (Ws−1 )) + e Θss+1 = Ks+1 − (1 − Ds+1 )ιs (Ws ) + Ws+1 = Ks+1 + Ds+1 (Ws+1 ) + (1 − Ds+1 )(Ws+1 − ιs (Ws )) . ∞ s ∞ Proposition 3.3. Assume that the sequences {Vs }∞ s=0 , {Ws }s=0 and {Θu }u=s have the same meaning and obey the same assumptions as in Proposition 3.1. Denote
ws = kWs − ιs−1 (Ws−1 )k (with w0 = kW0 k). Assume, in addition, that there exist a sequence of positive real numbers, {Fs }∞ s=0 , such that kΘsu k ≤ Fs ws ,
∀s, u, u > s ,
(11)
July 18, 2002 9:30 WSPC/148-RMP
540
00136
P. Duclos et al.
a sequence of non-negative real numbers {vs }∞ s=0 such that kVs − ιs−1 (Vs−1 )k ≤ vs , ∀s , (for s = 0 this means kV0 k ≤ v0 ) and a constant A ≥ 0 such that Fs vs2 ≤ A vs+1 , ∀s ,
(12)
and that it holds true B=
∞ X
Fs vs < ∞ .
(13)
s=0
Denote C = sup Fs vs .
(14)
s
If d > 0 obeys edB + Aφ(dC)d2 ≤ d
(15)
ws ≤ dvs , ∀s .
(16)
then
Proof. We shall proceed by induction in s. If s = 0 then v0 = w0 = kV0 k and (16) holds true since (15) implies that d ≥ 1. The induction step s → s + 1: according to (8), (7), (4) and (15), and owing to the fact that φ(x) is monotone, we have ws+1 ≤ kTs+1 k vs+1 + kΘss+1 k φ(kΘss+1 k) ws ! s X ≤ exp Fj wj vs+1 + φ(Fs ws )Fs ws2 j=0
≤ exp d
s X
! Fj vj vs+1 + φ(dFs vs )Fs d2 vs2
j=0
≤ edB vs+1 + φ(dC)d2 A vs+1 ≤ d vs+1 . Remark 3.4. If B≤
1 1 ln 2 and A φ(3C) ≤ 3 9
then (15) holds true with d = 3. Recall that Θs∞ ∈ B(X∞ ) is the unique bounded operator on X∞ such that Θs∞ ι∞u = ι∞u Θsu ,
∀u > s .
July 18, 2002 9:30 WSPC/148-RMP
00136
Weakly Regular Floquet Hamiltonians with Pure Point Spectrum
541
If (11) is true then its norm is estimated by kΘs∞ k ≤ Fs ws .
(17)
Corollary 3.5. Under the same assumptions as in Proposition 3.3, if d > 0 exists such that condition (15) is satisfied , and Finf = inf Fs > 0
(18)
s
then the limits V∞ = lim ι∞s (Vs ) ,
W∞ = lim ι∞s (Ws )
s→∞
s→∞
exist in X∞ , the limit s−1
0
T∞ = lim eΘ∞ · · · eΘ∞ s→∞
˜∞ → exists in B(X∞ ), and T∞ ∈ B(X∞ ) can be extended to a linear mapping T˜∞ : X ˜ ∞ by X T˜∞ (K∞ ) − K∞ = lim ι∞s (T˜s (Ks ) − Ks ),
(19)
s→∞
with the limit existing in X∞ . These objects obey the equality T˜∞ (K∞ + V∞ ) = K∞ + D∞ (W∞ ) .
(20)
Proof. If u ≥ s then
u u
X
X
ι∞j (Vj − ιj−1 (Vj−1 )) ≤ vj . kι∞u (Vu ) − ι∞s (Vs )k =
j=s+1
j=s+1
Since ∞ X
vs ≤
s=0
∞ 1 X Fs vs < ∞ Finf s=0
the sequence {ι∞s (Vs )} is Cauchy in X∞ and so V∞ ∈ X∞ exists. Under assumption (16) we can apply the same reasoning to the sequence {ι∞s (Ws )} to conclude that the limit W∞ = lims→∞ ι∞s (Ws ) exists in X∞ . Set s−1 0 T¯s = eΘ∞ · · · eΘ∞
if s ≥ 1, and T¯0 = 1 .
If u ≥ s then, owing to (17) and (16), we have ! ! u−1 X j ¯ ¯ kΘ k − 1 exp kTu − Ts k ≤ exp
s−1 X
∞
j=s
≤ exp d
u−1 X j=0
j=0
! Fj vj
! kΘj∞ k
− exp d
s−1 X
! Fj vj .
j=0
Assumption (13) implies that {T¯s } is a Cauchy sequence in B(X∞ ) and so T∞ ∈ B(X∞ ) exists.
July 18, 2002 9:30 WSPC/148-RMP
542
00136
P. Duclos et al.
To show (19) let us first verify the inequality ˜s
keΘu (Ku ) − Ku k ≤
1 + dB Fs ws −1 , e Finf
(21)
valid for all u > s. Actually, using definition (9) and assumption (11), we get ekΘu k − 1 ˜ s kΘu (Ku )k kΘsu k s
˜s
keΘu (Ku ) − Ku k ≤
ekΘu k − 1 (kΘsu kkWs k + kWs − ιs − 1(Ws−1 )k) ≤ kΘsu k 1 Fs ws − 1 kWs k + ≤ e . Fs s
To finish the estimate note that (13) and (16) imply kWs k =
s X
(kWj k − kWj−1 k) + kW0 k ≤
j=1
∞ X
dvj ≤
j=0
∞ d X dB Fj vj = . Finf j=0 Finf
With the aid of an elementary identity, aj · · · a0 − 1 = aj · · · a1 (a0 − 1) + aj · · · a2 (a1 − 1) + · · · + (aj − 1) , we can derive from (21): if 0 ≤ s ≤ t < u then ˜t
˜s
keΘu · · · eΘu (Ku ) − Ku k ≤ ekΘu k+···+kΘu t
s+1
k
˜s
keΘu (Ku ) − Ku k
+ ekΘu k+···+kΘu t
s+2
k
˜ s+1
keΘu (Ku ) − Ku k
˜t
+ · · · + keΘu (Ku ) − Ku k 1 + dB Ft wt +···+Fs+1 ws+1 Fs ws −1 e e ≤ Finf + eFt wt +···+Fs+2 ws+2 eFs+1 ws+1 − 1 + · · · + eFt wt − 1 1 + dB Ft wt +···+Fs ws −1 . e = Finf Set temporarily in this proof τs = ι∞s (T˜s (Ks ) − Ks ) ∈ X∞ . If t ≥ s then ˜ t−1
· · · eΘt (Kt ) − ιts eΘs
˜ t−1
· · · eΘt (Kt ) − eΘt
τt − τs = ι∞t (eΘt = ι∞t (eΘt
˜0
t−1
= ι∞t ((eΘt ˜ t−1
+ e Θt
˜ s−1
˜ s−1
˜0
s
˜ s−1
· · · eΘt − 1)(eΘt ˜s
· · · eΘt (Kt ) − Kt ) .
˜0
· · · eΘs (Ks )) ˜0
· · · eΘt (Kt )) ˜0
· · · eΘt (Kt ) − Kt )
July 18, 2002 9:30 WSPC/148-RMP
00136
Weakly Regular Floquet Hamiltonians with Pure Point Spectrum
543
Hence kτt − τs k ≤
1 + dB Ft−1 wt−1 +···+Fs ws ((e − 1)(eFs−1 ws−1 +···+F0 w0 − 1) Finf + eFt−1 wt−1 +···+Fs ws − 1)
=
1 + dB Ft−1 wt−1 +···+F0 w0 (e − eFs−1 ws−1 +···+F0 w0 ) . Finf
This shows that the sequence {τs } is Cauchy and thus the limit on the RHS of (19) exists. We conclude that it holds true, in virtue of (10), that T˜∞ (K∞ + V∞ ) = K∞ + lim ι∞s (T˜s (Ks ) − Ks ) + lim T¯s ι∞s (Vs ) s→∞
s→∞
= K∞ + lim ι∞s (T˜s (Ks + Vs ) − Ks ) s→∞
= K∞ + lim ι∞s (Ds (Ws ) + (1 − Ds )(Ws − ιs−1 (Ws−1 ))) s→∞
= K∞ + lim (D∞ (ι∞s (Ws )) + (1 − D∞ )(ι∞s (Ws ) − ι∞,s−1 (Ws−1 ))) s→∞
= K∞ + D∞ (W∞ ) . So equality (20) has been verified as well. 4. Convergence in a Hilbert Space Let {Xs , ιus } be a directed sequence of real or complex Banach spaces, as introduced in Sec. 3. In this section we assume that K is a separable complex Hilbert space and K is a closed (densely defined) operator in K . Suppose that for each s ∈ Z+ there is given a bounded linear mapping, κs : Xs → B(K), with kκs k ≤ 1 , and such that ∀s, u, 0 ≤ s ≤ u,
κu ιus = κs .
If the Banach spaces Xs are real then the mappings κs are supposed to be linear over R otherwise they are linear over C. Then there exists a unique linear bounded mapping κ∞ : X∞ → B(K) satisfying, ∀s ∈ Z+ , κ∞ ι∞s = κs . Clearly, kκ∞ k ≤ 1. ˜ s = kKs + Xs → CK + B(K) by defining ˜s : X Extend the mappings κs to κ κ ˜ s (Ks ) = K,
∀s ∈ Z+ ∪ {∞} .
So κ ˜ s (Ks + X) = K+ κs (X), with X ∈ Xs , is a closed operator in K with Dom(K+ κs (X)) = Dom(K). Suppose, in addition, that there exists D ∈ B(B(K)) such that ∀s ∈ Z+ ,
Dκs = κs Ds .
July 18, 2002 9:30 WSPC/148-RMP
544
00136
P. Duclos et al.
Then it holds true, ∀s ∈ Z+ , ∀X ∈ Xs , κ∞ D∞ (ι∞s X) = κ∞ ι∞s Ds (X) = κs Ds (X) = Dκs (X) = Dκ∞ (ι∞s X) . Since the set of vectors {ι∞s (X); s ∈ Z+ , X ∈ Xs } is dense in Xs we get κ∞ D∞ = Dκ∞ . Proposition 4.1. Under the assumptions of Corollary 3.5 and those introduced above in this section, let {As }∞ s=0 be a sequence of bounded operators in K such that , ∀s, u, 0 ≤ s < u, ∀X ∈ Xu ,
κu (Θsu (X)) = [As , κu (X)] ,
(22)
As (Dom K) ⊂ Dom K ,
∀s ∈ Z+ , and
˜ su (Ku ))|Dom(K) . [As , K] = κu (Θ
∀s, u, 0 ≤ s < u , Moreover , assume that ∞ X
kAs k < ∞ .
(23)
s=0
Set V = κ∞ (V∞ ),
W = κ∞ (W∞ ) .
Then the limit U = lim eAs−1 · · · eA0 s→∞
(24)
exists in the operator norm, the element U ∈ B(K) has a bounded inverse, and it holds true that U(Dom K) = Dom K and U(K + VU−1 = K + D(W) .
(25)
For the proof we shall need a lemma. Lemma 4.2. Assume that H is a Hilbert space, K is a closed operator in H, A, B ∈ B(H), A(Dom K) ⊂ Dom K , and [A, K] = B|Dom(K) .
July 18, 2002 9:30 WSPC/148-RMP
00136
Weakly Regular Floquet Hamiltonians with Pure Point Spectrum
545
Then it holds, ∀λ ∈ C, eλA (Dom K) = Dom K
(26)
and e−λ adA − 1 B. adA
e−λA KeλA = K +
Remark 4.3. Here and everywhere in what follows we use the standard notation: adA B = [A, B] and so eλ adA B = eλA Be−λA . Proof of Lemma 4.2. Choose an arbitrary vector v ∈ Dom(K) and set ∀n ∈ Z+ ,
vn =
n X λk k=0
Then vn ∈ Dom(K) and vn → e Kvn =
n X k=0
=−
λA
k!
Ak v .
v as n → ∞. On the other hand,
X λk λk (KAk − Ak K)v + Ak Kv k! k! n
k=0
n k−1 X λk X k=1
k!
j
k−1−j
A BA
j=0
v+
n X λk k=0
k!
Ak Kv .
So the limit limn→∞ Kvn exists. Consequently, since K is closed, eλA a(Dom K) ⊂ Dom K. But (eλA )−1 = e−λA has the same property and thus equality (26) follows. Furthermore, the above computation also shows that KeλA = −
∞ k−1 X λk X k=1
k!
Aj BAk−1−j + eλA K .
j=0
Application of the following algebraic identity (easy to verify), ∞ k−1 X 1 − e−λ adA λk X j A BAk−1−j = eλA B , k! j=0 adA k=1
concludes the proof. Proof of Proposition 4.1 We use notation of Corollary 3.5. From (22) follows that, ∀s, u, 0 ≤ s < u, ∀X ∈ Xu , κ∞ Θs∞ (ι∞u X) = κu Θsu (X) = [As , κu (X)] = [As , κ∞ (ι∞u X)] . Since the set of vectors {ι∞u (X); s < u, X ∈ Xu } is dense in X∞ , we get, ∀X ∈ X∞ , κ∞ Θs∞ (X) = [As , κ∞ (X)], and hence s κ∞ eΘ∞ (X) = eAs κ∞ (X)e−As . Set Us = eAs−1 · · · eA0
for s ≥ 1, U0 = 1 .
July 18, 2002 9:30 WSPC/148-RMP
546
00136
P. Duclos et al.
Assumption (23) implies that both sequences {Us } and {U−1 s } are Cauchy in B(K) and hence the limit (24) exists in the operator norm, with U−1 = lims→∞ U−1 s ∈ B(K). Moreover, ∀X ∈ X∞ , s−1 0 (27) κ∞ T∞ (X) = κ∞ lim eΘ∞ · · · eΘ∞ X = lim Us κ∞ (X)Us−1 . s→∞
s→∞
˜ s (Ku )) ∈ B(K). Bs Next let us compute κ ˜ s T˜s (Ks ). For 0 ≤ s < u, set Bs = κu (Θ u does not depend on u > s since if 0 ≤ s < u ≤ v then ˜ s (Ku )) = κv (Θ ˜ s (Ku )) = κv (ιvu Θ ˜ s (Kv )) . κu (Θ u u v We can apply Lemma 4.2 to the operators K, As , Bs to conclude that e−As (Dom K) = Dom K and eAs K e−As = K +
eadAs − 1 Bs . adAs
(28)
On the other hand, s s e Θu − 1 ˜ s eadAs − 1 ˜ Θ ˜ u Ku + (K ) =K+ Bs . κ ˜ u eΘu (Ku ) = κ u u s Θu adAs s ˜ Thus κ ˜ u eΘu (Ku ) = eAs Ke−As . Consequently, Us (Dom K) = Dom K and κ ˜ s T˜s (Ks ) = Us KUs−1 .
(29)
Set Cs = Us KUs−1 − K. According to (28), Cs ∈ B(K). Now we can compute, using relation (29), a limit in B(K), C = lim Cs = lim κs (T˜s (Ks ) − Ks ) s→∞
= κ∞
s→∞
lim ι∞s (T˜s (Ks ) − Ks )
s→∞
= κ∞ (T˜∞ (K∞ ) − K∞ ) . So K + C = κ ˜ ∞ (T˜∞ (K∞ )). From the closeness of K, the equality Us KU−1 s = K+ Cs , and from the fact that the sequences {U±1 s }, {Cs } converge one deduces that U±1 (Dom K) ⊂ Dom K and hence, in fact, U±1 (Dom K) = Dom K. In addition, ˜∞ T˜∞ (K∞ ) . UKU−1 = K + C − κ Combining (27) and (30) one finds that κ∞ (X)U−1 , κ ˜ ∞ T˜∞ (X) = U˜
˜∞ . ∀X ∈ X
To conclude the proof it suffices to apply the mapping κ ˜ ∞ to equality (20).
(30)
July 18, 2002 9:30 WSPC/148-RMP
00136
Weakly Regular Floquet Hamiltonians with Pure Point Spectrum
547
5. Choice of the Directed Sequence of Banach Spaces Suppose that there are given a decreasing sequence of subsets of the interval ]0, +∞[, Ω0 ⊃ Ω1 ⊃ Ω2 ⊃ · · · , a decreasing sequence of positive real num∞ bers {ϕs }∞ s=0 and a strictly increasing sequence of positive real numbers {Es }s=0 , 1 ≤ E1 < E2 < · · · . We construct a complex Banach space 0 Xs , s ≥ 0, as a subspace ! X X⊕ 0 ∞ Xs ⊂ L B(Hm , Hn ) Ωs × Z × N × N, n∈N m∈N
formed by those elements X = {Xknm (ω)} which satisfy Xknm (ω) ∈ B(Hm , Hn ), and have finite norm kXks =
sup
sup
0
n∈N
ω,ω ∈Ωs ω6=ω 0
XX
∀ω ∈ Ωs ,
∀(k, n, m) ∈ Z × N × N ,
˜ knm (ω, ω 0 )k)e|k|/Es (kXknm (ω)k + ϕs k∂X
(31)
k∈Z m∈N
where the symbol ∂˜ designates the discrete derivative in ω, ∂X(ω, ω 0) =
X(ω) − X(ω 0 ) . ω − ω0
In fact, this norm is considered in Appendix B (c.f. (85)), and it is shown there that 0 Xs is an operator algebra with respect to the multiplication rule (87). Let Xs ⊂ 0 Xs be a closed real subspace formed by those elements X ∈ 0 Xs which satisfy, ∀(k, n, m) ∈ Z×N×N,
∀ω ∈ Ωs ,
Xknm (ω)∗ = X−k,m,n (ω) ∈ B(Hn , Hm ) . (32)
Note, however, that Xs is not an operator subalgebra of 0 Xs . The sequence of Banach spaces, {Xs }∞ s=0 , becomes directed with respect to mappings of restriction in the variable ω: if u ≥ s then we set ιus : Xs → Xu ,
ιus (X) = X|Ωu .
Because of the monotonicity of the sequences {ϕs } and {Es } we clearly have kιus k ≤ 1. Next we introduce a bounded operator Ds ∈ B(Xs ) as an operator which extracts the diagonal part of a matrix, Ds (X)knm (ω) = δk0 δnm X0nn (ω) . Clearly, kDs k ≤ 1 and k1 − Ds k ≤ 1. Let V ∈L
∞
Z × N × N,
X X⊕ n∈N m∈N
(33)
! B(Hm , Hn )
July 18, 2002 9:30 WSPC/148-RMP
548
00136
P. Duclos et al.
be the element with the components Vknm ∈ B(Hm , Hn ) given in (2). Since, by assumption, V (t) is Hermitian for almost all t it hold true that (Vknm )∗ = V−k,m,n . We still assume, as in Theorem 2.1, that there exists r > 0 such that XX kVknm k max{|k|r , 1} < ∞ . V = sup n∈N
(34)
k∈Z m∈N
Let us define elements Vs ∈ Xs , s ≥ 0, by ( Vknm (Vs )knm (ω) = 0 For s ≥ 1 we get an estimate,
X
kVknm ke|k|/Es
m∈N k∈Z Es−1 ≤|k|<Es
≤ e sup n∈N
=
(35)
if |k| ≥ Es . X
kVs − ιs−1 (Vs−1 )ks = sup n∈N
if |k| < Es
XX k∈Z m∈N
kVknm k
max{|k|r , 1} (Es−1 )r
eV . (Es−1 )r
(36)
Similarly, for s = 0, we get kV0 k ≤ eV . It is convenient to set E−1 = 1, V−1 = 0. The sequence {Ks }∞ s=0 has the same meaning as in Sec. 3, i.e., each Ks is a distinguished basis vector in a one-dimensional vector space RKs . Furthermore, a sequence Θsu ∈ B(Xu ), 0 ≤ s < u, is supposed to satisfy rule (6). Similarly as in Proposition 3.1 we construct sequences Ts ∈ B(Xs ), s ≥ 1, and Ws ∈ Xs , s ≥ 0, using relations (7) and (8), respectively. Proposition 5.1. Suppose that it holds kΘsu k ≤
5 ϕs+1
kWs − ιs−1 (Ws−1 )ks ,
∀s, u, 0 ≤ s < u ,
(37)
and set A? = 5e sup s≥0
∞ X (Es )r 1 1 , B = 5e , C? = 5e sup . ? r r ϕs+1 (Es−1 )2r ϕ (E ) ϕ (E s+1 s−1 s+1 s−1 ) s≥0 s=0 (38)
If V B? ≤
1 ln 2 3
and
V A? φ(3V C? ) ≤
1 9
(39)
July 18, 2002 9:30 WSPC/148-RMP
00136
Weakly Regular Floquet Hamiltonians with Pure Point Spectrum
549
then the conclusions of Corollary 3 hold true, particularly, the objects V∞ , W∞ ∈ ˜ ∞ ) exist and satisfy the equality X∞ , T∞ ∈ B(X∞ ) and T˜∞ ∈ B(X T˜∞ (K∞ + V∞ ) = K∞ + D∞ (W∞ ) . Remark 5.2. Respecting estimates (36) and (37) we set in what follows Fs =
5 ϕs+1
and vs =
eV , (Es−1 )r
s ≥ 0.
(40)
Proof of Proposition 5.1. Taking into account the defining relations (40) one finds that the constants A, B and C introduced in Proposition 3.3 may be chosen as A = V A? , B = V B?
and C = V C? .
(41)
The assumption (39) implies that 1 1 ln 2 and A φ(3C) ≤ (42) 3 9 and so, according to the remark following Proposition 3.3, inequality (15) holds true with d = 3. Since Finf = 5/ϕ1 > 0 assumption (18) of Corollary 3.5 as well as all assumptions of Proposition 3.3 are satisfied and so the conclusions of Corollary 3.5 hold true. B≤
6. Relation of the Banach Spaces Xs to Hermitian Operators in K The real Banach spaces Xs have been chosen in the previous section. Set Ω∞ =
∞ \
Ωs .
s=0
Suppose that Ω∞ 6= ∅ and fix ω ∈ Ω∞ (so ω > 0). To an operator-valued function [0, T ] 3 t 7→ X(t) ∈ B(H) there is naturally related an operator X in K = L2 ([0, T ], H, dt) defined by (Xψ)(t) = X(t)ψ(t). As is well known, kXk ≤ kXkSH where k · kSH is the so called Schur–Holmgren norm, ( X sup kP` ⊗ Qn XPk ⊗ Qm k , = max kXk SH
(`,n)∈Z×N (k,m)∈Z×N
sup
X
) kP` ⊗ Qn XPk ⊗ Qm k
(k,m)∈Z×N (`,n)∈Z×N
= max
sup n∈N
X X k∈Z m∈N
kXknm k, sup m∈N
XX
kXknm k
.
k∈Z n∈N
(43)
July 18, 2002 9:30 WSPC/148-RMP
550
00136
P. Duclos et al.
Here Xknm =
1 T
Z
T
e−iωkt Qn X(t)Qm dt .
0
It is also elementary to verify that the Schur–Holmgren norm is an operator norm, kXY kSH ≤ kXkSH kY kSH , with respect to the multiplication rule (87). If X(t) is Hermitian for (almost) every t ∈ [0, T ] then it holds, ∀(k, n, m), (Xknm )∗ = X−k,m,n , and so XX kXknm k . kXkSH = sup n∈N
k∈Z m∈N
Note also that, ∀s ∈ Z+ , ∀X ∈ Xs , kX(ω)kSH ≤ kXks and, consequently, the same is also true for s = ∞. P P To an element X ∈ 0 Xs ⊂ L∞ (Ωs × Z × N × N, n∈N ⊕ m∈N B(Hm , Hn )) such that kX(ω)kSH < ∞ we can relate an operator-valued function defined on the interval [0, T ], XX X eikωt Xknm (ω) . t 7→ k∈Z n∈N m∈N
The corresponding operator in K is denoted by κs (X), with a norm being bounded from above by kX(ω)kSH . In particular, ∀X ∈ Xs , kκs (X)k ≤ kX(ω)kSH ≤ kXks . In addition, if X ∈ Xs then the operator κs (X) is Hermitian due to the property (32) of X. This way we have introduced the mappings κs : Xs → B(K) for s ∈ Z+ . Another property we shall need is that κs is an algebra morphism in the sense: if X, Y ∈ 0 Xs such that kX(ω)kSH < ∞ and kY (ω)kSH < ∞ then k(XY )(ω)kSH < ∞ and κs (XY ) = κs (X)κs (Y ) . Particularly this is true for all X, Y ∈ Xs . Let D ∈ B(B(K)) be the operator on B(K) taking the diagonal part of an operator X ∈ B(K), XX Pk ⊗ Qm XPk ⊗ Qm . D(X) = k∈Z m∈N
Clearly, Dκs = κs Ds . Since kD(X)k =
sup (k,m)∈Z×N
we have kDk ≤ 1.
kPk ⊗ Qm XPk ⊗ Qm k ≤ kXk
July 18, 2002 9:30 WSPC/148-RMP
00136
Weakly Regular Floquet Hamiltonians with Pure Point Spectrum
551
A consequence of (34) is that V = {Vknm } has a a finite Schur–Holmgren norm, kV kSH < ∞. Let Vs ∈ Xs , s ∈ Z+ , be the cut-offs of V defined in (35). Then X X kVknm k kV − Vs kSH = sup n∈N
≤
k∈Z, |k|≥Es m∈N
XX 1 sup kVknm k max{|k|r , 1} r (Es ) n∈N k∈Z m∈N
=
V . (Es )r
We shall impose an additional condition on the increasing sequence {Es } of positive real numbers that occur in the definition of the norm k · ks in Xs (c.f. (31)), namely we shall require lim Es = +∞ .
(44)
s→∞
In this case lims→∞ kV − Vs kSH = 0 and so V = lim κs (Vs ) in the operator norm . s→∞
(45)
We also assume that there exist As ∈ Xs+1 , s ∈ Z+ , such that (As )knm (ω)∗ = −(As )−k,m,n (ω),
(46)
s
and, using these elements, we define mappings 0 Θu ∈ B(0 Xu ), u > s, by 0
s
Θu (X) = [ιu,s+1 (As ), X]
(47) 0
(where the commutator on the RHS makes sense since Xu is an operator als gebra). Clearly, k0 Θu k ≤ 2kAs ks+1 . One finds readily that Xu ⊂ 0 Xu is an s invariant subspace with respect to the mapping 0 Θu and so one may define s Θsu = 0 Θu |Xu ∈ B(Xu ). Since iAs ∈ Xs+1 we can set As = −iκs+1 (iAs+1 ) ∈ B(K) . Clearly, As is anti-Hermitian and satisfies kAs k ≤ kAs ks+1 . Note that (47) implies that, ∀s, u, 0 ≤ s < u, ∀X ∈ Xu , κu (Θsu (X)) = [As , κu (X)] . ˜s ˜ Lemma 6.1. Let {Ws }∞ s=0 be a sequence of elements Ws ∈ Xs and let Θu : Xu → s ˜ Xu be the extension of Θu , 0 ≤ s < u, defined in (9). Assume that the elements As ∈ 0 Xs+1 , s ∈ Z+ , satisfy (kω∆mn ) (As )knm (ω) = (Θsu (ιus Ds (Ws ))) + ιus (1 − Ds )(Ws − ιs−1 (Ws−1 )))knm (ω) , ∀(k, m, n) ∈ Z × N × N, ∀s, u, 0 ≤ s < u. Then it holds true that , ∀s ∈ Z+ ,
As (Dom K) ⊂ Dom K ,
(48)
July 18, 2002 9:30 WSPC/148-RMP
552
00136
P. Duclos et al.
and ∀s, u, 0 ≤ s < u ,
˜ s (Ku ))|Dom(K) . [As , K] = κu (Θ u
Proof. Set ˜ s (Ku )) . Bs = −κu (Θ u ˜ s (Ku ) (c.f. (9)) this assumption Since the RHS of (48) is in fact a matrix entry of −Θ u may be rewritten as the equality KP` ⊗ Qn As Pk ⊗ Qm = P` ⊗ Qn As Pk ⊗ Qm K + P` ⊗ Qn Bs Pk ⊗ Qm , valid for all (`, n), (k, m) ∈ Z × N. Since K is closed one easily derives from the last property that it holds true, ∀(k, m) ∈ Z × N, KAs Pk ⊗ Qm = As Pk ⊗ Qm K + Bs Pk ⊗ Qm .
(49)
Particularly, As Ran(Pk ⊗Qm ) ⊂ Dom(K). But Ran(Pk ⊗Qm ) are mutually orthogonal eigenspaces of K. Consequently, if v ∈ Dom(K), then the sequence {vN }∞ N =1 , X X Pk ⊗ Qm v vN = k, |k|≤N m, m≤N
has the property: vN → v and KvN → Kv, as N → ∞. Equality (49) implies that KAs vN = As KvN + Bs vN , ∀N . Again owing to the fact that K is closed one concludes that As v ∈ Dom(K) and KAs v = As Kv + Bs v. Proposition 6.2. Assume that ω ∈ Ω∞ and the norms k · ks in the Banach spaces Xs satisfy (44). Let Θsu ∈ B(Xu ), 0 ≤ s < u, be the operators defined in (47) with the aid of elements As ∈ 0 Xs+1 satisfying (46), and let Ws ∈ Xs , s ∈ Z+ , be the sequence defined recursively in accordance with (8). Assume that the elements As , s ∈ Z+ , satisfy condition (48) and that kAs k ≤
5 kWs − ιs−1 (Ws−1 )k, 2ϕs+1
∀s ∈ Z+ .
(50)
Moreover , assume that the numbers A? , B? , C? , as defined in (38), satisfy condition (39). Then there exist , in K, a unitary operator U and a bounded Hermitian operator W such that U(Dom K) = Dom K and U(K + V)U−1 = K + D(W) . Proof. The norm of Θsu may be estimated as kΘsu k ≤ 2kAs k ≤
5 kWs − ιs−1 (Ws−1 )k . ϕs+1
July 18, 2002 9:30 WSPC/148-RMP
00136
Weakly Regular Floquet Hamiltonians with Pure Point Spectrum
553
This way the assumptions of Proposition 5.1 are satisfied and consequently, according to Proposition 5.1 (and its proof), the same is true for Proposition 3.3 and Corollary 3.5 (with Fs and vs defined in (40) and the constants A, B, C defined in (41)). Since it holds kAs k ≤ kAs k ≤ 12 Fs ws (where Fs = 5/ϕs+1 ) and, by assumption, condition (15) is satisfied with d = 3 we get ∞ X s=0
∞
kAs k ≤
∞
1X 3X 3B < ∞. Fs ws ≤ Fs vs = 2 s=0 2 s=0 2
This verifies assumption (23) of Proposition 4.1; the other assumptions of this proposition are verified as well as follows from Lemma 6.1. Note that, in virtue of (45), κ∞ (V∞ ) = lims→∞ κs (Vs ) coincides with the given operator V. Furthermore, W = κ∞ (W∞ ) = lims→∞ κs (Ws ) is a limit of Hermitian operators and so is itself Hermitian, and U = lims→∞ eAs−1 · · · eA0 is unitary. Equality (25) holds true and this concludes the proof. 7. Set of Non-Resonant Frequencies Let J > 0 be fixed and assume that, ∀s ∈ Z+ , 8 9 J, J . Ωs ⊂ 9 8 The following definition concerns indices (k, n, m) corresponding to non-diagonal entries, i.e., those indices for which either k 6= 0 or m 6= n. The diagonal indices, with k = 0 and m = n, will always be treated separately and, in fact, in a quite trivial manner. Definition 7.1. We shall say that a multi-index (k, n, m) ∈ Z × N × N is critical if m 6= n and 1 kJ ,2 (51) ∈ ∆mn 2 (hence sgn(k) = sgn(hm − hn ) 6= 0). In the opposite case the multi-index will be called non-critical. Definition 7.2. Let ψ(k, n, m) be a positive function defined on non-diagonal indices and W ∈ Xs . A frequency ω ∈ Ωs will be called (W, ψ)–non-resonant if for all non-diagonal indices (k, n, m) ∈ Z × N × N it holds dist(Spec(kω − ∆mn + W0nn (ω)), Spec(W0mm (ω))) ≥ ψ(k, n, m) .
(52)
In the opposite case ω will be called (W, ψ)-resonant. Note that, in virtue of (32), W0mm (ω) is a Hermitian operator in Hm . Lemma 7.3. Assume that Ωs ⊂ 89 J, 98 J , W ∈ Xs and ψ is a positive function defined on non-diagonal indices and obeying a symmetry condition, ψ(−k, m, n) = ψ(k, n, m) for all (k, n, m) non-diagonal .
(53)
July 18, 2002 9:30 WSPC/148-RMP
554
00136
P. Duclos et al.
If ∀m ∈ N,
∀ω, ω 0 ∈ Ωs ,
ω 6= ω 0 ,
˜ 0mm (ω, ω 0 )k ≤ k∂W
1 , 4
(54)
and if condition (52) is satisfied for all ω ∈ Ωs and all non-critical indices (k, n, m) ⊂ Ωs formed by (W, ψ)-resonant frequenthen the Lebesgue measure of the set Ωbad s cies may be estimated as X X Mm Mn ψ(k, n, m) . (55) |Ωbad s |≤ 8 k m,n∈N , ∆mn > 12 J
∆mn 2J
k∈N ,
m m Proof. Let λm 1 (ω) ≤ λ2 (ω) ≤ · · · ≤ λMm (ω) be the increasingly ordered set of eigenvalues of W0mm (ω), m ∈ N. Set n m Ωbad s (k, n, m, i, j) = {ω ∈ Ωs ; |ωk − ∆mn + λi (ω) − λj (ω)| < ψ(k, n, m)} .
Then = Ωbad s
[
[
(k,n,m)
i,j 1≤i≤Mn 1≤j≤Mm
Ωbad s (k, n, m, i, j) .
By assumption, if (k, n, m) is a non-critical index then Ωbad s (k, n, m, i, j) = ∅ (for any i, j). Further notice that, due to the symmetry condition (53), bad Ωbad s (k, n, m, i, j) = Ωs (−k, m, n, j, i). According to Lidskii theorem [21, Chap. II, Sec. 6.5], for any j, 1 ≤ j ≤ Mm , 0 (ω) − λm λm j j (ω ) may be written as a convex combination (with non-negative coefficients) of eigenvalues of the operator W0mm (ω) − W0mm (ω 0 ). Consequently, ˜ m (ω, ω 0 )| ≤ k∂W ˜ 0mm (ω, ω 0 )k ≤ ∀j, 1 ≤ j ≤ Mm , ∀ω, ω 0 ∈ Ωs , ω 6= ω 0 , |∂λ j
1 . 4
0 If ω, ω 0 ∈ Ωbad s (k, n, m, i, j), ω 6= ω , then (k, n, m) is necessarily a critical index and 0 n 0 m 0 2ψ(k, n, m) (ωk − ∆mn + λni (ω) − λm j (ω)) − (ω k − ∆mn + λi (ω ) − λj (ω )) > |ω − ω 0 | ω − ω0
≥ |k| −
1 1 ≥ |k| . 2 2
This implies that |Ωbad s (k, n, m, i, j)| ≤ 4ψ(k, n, m)/|k| and so X X 4 ψ(k, n, m) . |Ωbad s | ≤ 2 k i,j (k,n,m) 1≤i≤Mn k>0 2∆mn 1≤j≤Mm
∆mn 2J
This immediately leads to the desired inequality (55).
July 18, 2002 9:30 WSPC/148-RMP
00136
Weakly Regular Floquet Hamiltonians with Pure Point Spectrum
555
8. Construction of the Sequences {Ωs } and {As } For a non-diagonal multi-index (k, n, m) and s ∈ Z+ set 1 ∆0 2 1 7 J |k| − = 18 2
ψs (k, n, m) =
=
if (k, n, m) is non-critical and k = 0 , if (k, n, m) is non-critical and k 6= 0 ,
π ϕs+1 |k|1/2 e−%s |k|/2 2
if (k, n, m) is critical ,
(56)
where %s =
1 1 − . Es Es+1
Observe that ψs obeys the symmetry condition (53). The choice of ψs (k, n, m) for a non-critical index (k, n, m) was guided by the following lemma. Lemma 8.1. If ω ∈ Ωs ⊂ 89 J, 98 J , (k, n, m) ∈ Z × N × N is a non-critical index and W ∈ Xs satisfies 1 7 ∆0 , J (57) kW0mm (ω)k, kW0nn (ω)k ≤ min 4 72 then the spectra Spec(kω − ∆mn + W0nn (ω)), Spec(W0mm (ω)) are not interlaced (i.e., they are separated by a real point p such that one of them lies below and the other above p) and it holds dist(Spec(kω − ∆mn + W0nn (ω)), Spec(W0mm (ω))) ≥ ψ(k, n, m) . Proof. We distinguish two cases. If k 6= 0 then 7 ∆mn ≥ J|k| |kω − ∆mn | = |k| ω − k 18 since, by assumption, 1 8 9 ∆mn − ω ∈ ] − ∞, J − J ] ∪ [ 2J − J, +∞[ . k 2 9 8 So the distance may be estimated from below by 1 7 7 J|k| − kW0nn (ω)k − kW0mm (ω)k ≥ J |k| − . 18 18 2 If k = 0 then a lower bound to the distance is simply given by ∆0 − kW0nn (ω)k − kW0mm (ω)k ≥
1 ∆0 . 2
Next we specify the way we shall construct the decreasing sequence of sets 8 9 ∞ {Ωs }s=0 . Let Ω0 = 9 J, 8 J . If Ws ∈ Xs has been already defined then we introduce Ωs+1 ⊂ Ωs as the set of (Ws , ψs )–non-resonant frequencies. Recall that the real
July 18, 2002 9:30 WSPC/148-RMP
556
00136
P. Duclos et al.
Banach space Xs is determined by the choice of data ϕs , Es and Ωs , as explained in Sec. 5. As a next step let us consider, for s ∈ Z+ , ω ∈ Ωs+1 and a non-diagonal index (k, n, m), a commutation equation, (kω − ∆mn + (Ws )0nn (ω))X − X (Ws )0mm (ω) = Y ,
(58)
with an unknown X ∈ B(Hm , Hn ) and a right hand side Y ∈ B(Hm , Hn ). Since ω is (Ws , ψs )–non-resonant the spectra Spec(kω − ∆mn + (Ws )0nn (ω)) and Spec((Ws )0mm (ω)) do not intersect and so a solution X exists and is unique. This way one can introduce a linear mapping (Γs )knm (ω) : B(Hm , Hn ) → B(Hm , Hn ) such that X = (Γs )knm (ω)Y solves (58). Moreover, according to Appendix A, π (59) k(Γs )knm (ω)k ≤ 2ψ(k, n, m) in the general case, and provided the spectra Spec(kω − ∆mn + (Ws )0nn (ω)) and Spec((Ws )0mm (ω)) are not interlaced it even holds that k(Γs )knm (ω)k ≤
1 . ψ(k, n, m)
(60)
From the uniqueness it is clear that Ker((Γs )knm (ω)) = 0. We extend the definition of (Γs )knm to diagonal indices by letting (Γs )0nn (ω) = 0 ∈ B(B(Hn , Hn )). This way we get an element ! X X⊕ B(B(Hm , Hn )) , (61) Γs ∈ Map Ωs+1 × Z × N × N, n∈N m∈N
which naturally defines a linear mapping, denoted for simplicity by the same symbol, Γs : 0 Xs → 0 Xs+1 , according to the rule Γs (Y )knm (ω) := (Γs )knm (ω)(Yknm (ω)) . Lemma 8.2. Assume that for all non-diagonal indices (k, n, m) and ω, ω 0 ∈ Ωs+1 , ω 6= ω 0 , it holds ˜ s )−1 (ω, ω 0 )k ≤ |k| + 1 . (62) k∂(Γ knm 2 Assume also that when ω ∈ Ωs+1 and (k, n, m) is a non-critical index then the spectra Spec(kω − ∆mn + (Ws )0nn (ω)) and Spec((Ws )0mm (ω)) are not interlaced. Assume finally that 2 1 ∆0 , J . (63) ϕs+1 ≤ min 3 6 Then the following upper estimate on the norm of Γs ∈ B(0 Xs , 0 Xs+1 ) holds true: kΓs k ≤
5 . 2 ϕs+1
July 18, 2002 9:30 WSPC/148-RMP
00136
Weakly Regular Floquet Hamiltonians with Pure Point Spectrum
557
Proof. To estimate kΓs k we shall use relation (92) of Proposition B.2. Note that ˜ s )−1 (ω, ω 0 ) (Γs )knm (ω 0 )k ˜ s )knm (ω, ω 0 )k = k(Γs )knm (ω)∂(Γ k∂(Γ knm 1 . ≤ k(Γs )knm (ω)kk(Γs )knm (ω 0 )k |k| + 2
(64)
If (k, n, m) is critical then we have, according to (59) and (56), k(Γs )knm (ω)k ≤
1 e%s |k|/2 ϕs+1 |k|1/2
and consequently ˜ s )knm (ω, ω 0 )k) e−%s |k| (k(Γs )knm (ω)k + ϕs+1 k∂(Γ |k| + 12 %s |k| 1 %s |k|/2 e e + ≤ e−%s |k| ϕs+1 |k| ϕs+1 |k|1/2 5 1 1 . ≤ 1+1+ ≤ ϕs+1 2|k| 2 ϕs+1 If (k, n, m) is non-critical and k 6= 0 then we have, according to (60) and (56), k(Γs )knm (ω)k ≤
18 7J(|k| − 12 )
and consequently ˜ s )knm (ω, ω 0 )k) e−%s |k| (k(Γs )knm (ω)k + ϕs+1 k∂(Γ 18(|k| + 12 ) 18 1 + ϕ ≤ s+1 7J(|k| − 12 ) 7J(|k| − 12 ) 1 54 2 1 1 36 . 1+ < ≤ ϕs+1 6 7 6 7 ϕs+1 In the case when (k, n, m) is non-critical and k = 0 one gets similarly k(Γs )knm (ω)k ≤ 2/∆0 and ˜ s )knm (ω, ω 0 )k) e−%s |k| (k(Γs )knm (ω)k + ϕs+1 k∂(Γ 1 2 5 1 4 2 . 1+ < 1 + ϕs+1 ≤ ≤ ∆0 ∆0 ϕs+1 3 3 2 ϕs+1 Now we are able to specify the mappings Θsu . Set As = Γs ((1 − Ds )(Ws − ιs−1 (Ws−1 ))) ∈ 0 Xs+1 .
(65)
Ws ∈ Xs satisfies (32) and thus one finds, when taking Hermitian adjoint of (58), that ((Γs )knm (ω)Y )∗ = −(Γs )−k,m,n (ω)(Y ∗ ) . This implies that As obeys condition (46). The mappings Θsu , s < u, are defined by equality (47)(see also the comment following the equality).
July 18, 2002 9:30 WSPC/148-RMP
558
00136
P. Duclos et al.
9. Proof of Theorem 2.1 We start from the specification of the sequences {ϕs } and {Es }, ϕs = asα q −rs for s ≥ 1,
Es = q s+1 for s ≥ 0 ,
(66)
where α > 1 and q > 1 are constants that are arbitrary except of the restrictions q r ≥ eα
and q −r ζ(α) ≤ 3 ln 2
(67)
(ζ stands for the Riemann zeta function), and a = 45eq 2r V .
(68)
For example, α = 2 and q r = e2 will do. The value of ϕ0 ≥ ϕ1 = aq −r does not influence the estimates which follow, and we automatically have E−1 = 1 (this is a convenient convention). Condition r ln(q) ≥ α guarantees that the sequence {ϕs } is decreasing. Note also that 1 1 1 − = 1− q −s−1 . %s = Es Es+1 q Another reason for the choice (66) and (68) is that the constants A? , B? and C? , as defined in (38), obey assumption (39) of Proposition 5.1. Particularly, a constraint P r on the choice of {ϕs } and {Es }, namely ∞ s=0 1/(ϕs+1 (Es−1 ) ) < ∞, is imposed by requiring B? to be finite. However this is straightforward to verify. Actually, the constants may now be expressed explicitly, 5eq 2r 5eq r 5eq r , B? = ζ(α), C? = , a a a and thus conditions (39) means that 1 5eq r 5eq 2r 15eq r 1 ζ(α) ≤ ln 2, V φ V ≤ . V a 3 a a 9 A? =
(69)
The latter condition in (69) is satisfied since the LHS is bounded from above by (c.f. (5)) 1 −r 1 2 1 1 1 φ q = 1 − e1/3 < . ≤ φ 9 3 9 3 3 9 Concerning the former condition, the LHS equals q −r ζ(α)/9 and so it suffices to chose α and q so that (67) is fulfilled. An additional reason for the choice (66) will be explained later. Let us now summarize the construction of the sequences {Xs }, {Ws } and s 2.1. Some more details {Θu }s>u which will finally amount to a proof of Theorem were already given in Sec. 8. We set Ω0 = 89 J, 98 J and W0 = V0 . Recall that the cut-offs Vs of V were introduced in (35). In every step, numbered by s ∈ Z+ , we assume that Ωt and Wt , with 0 ≤ t ≤ s, and At , with 0 ≤ t ≤ s − 1, have already been defined. The mappings Θtu , with u > t, are given by Θtu (X) = [ιu,t+1 (At ), X] provided At ∈ 0 Xt+1 satisfies condition (46). We define Ωs+1 ⊂ Ωs as the set of
July 18, 2002 9:30 WSPC/148-RMP
00136
Weakly Regular Floquet Hamiltonians with Pure Point Spectrum
559
(Ws , ψs )–non-resonant frequencies, with ψs introduced in (56). Consequently, the real Banach space Xs+1 is defined as well as its definition depends on the data Ωs+1 , ϕs+1 and Es+1 . Then we are able to introduce an element Γs (in the sense of (61)) whose definition is based on Eq. (58) and which in turn determines a bounded operator Γs ∈ B(0 Xs , 0 Xs+1 ) (with some abuse of notation). The element As ∈ 0 Xs+1 is given by equality (65) and actually satisfies condition (46). Knowing Wt , t ≤ s, and Θts+1 , t ≤ s, (which is equivalent to knowing At , t ≤ s) one is able to evaluate the RHS of (8) defining the element Ws+1 . Hence one proceeds one step further. We choose ? (r, ∆0 , J) maximal possible so that 1 7 3e ∆0 , J (70) ? (r, ∆0 , J) ≤ min 1 − q −r 4 72 and
45eq r ? (r, ∆0 , J) ≤ min
2 1 ∆0 , J 3 6
.
(71)
We claim that this choice guarantees that the construction goes through. Basically this means that V < ? (r, ∆0 , J) is sufficiently small so that all the assumptions occurring in the preceding auxiliary results are satisfied in every step, with s ∈ Z+ . This concerns assumption (57) of Lemma 8.1, 1 7 ∆0 , J , ∀ω ∈ Ωs , ∀m ∈ N , (72) k(Ws )0mm (ω)k ≤ min 4 72 assumption (54) of Lemma 7.3, ˜ s )0mm (ω, ω 0 )k ≤ k∂(W
1 , 4
∀ω, ω 0 ∈ Ωs ,
ω 6= ω 0 ,
∀m ∈ N ,
(73)
assumptions (62) and (63) of Lemma 8.2, ˜ s )−1 (ω, ω 0 )k ≤ |k| + 1 , k∂(Γ knm 2
∀(k, n, m),
and
ϕs+1 ≤ min
2 1 ∆0 , J 3 6
∀ω, ω 0 ∈ Ωs ,
ω 6= ω 0 ,
(74)
,
(75)
and assumption (50) of Proposition 6.2, kAs−1 k ≤
5 kWs−1 − ιs−2 (Ws−2 )k . 2ϕs
(76)
We can immediately do some simplifications. As the sequence {ϕs } is nonincreasing condition (75) reduces to the case s = 0. Since ϕ1 = 45eq r V the upper bound (71) implies (75). Note also that (74) is a direct consequence of (73). Actually, one deduces from the definition of (Γs )knm (ω) (based on Eq. (58)) that, ∀Y ∈ B(Hm , Hn ), (Γs )−1 knm (ω)Y = (kω − ∆mn + (Ws )0nn (ω))Y − Y (Ws )0mm (ω) .
July 18, 2002 9:30 WSPC/148-RMP
560
00136
P. Duclos et al.
Hence ˜ s )0nn (ω, ω 0 )) Y − Y ∂(W ˜ s )0mm (ω, ω 0 ) ˜ s )−1 (ω, ω 0 )Y = (k + ∂(W ∂(Γ knm and, assuming (73), 1 . 2 Let us show that in every step, with s ∈ Z+ , conditions (72), (73) and (76) are actually fulfilled. For s = 0, condition (76) is empty and condition (73) is obvious since W0 = V0 does not depend on ω. Condition (72) is obvious as well due to assumption (70) and the fact that k(W0 )0mm (ω)k = k(V0 )0mm k ≤ V . Assume now that t ∈ Z+ and conditions (72), (73) and (76) are satisfied in each step s ≤ t. Recall that in (40) we have set Fs = 5/ϕs+1 and vs = e V /(Es−1 )r . We also keep the notation ws = kWs − ιs−1 (Ws−1 )ks , with the convention W−1 = 0. We start with condition (76). Using the induction hypothesis, Lemmas 8.1 and 8.2 one finds that kΓt k ≤ Ft /2 and so kAt k ≤ kΓt kkWt − ιt−1 (Wt−1 )k ≤ Ft wt /2 (c.f. (65) and (4)). By the induction hypothesis and the just preceding step, kAs k ≤ Fs ws for all s ≤ t. As we already know the constants A? , B? and C? fulfill (39) and so the quantities A, B and C given by A = V A? , B = V B? and C = V C? (c.f. (41)) obey (42) and consequently inequality (15) with d = 3. By the very choice of A, B and C (c.f. (38) and (40)) the quantities also obey relations (12), (13) and (14). This means that all assumptions of Proposition 3.3 are fulfilled for s ≤ t (recall that kΘsu k ≤ 2kAs ). One easily finds that the conclusion of Proposition 3.3, namely ws ≤ dvs , holds as well for all s, s ≤ t + 1. Clearly, k(Ws )0mm (ω)k ≤ kWs ks for all s, and ˜ s )0nn (ω, ω 0 ))k + k∂(W ˜ s )0mm (ω, ω 0 )k ≤ |k| + ˜ s )−1 (ω, ω 0 )k ≤ |k| + k∂(W k∂(Γ knm
kWt+1 kt+1 ≤
t+1 X
ws ≤ 3
s=0
∞ X
vs = 3eV
s=0
∞ X
q −rs =
s=0
3e V . 1 − q −r
By (70) we conclude that (72) is true for s = t + 1. Finally, using once more that ws ≤ 3vs for s ≤ t + 1, ˜ t+1 )0mm (ω, ω 0 )k ≤ k∂(W
t+1 X
k∂˜ Ws − ιs−1 (Ws−1 ) 0mm (ω, ω 0 )k
s=0
≤
t+1 X 1 kWs − ιs−1 (Ws−1 )ks ϕ s=0 s
≤
∞ X 3vs . ϕ s=0 s+1
However, the last sum equals (c.f. (40) and (42)) ∞
1 1 3 3X Fs vs = B ≤ ln 2 < . 5 s=0 5 5 4
July 18, 2002 9:30 WSPC/148-RMP
00136
Weakly Regular Floquet Hamiltonians with Pure Point Spectrum
561
This verifies (73) for s = t + 1 and hence the verification of conditions (72), (73) and (76) is complete. T Set, as before, Ω∞ = ∞ s=0 Ωs . Next we are going to estimate the Lebesgue measure of Ω∞ , ∞
∞
X X 17 17 J− J− |Ωs \Ωs+1 | = |Ωbad s |. 72 72 s=0 s=0
|Ω∞ | = |Ω0 | − |Ω0 \Ω∞ | =
Recalling Lemma 7.3 jointly with Lemma 8.1 showing that the assumptions of Lemma 7.3 are satisfied, and the explicit form of ψ (56) we obtain X X Mm Mn k −1/2 e−%s k/2 |Ωbad s | ≤ 4πϕs+1 m,n∈N , ∆mn > 12 J
≤ 4πϕs+1
k∈N , mn }
X
Mm Mn
m,n∈N , ∆mn > 12 J
X
= 16π (2J)σ ϕs+1
m,n∈N , ∆mn > 12 J
≤ 16π2 ϕs+1 σ
2σ + 1 e%s
2∆mn J
Mm Mn (∆mn )σ
∆mn 2J
−1/2
∆mn 2J
e−%s ∆mn /4J
σ+ 12
e−%s ∆mn /4J
σ+ 12 ∆σ (J)
α α ) . To where we have used that if α > 0 and β > 0 then supx>0 xα e−βx = ( eβ P∞ 1 σ+ 2 should be finite complete the estimate we need that the sum s=0 ϕs+1 /(%s ) which imposes another restriction on the choice of {ϕs } and {Es }. With our choice (66) this is guaranteed by the condition r > σ + 12 since in that case ∞ X ϕs+1 s=0
(%s
1 )σ+ 2
=
a 1−
1 q
σ+ 12
∞ X
(s + 1)α q −(r−σ− 2 )(s+1) < ∞ . 1
s=0
Hence |Ω∞ | ≥ where
17 J − δ1 (σ, r) ∆σ (J) V 72
(77)
σ+ 12 1 2σ + 1 Li−α (q −r+σ+ 2 ) . (78) δ1 (σ, r) = 720πeq 2r 2σ 1 − 1q e P k n Here Lin (z) = ∞ k=1 z /k (|z| < 1) is the polylogarithm function. This shows (3). To finish the proof let us assume that ω ∈ Ω∞ . We wish to apply Proposition 6.2. Going through its assumptions one finds that it only remains to make a note concerning equality (48). In fact, this equality is a direct consequence of
July 18, 2002 9:30 WSPC/148-RMP
562
00136
P. Duclos et al.
the construction of As ∈ 0 Xs+1 . Actually, by the definition of As (c.f. (65)), As = Γs (1 − Ds )(Ws − ιs−1 (Ws−1 )) , which means that for any ω ∈ Ωs+1 and all indices (k, n, m), kω − ∆mn + (Ws )0nn (ω) (As )knm (ω) − (As )knm (ω)(Ws )0mm (ω) (79) = (1 − Ds )(Ws − ιs−1 (Ws−1 )) knm (ω) . On the other hand, by the definition of Θsu (c.f. (47)) and the definition of Ds (c.f (33)), and since ω ∈ Ω∞ , it holds true that, ∀u, u > s, Θsu (ιus Ds (Ws ))knm (ω)
= [ιu,s+1 (As ), ιus Ds (Ws )] knm (ω)
= (As )knm (ω)(Ws )0mm − (Ws )0nn (As )knm (ω) .
(80)
A combination of (79) and (80) gives (48). We conclude that according to Proposition 6.2 the operator K + V is unitarily equivalent to K + D(W) and hence has a pure point spectrum. This concludes the proof of Theorem 2.1. 10. Concluding Remarks The backbone of the proof of Theorem 2.1 forms an iterative procedure loosely called here and elsewhere the quantum KAM method. One of the improvements attempted in the present paper was a sort of optimization of this method, particularly from the point of view of assumptions imposed on the regularity of the perturbation V . In this final section we would like to briefly discuss this feature by comparing our presentation to an earlier version of the method. We shall refer to paper [9] but the main points of the discussion apply as well to other papers including the original articles [5, 6] where the quantum KAM method was established. For the P sake of illustration we use a simple but basic model: H = m∈N m1+α Qm , i.e., hm = m1+α , with 0 < α ≤ 1, and dim Qm = 1; thus any σ > 1/α makes ∆σ (J) finite. The perturbation V is assumed to fulfill (34) for a given r ≥ 0. According to Theorem 2.1, r is required to satisfy r > σ + 1/2 which may be compared to reference [9, Theorem 4.1] where one requires (4σ + 6)σ +1. (81) r > r1 = 4σ + 6 + 1+σ The reason is that the procedure is done in two steps in the earlier version; in the first step preceding the iterative procedure itself the so-called adiabatic regularization is applied on V in order to achieve a regularity in time and “space” (by the spatial part one means the factor H in K = L2 ([0, T ], dt) ⊗ H) of the type ∃r1 , r2 > r2 = 4σ + 6 ,
sup |k|r1 |n − m|r2 |Vknm | < ∞ . knm
(82)
July 18, 2002 9:30 WSPC/148-RMP
00136
Weakly Regular Floquet Hamiltonians with Pure Point Spectrum
563
h i The adiabatic regularization brings in the summand (4σ+6)σ + 1. In the present 1+σ version both the adiabatic regularization and condition (82) are avoided. This is related to the choice of the norm in the auxiliary Banach spaces Xs , X ˜ knm (ω, ω 0 )|) . Fs (k, n, m)(|Xknm (ω)| + ϕs |∂X kXks = sup sup ω6=ω 0
n
k,m
In the earlier version the weights were chosen as Fs (k, n, m) := exp((|k| + |n − m|)/Es ) in order to compensate small divisors occurring in each step of the iterative method. A more careful control of the small divisors in the present version allows less restrictive weights, namely Fs (k, n, m) = exp(|k|/Es ). In more detail, indices labeling the small divisors are located in a critical subset of the lattice Z × N × N. Definition (51) of the critical indices implies a simple estimate, |k| ≤ |k| + |n − m| ≤ |k| + |∆mn | ≤ |k|(1 + 2J) , which explains why we effectively have, in the present version, r2 = 0. The second remark concerns Diophantine-like estimates of the small divisors governed by the sequence {ψs }. A bit complicated definition (56) is caused by the classification of the indices into critical and non-critical ones. However only the critical indices are of importance in this context and thus we can simplify, for the purpose of this discussion, the definition of ψs to ψs = γs |k|1/2 e−%s |k|/2 ,
ϕs+1 ≥ γs > 0 .
Let us compare it to the choice made in [9], namely ψs = γs |k|−σ . The factors γs then occur in some key estimates; let us summarize them. The norm of the operators Γs : Xs → Xs+1 are estimated as kΓs k ≤ const
ϕs+1 γs2
(this is shown in Lemma 8.2 but note that in this lemma we have set γs = ϕs+1 ). Another important condition is the convergence of the series B? = const
∞ X s=0
ϕs+1 <∞ γs2 (Es−1 )r
(c.f. (38) but there again γs = ϕs+1 ). Finally, the measure of the set of resonant S frequencies, | s Ωbad s |, is estimated by ∞ X s=0
|Ωbad s | ≤ const
∞ X γs σ+ 12 s=0 %s
< ∞,
%s =
1 1 − Es Es+1
(shown in the part of the proof of Theorem 2.1 preceding relation (77)). We recall that Es denotes the width of the truncation of the perturbation V at step s of the algorithm (c.f. (35)). These conditions restrict the choice of the sequences {Es } and {γs } which may also be regarded as parameters of the procedure. Specification (66)
July 18, 2002 9:30 WSPC/148-RMP
564
00136
P. Duclos et al.
of these parameters, with γs = ϕs+1 , can be compared to a polynomial behavior of Es and γs in the variable s in [9] where one sets ϕs+1 ≡ 1 and Es = const(s + 1)ν−1 ,
ν > 2,
γs = const(s + 1)−µ ,
µ > 1.
The latter definition finally leads to the bound on the order of regularity of V (2σ + 1)ν + 3 . ν −1 Thus in that case the bound varies from r > 4σ + 5 (for ν → 2+; this contributes to r1 in (81)) to r > 2σ + 1 (ν → +∞). This shows why we have chosen here to truncate with exponential Es , see (66). In the last remark let us mention a consequence of the equality γs = ϕs+1 . The S become (notice that %s = const/Es ) conditions for convergence of B? and s Ωbad s X X 1 σ+ 1 < ∞ and ϕs+1 Es 2 < ∞ r ϕs+1 (Es−1 ) s s r>
and are fulfilled for r > σ + 12 . There is however a drawback with this choice. Notice the role the coefficients ϕs play in the definition (31) of the norm k·ks . Since ϕs → 0 as s → ∞ one looses the control of the Lipschitz regularity in ω in the limit of the iterative procedure. This means that we have no information about the regularity of the eigenvectors and the eigenvalues of K + V with respect to ω. With r > 2σ + 1 we could have taken ϕs+1 = 1 and obtained that these eigenvalues and vectors are indeed Lipschitz in ω. Appendix A. Commutation Equation Suppose that X and Y are Hilbert spaces, A ∈ B(Y), B ∈ B(X), both A and B are self-adjoint, and V ∈ B(X, Y). We want to solve the equation AW − W B = V
(83)
in the unknown W under the condition dist(Spec(A), Spec(B)) > 0 .
(84)
The proof of the following proposition may be found in the beautiful review paper [22] by Bhatia and Rosenthal and references therein. If in addition to (84) one has sup Spec(A) < inf Spec(B) or sup Spec(B) < inf Spec(A) we shall say that Spec(A) and Spec(B) are not interlaced. Proposition A.1. Under (84) the solution to (83) exists and is unique in B(X, Y) and kV k π . kW k ≤ 2 dist(Spec(A), Spec(B)) If in addition Spec(A) and Spec(B) are not interlaced then kW k ≤
kV k . dist(Spec(A), Spec(B))
July 18, 2002 9:30 WSPC/148-RMP
00136
Weakly Regular Floquet Hamiltonians with Pure Point Spectrum
565
Appendix B. Choice of a Norm in a Banach Space Let H=
X⊕
Hn
n∈N
be a decomposition of a Hilbert space into a direct sum of mutually orthogonal subspaces, and Ω ⊂ R. To any couple of positive real numbers, ϕ and E, we relate a subspace ! X X⊕ ∞ B(Hm , Hn ) Ω × Z × N × N, A⊂L n∈N m∈N
formed by those elements V which satisfy Vknm (ω) ∈ B(Hm , Hn ) and have finite norm kVk = sup sup
XX
˜ knm (ω, ω 0 )k)e|k|/E (kVknm (ω)k + ϕ k∂V
(85)
ω,ω 0 ∈Ω n∈N k∈Z m∈N ω6=ω 0
where ∂˜ stands for the difference operator V(ω) − V(ω 0 ) ˜ . ∂V(ω, ω0) = ω − ω0 Note that the difference operator obeys the rule ˜ ˜ ˜ ω 0 ) V(ω 0 ) + U(ω)∂V(ω, ω0) . ∂(UV)(ω, ω 0 ) = ∂U(ω,
(86)
Proposition B.1. The norm in A is an algebra norm with respect to the multiplication XX Uk−`,n,p (ω) V`pm (ω) . (87) (UV)knm (ω) = `∈Z p∈N
Proof. We have to show that kUVk ≤ kUkkVk . For brevity let us denote (in this proof) XX kV`pm (ω)ke|`|/E , Xp (ω) = `∈Z m∈N
˜ p (ω, ω 0 ) = ∂X
XX `∈Z m∈N
˜ `pm (ω, ω 0 )ke|`|/E . k∂V
(88)
July 18, 2002 9:30 WSPC/148-RMP
566
00136
P. Duclos et al.
˜ is an “inseparable” symbol (which this time does not have the meaning ∂˜ Here ∂X of X ). It holds XX k(UV V )knm (ω)ke|k|/E m
k
≤
XXXX m
k
=
m
k
=
p
`
XXXX
kUknp (ω)ke|k|/E kV`pm (ω)ke|`|/E
p
`
XX
kUk−`,n,p (ω)ke|k−`|/E kV`pm (ω)ke|`|/E
kUknp (ω)ke|k|/E Xp (ω) .
p
k
Similarly, using (86), XXXX XX |k|/E ˜ ˜ `pm (ω, ω 0 )ke|`|/E k∂(UV) ≤ (kUknp (ω)ke|k|/E k∂V knm (ω)ke k
m
k
m
`
p
˜ knp (ω, ω 0 )ke|k|/E kV`pm (ω 0 )ke|`|/E ) + k∂U XX ˜ p (ω, ω 0 ) (kUknp (ω)k∂X = k
p
˜ knp (ω, ω 0 )k Xp (ω 0 ))e|k|/E . + k∂U A combination of these two inequalities gives XX |k|/E 0 ˜ k(UV)knm (ω)k + ϕ k∂(UV) knm (ω, ω )k e k
m
≤
XX k
˜ p (ω, ω 0 )) kUknp (ω)k(Xp (ω) + ϕ∂X
p
˜ knp (ω, ω 0 )kXp (ω 0 ) e|k|/E + ϕ k∂U ˜ p (ω, ω 0 )) ≤ sup sup(X dp (ω) + ϕ∂X ω,ω 0
p
XX k
(kUknp (ω)k
p
˜ knp (ω, ω 0 )k)e|k|/E + ϕ k∂U XX ˜ knp (ω, ω 0 )k)e|k|/E . = kVk (kUknp (ω)k + ϕ k∂U k
p
To obtain (88) it suffices to apply supω,ω0 supn to this inequality. Suppose now that two couples of positive real numbers, (ϕ1 , E1 ) and (ϕ2 , E2 ), are given and that it holds %=
1 1 − ≥0 E1 E2
and ϕ2 ≤ ϕ1 .
(89)
July 18, 2002 9:30 WSPC/148-RMP
00136
Weakly Regular Floquet Hamiltonians with Pure Point Spectrum
567
Consequently, we have two Banach spaces, A1 and A2 . Furthermore, we suppose that there is given an element ! X X⊕ B(B(Hm , Hn )) , (90) Γ ∈ Map Ω × Z × N × N, n∈N m∈N
such that for each couple (ω, k) ∈ Ω × Z and each double index (n, m) ∈ N × N, Γknm (ω) belongs to B(B(Hm , Hn )). Γ naturally determines a linear mapping, called for the sake of simplicity also Γ, from A1 to A2 , according to the prescription Γ(V)knm (ω) = Γknm (ω)(Vknm (ω)) .
(91)
Concerning the difference operator, in this case one can apply the rule ˜ ˜ ˜ ω 0 )(V(ω 0 )) + Γ(ω)(∂V(ω, ω 0 )) . ∂(Γ(V))(ω, ω 0 ) = ∂Γ(ω, Proposition B.2. The norm of Γ : A1 → A2 can be estimated as follows, kΓk ≤ sup sup
sup
0
˜ knm (ω, ω 0 )k) . e−%|k| (kΓknm (ω)k + ϕ2 k∂Γ
(92)
ω,ω ∈Ω k∈Z (n,m)∈N×N ω6=ω 0
˜ ω 0 )k. It Proof. Notice that, if convenient, one can interchange ω and ω 0 in k∂U(ω, holds XX ˜ knm (Vknm ))(ω, ω 0 )k)e|k|/E2 (kΓknm (ω)(Vknm (ω))k + ϕ2 k∂(Γ k
m
≤
XX
˜ knm (ω, ω 0 )k)e−%|k| (kVknm (ω)k(kΓknm (ω)k + ϕ2 k∂Γ
m
k
˜ knm (ω, ω 0 )k kΓknm (ω 0 )ke−%|k| )e|k|/E1 + ϕ2 k∂V ˜ knm (ω, ω 0 )k) ≤ sup sup sup e−%|k| (kΓknm (ω)k + ϕ2 k∂Γ ω,ω 0
×
k
(n,m)
XX k
˜ knm (ω, ω 0 )k)e|k|/E1 . (kVknm (ω)k + ϕ1 k∂V
m
To finish the proof it suffices to apply supω,ω0 supn to this inequality. Acknowledgments ˇ wishes to gratefully acknowledge the partial support from Grant No. P.S. 201/01/01308 of Grant Agency of the Czech Republic. We thank G. Burdet and Ph. Combe for drawing to our attention the review article of Bhatia and Rosenthal. References [1] V. Enss and K. Veseli´c, Bound states and propagating states for time-dependent Hamiltonians, Ann. Inst. H. Poincar` e 39 (1983) 159–191.
July 18, 2002 9:30 WSPC/148-RMP
568
00136
P. Duclos et al.
[2] M. Reed and B. Simon, Methods of Modern Mathematical Physics, III, Academic Press, New York, 1979. [3] J. S. Howland, Scattering theory for Hamiltonians periodic in time, Indiana J. Math. 28 (1979) 471–494. [4] K. Yajima, Scattering theory for Schr¨ odinger equations with potential periodic in time, J. Math. Soc. Japan 29 (1977) 729–743. [5] J. Bellissard, Stability and instability in quantum mechanics, in Trends and Developments in the Eighties, eds. Albeverio and Blanchard, World Scientific, Singapore, 1985, pp. 1–106. [6] M. Combescure, The quantum stability problem for time-periodic perturbations of the harmonic oscillator, Ann. Inst. H. Poincar´e 47 (1987) 62–82; Erratum, Ann. Inst. Henri Poincar´e 47 (1987) 451–454. [7] J. Nash, The embedding problems for Riemann manifolds, Ann. Math. 63 (1956) 20–63. [8] J. Moser, On invariant curves of an area preserving mapping of an annulus, Nachr. Akad. Wiss. G¨ ottingen Math. Phys. L1, 11a(1) (1962) 1–20. ˇˇtov´ıˇcek, Floquet Hamiltonian with pure point spectrum, Commun. [9] P. Duclos and P. S Math. Phys. 177 (1996) 327–247. [10] J. S. Howland, Floquet operators with singular spectrum I, II, Ann. Inst. H. Poincar`e 49 (1989) 309–334. [11] G. Nenciu, Adiabatic theory: Stability of systems with increasing gaps, Ann. Inst. H. Poincar`e, 67(4) (1997) 411–424. [12] A. Joye, Absence of absolutely continuous spectrum of floquet operators, J. Stat. Phys. 75 (1994) 929–952. ˇˇtov´ıˇcek and M. Vittot, Perturbation of an eigen-value from a dense [13] P. Duclos, P. S point spectrum: A general Floquet Hamiltonian, Ann. Inst. H. Poincar` e 71 (1999) 241–301. [14] P. M. Bleher, H. R. Jauslin and J. L. Lebowitz, Floquet spectrum for two-level systems in quasi-periodic time dependent fields, J. Stat. Phys. 68 (1992) 271. [15] D. Bambusi and S. Graffi, Time quasi-periodic unbounded perturbations of Schr¨ odinger operators and KAM methods, mp arc 00-399. [16] W. Scherer, Superconvergent perturbation method in quantum mechanics, Phys. Rev. Lett. 74 (1995) 1495. [17] H. R. Jauslin, S. Guerin and S. Thomas, Quantum averaging for driven systems with resonances, Physica A 279 (2000) 432–442. [18] P. Seba, Quantum chaos in Fermi-accelerator mode, Phys. Rev. A 41 (1990) 2306– 2310. [19] W. T. Palmer, Banach Algebras and The General Theory of *-Algebras, Vol. I Algebras and Banach Algebras, Encyclopedia of Mathematics and its Applications 49, Cambridge University Press, Cambridge and New York, 1994. [20] S. Sakai, C ∗ -Algebras and W ∗ -Algebras, Springer-Verlag, New York, 1971. [21] T. Kato, Perturbation Theory of Linear Operators, Springer-Verlag, New York, 1966. [22] R. Bhatia and P. Rosenthal, How and why to solve the operator equation AX − XB = Y , Bull. London Math. Soc. 29 (1997) 1–21.
July 18, 2002 9:39 WSPC/148-RMP
00124
Reviews in Mathematical Physics, Vol. 14, No. 6 (2002) 569–584 c World Scientific Publishing Company
A STRONG OPERATOR TOPOLOGY ADIABATIC THEOREM
ALEXANDER ELGART∗ and JEFFREY H. SCHENKER† ∗ Department
of Physics of Mathematics Princeton University, Princeton, NJ 08544, USA † Department
Received 28 September 2001 Revised 25 February 2002 We prove an adiabatic theorem for the evolution of spectral data under a weak additive perturbation in the context of a system without an intrinsic time scale. For continuous functions of the unperturbed Hamiltonian the convergence is in norm while for a larger class functions, including the spectral projections associated to embedded eigenvalues, the convergence is in the strong operator topology. Keywords: Quantum theory; adiabatic evolution; spectral theory. Mathematics Subject Classification 2000: 81Q15, 46N50, 81V70
1. Introduction The aim of this paper is to give a slightly new perspective on adiabatic theorems related to systems without intrinsic time scales. We consider convergence in the strong operator topology and hope to convince the reader that this is a natural setting for adiabatic theorems when there is no intrinsic notion of “slowness.” The adiabatic theorem of quantum mechanics describes the behavior of a nonautonomous system driven by means of slowly altered external field. An illustrative example is the case of a spin-1/2 particle (two level system) coupled to a rotating magnetic field of constant amplitude. The evolution of this system is generated by ~ the time dependent Hamiltonian H(t) := ~σ · B(t), where {σi } are the Pauli matrices ~ is the rotating magnetic field. There are two time scales here: the inverse of and B ˙ and the intrinsic the rate at which the magnetic field is changing, t1 := |B|/|B|, time scale of the two level system, t2 := 1/|B|, which is linked to the gap between the energy levels of the instantaneous Hamiltonian. It is natural to say that the system changes “slowly” if the ratio t2 /t1 is small. Adiabatic theory [1] implies in this context that if initially the system is in a stationary state – for instance, ~ with the spin parallel to the magnetic field B(0) – then it will stay close to an ~ instantaneous stationary state – i.e., parallel to the direction of B(t). “Close” here means that the transition amplitude to the second stationary state – anti-parallel ~ to B(t) – is bounded from above by a function of the ratio t2 /t1 which vanishes at zero. 569
July 18, 2002 9:39 WSPC/148-RMP
570
00124
A. Elgart & J. H. Schenker
In general, the subject of quantum adiabatic theory is the unitary evolution which solves an initial value problem (Schr¨ odinger equation) of the form ( iU˙ τ (t) = H(t/τ )Uτ (t) , t ∈ [0, τ ] (1.1) Uτ (0) = 1 with a time dependent self adjoint operator H(s) for s ∈ [0, 1]. The parameter τ is supposed to provide a scale to measure the “slowness” of the system, and adiabatic theory is concerned with the limit τ → ∞. Strictly speaking, to determine what is meant by “slow,” we need a second time scale coming from the structure of the system – e.g., a spectral gap as in the above example. When applicable, the adiabatic theorem states that lim Uτ (τ s)f (H(0))Uτ† (τ s) = f (H(s)) ,
τ →∞
s ∈ [0, 1] .
(1.2)
However, to be precise we should indicate in what topology this limit is taken, and this issue is the heart of this work. As far as we know, to date (1.2) has always been understood in the norm sense. This choice has been well justified since for the systems in question the meaning of slowness was intrinsic. For instance, if the function f in (1.2) is a projection to a spectral band separated by a finite gap from the rest of the spectrum, then the adiabatic theorem holds in the norm sense [2, 3]. In this case, the inverse of the spectral gap provides an intrinsic time scale. There are also examples of systems without a spectral gap but nonetheless a clearly defined intrinsic time scale. The first example is the so called level crossing situation: Imagine that two non-degenerate eigenvalues of the instantaneous Hamiltonian cross each other at some time. Although the spectral gap vanishes at the crossing, there is an intrinsic time scale coming from the relative slope of the eigenvalues. There is an adiabatic theorem in this case which holds in the norm sense [4]. Recently this was extended to systems with an infinite number of crossings in finite time [5]. Another example is a system with dense point spectrum perturbed by a finite rank operator, considered in Ref. [6]. There, the time scale is related not to the gap between energies (which may be arbitrarily small) but the gap multiplied by the overlap between the corresponding eigenstates coupled through the perturbation. Our final example is a system with an eigenvalue of finite degeneracy at the threshold of, say, continuous spectrum as considered in [7, 8]. Here an intrinsic time scale can be extracted from the H¨ older continuity of the continuous part of the spectral measure in the vicinity of the eigenvalue. In all situations above, the adiabatic theorem holds in the norm topology, and, more or less, these examples exhaust the known results on the subject.a The present paper is concerned with the adiabatic theorem for a system without an intrinsic time scale. We are motivated by problems encountered in the analysis a There is an extensive literature on adiabatic theory, much of which is not cited here. Several extensive reviews have appeared recently (see [8, 9]).
July 18, 2002 9:39 WSPC/148-RMP
00124
A Strong Operator Topology Adiabatic Theorem
571
of the Quantum Hall Effect (QHE) in which one considers a time dependent perturbation of a system with dense point spectrum. Unlike in Ref. [6], the perturbation is not finite rank which has the consequence, as was pointed out to us by Michael Aizenman, that one does not expect the adiabatic theorem to be true in the norm operator topology in that context. A somewhat simpler example of the phenomenon which occurs there is provided by a direct sum of infinitely many non-interacting systems each of which with its own characteristic time scale. Once the adiabatic parameter τ is larger than the time scale of an individual system, that subsystem is close to the adiabatic limit. However, if the sequence of time-scales is unbounded, there is no notion of slowness which holds for the whole system. We discuss this example in more detail in Sec. 5. We consider in this paper a family of Hamiltonians of the form 1 (1.3) Hτ (t/τ ) = H0 + Λ(t/τ ) τ where H0 and Λ(s), for s ∈ [0, 1], are self adjoint operators. The particular form for the time dependence is formulated with the QHE in mind. The usual adiabatic framework involves a Hamiltonian which depends on τ only through the rescaling of time – see (1.1). The evolution consider here is equivalent, via a unitary transformation, to the solution of (1.1) with H(s) = V (s)H0 V † (s), where Λ(s) is the generator of V (s): iV˙ (s) = Λ(s)V (s) , V (0) = 1 . (1.4) In physical literature, this description of the dynamics is referred to as the “interaction picture,” and has proved useful in many situations. We discuss here the limit τ → ∞ of a solution, Aτ (t) = Uτ (t)A(0)Uτ† (t), to the associated Heisenberg equation (1.5) iA˙ τ (t) = [Hτ (t/τ ) , Aτ (t)] when the initial observable is a function of H0 , i.e., A(0) = f (H0 ). Our main result, Theorem 2.1, states that Uτ (τ )f (H0 )Uτ† (τ ) → f (H0 )
(1.6)
for a wide class of functions f . The topology in which (1.6) holds depends on the continuity of f relative to the spectral properties of H0 : for continuous functions we obtain norm convergence while for a class of discontinuous functions we obtain strong operator convergence. Let us recall that a family τ 7→ Aτ of operators converges to A in the strong operator topology (SOT) if lim Aτ ψ = Aψ
τ →∞
(1.7)
for every ψ ∈ H and converges in norm if lim kAτ − Ak = 0 .
τ →∞
We denote SOT convergence by “SOT-lim Aτ = A”.
(1.8)
July 18, 2002 9:39 WSPC/148-RMP
572
00124
A. Elgart & J. H. Schenker
The remainder of this paper is organized as follows. Sections 2, 3 and 4 are devoted to the statement and proof of our main result, Theorem 2.1. In Sec. 5 we describe an example which shows that the norm topology is inadequate when we consider discontinuous functions of H0 . Finally, in Sec. 6 we present a stronger result which holds when H0 has pure point spectrum. 2. The Theorem and All We Can Show with the Resolvent Before we state Theorem 2.1, let us recall the definition of certain classes of functions f : R → C: (1) Let Cb denote the bounded continuous functions. (2) Let C0 denote those functions in Cb which vanish at ±∞. (3) Let BV denote the functions of bounded variation, i.e., functions f for which Var(f ) := sup
sup
n X
n≥1 x0 <···<xn ∈R j=1
|f (xj ) − f (xj−1 )| < ∞ .
(2.1)
A function in BV can have only countably many points of discontinuity. We direct the reader to [10, Chap. 3] for a detailed discussion of BV . Theorem 2.1. Let H0 be a self adjoint operator and suppose that the time evolution Uτ satisfies the initial value problem (1.1) with Hτ (t/τR) = H0 + (1/τ )Λ(t/τ ) where 1 Λ(·) is a self adjoint family which is L1 in norm: 0 dskΛ(s)k < ∞. Given a measurable function f , consider the statement lim Wτ (s)f (H0 )Wτ† (s) = f (H0 ) , unif ormly f or s ∈ [0, 1] ,
τ →∞
(2.2)
where Wτ is the evolution at scaled time, Wτ (s) = Uτ (τ · s) . (1) If f ∈ C0 then (2.2) is true in the operator norm topology. (2) If f = g + h with g ∈ Cb and h ∈ BV then (2.2) is true in the strong operator topology. Remark 2.2. (1) Operators Aτ (s) are said to converge uniformly to A in the strong operator topology if lim sup kAτ (s)ψ − Aψk = 0
τ →∞
(2.3)
s
for every ψ ∈ H. Uniform norm convergence is defined similarly. (2) Among the usual operator topologies, i.e., the norm topology as well as the strong and weak operator topologies, the strong operator topology is the strongest in which we can expect an adiabatic limit for discontinuous functions of H0 . In Sec. 5 we describe an elementary example of a system for which Wτ (s)f (H0 )Wτ† (s) fails to converge in the norm topology.
July 18, 2002 9:39 WSPC/148-RMP
00124
A Strong Operator Topology Adiabatic Theorem
573
(3) If the operator H0 is unbounded, the distinction between C0 and Cb is meaningful. Functions in Cb may be “discontinuous at infinity” which explains the loss of norm convergence. (4) Among the functions of bounded variation are the Kr¨ onecker delta functions: δE (x) = 1 if x = E and 0 otherwise. Thus we obtain an adiabatic evolution for the spectral projection associated to any eigenvalue – even if it has infinite degeneracy and is embedded in the essential spectrum! (5) As described above, the standard adiabatic theorems describe the limiting behavior of the Schr¨ odinger evolution for a system having a gap in its spectrum with initial data being a spectral projection onto an energy band. A projection onto a spectral band is a continuous function of H0 , thus the convergence occurs in the norm topology. In such a setting it is possible to find an explicit bound on the rate of convergence in (2.2) (see, for example, (2.12) and Lemma 3.3). (6) Schr¨ odinger equations with a Hamiltonian of the form considered here find direct application in the description of the motion of a quantum particle in a time dependent potential energy. In that case, H0 describes the motion of the particle in the absence of time dependent terms and is generally the Laplacian or some perturbation thereof, possibly discretized, the underlying Hilbert space being `2 (Zd ) or L2 (Rd ). The time dependent term Λ(t) is the operator of multiplication by a bounded function Λ(x, t). Theorem 2.1 is relevant to the adiabatic evolution of an ensemble of non-interacting particles with Fermi statistics. The observables, in this case, are the Fermi–Dirac distributions 1 at positive temperatures and the spectral projections Fµ,β (H0 ) = 1+eβ(H 0 −µ) χ(H0 ≤ µ) and/or χ(H0 < µ) at zero temperature. We obtain an adiabatic evolution even if there is an eigenvalue at the chemical potential µ! The heart of the matter lies in the proof of Theorem 2.1 under the additional assumption that Λ is boundedly differentiable in norm, i.e., that Λ(s + h) − Λ(s) ˙ Λ(s) := lim h→0 h
(2.4)
˙ is finite. The extenexists in the norm topology for each s ∈ (0, 1) and sups kΛ(s)k sion to general Λ is accomplished by a standard mollifier argument. Specifically, we choose a positive smooth function φ(s) with compact support R such that φ = 1 and set φ (s) = −1 φ(s/). Then Z Λ (s) =
ds0 φ (s − s0 )Λ(s0 ) ,
(2.5)
is boundedly differentiable and Z →0
1
kΛ (s) − Λ(s)k = 0 .
lim
0
(2.6)
July 18, 2002 9:39 WSPC/148-RMP
574
00124
A. Elgart & J. H. Schenker
Let Uτ, be the solution to the IVP (1.1) with Hτ (t/τ ) = H0 + (1/τ )Λ (t/τ ) and set Wτ, (s) = Uτ (τ · s). Then, d † † W (s)Wτ (s) = Wτ, (s) (Λ (s) − Λ(s))Wτ (s) . ds τ,
(2.7)
From this it follows that Z kWτ, (s) − Wτ (s)k ≤
1
dskΛ (s) − Λ(s)k → 0 ,
(2.8)
0
i.e., Wτ, (s) converges to Wτ (s) uniformly in s and τ . By a standard “2-” argument the theorem now follows for Λ in L1 once it is verified for Λ . Hence, it suffices to show Theorem 2.1 for differentiable Λ. Throughout the rest of the paper, Λ will denote a uniformly bounded self-adjoint family which is differentiable in the norm topology with a uniformly bounded deriva˙ tive Λ. The remainder of this section is devoted to the proof of those parts of Theorem 2.1 which follow from norm resolvent convergence. This part of the proof is very elementary but is also unrelated to the arguments in the subsequent sections. In Sec. 3, we present a Lemma 3.1, which states that the portion of Theorem 2.1 related to functions of bounded variation (BV ) may be reduced to a statement about spectral projections. A proof of this lemma, based on ideas that go back to Kato [11], is also given in Sec. 3. In Sec. 4 we prove Lemma 3.1. For a great many functions, f , the conclusion of Theorem 2.1 – i.e., (2.2) – follows from well known convergence theorems and a simple formula – (2.10) – which shows that sup kWτ (s)(H0 − z)−1 Wτ† (s) − (H0 − z)−1 k → 0
(2.9)
s∈[0,1]
for every z 6∈ R, which is to say that Wτ (s)H0 Wτ† (s) → H0 uniformly in s in the “norm resolvent sense”. The implications of norm resolvent convergence for Theorem 2.1 are that (1) Equation (2.2) holds in the norm topology for f ∈ C0 [12, Theorem VIII.20]. (2) Equation (2.2) holds in the strong operator topology for f ∈ Cb or when f is the characteristic function of an open interval (a, b) provided that a and b are not eigenvalues of H0 . This follows from [12, Theorems VIII.20 and VIII.24] since “strong resolvent convergence” is implied by “norm resolvent convergence.” What is remarkable is that with some additional work we can prove (2.2), for example, when f is the characteristic function of an open interval (a, b) and one or both of a, b is an eigenvalue with arbitrary degeneracy. To verify (2.9), we use the identity
July 18, 2002 9:39 WSPC/148-RMP
00124
A Strong Operator Topology Adiabatic Theorem
(Hτ (s) − z)−1 − Wτ (s)(Hτ (0) − z)−1 Wτ (s)† Z s d † −1 (Hτ (t) − z) = Wτ (s) Wτ (t) Wτ (t)dt Wτ (s)† , dt 0
575
(2.10)
where Hτ (s) = H0 + τ1 Λ(s). Equation (2.10) follows from the fundamental theorem of calculus and the observation that d d † −1 † −1 (Wτ (t) (Hτ (t) − z) Wτ (t)) = Wτ (t) (Hτ (t) − z) (2.11) Wτ (t) . dt dt Now, (2.9) follows from (2.10) because the latter implies that C 1 , (Im z)2 τ
(2.12)
1 d −1 ˙ (Hτ (t) − z)−1 = (Hτ (t) − z)−1 Λ(t)(H , τ (t) − z) dt τ
(2.13)
kWτ (s)(H0 − z)−1 Wτ (s)† − (H0 − z)−1 k ≤ since
and (Hτ (s) − z)−1 = (H0 − z)−1 −
1 (Hτ (s) − z)−1 Λ(s)(H0 − z)−1 . τ
(2.14)
Before we proceed, let us describe an example which demonstrates that we cannot hope to prove (2.2) for general f in BV using only the fact that Wτ H0 Wτ† converges to H0 in the norm resolvent sense. For this purpose it is sufficient to produce a sequence of unitary operators Vn such that Vn H0 Vn† converges to H0 but nonetheless SOT-lim Vn f (H0 )Vn† 6= f (H0 ) n→∞
(2.15)
for some function f ∈ BV . For this purpose, consider the self adjoint operator on `2 (Z) given in Dirac P 1 |mihm|, and for each n let Vn be the unitary on `2 (Z) notation by H0 = m6=0 m which “swaps 0 and n”, i.e., ψ(m) if m 6= 0, n if m = n . (2.16) (Vn ψ)(m) = ψ(0) ψ(n) if m = 0 Then Vn H0 Vn† = H0 + n1 (|0ih0| − |nihn|). So Vn H0 Vn† → H0 in norm, and thus in norm resolvent sense. Yet, Vn δ(H0 )Vn† = |nihn| −→ 0 6= δ(H0 ) n → ∞ , SOT
(2.17)
where δ(H0 ) = |0ih0| is the spectral projection of H0 associated to eigenvalue 0.
July 18, 2002 9:39 WSPC/148-RMP
576
00124
A. Elgart & J. H. Schenker
3. SOT Convergence for Spectral Projections The claim that (2.2) holds whenever f ∈ BV is, at heart, a statement about spectral projections as is indicated by the following lemma: Lemma 3.1. Equation (2.2) holds in the SOT for every f ∈ BV if and only if it holds for all f of the form f (x) = χ(x ≤ E) or f (x) = χ(x ≥ E) with any E ∈ R. We postpone the proof of Lemma 3.1 to Sec. 4 and focus here on proving (2.2) with f (x) = χ(x ≥ E) and f (x) = χ(x ≤ E) for every E in R. In what follows we fix E and take P = χ(H0 ≤ E). The other case, χ(H0 ≥ E), is handled in exactly the same way by changing ≤ to ≥ in the appropriate places. We must show that for any ψ ∈ H lim
sup k(Wτ (s)P Wτ (s)† − P )ψk = 0 .
τ →∞ s∈[0,1]
(3.1)
Our argument is stated most readily with the propagator Wτ (t, s) = Wτ (t)Wτ† (s) – note that Wτ (s)P Wτ† (s) = Wτ (s, 0)P Wτ (0, s) and P = Wτ (s, s)P Wτ (s, s). We would like to compare Wτ (s, t) with the propagator associated to H0 , so we define Ωτ (t, s) := eiτ (t−s)H0 Wτ (t)Wτ (s)† .
(3.2)
Since Ωτ (t, s) is unitary and the exponential of H0 commutes with P , k(Wτ (s)P Wτ (s)† − P )ψk = k(Ωτ (0, s)† P Ωτ (0, s) − P )ψk = k[P, Ωτ (0, s)]ψk .
(3.3)
Finally, because P is a projection [P, Ωτ (t, s)] = P Ωτ (t, s)P¯ − P¯ Ωτ (t, s)P ,
(3.4)
where P¯ = 1 − P . Therefore, (3.1) will follow if we can verify that both terms on the right side of (3.4) uniformly converge to zero in the SOT. Consider the first term. Let P∆ := χ(E < H0 < E + ∆), then P Ωτ (t, s)P¯ = P Ωτ (t, s)(P¯ − P∆ ) + P Ωτ (t, s)P∆ .
(3.5)
We will see below (Lemma 3.3) that the operator norm of P Ω(P − P∆ ) is uniformly bounded by 1/τ ∆. Thus given ψ ∈ H C kψk + kP∆ ψk . (3.6) kP Ωτ (t, s)P¯ ψk ≤ τ∆ √ If, for instance, ∆ = 1/ τ then both terms converges to zero since SOT-lim P∆ = 0 – whether or not there is an eigenvalue at E. The second term of (3.4) requires a little more care. Because E may be an eigenvalue, we need to isolate the contribution from the associated projection PE = 0 = χ(E − ∆ < H0 < E) and consider χ(H0 = E). Let P∆ 0 0 − PE ) + P¯ Ωτ (t, s)P∆ + P¯ Ωτ (t, s)PE . P¯ Ωτ (t, s)P = P¯ Ωτ (t, s)(P − P∆
(3.7)
July 18, 2002 9:39 WSPC/148-RMP
00124
A Strong Operator Topology Adiabatic Theorem
577
√ As above, if we take ∆ = 1/ τ then the first and second terms tend uniformly to zero. That the third term also converges to zero is the content of the following lemma: Lemma 3.2. Let PE := χ(H0 = E). Then (1 − PE ) Ωτ (t, s)PE uniformly tends to zero in the strong operator topology. Proof. The operator Ωτ (t, s) satisfies a Volterra equation Z t dr Kτ (r, s)Ωτ (r, s) , Ωτ (t, s) = 1 +
(3.8)
s
with Kτ (r, s) = −ieiτ (r−s)H0 Λ(r)eiτ (s−r)H0 .
(3.9)
By iterating (3.8) we obtain a norm convergent series Ωτ (t, s) =
∞ X
Anτ (t, s)
(3.10)
n=0
where
Z
Z ···
Anτ (t, s) =
dr1 · · · drn Kτ (r1 , s) · · · Kτ (rn , s) .
(3.11)
s≤rn ≤···≤r1 ≤t
Anτ (t, s)
is obtained by integrating a product of n factors of K over a simplex Since of volume (t − s)n /n! we have the elementary norm bound kAnτ (t, s)k ≤
1 n κ (t − s)n , n!
(3.12)
where κ = supr kΛ(r)k. Using dominated convergence, we see from (3.10), (3.12), that it suffices to show for each n that P¯E Anτ (t, s)PE → 0 uniformly in the SOT. This may be proved as follows. First note that PE Kτ (r, s)PE = −iPE Λ(r)PE ,
(3.13)
P¯E Kτ (r, s)PE = −iP¯E eiτ (r−s)(H0 −E) Λ(r)PE .
(3.14)
Next observe that
Z
r
0
dr0 P¯E eiτ (r −s)(H0 −E) → 0
(3.15)
s
uniformly in the strong operator topology from which it follows via integration by parts that Z r 0 dr0 P¯E eiτ (r −s)(H0 −E) B(r0 ) → 0 (3.16) s
for any differentiable family of operators B(r) which does not depend on τ .
July 18, 2002 9:39 WSPC/148-RMP
578
00124
A. Elgart & J. H. Schenker
Now consider the expression for Anτ obtained by inserting 1 = P¯E + PE between the two right most factors of Kτ in the integral which appears in (3.11). Proceed with the term obtained from PE by inserting P¯E + PE between the next two factors of K. Continue from right to left in this way, expanding only the terms obtained from PE . We obtain an expression for P¯E Anτ (t, s)PE as of sum of n terms, the jth term being Z Z dr1 . . . drn P¯E Kτ (r1 , s) · · · Kτ (rn−j , s) ··· (−i)j s≤rn ≤...≤r1 ≤t
× P¯E eiτ (rn−j+1 −s)(H0 −E) Λ(rn−j+1 )PE · · · Λ(rn )PE ,
(3.17)
which uniformly converges to zero by virtue of (3.16). Since Anτ is a finite linear combination of terms which uniformly tend to zero it does so as well. It remains to show that kP Ωτ (t, s)(P¯ − P∆ )k is bounded by 1/τ ∆. Lemma 3.3. Let P1 := χ(H0 ≤ E1 ) and P2 := χ(H0 ≥ E2 ) with E2 > E1 . Then C , (3.18) ∆τ where ∆ = E2 − E1 and C is a constant which does not depend on E1 or E2 . The same inequality holds with P1 , P2 interchanged. kP1 Ωτ (t, s)P2 k ≤
Proof. As in the proof of Lemma 3.2 the idea is to prove a bound on each term Anτ in the expansion for Ωτ . In this case, we will show that kP1 Anτ (s, t)P2 k ≤
αn n τ ∆ (n − 1)!
(3.19)
where α is a constant independent of s, t. Summing these bounds clearly implies (3.18) – see (3.10). The main step is to show that
Z s
C
P1 (3.20) drKτ (r, s)P2
≤ ∆τ ,
t and the same with P1 and P2 interchanged. The idea is that, since Kτ (r, s) = eiτ (r−s)H0 Λ(r)eiτ (s−t)H0 and the spectral supports of P1 and P2 are distance ∆ apart, the integral over r has a highly oscillating phase of order τ ∆. For a rigorous argument, however, it is convenient to use a commutator equation and integration by parts to extract (3.20). This method goes back to Kato [11]. The commutator [H0 , X] might be ill defined if H0 is unbounded. Thus we introduce a cutoff and work instead with [H0 , PM XPM ] where PM = χ(−M < H0 < M ) and M ∈ (0, ∞). At the end of the argument we take M → ∞. The X we have in mind is Z 1 dz P1 R(z)Λ(r)R(z)P2 (3.21) X(r) := 2πi Γ
July 18, 2002 9:39 WSPC/148-RMP
00124
A Strong Operator Topology Adiabatic Theorem
579
where R(z) := (H0 − z)−1 and the contour Γ is the line {E 0 + iη : η ∈ R} with E 0 = (E2 + E1 )/2. A simple calculation yields [H0 , PM X(r)PM ] = PM P1 Λ(r)P2 PM .
(3.22)
Therefore PM P1 Kτ (r, s)P2 PM = [H0 , eiτ (r−s)H0 PM X(r)PM eiτ (s−r)H0 ] 1 d (PM eiτ (r−s)H0 X(r)eiτ (s−r)H0 PM ) = iτ dr iτ (r−s)H0 ˙ iτ (s−r)H0 X(r)e PM . − PM e
(3.23)
˙ ˙ However X(r) and X(r) are uniformly bounded, kX(r)k, kX(r)k ≤ C/∆, so integrating (3.23) yields
Z s
C
PM P1 . (3.24) drK (r, s)P P τ 2 M ≤
∆τ t In the limit M → ∞ this implies (3.20) by lower semi-continuity of the norm. The second case with P1 and P2 interchanged follows with an obvious modification of X. The rest of the argument is similar to the proof of Lemma 3.2. We insert a ¯ between the factors of K in the integral decomposition of the identity 1 = Q + Q expression for Anτ , (3.11). To apply (3.20), we should maintain a spectral gap between the projections which sits to the left and right of K. Therefore we define ¯ j between Qj := χ(H0 ≤ Q1 + j/n∆) for j = 0, . . . , n and insert 1 = Qj + Q n the jth and (j + 1)th factors of K. With these insertions, P1 Aτ P2 breaks into 2n ¯ j+1 or terms and each term includes at least one factor of the type Qj Kτ (rj+1 , s)Q ¯ Qj Kτ (rj+1 , s)Qj+1 where there is a gap of size ∆/n between the spectral supports of the two projections. We apply integration by parts to the integral over rj+1 to obtain a factor which may be bounded by (3.20): Z rj ¯ j+1 B(rj+1 ) drj+1 Qj Kτ (rj+1 , s)Q 0
Z
Z
rj
drj+1 Qj
= 0
rj
r0
˙ j+1 ) . ¯ j+1 B(r dr0 Kτ (r0 , s)Q
(3.25)
Elementary norm estimates and (3.20) now show that each of the 2n terms is bounded by nβ n /(∆τ (n − 1)!) for some β which implies (3.19) with α = 2β. 4. Integration by Parts and the Proof of Lemma 3.1 Turning to the proof of Lemma 3.1, we note that the spectral theorem provides the representation Z (4.1) f (H0 ) = f (E)dP0 (E)
July 18, 2002 9:39 WSPC/148-RMP
580
00124
A. Elgart & J. H. Schenker
valid for bounded measurable f . The goal is to integrate this expression by parts thereby obtaining an expression involving df and P0 (E) = χ(H0 ≤ E). This argument works precisely when f ∈ BV as we shall now explain. The projection valued measure dP0 (E) is the differential of P0 (E) = χ(H0 ≤ E) which is of bounded variation in the strong operator topology. That is, for any ψ ∈ H, sup
n X
sup
n≥1 E0 <···<En ∈R j=1
kP0 (Ej )ψ − P0 (Ej−1 )ψk < ∞ .
(4.2)
We could equally well work with P0 (E) = χ(H0 < E) or a number of other choices – the distinction being meaningful only if H0 has point spectrum. Since the function P0 is SOT-continuous from the left at every E, i.e. P0 (E − 0) = P0 (E), we may integrate (4.1) by parts whenever f ∈ BV and everywhere continuous from the right:b Z f (H0 ) = f (∞)1 − df (E)P0 (E) , f ∈ BV and continuous from the right. (4.3) For general f ∈ BV this formula is replaced by Z f (H0 ) = f (∞)1 − df (E)χ(H0 ≤ E) +
X
(f (E) − f (E + 0))χ(H0 = E) .
(4.4)
E∈R
P Note that E∈R |f (E) − f (E + 0)| ≤ Var(f ) < ∞. In particular, there can be only countably many E ∈ R for which f (E) 6= f (E + 0). Now suppose that SOT-lim Wτ (s)AWτ (s)† = A
(4.5)
uniformly in s whenever A = χ(H0 ≤ E) or A = χ(H0 ≥ E) with E ∈ R. Since χ(H0 = E) = χ(H0 ≤ E) + χ(H0 ≥ E) − 1 ,
(4.6)
(4.5) also holds with A = χ(H0 = E). Now given f ∈ BV , use (4.4) to express f (H0 ) and find that SOT-lim Wτ (s)f (H0 )Wτ (s)† = f (H0 )
(4.7)
uniformly in s by dominated convergence. b The
extension of integration by parts to functions in BV is a standard part of real analysis – we direct the reader to [10, Chap. 3] for details.
July 18, 2002 9:39 WSPC/148-RMP
00124
A Strong Operator Topology Adiabatic Theorem
5. Why the Norm Topology is Inadequate
581
An Example
The following example, due to Michael Aizenman, is motivated by the consideration of systems with dense point spectrum. We begin with a comment which, although mathematically trivial, already contains a key observation. If a family of operators, Aτ , converges (as τ → ∞) in the strong operator topology then although for any vector ψ the family of vectors Aτ ψ is convergent – this is the very definition of SOT convergence – nothing can be said regarding the rate of convergence. In fact the essential difference with norm convergence is that in the norm case the vectors Aτ ψ all converge at the same rate. For a specific example, consider the countable collection of non interacting twolevel systems each perturbed by a weak perturbation (strength ∼ 1/τ ): Hτ = H0 +
1 Λ, τ
H0 =
∞ X
⊕ mk σz ,
Λ=
k=0
∞ X
⊕ σx ,
(5.1)
k=0
with σz , σx the Pauli spin matrices and mk = 1/k. The perturbation Λ is time independent, but still Theorem 2.1 applies. Of course, the unitary evolution Uτ associated with Hamiltonian Hτ decomposes into a direct sum of two by two matrices Uτk each generated by Hk,τ = (1/k)σz + (1/τ )σx . Let us choose f (H0 ) = P0 to be the spectral projection onto negative energies: P0 = χ(H0 < 0). We will show, lim sup kUτ (τ s)P0 Uτ† (τ s) − P0 k ≥ α(s) τ →∞
(5.2)
where α(s) > 0 for every s ∈ (0, 1), although, in accordance with Theorem 2.1, SOT lim Uτ (τ s)P0 Uτ† (τ s) = P0 .
(5.3)
Indeed, consider the particular sequence τn := n. For each n, the evolution of the two-level system with mn := 1/n obeys 1 (σz + σx )Uτnn (t) . (5.4) n Thus, the matrix V (s) := Uτnn (τn s) is independent of n and may be obtained by integrating iU˙ τnn (t) =
iV˙ (s) = (σz + σx )V (s)
(5.5)
with initial condition V (0) = 10 01 . It is now a simple matter to check that
V (s) 0 0 V (s)† − 0 0 > 0 , (5.6)
0 1 0 1 for all s ∈ (0, 1). Since, as mentioned above, Uτ is the direct sum of the two by two matrices Uτk we have
0 0 0 0 †
> 0. − kUτn (τn s)P0 Uτ†n (τn s) − P0 k ≥ V (s) V (s) (5.7)
0 1 0 1
July 18, 2002 9:39 WSPC/148-RMP
582
00124
A. Elgart & J. H. Schenker
6. The Schr¨ odinger Picture
A Theorem and a Counter-Example
Theorem 2.1 describes the adiabatic limit of the Heisenberg picture of quantum dynamics. As for the Schr¨ odinger picture, there is no reason to expect Uτ (τ s) to converge to anything at all, since even the unperturbed evolution, e−iτ sH0 , does not have a large τ limit. With this in mind, it is natural to ask whether, in some sense, Uτ is asymptotically equal to e−iτ sH0 . To test this idea we consider the evolution Ωτ (s) := eiτ sH0 Uτ (τ s)
(6.1)
which represents, physically, a process in which the system is evolved forward in time according to the perturbed dynamics and then backwards in time according to the unperturbed dynamics. If H0 admits an eigenfunction decomposition, i.e., if the spectrum of H0 is pure point, then a simple extension of the proof of Lemma 3.2 shows that Ω∞ does exist and even allows us to calculate it. Theorem 6.1. Let H0 be a self-adjoint operator with only pure point spectrum. If Uτ satisfies the initial problem (1.1) with Hτ (t/τ ) = H0 + (1/τ )Λ(t/τ ) where Λ(·) is a self-adjoint family which is L1 in norm, then SOT-lim eiτ sH0 Uτ (τ s) = Ω∞ (s) τ →∞
(6.2)
where Ω∞ (s) is the unitary operator which commutes with H0 and satisfies the initial value problem P ( iΩ˙ ∞ (s) = E∈σ(H0 ) PE Λ(s)PE Ω∞ (s) (6.3) Ω∞ (0) = 1 , where PE = χ(H0 = E) is the orthogonal projection onto the space of eigenvectors of H0 with eigenvalue E. Remark 6.2. (1) When there is a uniform lower bound on the spacing between neighboring eigenvalues, a classical result of Born and Fock [1] shows that the convergence occurs in the norm topology. (2) This theorem is of particular interest if H0 has only dense point spectrum as is true of discrete random Schr¨ odinger operators in the large disorder regime – see [13] for one perspective on this subject. Proof. In the notation of Sec. 3, the evolution considered here – Ωτ (s) – is equal to the propagator Ωτ (s, 0). Thus, by Lemma 3.2, we see that (1 − PE )Ωτ (s)PE converges to zero uniformly in the strong operator topology for every E. To complete the proof of Theorem 6.1, we let Ω∞ (s) denote the solution to the initial value problem (6.3) and show that
July 18, 2002 9:39 WSPC/148-RMP
00124
A Strong Operator Topology Adiabatic Theorem
SOT-lim PE Ωτ (s)PE = PE Ω∞ (s)PE ,
583
(6.4)
for each E ∈ σ(H0 ). As we saw in the proof of Lemma 3.2, Ωτ (s) satisfies a Volterra-type equation: Z s drKτ (r)Ωτ (r) , (6.5) Ωτ (s) = −i 0 −iτ rH0
Λ(r)e with Kτ (r) = −ie between Kτ and Ωτ we obtain iτ rH0
. Inserting into this expression 1 = PE +(1−PE ) Z
PE Ωτ (s)PE = PE − i Z
s
drPE Λ(r)PE Ωτ (r)PE 0
s
−i
drPE Kτ (r)(1 − PE )Ωτ (r)PE ,
(6.6)
0
since PE Kτ (r)PE = Λ(r). The last term on the right side converges to zero uniformly in the strong operator topology, again by Lemma 3.2. It is clear from (6.6) that if the limit of PE Ωτ (s)PE exists, then it obeys the evolution equation (6.3). However, (6.6) does not directly imply that the limit exists. On the other hand, Ω∞ also satisfies a Volterra-type equation which when subtracted from (6.6) yields, for the difference Eτ (s) = PE Ωτ (s)PE − PE Ω∞ (s)PE , Z s drPE Λ(r)PE Eτ (r) + Rτ (s) , (6.7) Eτ (s) = −i 0
where the remainder Rτ (s) is the last term of (6.6) and converges to zero uniformly in the strong operator topology. Using Gronwall’s lemma [14], we conclude from (6.7) that Eτ (s) converges to zero uniformly in the strong operator topology. More concretely, let ψ be any vector. Then (6.7) yields Z s drkPE Λ(r)PE kkEτ (r)ψk + kRτ (s)ψk . (6.8) kEτ (s)ψk ≤ 0
From this together with the classical Gronwall lemma we learn that R 1 kEτ (s)ψk ≤ sup kRτ (s)ψk e 0 kΛ(r)kdr .
(6.9)
s
Since the factor in brackets converges to zero as τ → ∞ so does the right hand side of the above inequality. We conclude with an example which shows that in general Ωτ (s) need not have a limit. Take H0 to be differentiation, id/dx, on L2 (R) and let Λ(s) be the operator of multiplication by a function Λ(x, s). Since e−itH0 is a shift by t, the generator of Kτ (s) is the operator of multiplication by −iΛ(x − τ s, s). Thus, since Kτ (s) is a commuting family, Ωτ (s) = e−i
Rs 0
drΛ(x−τ r,r)
.
(6.10)
July 18, 2002 9:39 WSPC/148-RMP
584
00124
A. Elgart & J. H. Schenker
S If, for instance, Λ(x, r) is the indicator function of the set n [22n , 22n+1 ] then the right hand side has no limit as τ → ∞. In light of this example, it is interesting to ask what conditions may be placed on Λ(r) to ensure the convergence of Ωτ (s). Acknowledgments We are grateful to M. Aizenman for the example presented in Sec. 5 as well as the suggestion to consider the strong operator topology. We are also indebted to Y. Avron for the insight that the norm adiabatic theorem is linked with the presence of an intrinsic time scale. This work was partially supported by the NSF Grant PHY-9971149 (AE). References [1] M. Born and V. Fock, Beweis des adiabatensatzes, Z. Phys. 51 (1928) 165–180. [2] J. E. Avron, R. Seiler and L. G. Yaffe, Adiabatic theorems and applications to the quantum Hall effect, Comm. Math. Phys. 110 (1987) 33–49. [3] G. Nenciu, On the adiabatic theorem of quantum mechanics, J. Phys. A 13 (1980) L15–L18. [4] G. A. Hagedorn, Adiabatic expansions near eigenvalue crossings, Ann. Physics 196 (1989) 278–295. [5] A. Joye, F. Monti, S. Gu´erin and H. R. Jauslin, Adiabatic evolution for systems with infinitely many eigenvalue crossings, J. Math. Phys. 40 (1999) 5456–5472. [6] J. E. Avron, J. S. Howland, and B. Simon, Adiabatic theorems for dense point spectra, Comm. Math. Phys. 128 (1990) 497–507. [7] F. Bornemann, Homogenization in Time of Singularly Perturbed Mechanical Systems, Springer-Verlag, Berlin, 1998. [8] J. E. Avron and A. Elgart, Adiabatic theorem without a gap condition, Comm. Math. Phys. 203 (1999) 445–463. [9] A. Joye and Ch.-Ed. Pfister, Exponential estimates in adiabatic quantum evolution, in Proceedings of the XIIth International Congress of Mathematical Physics, Brisbane, 1997. [10] G. B. Folland, Real Analysis, John Wiley & Sons Inc., New York, 1984, Modern techniques and their applications, A Wiley-Interscience Publication. [11] T. Kato, On the adiabatic theorem of quantum mechanics, Phys. Soc. Jap. 5 (1958) 435–9. [12] M. Reed and B. Simon, Methods of Modern Mathematical Physics, I, Academic Press Inc. [Harcourt Brace Jovanovich Publishers], New York, second edition, 1980. Functional analysis. [13] M. Aizenman and S. Molchanov, Localization at large disorder and at extreme energies: an elementary derivation, Comm. Math. Phys. 157 (1993) 245–278. [14] T. H. Gronwall, Note on the derivatives with respect to a parameter of the solutions of a system of differential equations, Ann. Math. 20 (1919) 292–296.
July 18, 2002 15:2 WSPC/148-RMP
00133
Reviews in Mathematical Physics, Vol. 14, No. 6 (2002) 585–600 c World Scientific Publishing Company
ERGODIC THEOREMS FOR 2D STATISTICAL HYDRODYNAMICS
SERGEI B. KUKSIN Department of Mathematics, Heriot-Watt University Edinburgh EH14 4AS, Scotland, UK Steklov Institute of Mathematics, 8 Gubkina St., 117966 Moscow, Russia
[email protected] Received 9 April 2002 We consider the 2D Navier–Stokes system, perturbed by a random force, such that sufficiently many of its Fourier modes are excited (e.g. all of them are). We discuss the results on the existence and uniqueness of a stationary measure for this system, obtained in last years, homogeneity of the measures and some their limiting properties. Next we use these results to prove that solutions of the equations obey the central limit theorem and the strong law of large numbers. Keywords: Central limit theorem; homogeneous measure; Navier-Stokes system; random force; stationary measure; strong law of large numbers.
1. Introduction In this work we interpret the 2D statistical hydrodynamics as a theory of the 2D Navier–Stokes (NS) system, perturbed by a random force: u˙ − ν∆u + (u · ∇)u + ∇p = η(t, x) ,
div u = 0 ,
(1.1)
where u = u(t, x) ∈ R2 and p = p(t, x). The space-variable x belongs either to a smooth bounded two-dimensional domain and then uRvanishesRon its boundary, or to the torus T2 = R2 /2πZ2 , and then we assume that u dx ≡ η dx ≡ 0.a In both cases any vector-field v(x) admits a unique decomposition v(x) = w(x) + ∇p(x) ,
div w = 0 ,
and we denote by Π the projector v(x) 7→ w(x) (Π is called Leray’s projector , see [4, 30]). Applying Π to the NS system (1.1) we re-write it as u(t) ˙ + Lu(t) + B(u(t), u(t)) = η(t) .
(1.2)
2 the most important is the case when x belongs to an R unbounded domain (say, x ∈ R ) and the velocity-field u(x) is bounded, but has infinite energy |u|2 dx. This case leads to serious complications which we cannot handle.
a Physically
585
July 18, 2002 15:2 WSPC/148-RMP
586
00133
S. B. Kuksin
Here we set L = −νΠ∆, B(u, u) = Π(u · ∇)u and re-denoted Πη = η. In (1.2) we view u and η as curves in the Hilbert space H, formed by square-integrable divergence-free vector-fields. By V we denote the space ∂u ∂u , ∈ L2 and u satisfies the boundary conditions , V = u(x) ∈ H ∂x1 ∂x2 R given the norm kuk = ( |∇u|2 dx)1/2 , and denote by |·| the norm in the space H. It is well known that if η(t) is a continuous curve in H, then for any u0 ∈ H Eq. (1.2) has a unique solution u(t) (understood in the sense of generalised functions) such that u(0) = u0 and u defines a continuous curve in H, as well as a square-integrable curve in V , see e.g. [4, 30, 12]. Let {e1 , e2 , . . .} be a Hilbert basis of the space H, formed by eigenfunctions of the operator L. In the case of periodic boundary conditions it is formed by the trigonometric vector-fields cs s⊥ cos s · x ,
cs s⊥ sin s · x ,
s∈S.
(1.3)
2 ⊥ Here S is a subset of Z20= Z2 \ {0} such that √ S ∩ −S = and S ∪ −S = Z0 , s = −s2 s1 2 2 for any vector s2 ∈ Z0 , and cs = 2π|s| is the L2 -normalising factor. s1 By {αj } we denote the eigenvalues, corresponding to the eigenfunctions {ej }, and assume that α1 ≤ α2 ≤ · · · . Note that α1 > 0 and that in the periodic case the eigenfunctions in (1.3) have the eigenvalues ν|s|2 . Concerning the random force η we assume that either it is a kick-force
η(t, x) =
∞ X
ηk (x)δ(t − T k) ,
ηk (x) =
∞ X
bj ξjk ej (x) ,
(1.4)
j=1
k=−∞
P 2 bj < ∞ and {ξjk } are independent where {bj ≥ 0} are some constants such that random variables with k-independent distributions. Or we assume that η is a “white in time force”: η(t, x) =
d ζ, dt
ζ=
∞ X
bj βj (t)ej (x) ,
(1.5)
j=1
where {βj } are independent standard Wiener processes, defined for t ∈ R, and the constants {bj ≥ 0} are such that X (1.6) αj b2j < ∞ . In the kick-case (1.4) solutions for (1.2) are normalised to be continuous from the right and can be described as follows. For t ∈ [kT, (k + 1)T ), where k is any integer, u is a solution of the free NS system (i.e. it satisfies (1.1) with η = 0), and at t = (k + 1)T it has a jump, equal ηk+1 , see Fig. 1. Denoting by S the operator of the time-T shift along trajectories of the free NS system, we see that u((k + 1)T ) = S(u(kT )) + ηk+1 .
(1.7)
July 18, 2002 15:2 WSPC/148-RMP
00133
Ergodic Theorems for 2D Statistical Hydrodynamics
u1
6 q
q
u0
η1
p
p
0
587
T Fig. 1.
~ η2
u3
6 q
?u q
2
p
2T
η3
p
3T
-t
Solutions of the kicked equation.
This random system defines a Markov chain in the space H. (Here and below all metric spaces are assumed to be provided with the Borel sigma-algebras.) Denoting by u(t; u0 ) a solution for (1.2) and (1.4), equal u0 at t = 0, we see that the corresponding Markov transition function P (t, v, Γ), where t ∈ T Z, v ∈ H and Γ is a Borel subset of H, is P (t, v, Γ) = P{u(t; v) ∈ Γ} .
(1.8)
This Markov chain defines semigroups {St } and {St∗ } (t ∈ T Z+ ) in the space Cb of bounded continuous functions on H and in the space P of probability (Borel) measures, respectively. Due to (1.8), (St f )(v) = Ef (u(t; v)) , f ∈ Cb , Z ∗ (St µ)(Γ) = P{u(t; v) ∈ Γ}µ(dv) , µ ∈ P . A measure µ ∈ P is called a stationary measure if St∗ µ ≡ µ. Concerning the independent variables ξjk we assume the following: (H) Their distributions Dξjk have densities against the Lebesgue measure, Dξjk = pj (s) ds, where pj ’s are functions R ε of bounded total variation, supported by the segment [−1, 1], and such that −ε pj (s) ds > 0 for all integers j and all ε > 0. That is, we restrict ourselves to the case of bounded random kicks. Concerning equations with unbounded kicks see [19]. Now let us pass to the white-forced NS system (1.2) and (1.5). Integrating (1.2) from 0 to T > 0 we get: Z
T
(Lu(t) + B(u(t), u(t))) dt = u(0) + ζ(T ) − ζ(0) .
u(T ) +
(1.9)
0
A random vector-field u(t, x), t ≥ 0, which defines an a.s. continuous process u(t) ∈ H such that its norm ku(t)k is square-integrable on every finite time-interval, is
July 18, 2002 15:2 WSPC/148-RMP
588
00133
S. B. Kuksin
called a solution for (1.2) and (1.5) if the relation (1.9) holds a.s., for each T > 0. The equality is understood in the sense of generalised functions of x.b It is known (e.g. see [31]) that for any u0 ∈ H Eqs. (1.2) and (1.5) has a unique solution, equal u0 for t = 0. Because of that, the white-forced NS system defines a Markov process in H. Its Markov transition function P (t, v, Γ) still is defined by the relation (1.8), where now t ∈ R+ . The corresponding semigroups {St } and {St∗ } are defined for t ≥ 0. 2. Stationary Measures 2.1. Kicked equations Armen Shirikyan and the author studied stationary measures for the kicked equation (1.2) and (1.4) in [17, 21, 18, 24, 16] (the last paper is joint with A. Piatnitski). In these works we assume that the force η satisfies the assumption (H) and that bj 6= 0 ,
∀ 1 ≤ j ≤ Nν ,
(2.1)
where Nν is a suitable constant, growing to infinity as ν → 0 (e.g. all bj are nonzero numbers). In [17] (see also [21]) it is proved that the equation has a unique stationary measure µ and solutions of (1.2) and (1.4) weakly converge to µ in distribution: Du(t; u0 ) * µ as
t → ∞,
(2.2)
for all u0 , where t ∈ T Z+ .c To establish the result we used a Foias–Prodi type reduction of the NS system to an Nν -dimensional system with delay which is satisfied by the vector, formed by the first Nν Fourier components of any solution for the NS system (1.2) and (1.4). The new system turned out to be of the Gibbs type (similar systems are studied, say, in [3]). By the assumption (2.1), the noise, which stirs it, is non-degenerate. Therefore due to a Ruelle-type theorem, the new system has a unique invariant (“Gibbs”) measure, so the NS system has a unique stationary measure µ. The condition (2.1) which guarantees the non-degeneracy of the reduced system is crucial for all works on stationary measures for randomly forced PDEs, written after [17] up to now. It is somewhat restrictive since Nν grows when the viscosity ν decreases, and it is not clear if the uniqueness result remains true under the weaker condition bj 6= 0 for j ≤ N , where N is an absolute constant. Still, it is shown in [9] that if the system (1.2) is replaced by any its finite-dimensional Galerkin approximation, then the stationary measure is unique provided that (2.1) holds b I.e. if we multiply both parts of (1.9) by any smooth divergence-free vector-field w(x) and next integrate against dx, applying formally integration by parts to the term (Lu) · w, we get equal numbers. c The work was done during the year 1999, when its approach was discussed at a number of informal seminars. At the end of that year the first talk on the results obtained was given at a meeting of Moscow Mathematical Society and a preprint of the paper [17] appeared.
July 18, 2002 15:2 WSPC/148-RMP
00133
Ergodic Theorems for 2D Statistical Hydrodynamics
589
with Nν = 3. This result follows from the Malliavin calculus since it can be checked that the system satisfies the H¨ ormander non-degeneracy condition. In [21, 18, 24, 16] we developed a coupling-approach to study the NS system and related kick-forced dissipative nonlinear PDEs.d This approach uses not the Foias– Prodi reduction, but the main lemma the reduction is based upon. It gives a shorter proof of the uniqueness (under the assumption (2.1)) and implies that Z Ef (u(t; u0 )) − f (u)µ(du) ≤ C(|u0 |)e−σt , ∀ f ∈ O , (2.3) where t ∈ T Z+ , σ > 0 is u0 -independent and O = {f ∈ Cb | |f | ≤ 1 and Lip f ≤ 1} .
(2.4)
That is, for any u0 ∈ H the distribution Du(t; u0 ) exponentially fast converge to µ in the Lipschitz-dual norm: kDu(t; u0 ) − µk∗L ≤ C(|u0 |)e−σt ,
(2.5)
where for any two measures ν1 , ν2 ∈ P, Z Z ∗ kν1 − ν2 kL = sup f (u)ν1 (du) − f (u)ν2 (du) . f ∈O
This convergence implies the weak convergence (2.2), see [5]. For the final version of the proof see [24], where some ideas of L. Kantorovich are evoked to simplify and clarify the arguments. Independently a similar coupling-approach to study the problem (1.2), (1.4) and (2.1) was developed by N. Masmoudi and L.-S. Young in [27]. 2.2. White-forced equations The first theorem on the uniqueness of a stationary measure for a white-forced NS system is due to Flandoli and Maslovski [11] who considered Eq. (1.2), perturbed by a non-smooth in the space-variable random force η(t, x). I.e. by a force (1.5), where bj ∼ j −a , 12 ≤ a < 38 . This result is not quite satisfactory since, firstly, the statistical hydrodynamics usually works with forces, smooth in the space-variable, and, secondly, it is unnatural to impose a lower bound for the energy of each Fourier mode. After the work [17] on the kick-forced equations, E. Mattingly, Sinai [10] and Bricmont, Kupiainen, Lefevere [2] applied the Foias–Prodi reduction to study the d The
short paper [21] was written in December 2000 as the authors’ respond to some criticism of the results of [17], made during my lectures at CIMS (New York). In January 2001, when the work was presented at a work-shop in Warwick, Roger Tribe pointed out that the main idea of the work is a form of coupling.
July 18, 2002 15:2 WSPC/148-RMP
590
00133
S. B. Kuksin
NS system (1.2) with smooth in space white-force (1.5) which satisfies (2.1), reducing it to a finite-dimensional system with delay. Imposing the additional restriction that the sum (1.5) is finite, N0
d X bj βj (t)ej (x) , η(t, x) = dt j=1
Nν ≤ N 0 < ∞ ,
(2.6)
they proved that the system has a unique stationary measure µ and is ergodic. In [10] the convergence (2.2) is not established, but it is shown that for any continuous functional f on H and for µ-a.a. R initial data u0 , time-averages for f (u(t; u0 )) converge to the ensemble-average f dµ. Some techniques, which had been developed in [10], were next used in works of other researches (including [2, 20]). In [2] it is proved that the convergence (2.2) holds for µ-a.a. initial data and is exponentially fast. This is the first work on the randomly forced NS equations, where the exponentially fast convergence to a stationary measure was established. Still, the stipulation “for µ-a.a.” restricts applicability of the results of that work. In particular, they cannot be used to derive the central limit theorem for solutions of (1.2) and (1.5), which we discuss in Sec. 4. We also note that the assumption that the sum (2.6) is finite makes it impossible to use the results of [10] and [2] to study the turbulence-limit ν → 0 (see Sec. 2.3). Indeed, the restriction Nν ≤ N 0 , where N 0 is fixed and Nν grows to infinity as ν → 0, does not allow to make ν very small. In [26], J. Mattingly applied a coupling to Eqs. (1.2) and (1.5) which satisfies (2.1) and (2.6), and proved that convergence (2.2) is exponential for all u(0). Unfortunately, we found it very difficult to follow his arguments. We also mention the papers [7, 14], devoted to studying a class of randomly perturbed parabolic problems with strong nonlinear dissipation, including the Ginzburg–Landau equation. In [20] Armen Shirikyan and I show that the ideas which we developed earlier to study the kicked equations, apply as well in the white-forced case. Technically the main difference with the kick-case is that now to study distributions of solutions we cannot any more use explicit formulas in terms of iterated integrals, but have to use instead Girsanov’s formulas, related to those which were first exploited in [10]. In [20] we prove that Eqs. (1.2), (1.5) and (2.1) has a unique stationary measure µ, and that the convergence (2.3) = (2.5) holds for all u0 , with C = C1 (1 + |u0 |2 ). Moreover, the convergence holds true if u(t), t ≥ 0, is a solution such that u(0) is a random variable with a bounded second moment: kDu(t) − µk∗L ≤ C1 (1 + E|u(0)|2 )e−σt ,
t ≥ 0.
(2.7)
Applying the Ito formula to functionals |u|2m , m = 1, 2, . . ., and arguing by induction we get that E|u(t; u0 )|2m ≤ C(m, u0 ), uniformly in t ≥ 0 (see [10] and [22]). Integrating the Lipschitz functionals fm,M (u) = |u|2m ∧ M against R the signed measure Du(t; 0) − µ, using (2.7) and going to limit in t, we find that fm,M dµ ≤
July 18, 2002 15:2 WSPC/148-RMP
00133
Ergodic Theorems for 2D Statistical Hydrodynamics
591
C(m, 0) for each M > 0. Now application of the Beppo–Levi theorem implies that all moments of the measure µ are finite: Z (2.8) |u|2m µ(du) < ∞ , ∀ m ∈ N . If the boundary conditions are periodic and the noise η as a function of x is smoother than we have assumed originally and ∞ X
αlj b2j < ∞ for some l ≥ 1 ,
(2.9)
j=1
then for t > 0 any solution u(t; u0 ), u0 ∈ H, a.s. belongs to the space H l , formed by vector-fields from H which belong to the Sobolev space of order l, and Eku(t; u0 )krl ≤ Crl (|u0 |) ,
∀t ≥ 1,
∀r ∈ N.
(2.10)
Here k · kl is the norm in H l , kukl = |(−∆u)l/2 u| (in particular, k · k1 = k · k and k · k0 = | · |). Now the convergence (2.3) holds for locally Lipschitz functionals f on H l−1 of polynomial growth: Z Ef (u(t; u0 )) − f (u)µ(du) ≤ Cf0 (|u0 |)e−σf t , ∀ t ≥ 1 , f ∈ Ol−1 . (2.11) S∞ p p , where Ol−1 denotes the set of continuous functionals f on Here Ol−1 = p=1 Ol−1 l−1 such that H (i) |f (u)| ≤ 1 + kukpl−1 , p−1 (ii) |f (u) − f (v)| ≤ ku − vkl−1 (1 + kukp−1 l−1 + kvkl−1 ), see [22]. In particular, the energy functional f (u) = |u|2 belongs to O02 , while the correlation tensor gives rise to the functionals f (u) = ui (x)uj (y) (x and y are points in the space-domain, i and j equals 1 or 2), which belong to O22 (and do not belong to O02 ). Similar refinement applies to the convergence (2.3) in the kick-forced case. Let U (t), t ≥ 0, be a solution for (1.2) and (1.5) such that DU (0) = µ. Then DU (t) ≡ µ and the Ito formula applies to the energy functional |U (t)|2 . Taking P 2 bj =: expectation and differentiating the result in t we find that 2νEkU (t)k2 = B0 . That is, Z 1 B0 (2.12) kuk2 µ(du) = 2ν (see [31, 10]). If (1.2) is the NS system with the periodic boundary conditions, then using (2.10) with l = 1 and applying the Ito formula to the enstrophy functional |rot u|2 = kuk2 we get that Z Z X 1 2 B1 , B1 = αj b2j . (2.13) kuk2 µ(du) = |∆u|2 µ(du) = 2ν
July 18, 2002 15:2 WSPC/148-RMP
592
00133
S. B. Kuksin
Since the functionals u 7→ kuk2 and u 7→ kuk22 belong to O1 and O2 , respectively, then for any solution u(t) = u(t; u0 ) (u0 ∈ H) of the NS equation under the periodic boundary conditions we have: Eku(t)k2 − 1 B0 ≤ C0 e−σ0 t , Eku(t)k22 − 1 B1 ≤ C1 e−σ1 t 2ν 2ν if t ≥ 1 and (2.9) holds with l = 3. 2.3. Stationary in space forces and solutions In this section we restrict ourselves to the NS system under the periodic boundary conditions. Now the basis {ej } is formed by the vector-fields (1.3), and it is convenient to write it as {es (x), s ∈ Z20 }, where for any s ∈ S, es (x) is the first vector-field in (1.3) and e−s (x) is the second. Let us consider a white-force (1.5) such that bs ≡ b−s . Then X bs cs s⊥ (βs (t) cos s · x + β−s (t) sin s · x) ζ= s∈S
= Re
X
b0s s⊥ (βs − iβ−s )eis·x ,
(2.14)
s∈S
where b0s = bs cs . Now αs = ν|s|2 and the assumption (1.6) takes the form P 2 2 2 s∈S bs |s| < ∞. Due to (2.14), for any y ∈ R we have X b0s s⊥ (βs − iβ−s )(cos s · y + i sin s · y)eis·x ζ y (t, x) := ζ(t, x + y) = Re = Re
X
y b0s s⊥ (βsy − iβ−s )eis·x ,
where βsy = βsy (t), t ∈ R, and for s ∈ S we have βsy = βs cos s · y + β−s sin s · y, y = β−s cos s · y − β−s sin s · y. Since βs and β−s are independent normal random β−s y also are independent and are variables and Dβ±s = N (0, |t|), then βsy and β−s y distributed as N (0, |t|). So the processes ζ (t), y ∈ R2 , are distributed identically with ζ(t). Let µ be the unique stationary measure for Eqs. (1.1), (2.14) and (2.1) and let U (t) = U (t, x), t ≥ 0, be a solution such that DU (t) ≡ µ. Then U y (t, x) = U (t, x+y) is a stationary process in H, satisfying the equation with ζ replaced by ζ y . Since Dζ y = Dζ, then DU y (t) is a stationary measure for the equation. Therefore, by the uniqueness, DU y (t) ≡ µ. That is, the measure DU y (t) ∈ P is y-independent, and the process U (t, x) is stationary both in time and space. Accordingly, the stationary measure µ is homogeneous, i.e. it is preserved by transformations of the space H of the form u(x) 7→ u(x + y) (y is a fixed vector from R2 ). R Since U dx ≡ 0, then Z Z 0 = E U dx = (E U (t, x)) dx = (2π)2 E U (t, x) .
July 18, 2002 15:2 WSPC/148-RMP
00133
Ergodic Theorems for 2D Statistical Hydrodynamics
593
That is, E U (t, x) ≡ 0 , ∀ t, x . (2.15) P 2 6 If (2.9) holds with l = 3, i.e. if bs |s| < ∞, then due to (2.11) with f (u) = u(x), we have: |Eu(t, x; u0 )| ≤ C(|u0 |)e−σt ,
t ≥ 1,
for any u0 ∈ H and any x. 2.4. The turbulence-limit Due to (2.2) and (2.7), the unique stationary measure µ comprises asymptotic in time stochastic properties of solutions for the NS system, cf. [1, Chap. VI] and [12, Sec. 6.1]. If the force (1.5) is such that bj 6= 0 for all j, then for each positive viscosity ν the equation has a unique stationary measure µν . The limiting properties of this measure as ν → 0 describe the 2D turbulence. At this moment we do not know much about the turbulence-limit for the NS system, apart from the relations (2.12) and (2.13) (where µ = µν ) and their immediate consequences.e Instead, in the rest of this paper we discuss limiting properties of the stationary measure for the kick-forced equation when the period T between the kicks goes to zero, and asymptotic properties of time-averaged solutions, both in the white-forced and the kicked-forced cases. For some other PDEs, different from but related to the 2D NS system, some progress in study of the turbulence-limit has been achieved in last years. Namely, in [23] (also see in [24]) a weak form of the Kolmogorov–Obukhov law from the theory of developed turbulence (see [25, Sec. 33]) is obtained for solutions of the randomly forced nonlinear Schr¨ odinger equation with small viscosity. In [8] the small-viscosity 1D Burgers equation is considered. It is proved that when the viscosity goes to zero, the stationary measure weakly converges to a limit, and the limiting measure is studied. In [15] similar analysis is done for the nD Burgers equation. We note that the weak convergence of the stationary measure is a nice specific of the Burgers equation. Most likely, for the NS system this convergence does not hold (so the limiting properties of the stationary measure µν correspond to some “very weak” convergence of measures). 3. Kick-Forces with Short Periods Between the Kicks √ Let us consider Eq. (1.2) with the kick-force εη, where η has the form (1.4) and T = ε, 0 < ε 1: ∞ √ X ηk (x)δ(t − εk) . (3.1) η = ηε (t, x) = ε k=−∞ e In
particular, these relations imply that the energy-range for the periodic 2D turbulent flow (described by Eq. (1.1)) is bounded uniformly in ν ∈ (0, 1].
July 18, 2002 15:2 WSPC/148-RMP
594
00133
S. B. Kuksin
2 In addition to (H), we assume that E ξjk ≡ 0, E ξjk ≡ 1, (1.6) holds and
bj 6= 0 ,
∀j .
(3.2)
Denoting by uε (t; u0 ) a solution for (1.2) and (3.1), equal u0 for t = 0, due to the results of Sec. 2.1 for any ε > 0 we have the convergence Duε (t; u0 ) * µε
as
t → ∞,
where µε is the corresponding stationary R t+0 measure. Let us set Xε (t) to be equal to 0+ ηε (s) ds for t ∈ R+ ∩ εZ and use the linear interpolation to extend Xε to R+ . Due to the Donsker theorem (see e.g. [29]), we have DXε (·) * Dζ(·) as ε → 0, where now * denotes the weak convergence of measures in the space C([0, L], H), the process ζ is defined in (1.5) and L is any positive number. This convergence underlies the “splitting up method for stochastic PDEs” which in our situation states that Duε (t; u0 ) * Du(t; u0 ) as
ε→0
for each t ≥ 0, where u(t; u0 ) is a solution for (1.2) and (1.5), see in [22]. Using the results of Sec. 2.2 we have Du(t; u0 ) * µ as t → ∞ . So altogether we got the convergences Duε (t; u0 ) −−−−−→ µε t→∞ ε→0 y Du(t; u0 )
−−−−−→ µ t→∞
(the arrows signify weak convergence of measures). In [22] A. Shirikyan and the author of this paper show that µε * µ as ε → 0. So (3.3) closes up to the commutative diagram: Duε (t; u0 ) −−−−−→ t→∞ ε→0 y
µε ε→0 y
Du(t; u0 ) −−−−−→ µ t→∞
That is, for distributions of solutions for the NS system, forced by the short-kick force (3.1) and (3.2), the limits t → ∞ and ε → 0 commute. 4. Ergodic Properties of Solutions In this section we use the convergences (2.5) and (2.7) and some versions of the classical limiting theorems from the theory of probability to examine ergodic properties of solutions for the randomly forced NS systems.
July 18, 2002 15:2 WSPC/148-RMP
00133
Ergodic Theorems for 2D Statistical Hydrodynamics
595
4.1. Ergodicity and the strong law of large numbers Let us consider Eqs. (1.2) and (1.5) which satisfies (2.1). Let µ be its stationary measure and Uτ (t), t ≥ τ , be a solution such that DUτ (τ ) = µ. Then DUτ (t) = µ for each t. Passing to the limit as τ → −∞, we obtain an a.s. continuous stationary process U (t) ∈ H, t ∈ R (defined on a new probability space), such that DU (t) ≡ µ and U satisfies (1.2), where η(t) is replaced by a new process η 0 (t) with the same distribution (cf. [17, Proposition 1.5]). So U is a Markov process with the same transition function P . Since µ is the unique stationary measure for the equation, than the process U is ergodic, see [28, Theorem 3.2.6]. Therefore the strong law of large numbers applies to the process h(U ), where h is any function from L2 (H, dµ) (see [28, Theorem 3.3.1]): Z Z 1 T T h(U (t)) dt −−−−−→ h(u)µ(du) a.s. (4.1) hh(U )i0 := T →∞ T 0 If h ∈ O (see (2.4)), then this convergence holds for any solution: Theorem 4.1. If (2.1) holds and u(t) = u(t; u0 ) is a solution of (1.2) and (1.5), then Z (4.2) hf (u)iT0 −−−−−→ f (u) µ(du) a.s. T →∞
for any f ∈ O and u0 ∈ H. Proof. Let P be the measure, defined by the process u in the space of trajectories H = C([0, ∞), H). Then (4.2) states that the functionals on H,Rdefined for u ∈ H by the l.h.s. of (4.2), converge P -a.s. as T → ∞ to the constant f dµ. So to check (4.2) we can replace u(t) by any weak solution u1 (t) for (1.2) and (1.5), equal u0 at t = 0.f Similar, (4.1) remains true if we replace U by a process u2 (t) ∈ H, t ≥ 0, distributed as U . For any random variable T 0 ≥ 0 which is a.s. finite, we set Z T 1 f (u(t)) dt . hf (u)iTT 0 := IT 0
∀t ≥ T0 ,
where C and σ are positive constants. Then Z T CIT 0
(4.3)
is, by any process u1 such that u1 (0) = u0 and u1 satisfies (1.2), where η is replaced by a process ∂t ζ 0 and ζ 0 has the same distribution as ζ in (1.5). This process defines in H the same measure P .
July 18, 2002 15:2 WSPC/148-RMP
596
00133
S. B. Kuksin
R Since (4.1) with h = f implies that hf (u2 )iTT 0 → f dµ a.s. and since due to (4.3) hf (u1 )iTT 0 − hf (u2 )iTT 0 → 0 a.s., then (4.2) follows. Remark 4.1. (1) If the boundary conditions are periodic and (2.9) holds with some l ≥ 1, then due to (2.10) and (2.11), the theorem’s assertion is valid for any f ∈ Ol−1 . (2) An obvious version of the theorem holds for solutions of the kicked equation (1.2) and (1.4). (3) The arguments used in the proof apply to study other asymptotic properties of the solutions u. In particular, if g is a bounded Lipschitz functional on H and the Law of Iterated Logarithm holds for the stationary process g(U ), then it also holds for the processes g(u). 4.2. The central limit theorem (CLT ) Let U (t) ∈ H, t ∈ R, be the stationary weak solution for (1.2), (1.5) and (2.1), constructed above, and {F≤t , t ∈ R R}, be the corresponding flow of σ-algebras. Let f ∈ O be a functional such that f (u) µ(du) = 0. Then, using the Markov property and (2.7), we get: Z 2 E(E(f (U0 )|F≤−t ) = (Ef (u(t; u0 )))2 µ(du0 ) ≤ C12 e−2σt
Z (1 + |u0 |2 )2 µ(du0 ) .
Since the integral in the r.h.s. is bounded due to (2.8), then E(E(f (U0 )|F≤−t ))2 ≤ C2 e−2σt . 0 be the σ-algebra, generated by the random variables f (U (s)), s ≤ t. Then Let F≤t 0 F≤t ⊂ F≤t . Denoting E(f (U0 )|F≤−t ) = F and using the Jensen inequality we have: 0 ))2 ≤ E(E(F 2 |F≤−t )) = EF 2 . E(E(F |F≤−t
So 0 ))2 ≤ C2 e−2σt . E(E(f (U0 )|F≤−t
(4.4)
Let us consider the real-valued stationary process X(t) = f (U (t)), t ∈ R. Since the process U is ergodic, then X is ergodic as well. Due to the mixing property (4.4), the CLT as in [6, Theorem 7.6],g applies to X, and for any positive T we have f (U (0)) + · · · + f (U ((n − 1)T )) √ *σ ˆ N (0, 1) as n → ∞ , (4.5) D n g This
form of the CLT goes back to M. I. Gordin [13].
July 18, 2002 15:2 WSPC/148-RMP
00133
Ergodic Theorems for 2D Statistical Hydrodynamics
597
where ˆ 2 (T ) = Ef (U (0))2 + 2 σ ˆ2 = σ
∞ X
E(f (U (0))f (U (nT ))) ≥ 0 .
(4.6)
n=1
ˆ 2 > 0 at least if T is Due to (4.4), |Ef (U (0))f (U (nT ))| ≤ C2 e−σnT . Therefore σ sufficiently large. For any u0 ∈ H let us denote u(t) = u(t; u0 ). Taking a function h : R → R such that |h| ≤ 1 and Lip h ≤ 1, we wish to estimate E h f (U (0) + · · · +√f (U ((n − 1)T ))) n f (u(0) + · · · + f (u((n − 1)T ))) √ −Eh (4.7) . n 1/2
Denoting n−1/2 f (U (jT )) = fUj and n−1/2 f (u(jT )) = fuj , we see that (4.7) is majorized by n−1 X
|E(h(fU0 + · · · + fUj + fuj+1 + · · · + fun−1 )
j=0
− h(fU0 + · · · + fuj + fuj+1 + · · · + fun−1 ))| .
(4.8)
Due to [20, (3.39)], we can replace U and u by weak solutions having the same initial conditions, such that (keeping for the new solutions the same notations) we have P{|U (jT ) − u(jT )| ≥ C1 e−σjT } ≤ C2 (1 + |u0 |2 )e−σjT .
(4.9)
Let us denote the event in the l.h.s. of (4.9) by Qj , j = 0, . . . , n − 1. Since Lip f , Lip h ≤ 1, then everywhere outside Qj we have |h(· · · + fUj + fuj+1 + · · ·) − h(· · · + fuj + fuj+1 + · · ·)| ≤ n−1/2 C1 e−σjT . As |h| ≤ 1, then due to (4.9) the jth term in (4.8) is bounded by n−1/2 C1 e−σjT + 2C2 (1 + |u0 |2 )e−σjT ≤ C3 e−σjT , where C3 depends on |u0 |. From other hand, since |fuj |, |fUj | ≤ n−1/2 , then the jth term is also bounded by 2n−1/2 . Let us denote by nT the smallest integer ≥ (log n)/σT . Then, majorizing the first nT terms in (4.8) using the second estimate, and majorizing the rest of them using the first, we get that (4.8) ≤ 2n−1/2 nT + C3
n−1 X j=nT
e−σjT ≤ 2n−1/2 nT +
C4 ≤ C5 n−1/3 . nσT
July 18, 2002 15:2 WSPC/148-RMP
598
00133
S. B. Kuksin
Thus, for each h as above, (4.7) is bounded by C5 n−1/3 . Hence,
f (U ((n − 1)T ))
D f (U (0)) + · · · + √
n ∗ f (u(0)) + · · · + f (u((n − 1)T ))
= O(n−1/3 ) , √ −D
n L
(4.10)
where now k · k∗L stands for the Lipschitz-dual norm for measures on the real line. Since the convergence in this norm is equivalent to the weak convergence of measures [5], then (4.5) and (4.10) imply the CLT for the process f (u(t)): Theorem 4.2. If (2.1) holds, µ is the unique stationary measure for (1.2) and (1.5) and u = u(t; u0 ) is any solution, then for any functional f ∈ O such that R f dµ = 0, we have the convergence f (u(0)) + · · · + f (u((n − 1)T )) √ *σ ˆ N (0, 1) as n → ∞ , (4.11) D n where σ ˆ is defined in (4.6). If the boundary conditions are periodic and (2.9) holds for some l ≥ 1, then f can be taken from the space Ol−1 . In particular, if l = 3, then we can take f (u) = ui (x), or f (u) = ui (x)uj (y), where x and y are fixed points from the space-domain and i, j ∈ {1, 2}. The same result is true for solutions of the kicked equation (1.2), (1.4) and (2.1) (in this case we choose T to be equal to the interval between the kicks). Proof remains the same. As an example, let us consider the NS equation under the periodic boundary conditions, perturbed by the space-stationary force ∂t ζ, where the process ζ is P 2 6 defined in (2.14). Assuming that (2.9) holds with l = 3 (i.e. that bs |s| < ∞) and using (2.15) we get that u(0, x) + · · · + u((n − 1)T, x) √ *σ ˆ N (0, 1) as n → ∞ . (4.12) D n The dispersion σ ˆ 2 is x-independent since it is defined in terms of the process U and the latter is stationary in time and in space, see in Sec. 2.3. We claim (giving no proof) that the integral version of (4.12) also is true: Z T −1/2 u(s, x) ds * σ 0 N (0, 1) as T → ∞ . (4.13) D T 0
This convergence is related to the following physical effect. Let us imagine that we are trying to measure the velocity u of the fluid at a point x, using R T some device. Then what we really measure will be the averaged quantity const · 0 u(s, x) ds. If our device is unsophisticated, then T 1 and due to (4.13) we shall get the false impression that u(t, x) is a Gaussian random variable. The normal distribution of the velocity u was “observed” in a number of experiments in the first half of the last
July 18, 2002 15:2 WSPC/148-RMP
00133
Ergodic Theorems for 2D Statistical Hydrodynamics
599
century (some of them are discussed in [1, Sec. 8.1]). We believe that the arguments above explain this phenomenon. Acknowledgments Most of my results, reviewed in Sec. 2 of this work, are obtained in collaboration with Armen Shirikyan, and I thank Armen for related discussions. This research was supported by EPSRC, grant GR/N63055/01. References [1] G. K. Batchelor, The Theory of Homogeneous Turbulence, Cambridge University Press, Cambridge, 1982. [2] J. Bricmont, A. Kupiainen and R. Lefevere, Exponential mixing for the 2D stochastic Navier–Stokes dynamics, preprint (2000). [3] R. Bowen, Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms, Lecture Notes in Mathematics, Vol. 470, Springer-Verlag, Berlin, 1975. [4] P. Constantin and C. Foias, Navier–Stokes Equations, University of Chicago Press, Chicago, 1988. [5] R. M. Dudley, Real Analysis and Probability, Wadsworth & Brooks/Cole, Pacific Grove, California, 1989. [6] R. Durret, Probability: Theory and Examples, Wadsworth & Brooks/Cole, Pacific Grove, California, 1991. [7] J.-P. Eckmann and M. Hairer, Uniqueness of the invariant measure for a stochastic PDE driven by degenerate noise, Comm. Math. Phys. 219 (2001) 523–565. [8] W. E. and K. Khanin, A. Mazel and Ya. G. Sinai, Invariant measures for Burgers equation with stochastic forcing, Ann. of Math. 151 (2000) 877–960. [9] W. E. and J. C. Mattingly, Ergodicity for the Navier–Stokes equation with degenerate random forcing: finite dimensional approximation, Comm. Pure Appl. Math. 54 (2001) 1386–1402. [10] W. E. and J. C. Mattingly and Ya. G. Sinai, Gibbsian dynamics and ergodicity for the stochastically forced Navier–Stokes equation, Comm. Math. Phys. 224 (2001) 83–106. [11] F. Flandoli and B. Maslowski, Ergodicity of the 2D Navier–Stokes equation under random perturbations, Comm. Math. Phys. 172 (1995) 119–141. [12] G. Gallavotti, Foundations of Fluid Dynamics, Springer-Verlag, Berlin, 2001. [13] M. I. Gordin, The central limit theorem for stationary processes, Soviet Math. Doklady 10 (1969) 1174–1176. [14] M. Hairer, Exponential mixing properties of stochastic PDE’s through asymptotic coupling, preprint (2001). [15] R. Iturriaga and K. Khanin, Burgers turbulence and random Lagrangian systems, preprint (2001). [16] S. B. Kuksin, A. Piatnitski and A. Shirikyan, A coupling approach to randomly forced nonlinear PDE’s. II, to appear in Comm. Math. Phys. [17] S. B. Kuksin and A. Shirikyan, Stochastic dissipative PDE’s and Gibbs measures, Comm. Math. Phys. 213 (2000) 291–330. [18] S. B. Kuksin and A. Shirikyan, A coupling approach to randomly forced nonlinear PDE’s. I, Comm. Math. Phys. 221 (2001) 351–366. [19] S. B. Kuksin and A. Shirikyan, Ergodicity for the randomly forced 2D Navier–Stokes equations, Math. Phys. Anal. Geom. 4 (2001) 147–195.
July 18, 2002 15:2 WSPC/148-RMP
600
00133
S. B. Kuksin
[20] S. B. Kuksin and A. Shirikyan, Coupling approach to white-forced nonlinear PDE’s, J. Math. Pures Appl., 81 (2002) 567–602. [21] S. B. Kuksin and A. Shirikyan, On dissipative systems perturbed by bounded random kick-forces, to appear in Ergodic Theory Dynam. Systems. [22] S. B. Kuksin and A.Shirikyan, Some limiting properties of randomly forced 2D Navier–Stokes equations, preprint (2002). [23] S. B. Kuksin, Spectral properties of solutions for nonlinear PDEs in the turbulent regime, Geom. Funct. Anal. 9 (1999) 141–184. [24] S. B. Kuksin, On exponential convergence to a stationary measure for nonlinear PDEs, perturbed by random kick-forces, and the turbulence-limit, The M. I. Vishik Moscow PDE Seminar, Amer. Math. Soc. Transl. Amer. Math. Soc., 2002. [25] L. D. Landau and E. M. Lifschitz, Fluid Mechanics, second ed., Pergamon Press, Oxford, 1987. [26] J. Mattingly, Exponential convergence for the stochastically forced Navier–Stokes equations and other partially dissipative dynamics, preprint (2001). [27] N. Masmoudi and L.-S. Young, Ergodic theory of infinite dimensional systems with applications to dissipative parabolic PDE’s, preprint (2001). [28] G. Da Prato and J. Zabchyk, Ergodicity for Infinite Dimensional Systems, Cambridge University Press, Cambridge, 1996. [29] B. Simon, Functional Integration and Quantum Physica, Academic Press, New York, 1979. [30] R. Temam, Navier–Stokes Equations, North-Holland, Amsterdam, 1979. [31] M. I. Vishik and A. V. Fursikov, Mathematical Problems in Statistical Hydromechanics, Kluwer, Dordrecht, 1988.
July 18, 2002 10:21 WSPC/148-RMP
00131
Reviews in Mathematical Physics, Vol. 14, No. 6 (2002) 601–621 c World Scientific Publishing Company
QUANTUM MOMENT MAPS AND INVARIANTS FOR G-INVARIANT STAR PRODUCTS
KENTARO HAMACHI Laboratoire Gevrey de Math´ ematique Physique, Universit´ e de Bourgogne BP 47870, F-21078 Dijon Cedex, France
[email protected] Received 11 November 1999 Revised 21 March 2002 We study a quantum moment map and propose an invariant for G-invariant star products on a G-transitive symplectic manifold. We start by describing a new method to construct a quantum moment map for G-invariant star products of Fedosov type. We use it to obtain an invariant that is invariant under G-equivalence. In the last section we give two simple examples of such invariants, which involve non-classical terms and provide new insights into the classification of G-invariant star products. Keywords: Deformation quantization; invariant star product; quantum moment map.
1. Introduction In classical mechanics, observables are smooth functions on a phase space, which constitute a Poisson algebra, while in quantum mechanics observables becomes a noncommutative associative algebra. Deformation quantization introduced by Bayen, Flato, Fronsdal, Lichnerowicz and Sternheimer [4] in 1970’s, is one of the important attempts aiming to establish a correspondence principle between these two mechanics. A classical phase space is usually a symplectic manifold M and the set of observables of classical mechanics is N = C ∞ (M ). A deformation quantization, or more precisely a quantization based on a star product is to introduce a non-commutative associative multiplication ∗ on N [[λ]], the space of formal power series with coefficients in N . In symplectic geometry, the notion of Hamiltonian G-spaces, and in particular, of moment maps play a important role [13, 10]. There is a quantum analogue of a moment map [14]. Under some suitable conditions, a quantum moment map can be defined on a G-invariant star product as a homomorphism from Gutt’s star product [8] to a G-invariant star product N [[λ]]. Fedosov showed that a special quantum moment map plays an important role in formulating the quantum reduction as an analogue of a symplectic reduction [7]. The classification of G-invariant star products as one of the important problems is described by G-invariant differential map T . This problem is completely 601
July 18, 2002 10:21 WSPC/148-RMP
602
00131
K. Hamachi
represented by the G-invariant de Rham cohomology [1]. The set of equivalence classes of star products is parametrized by a sequence of elements in the G-invariant second de Rham cohomology of M . A quantum moment map has a close relationship to a G-equivalence map. An equivalence map T mapping a quantum moment map associated with a G-invariant star product to another one associated with another G-invariant star product can be shown to be G-invariant as show later. If a star product enjoys the uniqueness property w.r.t. quantum moment maps to be associated with it, any G-equivalence maps a quantum moment map to another quantum moment map. An interesting problem about a star product is how to define a ‘quantum number’. When we try to do it, some difficulty arises. The most serious obstacle is that higher terms in λ of an element of N [[λ]] have no direct meaning since they are easily changed by an equivalence map T in Definition 1.2. So we should define a quantum number such that it is independent of the choice of equivalent star products. In this paper, we define an invariant quantity for G-invariant star products which is invariant under the G-equivalence relation. It is defined in a simple way by using quantum moment maps. We give two examples of this invariant for the case of R2 and of S 2 , which involve non-classical terms. It is not clear whether the invariant defined in this paper fully characterize a G-invariant star product. We do not know either that there is a relation between this constant and the G-invariant de Rham cohomology. We arrange the paper as follows: First we recall the Fedosov quantization, which is one of the most important tool to compute some examples. In Sec. 2, we give Gutt’s star product. This star product is a deformation of the Poisson algebra of the dual of a Lie algebra, and it plays a role of the ‘universal algebra’ of G-invariant star product. In Sec. 3, a quantum moment map is studied. This section contains the definition of a quantum moment map and describes its properties. An explicit form of an quantum moment map for a given G-invariant star product has not been given yet. We provide here an equation for a quantum moment map for a given star product of Fedosov type. Using a quantum moment map, we define a new invariant for a G-invariant star product. This invariant is the main object of this paper. In the last section, we carry out computations of the invariant for two cases. One is the simple symplectic manifold R2 . The other is the S 2 , which is the SO(3) coadjoint orbit in so(3)∗ . These examples exhibit non-classical terms, which are invariant under G-equivalences. 1.1. Deformation of symplectic manifolds and equivalences Let (M, ω) be a symplectic manifold and N = C ∞ (M ) be the set of smooth functions on M . The Poisson bracket on N associated with ω is denoted by {·, ·}. Let N [[λ]] be the space of formal power series in a formal parameter λ with coefficients in N .
July 18, 2002 10:21 WSPC/148-RMP
00131
Quantum Moment Maps and Invariants
603
Definition 1.1. A star product is defined as an associative multiplication ∗ on N [[λ]] of the form ∞ n X λ Cn (f, g) , for any f, g ∈ N , f ∗g = 2 n=0 such that (1) C0 (f, g) = f g and C1 (f, g) − C1 (g, f ) = 2{f, g}; (2) Ck (1, f ) = Ck (f, 1) = 0, for k ≥ 1; (3) each Ck is a bidifferential operator. In the situation that a Lie group G acts on M , a star product ∗ is said to be G-invariant if g(u ∗ v) = gu ∗ gv
(1)
holds for any u, v ∈ Nλ and g ∈ G. For any symplectic manifold (M, ω) there exists a star product [3, 12, 5]. Definition 1.2. Two star products ∗1 and ∗2 defined on N [[λ]] are said to be formally equivalent if there is a formal series, T = Id +
∞ X
λn Tn ,
(2)
n=1
of differential operators on C ∞ (M ) annihilating constants such that f ∗2 g = T (T −1f ∗1 T −1 g) . The formal operator T is called an equivalence between ∗1 and ∗2 . In this situation ∗2 is denoted by ∗T1 . And two G-invariant star products ∗1 and ∗2 are G-equivalent if these two star products are equivalent and the equivalence T between ∗1 and ∗2 is G-invariant. In this case T is called a G-equivalence. The classification of star products on a symplectic manifold is represented by the de Rham cohomology as follows. Theorem 1.1 ([11, 2]). The set of equivalence classes of star products on (M, ω) is canonically parametrized by sequences of elements belonging to the second de Rham cohomology of the de Rham complex on M . In the case of G-invariant star products, the following theorem holds. Theorem 1.2 ([1]). Assume that there is a G-invariant symplectic connection on M . The set of G-equivalence classes of G-invariant star produts on (M, ω) is canonically parametrized by sequences of elements belonging to the second de Rham cohomology of the G-invariant de Rham complex on M .
July 18, 2002 10:21 WSPC/148-RMP
604
00131
K. Hamachi
1.2. Example of star product: Moyal Weyl product One of the most important star product is the Moyal product [4]. This is a star product on the symplectic vector space R2n defined as follows. ∞ k X λ 1 i1 j1 ∂ku ∂kv ω · · · ω ik jk i1 , u ∗λ v = 2 k! ∂y · · · ∂y ik ∂y j1 · · · ∂y jk k=0
for any u, v ∈ C ∞ (R2n )[[λ]] ,
(3)
where y 1 , . . . , y 2n are linear coordinates on R2n , ω ij = {y i , y j }, and {, } is the canonical Poisson bracket of R2n . It is simple to see that this definition is independent of the choice of linear coordinates. 1.3. Fedosov quantization In the case of a general symplectic manifold, there is a simple construction of a star product, which is called Fedosov quantization. In this section, we will recall some basic facts about the Fedosov quantization on a symplectic manifold, as well as some useful notation. For details; see [5]. Let (M, ω) be a symplectic manifold of dimension 2n. Then, for each point x ∈ M , Tx M is equipped with a linear symplectic structure. Recall that the Moyal star product always exists on a symplectic vector space Tx M . Definition 1.3. A formal Weyl algebra Wx associated with Tx M is an associative algebra with a unit over C defined as follows: Each element of Wx is a formal power series in λ with coefficients being formal polynomial in Tx M , that is, each element has the form X λk ak,α y α , a(y, λ) = k,α
where y = (y 1 , . . . , y 2n ) are linear coordinates on Tx M , α = (α1 , . . . , α2n ) is a multi-index and y α = (y 1 )α1 · · · (y 2n )α2n . The product is defined by the Moyal– Weyl rule (3). S Let W = x∈M Wx . Then W is a bundle of algebras over M , called the Weyl bundle over M . Each section of W has the form X λk ak,α (x)y α , (4) a(x, y, λ) = k,α
where x ∈ M . We call a(x, y, λ) smooth if each coefficient ak,α (x) is smooth in x. We denote the set of smooth sections by ΓW . It constitutes an associative algebra with unit under the fibrewise multiplication. A differential q-form with values in W is a smooth section of the bundle V W ⊗ q T ∗ M . For short, we denote the space of smooth sections of the bundle by ΓW ⊗ Λq . ΓW ⊗ Λq forms an associative algebra under multiplication of tensor product algebra.
July 18, 2002 10:21 WSPC/148-RMP
00131
Quantum Moment Maps and Invariants
605
Let ∇ be a torsion-free symplectic connection on M and ∂ : ΓW → ΓW ⊗ Λ1 be its induced covariant derivative. Consider a connection on W of the form 1 (5) Da = −δ + ∂ − [γ, a] , for a ∈ ΓW λ with γ ∈ ΓW ⊗ Λ1 , where δa = dxk ∧
∂a . ∂y k
Clearly, D is a derivation with respect to the Moyal–Weyl product. A simple computation shows that D2 a =
1 [Ω, a] , λ
for any a ∈ ΓW ,
where Ω = ω − R + δγ − ∂γ +
1 2 γ . λ
m is the curvature tensor of the Here R = 4i Rijkl y i y j dxk ∧ dxl and Rijkl = ωim Rjkl symplectic connection. A connection of the form (5) is called Abelian if Ω is a scalar 2-form, that is, Ω ∈ Λ2 [[λ]]. We call D a Fedosov connection if it is Abelian and deg γ ≥ 3. For an Abelian connection, the Bianchi identity implies that dΩ = DΩ = 0, that is, Ω is closed. In this case, we call Ω Weyl curvature.
Theorem 1.3 ([5]). Let ∇ be any torsion-free symplectic connection, and Ω = ω + λω1 + · · · ∈ Z 2 (M )[[λ]] a perturbation of the symplectic form ω. There exits a unique γ ∈ ΓW ⊗ Λ1 such that D given by Eq. (5) is a Fedosov connection, which has Weyl curvature Ω and satisfies δ −1 γ = 0. The above theorem indicates that a Fedosov connection is uniquely determined by a torsion-free symplectic connection ∇ and a Weyl curvature Ω = ω +λω1 +· · · ∈ Z 2 (M )[[λ]]. For this reason, we will say that the connection D defined above is a Fedosov connection corresponding to the pair (∇, Ω). We denote WD be the set of smooth and flat sections, that is, Da = 0 for a ∈ ΓW . The space WD becomes a subalgebra of ΓW . Let σ denote the projection from WD to N [[λ]] defined by σ(a) = a|y=0 . Theorem 1.4 ([5]). Let D be an Abelian connection. Then, for any a0 (x, λ) ∈ N [[λ]] there exists a unique section a ∈ WD such that σ(a) = a0 . Therefore, σ establishes an isomorphism between WD and N [[λ]] as C[[λ]]-vector spaces. We denote the inverse map of σ by QD and call it a quantization procedure. The Weyl product ∗ on WD is translated to N [[λ]] yielding a star product ∗D . Namely, we set for a, b ∈ N [[λ]] a ∗D b = σ(QD (a) ∗ QD (b)) .
July 18, 2002 10:21 WSPC/148-RMP
606
00131
K. Hamachi
The explicit formula of the quantization procedure is given by 1 QD (a0 ) = a0 + ∂i a0 y i + ∂i ∂j a0 y i y j 2 1 1 + ∂i ∂j ∂k a0 y i y j y k − Rijkl ω lm ∂m a0 y i y j y k + · · · . 6 24
(6)
For G-invariant star products, there is a simple criterion as follows. Proposition 1.1 ([6, 14]). Let ∇ be a G-invariant connection, Ω be a G-invariant Weyl curvature and D be the Fedosov connection corresponding to (∇, Ω). Then the star product corresponding to D is G-invariant. In the previous proposition the G-invariant star product whose Weyl curvature is given by ω is called the canonical G-invariant star product. 1.4. Deformation of Lie algebras and Gutt’s star product Let g be a finite dimensional Lie algebra and g∗ be its dual. g∗ has a Poisson structure called the linear Poisson structure that is defined for u, v ∈ C ∞ (g∗ )[[λ]] by k {u, v} = Cij
∂u ∂v xk , ∂xi ∂xj
k are structure constants of g with respect to {xi }. where {xi } is a basis of g and Cij ∞ ∗ The Poisson algebra (C (g )[[λ]], {, }) has a canonical star product called Gutt’s star product [8]. This star product is defined as follows: Let g[[λ]] be the formal power series of λ with coefficients in g. We define a Lie algebra structure [ , ]λ on g[[λ]] by
[ξ, η]λ = λ[ξ, η] for any ξ, η ∈ g and extend by λ-linear, where [ , ] means the Lie bracket of g. We denote it by gλ . Let U(gλ ) be the universal enveloping algebra of g[[λ]]. As a vector space, U(gλ ) is canonically isomorphic to pol(g∗ )[[λ]]. The space of formal power series of λ with coefficients being polynomials on g∗ . The isomorphism is established by symmetrization. Therefore, the algebra structure on U(g[[λ]]) induces a star product on pol(g∗ )[[λ]], which give rise to a deformation quantization for the Lie-Poisson structure g∗ . 2. G-invariant Star Products and Quantum Moment Maps Now we consider a quantum moment map for a G-invariant star product, as one of the main subjects in this paper. A quantum moment map is a quantum analogue of a moment map. Fedosov defines a quantum moment map to show the quantum reduction theorem [7]. But
July 18, 2002 10:21 WSPC/148-RMP
00131
Quantum Moment Maps and Invariants
607
we adopt here the definition of quantum moment map from [14] since this definition contains the ones of Fedosov. While existence and uniqueness can be easily verified under suitable conditions, it is not easy to present in an explicit formula. Hence we present a new method of computing quantum moment maps for any star product of Fedosov type and discuss the relation between G-equivalents and quantum moment maps in this section. On the basis of these consideration, we propose here a new invariant for G-invariant star products and show that it remains unchanged under G-equivalence, because of which this invariant should be expected to play an important role in the classification of G-invariant star product. We present a few examples, which show that these invariant provides non-trivial results arising from quantum effect. 2.1. The definition and basic properties of quantum moment maps Let (M, ω) be a Hamiltonian G-space and Φ be a moment map [13]. From now on, we assume that any star product is G-invariant. Then, the corresponding infinitesimal action ξ defines a Lie algebra homomorphism from g to the Lie algebra of derivations Der(N [[λ]], ∗) with respect to ∗. Definition 2.1. A quantum moment map is a homomorphism of associative algebras Φ∗ : U(gλ ) → N [[λ]] ,
(7)
[Φ∗ (X), u]∗ = λXu ,
(8)
where the right hand side of (8) means the infinitesimal action of X ∈ g on N [[λ]]. It is easy to see that the condition (7) is equivalent to Φ∗ ([X, Y ]λ ) = [Φ∗ (X), Φ∗ (Y )]∗
for any X, Y ∈ g .
(9)
Note that a quantum moment map usually depends on the choice of a star product. As mentioned above, the algebra U(gλ ) can be identified with Gutt’s star product on pol(g∗ )[[λ]]. Proposition 2.1 ([14]). Let Φ∗ : pol(g∗ [[λ]]) → N [[λ]] be a quantum moment map. Then M is a Hamiltonian G-space. Moreover Φ∗ satisfies Φ∗ (f ) = Φ0 (f ) + O(λ) ,
for any f ∈ pol(g∗ ) ,
where Φ0 : C ∞ (g∗ ) → C ∞ (M ) denotes the corresponding classical moment map. On the existence and the uniqueness of quantum moment maps, some simple criteria are known as follows. ∗ (M ) be de Rham cohomology group and H ∗ (g, R) Theorem 2.1 ([14]). Let HdR be Lie algebra cohomology group with coefficients in R. There exists a quantum 1 (M ) = 0 and H 2 (g, R) = 0. moment map if HdR
July 18, 2002 10:21 WSPC/148-RMP
608
00131
K. Hamachi
Theorem 2.2 ([14]). The set of quantum moment maps of a G-invariant star product is parametrized by H 1 (g, R). 2.2. A local formula of quantum moment maps Let f be a diffeomorphism on M . Then f acts on a section a ∈ C ∞ (W ⊗ Λ) by pull back ∂f y, df (x), λ . (f ∗ a)(x, y, dx, λ) = a f (x), ∂x If f is a symplectomorphism, f ∗ is an automorphism of the algebra C ∞ (W ⊗ Λ). Thus, a Hamiltonian vector field X defines a derivation on C ∞ (W ⊗ Λ), d ∗ f a LX a = dt t t=0 called the Lie derivative, where ft is the Hamiltonian flow generated by X. One can show that there is a section A(X) ∈ C ∞ (W ) such that LX a = (di(X) + i(X)d)a +
1 [A(X), a] . λ
(10)
For instance, A(X) is given as following A(X) = ωik
d ∂ft dt ∂x
∂ft ∂x
−1 !k yiyj , j
where ωik are coefficients of the symplectic form ω and ft is the Hamiltonian flow generated by X. The following two lemmas play important roles in determining the local form of any quantum moment map. Lemma 2.1. Let D be a Fedosov connection whose Weyl curvature is Ω and Q be the quantization procedure corresponding to D. Assume that there exists H(X) ∈ N [[λ]] for any X ∈ g such that LX a = (i(X)D + Di(X))a +
1 [Q(H(X)), a] λ
(11)
holds for any section a ∈ C ∞ (M, W ⊗ Λ). Then, for any Abelian connection of the form D1 = D + λ1 [∆γ, ·] with G-invariant ∆γ ∈ ΓW 3 ⊗ Λ1 which has the same Weyl curvature as D, Eq. (11) holds with D1 and Q1 replaced by D and Q, respectively, where Q1 is the quantization procedure corresponding to D1 . Proof. Since the addition of [∆γ, ·]/λ to D on the right hand side of (11) is cancelled by that of −i(X)∆γ to Q(H(X)), we have LX a = (i(X)D1 + D1 i(X))a +
1 [Q(H(X)) − i(X)∆γ, a] . λ
July 18, 2002 10:21 WSPC/148-RMP
00131
Quantum Moment Maps and Invariants
609
It remains to show that Q(H(X)) − i(X)∆γ is equal to Q1 (H(X)). Since (Q(H(X)) − i(X)∆γ)|y=0 = H(X) , it is sufficient to show that Q(H(X)) − i(X)∆γ is flat with respect to D1 . D1 (Q(H(X)) − i(X)∆γ) = D(Q(H(X)) − i(X)∆γ) +
1 [∆γ, Q(H(X)) − i(X)∆γ] λ
= D(Q(H(X)) − i(X)∆γ) +
1 1 [∆γ, Q(H(X))] + [i(X)∆γ, ∆γ] λ λ
1 = i(X) D∆γ + ∆γ 2 . λ
(12)
Since the Weyl curvature of D1 equals to Ω, we obtain D∆γ +
1 ∆γ 2 = 0 . λ
Lemma 2.2. Under the conditions in Lemma 2.1 if [Q(H(X)), Q(H(Y ))] = λQ(H([X, Y ]))
(13)
holds for any X, Y ∈ g, then Eq. (13) holds with Q1 replaced by Q defined in Lemma 2.1. Proof. As we have seen in the proof of Lemma 2.1, the equation Q1 (H(X)) = Q(H(X)) − i(X)∆γ holds. By Lemma 2.1, we have [Q1 (H(X)), Q1 (H(Y ))] = [Q1 (H(X)), Q(H(Y )) + i(Y )∆γ] = λLX Q(H(Y )) + λLX i(Y )∆γ = [Q(H(X)), Q(H(Y ))] + λ(i(Y )LX + i([X, Y ]))∆γ = λ(Q(H([X, Y ])) + i([X, Y ])∆γ) = λQ1 (H([X, Y ])) .
The star products defined below play important role to compute local form of quantum moment map. Definition 2.2. Let U be a neighborhood of a symplectic manifold with Darboux coordinates, Ω a perturbation of a symplectic form on U . A Fedosov connection
July 18, 2002 10:21 WSPC/148-RMP
610
00131
K. Hamachi
D corresponding to (∇, Ω) is called a semi-Moyal connection whose Weyl curvature is Ω if ∇ is a the exterior differential on U and the star product corresponding to D is called the semi-Moyal product on U . The following proposition is a key for the computation of a local form of quantum moment map. Proposition 2.2. Let D be the Fedosov connection corresponding to a symplectic connection ∇ and a Weyl curvature Ω, and ∗ be the star product corresponding to D. Take a local chart U of M and let D1 be the semi-Moyal connection on U whose Weyl curvature is Ω. If Φ∗ is a quantum moment map of ∗, then Φ∗ satisfies LX a = (i(X)D1 + D1 i(X))a +
1 [Q1 (Φ∗ (X)), a] , λ
λQ1 (Φ∗ ([X, Y ])) = [Q1 (Φ∗ (X)), Q1 (Φ∗ (Y ))] ,
(14) (15)
for any X, Y ∈ g and a ∈ C ∞ (W ⊗ Λ). Proof. First note that Eq. (11) is holds with Φ∗ replacing H. In fact, if we denote Da = da + [γ, a]/λ and use Eqs. (10), (11) is equivalent to [γ(X) + Q(Φ∗ (X)) + A(X), a] = 0
for any a ∈ C ∞ (W ⊗ Λ) .
(16)
By definition of Φ∗ , Eq. (11) holds for any flat section a ∈ WD . Hence (16) holds any section a since a section which commutes with any flat section is central (See Corollary 5.5.2 in [6]). Apply Lemmas 2.1 and 2.2 with Φ∗ replaced by H and we have the proposition. This proposition means that the computation of the local form of a quantum moment map for a Fedosov star product reduces to that of a quantum moment map for the semi-Moyal product whose Weyl curvature is the same as the corresponding Weyl curvature to Fedosov star product under consideration. The following theorem which is proved by Fedosov ([5]) is obtained by using previous proposition in the case of the canonical G-invariant star product, that is, the Weyl curvature Ω = ω. Theorem 2.3. Assume ∗ is a canonical G-invariant star product. Then Eqs. (11) and (13) are valid if H is a classical moment map. We will give a method to compute a local form of quantum moment map for any G-invariant Fedosov star product. In the case of a canonical G-invariant star product, the above theorem provides a quantum moment map. In other cases, the computation is divided into two parts. Firstly, we give an explicit formula of semiMoyal quantization. Secondly, we give a formula of a quantum moment map of a semi-Moyal product.
July 18, 2002 10:21 WSPC/148-RMP
00131
Quantum Moment Maps and Invariants
611
2.3. An explicit form of semi-Moyal products and their quantum moment maps As we saw in the previous subsection, it is important to give an explicit formula of semi-Moyal products to compute a quantum moment map. Using the Fedosov quantization method, we have following formula. Let U be a Darboux neighborhood, ΓWU be the Weyl algebra bundle on U and Ω is a perturbation of the symplectic form ω, that is, Ω = ω + λω1 + λ2 ω2 + · · · , where each ωi is closed two form on U . Using the Fedosov method (Theorem 1.3), we have the semi-Moyal connection whose Weyl curvature is Ω as follows; Da = −δa + da +
1 [γ, a] , λ
for any a ∈ ΓWU
(17)
where ˜ jk y i y j dxk + ω ik Ω ˜ ij Ω ˜ kl y j dxl · · · , ˜ ij y i dxj + 1 ∂i Ω γ=Ω 3
(18)
˜ = Ω−ω. Ω
(19)
Then, by Theorem 1.4, the semi-Moyal quantization of u ∈ C ∞ (U ) is given by ˜ ki ∂j u + ω mj ω kl Ω ˜ km Ω ˜ lj ∂j u + · · ·)y i Q(u) = u + (∂i u + ω kj Ω ˜ il ∂j ∂m uy l y m + 1 ω ij ∂k Ω ˜ im ∂j uy k y m + ω ij Ω 6 ˜ ki Ω ˜ ln ∂j ∂o uy n y o · · · . + ω ij ω kl Ω
(20)
For later use, we give all linear terms with respect to y i s of a semi-Moyal quantization (20) (I + µ + µ2 + · · · + µk + · · ·)ji ∂j uy i ,
(21)
where (1)
µij = −ω ik γkj ,
(22)
and γ (1) denotes linear terms of γ with respect to y i . Next, we give a differential equation in determining a quantum moment map of a semi-Moyal product. In the special case, the Moyal product, the following fact holds. Lemma 2.3 ([7]). Let X be a vector field on U and H be a generator function of X, that is Xf = {H, f }. For any section a ∈ ΓWU , LX a = (i(X)DM + DM i(X))a +
1 [QM (H), a] λ
(23)
July 18, 2002 10:21 WSPC/148-RMP
612
00131
K. Hamachi
holds, where DM and QM are the Moyal connection and Moyal quantization on U respectively. Proof. It is a direct verification. For general semi-Moyal products, the following lemma is important to determine a quantum moment map. Lemma 2.4. Let D is the semi-Moyal connection whose Weyl curvature is Ω = ω+λω1 +· · · , Q be the quantization procedure with respect to D and H be a generator ¯ ∈ C ∞ (U )[[λ]] satisfies function of a vector filed X on U . If H LX a = (i(X)D + Di(X))a +
1 ¯ a] [Q(H + H), λ
for any a ∈ ΓWU ,
(24)
then ¯ = (−2µ + µ2 )j ∂j H ∂i H i
(25)
holds, where µji is given by (22). Proof. Substituting Da = DM a + [γ, a]/λ into (24), we have LX a = (i(X)DM + DM i(X))a +
1 [QM (H), a] λ
1 ¯ a] . + [i(X)γ + (Q − QM )(H) + Q(H), λ Lemma 2.3 reduces this equation to ¯ a] = 0 . [i(X)γ + (Q − QM )(H) + Q(H), Since this equation holds for any a ∈ ΓWU , we have ¯ = −i(X)γ − (Q − QM )(H) Q(H)
(26)
up to central elements, functions in C ∞ (U )[[λ]]. Equating linear terms with respect to y i of Eq. (26) and using (20), we have ¯ i = −(2µ + µ2 + · · · + µk + · · ·)j ∂j Hy i . (1 + µ + µ2 + · · · + µk + · · ·)ji ∂j Hy i Multiplying the above equation by (µ − 1), we have (25). Let ∗ be a G-invariant Fedosov star product whose Weyl curvature is Ω and Φ∗ is a quantum moment map of ∗. Then Proposition 2.2 and Lemma 2.4 imply ∂i (Φ∗ (X) − Φ(X)) = (−2µ + µ2 )ji ∂j Φ(X) , where Φ is the classical moment map. The above equation determines Φ∗ up to constants, that is, elements in C[[λ]].
July 18, 2002 10:21 WSPC/148-RMP
00131
Quantum Moment Maps and Invariants
613
To fix these constant terms of a quantum moment map, we use Eq. (15). We can completely fix constants if H1 (g, R) = 0. 2.4. G-equivalences, quantum moment maps and invariants In this subsection, we will give a relation between G-equivalence and quantum moment map. Proposition 2.3. Let ∗ and ∗0 be two G-invariant star products on N [[λ]] and Φ∗ and Φ∗0 be the corresponding quantum moment maps. Assume that there exists an equivalence map T between ∗ and ∗0 such that T Φ∗ (X) = Φ∗0 (X) ,
for any X ∈ g .
Then T is G-invariant. So ∗ and ∗0 are formally G-equivalent. Proof. It is enough to show that for any f ∈ N , T Xf = XT f holds, which can be seen as λT Xf = T ([Φ∗ (X), f ]∗ ) = [T Φ∗ (X), T f ]∗0 = [Φ∗0 (X), T f ]∗0 = λXT f .
Proposition 2.4. Assume ∗ is a G-invariant star product and Φ∗ is its quanP n tum moment map. If a formal differential map T = Id + λ Tn on N [[λ]] is T G-invariant, then ∗ is also a G-invariant star product and T Φ∗ is a quantum moment map with respect to ∗T . Proof. It is easy to see that ∗T is G-invariant star product. Set Ψ = T Φ∗ , then Ψ is an algebra homomorphism between Gutt’s star product and (N [[λ]], ∗T ). So it is enough to check the condition (9): [Ψ(X), f ]∗T = [T Φ∗ (X), f ]∗T = T ([Φ∗ (X), T −1 f ]∗ ) = T (λXT −1f ) = λXf .
Corollary 2.1. Assume H 1 (g, R) = 0, that is, there is a unique quantum moment map for each star product, if it exists. Let ∗ and ∗0 be G-invariant star products and Φ∗ and Φ∗0 be the corresponding quantum moment maps. If T is a G-equivalence map between ∗ and ∗0 then Φ∗0 = T Φ∗ . So we have shown that a G-equivalence maps a quantum moment map corresponding to a G-invariant star product to one for the corresponding product, and vice versa.
July 18, 2002 10:21 WSPC/148-RMP
614
00131
K. Hamachi
The following proposition determines the commutant of quantum moment maps. Proposition 2.5. Let Φ∗ be a quantum moment map with respect to a star product ∗. If f ∈ N [[λ]] satisfies [Φ∗ (X), f ]∗ = 0
for any X ∈ g ,
(27)
then f is a G-invariant function. Proof. The equation, λXf = [Φ∗ (X), f ]∗ = 0 means that f is a G-invariant function on M . Corollary 2.2. Assume M is a G-transitive space. The condition (27) implies f is constant. Proposition 2.6. Assume M be a G-transitive space. Let Z be the center of Gutt’s star product, ∗ be a G-invariant star product and Φ∗ be a quantum moment map of ∗. Then for any element l in Z, Φ∗ (l) is constant, that is, there exist an element c∗ (l) ∈ C[[λ]] such that Φ∗ (l) = c∗ (l). Proof. The equality [Φ∗ (X), Φ∗ (l)]∗ = Φ∗ ([X, l]) = Φ∗ (0) = 0 implies, by Proposition 2.5, that Φ∗ (l) is constant. Proposition 2.6 leads to the following definition. Definition 2.3. Let M be a G-transitive symplectic manifold and ∗ be a Ginvariant star product which has a quantum moment map Φ∗ . Define a map c∗ by c∗ : Z → C[[λ]] , c∗ (l) := Φ∗ (l) ,
for any l ∈ Z .
Then c∗ is an algebra morphism, because Z is a subalgebra of Gutt’s star product and Φ∗ is an algebra morphism. The map c∗ has following properties. Proposition 2.7. Let ∗ and ∗0 be G-invariant star products and Φ∗ and Φ∗0 be the corresponding quantum moment maps. If ker Φ∗ = ker Φ∗0 , then for any l ∈ Z Φ∗ (l) = Φ∗0 (l) holds. Proof. Let c = Φ∗ (l) ∈ C[[λ]]. Then l − c ∈ ker Φ∗ . This means l − c ∈ ker Φ∗0 . So Φ∗0 (l − c) = 0, that is, Φ∗0 (l) = c.
July 18, 2002 10:21 WSPC/148-RMP
00131
Quantum Moment Maps and Invariants
615
The following theorem says that the map c∗ depends only on a class of G-invariant ∗-products. Theorem 2.4. Let ∗ and ∗0 be G-invariant star products and Φ∗ and Φ∗0 be the corresponding quantum moment maps. If ∗ is G-equivalent to ∗0 then c∗ is equal to c∗0 . Proof. Let T , a G-invariant differential map, be the G-equivalence between ∗ and ∗0 . By Corollary 2.1 it satisfies Φ∗0 = T Φ∗ , so ker Φ∗ = ker Φ∗0 . This implies c∗ = c∗0 by Proposition 2.7. 3. Examples of c∗ In this section, we present two examples of c∗ . The first one is the Moyal product on R2 on which SL(2) acts. The second one is the G-invariant star product on S 2 , the coadjoint orbit of G = SO(3). 3.1. Moyal product on R2 Let R2 be the symplectic vector space with coordinates (x, p), and the Poisson bracket is given by {x, p} = 1. The group SL(2) acts on R2 by linear symplectomorphisms. Let {E, F, H} be basis of sl(2) with commutation relation, [E, F ] = H ,
[H, E] = 2E ,
[H, F ] = −2F .
The Casimir element is given by Z = EF + 12 H 2 + F E. The Moyal product is the canonical SL(2)-invariant star product. So we obtain a quantum moment map corresponding to Moyal product is given by classical moment map. The classical moment map Φ is given by Φ(E) =
1 2 x , 2
1 Φ(F ) = − p2 , 2
Φ(H) = −xp .
So c∗ (Z) is given by 1 Φ∗ (Z) = Φ∗ (E) ∗ Φ∗ (F ) + Φ∗ (H) ∗ Φ∗ (H) + Φ∗ (F ) ∗ Φ∗ (E) 2 1 = Φ(E) ∗ Φ(F ) + Φ(H) ∗ Φ(H) + Φ(F ) ∗ Φ(E) . 2 All terms of (28) vanish except the λ2 term. A simple computation gives 2 1 λ 1 −1 + (−2) − 1 Φ∗ (Z) = 2 2 2 2 3 λ . =− 2 2
(28)
July 18, 2002 10:21 WSPC/148-RMP
616
00131
K. Hamachi
3.2. SO(3)-invariant star product on S 2 In this subsection, we give an example of c∗ for the SO(3)-invariant star product on S 2 , the coadjoint orbit of SO(3), up to λ2 . We should note that the G-invariant deRham cohomology space of S 2 is R. So SO(3)-equivalence class of SO(3)-invariant 2 (S 2 , R)[[λ]]. We compute here c∗ for the star product on S 2 is parametrized by HdR canonical invariant star product and the star product of Fedosov type whose Weyl curvature Ω = ω + λω. 3.2.1. Canonical SO(3)-invariant star product on S 2 To use the formula (6), we need a SO(3)-invariant connection on S 2 . To this end, the following results are fundamental (see [9]). Let M = K/H be a homogeneous space, where K is a connected Lie group and H is a closed subgroup of K. The coset H is called the origin of M and will be denoted by o. The group K acts transitively on M in a natural manner. The linear isotropy representation is by definition the homomorphism of H into the group of linear transformations of To (M ) which assigns to each h ∈ H the differential of h at o. Let n be the dimension of M and G be a Lie subgroup of GL(n; R). We recall that a G-structure on M is a principal subbundle P of the linear frame bundle L(M ) with structure group G ⊂ GL(n; R). Unless otherwise stated we assume throughout this section that P is a G-structure on M invariant by K, i.e. K acts on P as an automorphism group. We also fix a linear frame uo ∈ P at o throughout. If we identify To (M ) with Rn by the linear isomorphism uo : Rn → To (M ), then the linear isotropy representation of H may be identified with the homomorphism ρ : H → G defined by ρ(h) = u−1 o ◦ h∗ ◦ u o
for h ∈ H ,
(29)
where h∗ : To (M ) → To (M ) denotes the differential of h at o. We say that a homogeneous space K/H is reductive if the Lie algebra k of K may be decomposed into a vector space direct sum of Lie algebra h of H and an Ad(H)-invariant subspace m, that is, if (1) k = h + m, h ∩ m = 0; (2) Ad(H)m ⊂ m. Condition (2) implies (3) ad(h)m ⊂ m, and, conversely, if H is connected, then (3) implies (2). Theorem 3.1. Let P be a K-invariant G-structure on a reductive homogeneous space M = K/H with decomposition k = h + m. Then there is a one-to-one
July 18, 2002 10:21 WSPC/148-RMP
00131
Quantum Moment Maps and Invariants
617
correspondence between the set of K-invariant connections in P and the set of linear mappings Λm : m → g such that Λm (Ad(h)(X)) = Ad(ρ(h))(Λm (X))
for X ∈ k and h ∈ H ,
(30)
where ρ denotes both the linear isotropy representation H → G and the Lie algebra homomorphism h → g induced from it, Ad(h) denotes the adjoint representation of H in k and Ad(ρ(h)) denotes the adjoint representation of G in g. To a K-invariant connection in P with connection form ω there corresponds the linear mapping defined by ˆ Λ(X) = ωu0 (X)
for X ∈ k ,
ˆ denotes the natural lift to P of a vector field X ∈ k of M and Λ is defined by where X ( ρ(X) if X ∈ h , Λ(X) = Λm (X) if X ∈ m . We shall now express the one-to-one correspondence in Theorem 3.1 in terms of covariant differentiation. If ∇ is the covariant differentiation with respect to the affine connection on M and if X is a vector field on M , then the tensor field AX of type (1, 1) on M is defined by AX = LX − ∇X .
(31)
Corollary 3.1. The one-to-one correspondence in Theorem 3.1 is also given by uo ◦ (Λm (X)) ◦ u−1 o = −(AX )o
for X ∈ m .
(32)
Next, we provide useful facts of coadjoint orbits of SO(3). We identify the so(3)∗ with the R3 by taking basis of so(3). Let α be a point on so(3)∗ . A coadjoint orbit through the point α is nothing but the sphere with the radius r = kαk denoted by Sr2 . Let dA be the area element on the sphere Sr2 . Then the coadjoint symplectic structure is given by following 2-form ω=
1 dA . r
(33)
We give local canonical coordinates around o = (r, 0, 0) ∈ R3 . The spherical coordinates given by x = r cos ϕ sin θ , y = r sin ϕ sin θ ,
(34)
z = r cos θ , constitute local coordinates in a neighborhood of o. In these coordinates the symplectic form ω can be written by ω = r sin dθ ∧ dϕ .
July 18, 2002 10:21 WSPC/148-RMP
618
00131
K. Hamachi
If we define θ˜ = −r cos θ, we have ω = dθ˜ ∧ dϕ , ˜ ϕ) are canonical coordinates. and (θ, Let σx , σy , σz ∈ so(3) be the generators of rotations around x, y and z axes, respectively. Note that σx generates the isotropy group at o = (r, 0, 0). Let h be a Lie subalgebra of so(3) and m be a linear subspace of so(3) generated by σy and σz . It is easy to show that so(3) = h + m is a unique reductive decomposition. Let P be the principal Sp(2)-subbundle of L(Sr2 ), that is, the bundle of ˜ ϕ), we set symplectic frames. Using canonical coordinates (θ, ∂ ∂ , ∈P. uo = o, ∂ θ˜ ∂ϕ Let Tx , Ty and Tz be the fundamental vector fields of Sr2 corresponding, respectively, ˜ ϕ) we have to σx , σy and σz . In canonical coordinates (θ, cos θ ∂ ∂ − cos ϕ , ∂ϕ ∂ θ˜ sin θ cos θ ∂ ∂ − sin ϕ , Ty = −r sin θ cos ϕ ˜ sin θ ∂ϕ ∂θ ∂ . Tz = ∂ϕ
Tx = −r sin θ sin ϕ
(35)
We give here a SO(3)-invariant connection on S 2 . In the present case, the linear isotropy representation (29) is nothing but the Jacobi matrix of the differential of ˜ ϕ), so we easily obtain h ∈ G at o in the canonical coordinates (θ, 1! 0 − r . (36) ρ(σx ) = r 0 Note that this ρ means the induced Lie algebra homomorphism. Lemma 3.1. There is a unique SO(3)-invariant symplectic connection given by Λm (m) = 0 ,
(37)
where m is a linear subspace generated by σy , σz . Proof. It is easy to see that (37) defines invariant connection. On the other hand, if Λm is a linear mapping which satisfies conditions in Theorem 3.1, then [ρ(σx ), [ρ(σx ), Λm (σy )]] = −Λm (σy ) , which implies Λm (σy ) = 0, and similarly for σz . Lemma 3.2. Let Γ be the coefficients of the invariant connection corresponding to Λm . Then Γkij (o) = 0 , ˜ ϕ). with respect to the coordinates (θ,
(38)
July 18, 2002 10:21 WSPC/148-RMP
00131
Quantum Moment Maps and Invariants
619
Proof. Let TX , TY , TZ be fundamental vector fields with respect to the action of SO(3). For X and Y in k ∇TX TY = −Λ(X)Y + [X, Y ]
(39)
holds by (31) and (32). One obtains ∇TZ TY |0 = −Λ(Z)Y + [TZ , TY ]|0 = −TX |0 = 0 , ∇TZ TZ |0 = 0 , ∇TY TY |0 = 0 and show the lemma by direct computation. So we have a unique SO(3)-invariant connection on Sr2 which is given by the ˜ ϕ) at o. One usual partial differential with respect to canonical coordinates (θ, can show that if there is an invariant torsion-free connection, there exists an invariant torsion-free symplectic connection. So the connection we have constructed is symplectic. Now we can compute c∗ for the canonical SO(3)-invariant star product. According to the theorem, the quantum moment map is given by the classical moment map. Let Z = σx2 + σy2 + σz2 be a Casimir operator and the center of U(so(3)) is generated by Z. Our purpose is to compute Φ∗ (Z) up to λ2 order. Since Φ∗ is a homomorphism, we have Φ∗ (Z) = Φ∗ (σx ) ∗ Φ∗ (σx ) + Φ∗ (σy ) ∗ Φ∗ (σy ) + Φ∗ (σz ) ∗ Φ∗ (σz ) = Φ(σx ) ∗ Φ(σx ) + Φ(σy ) ∗ Φ(σy ) + Φ(σz ) ∗ Φ(σz ) .
(40)
Since Φ∗ (Z) is a constant function and star product is local, we only concentrate on a reference point o ∈ S 2 . Computation requires the values of covariant derivatives of functions at o. The classical moment map is given by Φ(σx ) = r sin θ cos ϕ , Φ(σy ) = r sin θ sin ϕ , Φ(σz ) = r cos θ . A simple computation gives 1 ∂θ˜∂θ˜Φ(σx )|o = − , r
(41)
∂ϕ ∂ϕ Φ(σx )|o = −r ,
(42)
July 18, 2002 10:21 WSPC/148-RMP
620
00131
K. Hamachi
and other combinations are 0. Substituting these values into (40) and using the formula (6), we obtain 2 1 λ 2 + ··· c∗ (Z) = Φ∗ (Z) = r + 2 2 2 1 = r2 + λ2 + · · · . 4
(43)
The λ2 term in (43) is non-classical, and any G-equivalence remains these values. 3.2.2. SO (3)-invariant star product whose Weyl curvature is ω + λω 2 (S 2 , R), the star product of Fedosov Let Ω = ω + λω. Since ω is a generator of HdR type whose Weyl curvature is Ω gives another SO(3)-invariant star product which is not SO(3)-equivalent to the canonical one, and we denote this star product by ∗Ω . Let D0 be the semi-Moyal connection whose Weyl curvature is Ω. A simple computation gives
D0 a = −δa + da +
1 [γ, a] for any a ∈ ΓW ⊗ Λ , λ
where γ = (λ − λ2 + · · ·)ωij y i dxj . Let Φ∗Ω be a quantum moment map corresponding to ∗Ω . If we denote Φ∗Ω = Φ + λΦ1 + λ2 Φ2 + · · · , due to Lemma 2.4 we obtain ∂i (λΦ1 + λ2 Φ2 + · · ·) = (2λ − λ2 + · · ·)∂i Φ . Then we have Φ∗Ω = Φ + 2λΦ − λ2 Φ + · · · ,
(44)
up to constants. Because of Eq. (15), a quantum moment map Φ∗Ω is exactly given by (44). We can easily show that up to λ2 terms, ∗Ω is given by 2 1 λ λ (ω i1 j1 ω i2 j2 ∂i1 ∂i2 u∂j1 ∂j2 v − 2{u, v}) + · · · u ∗Ω v = uv + {u, v} + 2 2 2 = u ∗ v + λ2 {u, v} + · · · . So we have c ∗Ω
1 2 = Φ∗Ω (Z) = r + 4r λ + 2r + λ2 + · · · . 4 2
2
(45)
This result gives c∗ 6= c∗Ω . It occurs to us that the class of G-invariant star products are parametrized by c∗ .
July 18, 2002 10:21 WSPC/148-RMP
00131
Quantum Moment Maps and Invariants
621
We end this section with the following problem: Problem 3.1. What is the image of the mapping from G-invariant star products to c∗ maps, and is this mapping one to one? Acknowledgments I would like first of all to thank Izumi Ojima for many advices and his insistance that I complete and write this study, and Giuseppe Dito for many fruitful discussions on the subject. Thanks are also due to Daniel Sternheimer for helpful advices and to the referee for his patience and for important suggestions on the presentation of the results. This research was supported by the Research Fellowships of the Japan Society for the Promotion of Science for Young Scientists. References [1] M. Bertelson, P. Bieliavsky and S. Gutt, “Parametrizing equivalence classes of invariant star products”, Lett. Math. Phys. 46 (1998) 339–345. [2] M. Bertelson, M. Cahen and S. Gutt, “Equivalence of star products”, Class. Quantum Gravity 14 (1997) A93–A107. [3] M. De Wilde and P. Lecomte, “Existence of star-products and of formal deformations of Poisson Lie algebra of arbitrary symplectic manifolds”, Lett. Math. Phys. 7 (1983) 487–496. [4] F. Bayen, M. Flato, C. Fronsdal, A. Lichnerowicz and D. Sternheimer, “Deformation theory and quantization, I and II”, Ann. Phys. 111 (1977) 61–151. [5] B. Fedosov, “A simple geometrical construction of deformation quantization”, J. Differential Geom. 40 (1994) 213–238. [6] B. Fedosov, “Deformation quantizatoin and index theory”, in Mathematical Topics 9, Akademie Verlag, 1996. [7] B. Fedosov, “Non-abelian reduction in deformation quantization”, Lett. Math. Phys. 43 (1998) 137–154. [8] S. Gutt, “An explicit ∗-product on the cotangent bundle to a Lie group”, Lett. Math. Phys. 7 (1983) 249–258. [9] S. Kobayashi and K. Nomizu, Foundation of Differential Geometry, John Wiley & Sons, 1969. [10] J. Marsden and T. Ratiu, Introduction to Mechanics and Symmetry, Springer-Verlag, 1994. [11] R. Nest and B. Tsygan, “Algebraic index theorem for families”, Adv. Math. 113 (1995) 151–205. [12] H. Omori, Y. Maeda and A. Yoshioka, “Weyl manifolds and deformation quantization”, Adv. Math. 85 (1991) 224–255. [13] R. Abraham and J. Marsden, Foundation of Mechanics, Addison-Wesley, 1985. [14] P. Xu, “Fedosov ∗-products and quantum momentum maps”, Comm. Math. Phys. 197 (1998) 167–197.
August 21, 2002 15:9 WSPC/148-RMP
00147
EDITORIAL
This special issue of Reviews in Mathematical Physics is dedicated to Huzihiro Araki who, as founding editor, shaped this journal, thus establishing its place in our community. Colleagues, friends and former students, in contributing to this issue, express their appreciation, sympathy and gratitude to a man who, through his scientific work and organizational talents was instrumental in ensuring that Mathematical Physics became an acknowledged part of modern science. Huzihiro Araki, born in 1932, has published more than 160 scientific papers, primarily on topics in quantum field theory, quantum statistical mechanics and the theory of operator algebras. This impressive oeuvre bears witness to his enormous technical skill and utmost mathematical precision. These enabled him to do truely pioneering work and clarify conceptual issues of fundamental importance. The impact of his scientific contributions to these fields is described by R. Haag and M. Takesaki in two subsequent essays. In addition to his outstanding scientific contributions, Araki demonstrated his talent as an efficient organizer of international conferences, as editor and coeditor of several journals and as head of numerous international committees and associations. In particular, he played a leading role in formulating the constitution of the International Association of Mathematical Physics and was three times its president or vice president. He chaired the Mathematical Physics Commission of the International Union of Pure and Applied Physics and held positions of responsibility at the International Mathematical Union. His scientific authority and continuous efforts were vital in bringing about the present general recognition of Mathematical Physics as a fertile interface between Mathematics and Physics. After retiring from RIMS at Kyoto University and the Science University of Tokyo, Araki has now focussed on his scientific work in quantum statistical mechanics. We convey here our very best wishes for the successful conclusion of these intriguing projects and for many pleasant and productive years to come. Detlev Buchholz Masaki Izumi Taku Matsui
623
August 21, 2002 16:32 WSPC/148-RMP
00125
HUZIHIRO ARAKI ON THE OCCASION OF HIS 70TH BIRTHDAY
Some Reminiscences by Rudolf Haag Looking back over forty-five years I see a young student entering my office saying: “I want to talk to you”. This was at Princeton University where Huzihiro, as a graduate student with a scholarship from Japan, had already acquired a high reputation and I spent two years as a visiting professor. He was looking for a thesis problem. I felt honored but somewhat insecure because I had never acted as thesis advisor before and had no permanent position anywhere. Still, it worked out. Within two years Huzihiro produced two interesting papers about quite different topics of which one was accepted as a Ph.D. thesis by Princeton University after my departure. Arthur Wightman wrote me that I was responsible for the judgment of the scientific content whilst he would see his duty as translator from Japanese English to American style. This episode marked the beginning of a long period of collaboration, discussions and shared experiences in many locations of the world. Here is not the place to talk about the many tokens of friendship I received from Huzihiro, of his absolute reliability both in scientific and personal matters, of his carefulness in organizing things taking every detail into account or of the indulgence with which he accompanied many of my whims during my visits to Japan. Just one episode may illustrate the style of discussions. One day he came, rather disgusted, and told me: “If you do not pose a problem correctly I have no fun solving it”. Over the past forty years Huzihiro Araki produced an enormous body of published work in a variety of areas. Let me try to give a brief summary. Axiomatic Quantum Field Theory In 9 papers between 1960 and 1962 Araki treats the following questions: analyticity of Green’s functions in momentum space, cluster properties of these functions, inequivalent representations of the commutation relations in field theory appropriate to specific dynamical laws, the spin-statistics relation and arbitrariness of spacelike commutation relations between different fields due to superselection rules. The papers culminate in his lecture notes at the ETH Z¨ urich in 1962, where he gives a comprehensive survey of the whole field including many original techniques and results. These lecture notes constitute one of the principal authoritative presentations of axiomatic quantum field theory. 625
August 21, 2002 16:32 WSPC/148-RMP
626
00125
Huzihiro Araki
Local Quantum Physics (the Algebraic Approach) In this approach the theory is described in terms of nets of algebras generated by local observables (instead of quantum fields). The contributions of Araki to this frame have been absolutely essential. In several papers 1963–64 he analyzed free field theories from this point of view; one important result here was the very difficult proof of the duality relation between the algebra of a space-time region and that of its causal complement. Next came his amazing discovery that the von Neumann algebras arising in this context could not be of type I. This result was so stunning at the time that for years eminent mathematicians did not believe it. Then he showed that the relativistic spectrum condition for energy-momentum implied that translation invariance could not be broken in the ground state i.e. that the vacuum cannot be a crystal. Another seminal work (1967) was a joint paper by Araki and myself on the determination of collision cross sections form vacuum expectation values of observables. This showed that a theory is completely given by the net of algebras of local observables. No further information about the physical interpretation of the algebraic elements is needed. It also gave a new and very natural approach to the study of particle aspects in the theory, a method extended significantly in the recent work by Buchholz on theories with massless particles. Quantum Statistical Mechanics In 1963 Araki and Woods derived the representation of canonical commutation relations appropriate to the description of a thermal equilibrium state for a free, infinitely extended Bose gas. This paper had an enormous influence on subsequent work by many authors. From 1967 on Araki applied C∗ -algebra techniques to many questions in statistical mechanics. Among them: equilibrium states of lattice systems, different characterizations of equilibrium, autocorrelation inequalities, entropy inequalities, the concept of relative entropy, chemical potential, local thermodynamic stability. He worked on specific models as well as on general structural properties and quite a number of his results have made their way into text books of statistical mechanics. Functional Analysis (in Particular of von Neumann Algebras) Since this work in pure mathematics lies somewhat outside my range of competence I shall leave its discussion to Masamichi Takesaki. Foundations of Quantum Mechanics and Measurement Theory From his student days Araki has been interested in the conceptual problems posed by Quantum Theory and the origin of the specific mathematical structure, sometimes called Quantum Logic. Beginning in 1957 we find in intervals of 5 to 10 years contributions to these questions by Araki. Among them is one of the most convincing derivations of the structure of quantum mechanical state space from operational principles (1980).
August 21, 2002 16:32 WSPC/148-RMP
00125
Huzihiro Araki
627
The contribution of Huzihiro Araki to science is not limited to his published work. He always felt it his duty to offer his services to matters of organization. The high appreciation of his judgment and integrity is put in evidence by the many requests on him to act as organizer for international symposia, to serve on governing boards of various scientific organizations, as editor of journals etc. Tasks which he always accepted willingly and executed with great conscientiousness. The community of mathematical physicists owes him much. About the Mathematical Work of Huzihiro Araki by Masamichi Takesaki It was in the fall of 1963 when I first encountered Huzihiro’s work on the representations of the canonical commutation relations of infinitely many degrees of freedom which struck me strongly and influenced my later study. It was quite a surprise to find this outstanding deep work in the middle of the lowest slump in operator algebras, written by an outsider who was not known among Japanese operator algebraists at that time although I soon realized from a friend in theoretical physics that he was well known among Japanese particle physicists. Later, in the fall of 1964, I received a message from Huzihiro asking me how he should organize a meeting in the field of operator algebras at RIMS in Kyoto. I recommended that he contact Professor Takenouchi. Anyway, Huzihiro organized this first full scale workshop on operator algebras at RIMS in the spring of 1965 in a remarkably short preparation period. After this Huzihiro became very quickly a leader of Japanese operator algebraists and mathematical physicists. His foot steps were visible everywhere in these fields. He is one of the pioneers who established the field of mathematical physics as an active area of mathematics in the sixties and seventies. Mathematical Physics is a hybrid of theoretical physics and mathematics. Its development to a distinct area of research resulted from a peculiar situation after the end of World War II when the need for a unification of Quantum Physics and Special Relativity became imperative and mathematics was unable to provide adequate tools for this task. In fact, the separation between mathematics and physics had become wide. The success of mathematics resting on rigorous arguments necessitated to train students more and more independently from physics, its long time ally. This separation also hampered physicists. The wide-spread scorn for mathematical rigor among theoretical physicists led in the middle of the century to a situation in which they had to realize that their formal manipulations of symbols generated mathematical nonsense all over the place. Several theoretical physicists saw the need to bridge the gap and build physical theory on more solid mathematical ground. Arthur Wightman, in his program for an “Axiomatic Quantum Field Theory”, stressed the importance of the theory of distributions and of analytic functions of several complex variables. Rudolf Haag used work on topological algebras by von Neumann, Gelfand and others. But in all these areas sharper tools had to be developed.
August 21, 2002 16:32 WSPC/148-RMP
628
00125
Huzihiro Araki
Araki was a young energetic physicist at that time who had unbelievable mathematical talent and creativity. In the sixties he addressed and settled difficulties one after another and was thus instrumental in creating this new field of mathematical physics, bridging mathematics and theoretical physics. In reporting about his work I shall confine myself to that part which had a strong impact on pure mathematics, leaving it to Rudolf Haag to discuss his contribution to other areas. In the early sixties Araki studied a lattice of von Neumann algebras associated with the Fock representation of canonical commutation relations and proved a remarkable duality theorem. In a joint paper with E. J. Woods on the equilibrium state of a free Bose gas the authors exhibited a beautiful symmetry between a von Neumann algebra and its commutant, far ahead of the Tomita–Takesaki theory. A real milestone was Araki’s stunning discovery that the von Neumann algebras appearing in relativistic quantum physics were factors of type III. This was shocking to experts in operator algebras because at that time only two non-isomorphic type III factors were known and no tools for a finer analysis were in sight. Araki’s result stimulated a new attack on the type III mystery. In 1967 R. T. Powers showed the existence of uncountably many non-isomorphic type III factors. Araki realized that he could classify factors arising from infinite tensor products of type I factors. Together with his long time collaborator E. J. Woods he introduced two algebraic invariants, called the asymptotic ratio set and the %-set which captured deep structures in the type III phenomenon. It was this work which motivated Alain Connes to introduce his famous invariants S(M ), the modular spectrum of a factor M and T (M ), the modular period group. The invariants of Araki–Woods turned out to be special cases of these. Araki continued this tour de force during his stay at Queen’s University in Kingston, Ontario, from April through July 1972. In those four months he completed 10 articles. Dick Kadison suggested to change the spelling of Araki’s first name to “Who’s the hero”. There was work on the general theory of noncommutative integration based on the newly established Tomita–Takesaki theory. Essential in this work was the study of positive cones in the Hilbert space of the standard representation of a von Neumann algebra which enabled one to think of trace free Lp spaces. I cannot resist the temptation to mention an episode in spring 1972. Araki and myself were both invited to give talks in a conference at Texas Christian University and it turned out that we both presented the same results without knowledge of each other’s work. These results were exciting at that time because they were the first case in which the complete classification of a type III factor could be reduced to the study of von Neumann algebras of type II1 . Jim Woods was quick to realize that this was a golden opportunity to organize an extended workshop for Araki and myself at Queen’s University. We gathered there for six weeks with other prominent people. During this summer and fall a factor of type III finally surrendered at the hands of Alain Connes and myself with big input by Araki. It was the most exciting summer I ever experienced.
August 21, 2002 16:32 WSPC/148-RMP
00125
Huzihiro Araki
629
By that time the cause of mathematical physics had attracted quite a number of people in many countries who met often informally in various subgroups. The advice and judgment of Araki was highly appreciated by all of them. It was in particular the amazing parallelism between basic notions in theoretical physics and deep mathematical theorems which fascinated Araki. One example was the role of the chemical potential in the description of equilibrium states. Araki recognized here a connection to the Tannaka duality theorem and this led ultimately to a joint paper by Araki, Haag, Kastler and myself with ramifications into several areas of pure mathematics. In later years, though he always kept his active interest in several areas of pure mathematics, the most significant contributions of Araki concerned the precise treatment of questions in quantum statistical mechanics where he often invented novel methods for solving concrete problems. It is rare to have an individual who combines distinct disciplines with such mastership, especially in Japan because Japanese culture prefers to have a group of people working together. But he succeeded also in training young talented students in Kyoto who form now a core of mathematical physicists and operator algebraists in Japan. This impact will reach far into the future.
Rudolf Haag Waldschmidtstraße 4b 83727 Schliersee–Neuhaus Germany
Masamichi Takesaki Department of Mathematics University of California Los Angeles, CA 90095-1555 USA
August 22, 2002 15:18 WSPC/148-RMP
00126
Reviews in Mathematical Physics, Vol. 14, Nos. 7 & 8 (2002) 631–648 c E. H. Lieb & G. K. Pedersen
CONVEX MULTIVARIABLE TRACE FUNCTIONS∗
ELLIOTT H. LIEB Departments of Mathematics and Physics, Princeton University P.O. Box 708, Princeton NJ 08544-0708, USA
[email protected] GERT K. PEDERSEN Department of Mathematics, University of Copenhagen, Universitetsparken 5 DK-2100 Copenhagen Ø, Denmark
[email protected]
Received 9 July 2001 Revised 17 September 2001 Dedicated to Huzihiro Araki on the occasion of his 70th birthday For any densely defined, lower semi-continuous trace τ on a C ∗ -algebra A with mutually commuting C ∗ -subalgebras A1 , A2 , . . . An , and a convex function f of n variables, we give a short proof of the fact that the function (x1 , x2 , . . . , xn ) → τ (f (x1 , x2 , . . . , xn )) Ln is convex on the space i=1 (Ai )sa . If furthermore the function f is log-convex or rootconvex, so is the corresponding trace function. We also introduce a generalization of logconvexity and root-convexity called `-convexity, show how it applies to traces, and give some examples. In particular we show that the Kadison–Fuglede determinant is concave and that the trace of an operator mean is always dominated by the corresponding mean of the trace values. Keywords: Operator algebras; trace functions; trace inequalities. Mathematics Subject Classification 2000: Primary 46L05; Secondary 46L10, 47A60, 46C15
1. Introduction. The fact that several important concepts in operator theory, in quantum statistical mechanics (the entropy, the relative entropy, Gibbs free energy), in engineering and in economics involve the trace of a function of a self-adjoint operator has motivated a considerable amount of abstract research about such functions in the last half century. An important subset of questions involve the convexity of trace functions with respect to their argument. The convexity of the function x → Tr(f (x)), when f is a convex function of one variable and x is a self-adjoint operator, was known to von Neumann, ∗ This
paper may be reproduced, in its entirety, for non-commercial purposes. 631
August 22, 2002 15:18 WSPC/148-RMP
632
00126
E. H. Lieb & G. K. Pedersen
cf. [12, V.3. p. 390]. An early proof for f (x) = exp(x) can be found, e.g. in [19, 2.5.2]. A proof given by the first author some time ago describes the trace Tr(f (x)), where f is convex, as a supremum over all possible choices of orthonormal bases of the Hilbert space of the sum of the values of f at the diagonal elements of the matrix for x. This proof was communicated to B. Simon, who used the method to give an alternative proof of the second Berezin–Lieb inequality in [21, Theorem 2.4], see also [22, Lemma II.10.4]. Simon only considers the exponential function, but the argument is valid for any convex function, cf. [10, Proposition 3.1]. The general case for an arbitrary trace on a von Neumann algebra was established by D. Petz in [17, Theorem 4] using the theory of spectral dominance (spectral scale). The basic fact, for one variable x and a positive convex function f , is that X f ((φj , xφj )) ≤ Tr(f (x)) , (1) j
where the sum — finite or not — is over any orthonormal basis. Equality is obviously achieved if the basis is the set of eigenvectors of x. Thus, X f ((φj , xφj )) . (2) Tr(f (x)) = sup {φ}
j
Essentially, the proof is the following: If {φj } is the eigenvector basis (with eigenP values λj ) and ψj is some other basis, then ψj = k Cjk φk , and the coefficients P P 2 = 1 = |2 . Then (ψj , xψj ) = of the unitary matrix C satisfy j |Cjk | P k |CjkP P 2 2 2 k |Cjk | λk and, by Jensen’s inequality, f ( k |Cjk | λk ) ≤ k |Cjk | f (λk ). Now, summing on j we obtain (1). Equation (2) implies the convexity of x → Tr(f (x)), because any supremum of convex functions is convex. Moreover, if f (x) = exp(x) then one sees that Tr(exp(x)) is log-convex (i.e. log(Tr(exp(x))) is convex), because an ordinary sum P of the type exp(aj ), with aj in R, is log-convex. Similarly, if f (x) = |x|p (with p ≥ 1) we see that x → (Tr(|x|p ))1/p is convex. In particular, the Schatten p-norms are subadditive. Even more is true. If f (x) = exp(g(x)) and g is convex, then x → Tr(f (x)) is log-convex. Similarly, if f (x) = |g(x)|p then x → (Tr(f (x)))1/p is convex. A natural question that arises at this point is this: Are there other pairs of functions e, ` of one real variable, beside the pairs exp, log and |t|p , |t|1/p , for which x → `(Tr(f (x))) is convex whenever f is `-convex, i.e. f (x) = e(g(x)) and g is convex? In the second part of our paper we answer this question completely and give a few examples, which we believe to have some potential value. But first we turn to the question of generalizing (2) to functions of several variables. We start with a function f (λ) of n real variables (with λ = (λ1 , λ2 , . . . , λn )). Next we replace the real variables λj , by operators xj , similar to the one-variable case. An immediate problem that arises is how to define f (x) in this case. The spectral theorem, which was used in the one-variable case, fails here unless the xj ’s commute with each other. Therefore, we restrict the x1 , x2 , . . . , xn to lie in commuting subalgebras A1 , A2 , . . . , An , and then f (x) and Tr(f (x)) are well defined and it makes sense to discuss the joint convexity of this trace function under the
August 22, 2002 15:18 WSPC/148-RMP
00126
Convex Multivariable Trace Functions
633
condition that f is a jointly convex function of its arguments. (We do not investigate the question whether f (x) is operator convex — only the convexity under the trace.) More generally, we replace the trace Tr in a Hilbert space setting by τ , a densely defined, lower semi-continuous trace on a C ∗ -algebra A; i.e. a functional defined on the set A+ of positive elements with values in [0, ∞], such that τ (x∗ x) = τ (xx∗ ) for all x in A. We further assume that A comes equipped with mutually commuting C ∗ -subalgebras A1 , A2 , . . . , An . The convexity of the function x → τ (f (x)) on the space of n-tuples in Ln i=1 (Ai )sa was proved by F. Hansen for matrix algebras in [5]. (Here Asa denotes the self-adjoint elements in A.) His result was extended to general operator algebras by the second author in [16]. Both these proofs rely on Fr´echet differentiability and some rather intricate manipulations with first and second order differentials. We realized that the argument in (2) will work in this multi-variable case as well, thereby providing a quick proof of the convexity. The key observation is that the mutual commutativity of x1 , x2 , . . . , xn implies that there is one orthonormal basis (in the case that A is a matrix algebra) that simultaneously makes all the xj diagonal. For a general C ∗ -algebra (e.g. the algebra of continuous functions on an interval) we have to find something to take the place of an orthonormal basis. This something is just the commutative C ∗ -subalgebra generated by the n commuting elements x1 , x2 , . . . , xn . It depends on x, of course, but that fact is immaterial for computing the trace for a given x. The main result in the first part of this paper — proved in Sec. 6 — is 2. Theorem (Multivariable Convex Trace Functions). Let f be a continuous convex function defined on a cube I = I1 ×· · ·×In in Rn . If A1 , . . . , An are mutually commuting C ∗ -subalgebras of a C ∗ -algebra A and τ is a finite trace on A, then the function (x1 , . . . , xn ) → τ (f (x1 , . . . , xn )) ,
(3)
defined on commuting n-tuples such that xi ∈ (Ai )Isai for each i, is convex on L (Ai )Isai . (Here, (Ai )Isai denotes the set of self-adjoint elements in Ai whose spectra are contained in Ii .) If τ is only densely defined, but lower semi-continuous, the result still holds if f ≥ 0, even though the function may now attain infinite values. In the second part we explore the natural generalization of the concept of logconvexity mentioned before and explained in detail in Sec. 7. We find a necessary and sufficient condition on a concave function ` that ensures that `-convexity of a function f = e ◦ g, with e = `−1 and g convex, implies `-convexity of the function x → τ (f (x)) for a tracial state τ on a C ∗ -algebra A. (A tracial state is a trace satisfying τ (1) = 1.) The main result there — proved in Sec. 9 — is
August 22, 2002 15:18 WSPC/148-RMP
634
00126
E. H. Lieb & G. K. Pedersen
3. Theorem (`-Convex Trace Functions). Let f be a continuous function defined on a cube I = I1 ×· · ·×In in Rn , and assume furthermore that f is `-convex relative to a pair of functions e, ` as described in Sec. 7, where `0 /`00 is convex. If A1 , . . . , An are mutually commuting C ∗ -subalgebras of a C ∗ -algebra A and τ is a tracial state on A, then the function (x1 , . . . , xn ) → τ (f (x1 , . . . , xn )) ,
(4)
defined on commuting n-tuples such that xi ∈ (Ai )Isai for each i, is also `-convex L on (Ai )Isai . If moreover `0 /`00 is homogeneous and f ≥ 0 the result holds for any densely defined, lower semi-continuous trace τ on A. In Secs. 4, 5 and 7 we set up some necessary machinery, whereas the key lemmas are in Secs. 6 and 8. The third part of the paper, Secs. 9–22, consists of examples where we apply the preceeding results. In particular we investigate the n-fold harmonic mean of positive operators and show how its trace behaves under certain concave transformations. As a corollary we prove in Proposition 23 that the trace of any mean (in the sense of Kubo and Ando [8]) is dominated by the corresponding mean of the trace values. Throughout the paper we have chosen a C ∗ -algebraic setting with densely defined, lower semi-continuous traces, this being the more general theory. We might as well have developed the theory for von Neumann algebras with normal, semifinite traces; in fact we need this more special setting in the proof of Lemma 6. However, the Gelfand–Naimark–Segal construction effortlessly transforms the C ∗ algebra version into the von Neumann algebra setting, so there is no real difference between the two approaches. 4. Spectral Theory. We consider a C ∗ -algebra A of operators on some Hilbert space H and mutually commuting C ∗ -subalgebras A1 , . . . , An , i.e. Ai ⊂ Aj ’ for all i 6= j. For each interval Ii we let (Ai )Isai denote the convex set of self-adjoint elements in Ai with spectra contained in Ii . If I = I1 × · · · × In ⊂ Rn and f is a continuous L Ii function on I we can for each R x = (x1 , . . . , xn ) in (Ai )sa define an element f (x) in A. To see this, let xi = λdEi (λ) be the spectral resolution of xi for 1 ≤ i ≤ n. Since the xi ’s commute, so do their spectral measures. We can therefore define the product spectral measure E on I by E(S1 × · · · × Sn ) = E1 (S1 ) · · · En (Sn ) and then write Z (5) f (x) = f (λ1 , . . . , λn )dE(λ1 , . . . , λn ) . Of course, if f is a polynomial in the variables λ1 , . . . , λn we simply find f (x) by replacing each λi with xi . The map f → f (x) so obtained is a ∗ -homomorphism of C(I) into A and generalizes the ordinary spectral mapping theory for a single (self-adjoint) operator. In particular, the support of the map (the smallest closed set S such that f (x) = 0 for any function f that vanishes off S) may be regarded as the “joint spectrum” of the elements x1 , . . . , xn .
August 22, 2002 15:18 WSPC/148-RMP
00126
Convex Multivariable Trace Functions
635
This theory applies readily in the situation where A = A1 ⊗ · · · ⊗ An , but, curiously enough, the tensor product structure (used extensively in [5] and [16]) is not needed in our arguments. 5. Conditional Expectations. Let τ be a fixed, densely defined, lower semicontinuous trace on A, and let C be a fixed commutative C ∗ -subalgebra of A. By Gelfand theory we know that each commutative C ∗ -subalgebra of A has the form C0 (T ) for some locally compact Hausdorff space T . Note now that if y ∈ C+ , the positive part of C, and has compact support as a function on T , then y = yz for some z in C+ . Since the minimal dense ideal K(A) of A is generated (as a hereditary ∗ -subalgebra) by elements a in A+ such that a = ab for some b in A+ , cf. [14, 5.6.1], and since τ is densely defined, hence finite on K(A), it follows that τ (y) < ∞. Restricting τ to C we therefore obtain a unique Radon measure µC on T such that Z (6) y(t)dµC (t) = τ (y) , y ∈ C , cf. [15, Chap. 6]. Furthermore, if x ∈ A+ the positive functional y → τ (yx) on C determines a unique Radon measure on T (by the Riesz representation theorem) which is absolutely continuous with respect to µC , in fact dominated by a multiple of µC (by the Cauchy–Schwarz inequality). By the Radon–Nikodym theorem, there is a positive function Φ(x) in L∞ µC (T ) such that Z (7) y(t)Φ(x)(t)dµC (t) = τ (yx) , y ∈ C . Extending by linearity, this defines a map Φ from A to L∞ µC (T ) which is linear, positive, norm decreasing (and unital if both A and C have the same unit). Moreover, Φ(y) = y almost everywhere if y ∈ C (with respect to the natural homomorphism of C = C0 (T ) into L∞ µC (T )). When τ is faithful and C and A are von Neumann algebras, the map Φ is a classical example of a conditional expectation, cf. [7, Exercise 8.7.28]. Ln 6. Lemma. With notations as in Secs. 4 and 5, take x = (x1 , . . . , xn ) in i=1 (Ai )Isai and let f be a continuous, real and convex function on I = I1 × · · · In . If τ is unbounded we assume, moreover, that f is positive. For each commutative C ∗ subalgebra C of A set Z (8) ϕC (f, x) = f (Φ(x1 )(t), . . . , Φ(xn )(t))dµC (t) . Then ϕC (f, x) ≤ τ (f (x)), with equality whenever xi ∈ C for all i. Proof. If xi ∈ C for all i then also f (x) ∈ C, and since Φ(xi ) = xi almost everywhere we get by (6) that Z Z (9) ϕC (f, x) = f (x1 (t), . . . , xn (t))dµC (t) = f (x)(t) dµC (t) = τ (f (x)) .
August 22, 2002 15:18 WSPC/148-RMP
636
00126
E. H. Lieb & G. K. Pedersen
To prove the inequality for a general n-tuple we first assume that τ is a normal, semi-finite trace on a von Neumann algebra M , [7, 8.5] or [20, 2.5.1], and put Mτ = {x ∈ M | τ (|x|) < ∞} .
(10)
Mτ= ,
the norm closure of Mτ , so that A is a twoWe then assume that A = sided ideal in M . We further assume that each Ai is relatively weakly closed in A, i.e. Ai = A ∩ Mi for some von Neumann subalgebra Mi of M . This has the effect that every self-adjoint element xi in Ai can be approximated in norm by an element yi in Ai with finite spectrum and sp(yi ) ⊂ sp(xi ). Under these assumptions we notice that since both f and Φ are norm continuous it suffices to establish the inequality for the norm dense set of n-tuples x = (x1 , . . . , xn ), such that each xi has finite spectrum. Since the xi ’s commute mutually there is in this case a finite family {pk } of pairwise orthogonal projections in A with sum 1 such that X λik pk , 1 ≤ i ≤ n , (11) xi = k
where λik ∈ Ii (and repetitions may occur). If we set λk = (λ1k , . . . , λnk ) this means that X f (λk )pk . (12) f (x) = P P P Φ(pk )(t) = 1 for (almost) As pk = 1 in A also Φ(pk ) = 1 in L∞ µC (T ), so that every t in T . Since f is convex this implies that X X λnk Φ(pk )(t) f (Φ(x1 )(t), . . . , Φ(xn )(t)) = f λ1k Φ(pk )(t), . . . , ≤
X
f (λ1k , . . . , λnk )Φ(pk )(t) =
X
f (λk )Φ(pk )(t) . (13)
Consequently, by (8), (13) and (6), Z X Z X f (λk )Φ(pk )(t)dµC (t) = f (λk ) Φ(pk )(t)dµC (t) ϕC (f, x) ≤ =
X
f (λk )τ (pk ) = τ
X
f (λk )pk ) = τ (f (x) .
(14)
To prove the inequality for a general C ∗ -algebra A with commuting C ∗ subalgebras Ai consider the GNS representation (πτ , Hτ ) associated with τ . By construction, cf. [14, 5.1.5], we obtain a normal, semi-finite trace τ˜ on the von Neumann algebra M = πτ (A)00 such that τ˜(πτ (x)) = τ (x) for every x in A+ , and we can define the two-sided ideal Mτ and its norm closure Mτ= as in (10). If we now put A˜ = Mτ= and A˜i = A˜ ∩ (πτ (Ai ))−w (closure in the weak operator topology) for 1 ≤ i ≤ n, we have exactly the setup above. Our argument, therefore, shows that ˜ A˜i and τ˜, and an arbitrary commutative the inequality holds in the setting of A, ∗ ˜ C -subalgebra C˜ of A. ˜ If C is a Note now that since τ is densely defined on A we have πτ (A) ⊂ A. ∗ ˜ commutative C -subalgebra of A we put C = πτ (C), and observe that C˜ has the
August 22, 2002 15:18 WSPC/148-RMP
00126
Convex Multivariable Trace Functions
637
˜ form C˜ = C0 (T˜ ) for some closed subset T˜ of T with µC (T \ T˜ ) = 0. The map Φ ∞ ˜ ˜ from A to LµC (T ) defined in Sec. 3 will therefore satisfy the restriction formula ˜ τ (x)) = Φ(x) | T˜ Φ(π
(15)
for every x in A. But then, by our previous result in (14), Z Z ˜ τ (x1 )), . . . , Φ(π ˜ τ (xn )))dµC ϕC (f, x) = f (Φ(x1 ), . . . , Φ(xn )) dµC = f (Φ(π = ϕC˜ (f, πτ (x)) ≤ τ˜(f (πτ (x)) = τ˜(πτ (f (x)) = τ (f (x)) .
(16)
Proof of Theorem 2. It is evident that the function x → ϕC (f, x), defined in L Lemma 6, is convex on (Ai )Isai for each C, being composed of a linear operator Φ(= ΦC ), a convex function f , and a positive linear functional — the integral. Moreover, if C denotes the set of commutative C ∗ -subalgebras of A, it follows from Lemma 6 that τ (f (x)) = sup ϕC (f, x) , C
(17)
the supremum being attained at every C that contains the commutative C ∗ subalgebra C ∗ (x) generated by the (mutually commuting) elements x1 , . . . , xn . Thus x → τ (f (x)) is convex as a supremum of convex functions. This concludes the first part of our paper and we now turn to Theorem 3. 7. `-Convexity. We consider a strictly increasing, convex and continuous function e on some interval I, and denote by ` its inverse function (so that `(e(s))) = s for every s in I and e(`(t)) = t for every t in e(I)). If g is a convex function defined on a convex subset of a linear space then f = e ◦ g is convex as well, but in some sense f is “much more” convex, since even ` ◦ f is convex, whereas ` is concave. We say in this situation that the function f is `-convex. This terminology is chosen to agree with the seminal example, where e = exp and ` = log. Our aim is to show — under mild restrictions on ` — that `-convexity has some remarkable structural properties, being preserved under integrals and traces. By contrast, the concept of `-concavity — with the obvious definition — seems to be less interesting. 8. Lemma. If I is an interval in R and e is a strictly increasing, strictly convex function in C 2 (I) with inverse function `, then for each probability measure µ on a locally compact Hausdorff space T, and for each `-convex function f defined on some cube I in Rn , the function Z (18) (u1 , . . . , un ) → f (u1 (t), . . . , un (t))dµ(t) , ui ∈ L∞ µ (T ) ,
August 22, 2002 15:18 WSPC/148-RMP
638
00126
E. H. Lieb & G. K. Pedersen
is also `-convex on the appropriate n-tuples in L∞ µ (T ) if and only if the function ϕ, defined by ϕ(e(s)) = (e00 (s))−1 (e0 (s))2 , is concave. Proof. Since f = e ◦ g for some convex function g on I, we see that to prove the Lemma it suffices to show that the increasing function Z (19) k(u) = ` e(u(t))dµ(t) , u ∈ L∞ µ (T ) , I is convex on (L∞ µ (T )) . Clearly, this is also a necessary condition. Considering instead the scalar funtions Z h(s) = ` e(u(t) + sv(t))dµ(t) (20)
for arbitrary elements u, v in L∞ µ (T ), where the range of u is contained in the interior of I, we notice that convexity of k is equivalent to convexity at zero for all 00 functions R of the form h, and we therefore only have to show that h (0) ≥ 0. Setting r(s) = e(u + sv) dµ we compute Z 0 0 h (s) = ` (r) e0 (u + sv)v dµ ; Z h00 (s) = `00 (r)
2 R e0 (u + sv)v dµ + `0 (r) e00 (u + sv)v 2 dµ .
(21)
On the other hand, since `(e(t)) = t we also have `0 (e(t))e0 (t) = 1 ; `00 (e(t))(e0 (t))2 + `0 (e(t))e00 (t) = 0 .
(22)
In our case we can let e(t) = r(s), whence t = `(r(s)) = h(s). We can therefore eliminate `00 (r) in (21) to get the expression 2 Z Z h00 (s) = −`0 (r)e00 (h(s))(e0 (h(s)))−2 e0 (u + sv)v dµ + `0 (r) e00 (u + sv)v 2 dµ Z 2 Z e00 (u + sv)v 2 dµ − e00 (h(s))(e0 (h(s)))−2 e0 (u + sv)v dµ . = `0 (r) (23) Since e is strictly increasing, so is `, which implies that `0 (r) > 0. It follows from (23) that h00 (0) ≥ 0 if and only if 2 Z Z (24) e00 (h(0))e0 (h(0))−2 e0 (u)v dµ ≤ e00 (u)v 2 dµ . Now define ϕ on e(I) by ϕ(e(s)) = (e0 (s))2 (e00 (s))−1 . Since e(h(0)) = R the function 00 r(0) = e(u)dµ and e (h(0)) > 0 because e is strictly convex we see that (24) is equivalent to the inequality 2 Z Z Z e(u) dµ e00 (u)v 2 dµ . (25) e0 (u)v dµ ≤ ϕ
August 22, 2002 15:18 WSPC/148-RMP
00126
Convex Multivariable Trace Functions
639
R For ease ˜ and choose a function w in L1µ (T ) R of notation put ϕ( e(u)dµ) = ϕ(e(u)), with w dµ = 1. Then consider the quadratic form Z Z Z ˜ w dµ λ2 e00 (u)v 2 dµ − 2λ e0 (u)v dµ + ϕ(e(u)) Z =
˜ dµ . (λ2 v 2 e00 (u) − 2λve0 (u) + ϕ(e(u))w)
(26)
By construction this form is positive if and only if (25) is satisfied. But (26) expresses the integral of a function which is itself a quadratic form. The minimum in (26) therefore occurs for λve00 (u) = e0 (u) and equals Z (ϕ(e(u))w ˜ − (e00 (u))−1 (e0 (u))2 ) dµ Z
Z =
(ϕ(e(u))w ˜ − ϕ(e(u))) dµ = ϕ
Z e(u) dµ − ϕ(e(u)) dµ .
(27)
Evidently this expression is non-negative if and only if ϕ is concave. 9. Remark. Using the equation e(`(t)) = t as in (22) we easily find that ϕ(t) = −(`00 (t))−1 `0 (t), so that the condition in Lemma 8 translates to the demand: The (negative) function t → `0 (t)/`00 (t)
must be convex .
Note also that the condition in (27), viz. Z Z ϕ(e(u)) dµ ≤ ϕ e(u) dµ
(28)
(29)
makes sense for an arbitrary measure µ and provides a necessary and sufficient condition for the function defined in (18) to be `-convex. However, in order to satisfy (29) for an arbitrary (point) measure, the function ϕ must have sϕ(t) ≤ ϕ(st) for all s > 0, which forces it to be homogeneous, i.e. ϕ(st) = sϕ(t) for s > 0. It follows that the function defined by (18) in Lemma 8 is `-convex for an arbitrary measure µ if and only if `0 (t)/`00 (t) = γt for some non-zero number γ .
(30)
Of course, this can only happen when the domain of ` is stable under multiplication with positive numbers, so it is either a half-axis or the full line. But since the expression in (30) must be negative, only half-axes can occur. Proof of Theorem 3. The theorem follows by using Lemma 6, as in the proof of Theorem 2, combined with Lemma 8. 10. Examples. Evidently the condition that `0 /`00 be convex is not very restrictive, and is satisfied by myriads of functions, of which we shall list a few, below. On the other hand, the demand that `0 /`00 be homogeneous is quite severe, and only four (classes of) functions will meet this requirement:
August 22, 2002 15:18 WSPC/148-RMP
640
00126
E. H. Lieb & G. K. Pedersen
(i) Let `(t) = log(t) for t > 0. We get e(s) = exp(t) for s in R and `0 /`00 = −t. This is the classical example, and by far the most important. Clearly `(t) = c log(t) for any c > 0 can also be used, but we omit this trivial parameter here and in the following examples. (ii) Let `(t) = t1/p for t ≥ 0 and some p > 1. We get e(s) = tp for s ≥ 0 and `0 /`00 = −γt, where γ = p/(p − 1) > 1. The root examples are also fairly well known. Indeed, it is a very general fact that whenever f is a convex (respectively concave) function that is homogeneous of some degree p > 0 then (i) we must have p ≥ 1 (respectively p ≤ 1) and (ii) the function f 1/p is automatically convex (respectively concave). This is discussed in detail in the proof of [9, Corollary 1.2]. (iii) Let `(t) = −t−α for t > 0 and some α > 0. We get e(s) = (−s)−1/α for s < 0 and `0 /`00 = −γt, where γ = (1 + α)−1 < 1. (iv) Let `(t) = −(−t)p for t ≤ 0 and some p > 1. We get e(s) = −(−s)1/p for s ≤ 0 and `0 /`00 = γt, where γ = (p − 1)−1 > 0.
(v)
(vi)
(vii)
(viii)
(ix)
Non-homogeneous examples are not hard to come by. Without any apparent order we mention these: Let `(t) = − exp(−αt) for t in R and some α > 0. We get e(s) = −α−1 log(−s) for s < 0 and `0 /`00 = −1/α. In applications of Theorem 3 the parameter α disappears, since `(τ (e(a))) = − exp(−ατ (−α−1 log(−a))) = − exp(τ (log(−a))) for any operator a < 0; so we may as well assume that α = 1. Let `(t) = log(log(t)) for t > 1. We get e(s) = exp(exp(s)) for s in R and `0 (t)/`00 (t) = −t log(t)(1 + log(t))−1 , which is only convex for t ≤ e. So on the intervals 1 < t ≤ e and −∞ < s ≤ 0 we can use the functions log log and exp exp. Let `(t) = (log(t))1/p for t ≥ 1 and some p > 1. Here e(s) = exp(sp ) for s ≥ 0 and by computation we find that `0 /`00 is convex for t ≤ exp(1 + 1/p). The allowed interval for e is 0 ≤ s ≤ (1 + p)p−2 . Let `(t) = (1 − (1 − t)p )1/p for 0 ≤ t ≤ 1 and some p > 1. We get e(s) = 1 − (1 − sp )1/p for 0 ≤ s ≤ 1 and `0 (t)/`00 (t) = (p − 1)−1 ((1 − t)p+1 − (1 − t)), which is a convex function on the unit interval. Let `(t) = t1/p (1 + t1/p )−1 for t ≥ 0 and some p ≥ 1. Here e(s) = sp (1 − s)−p for 0 ≤ s < 1. For p > 1 we find after some computation that `0 (t)/`00 (t) = 2pt1+1/p ((p − 1)(p − 1 + (p + 1)t1/p ))−1 plus a linear term, and this is a convex function. For p = 1 we simply get `0 (t)/`00 (t) = − 21 (1 + t), and we note that this example is just a translation of (iii) (replacing t by 1 + t and adding 1).
11. Corollary. If f is a positive, continuous, log-convex function on a cube I in Rn , and A1 , . . . , An are mutually commuting C ∗ -subalgebras of a C ∗ -algebra A, then for each densely defined, lower semi-continuous trace τ on A the function (x1 , . . . , xn ) → τ (f (x1 , . . . , xn )) ,
(31)
August 22, 2002 15:18 WSPC/148-RMP
00126
Convex Multivariable Trace Functions
641
defined on commuting n-tuples such that xi ∈ (Ai )Isai for each i, is also log-convex L on (Ai )Isai . If, instead, f 1/p is convex for some p > 1, or if −f −α is convex for some α > 0, or if f < 0 and −(−f )p is convex, then we also have convexity of the respective functions (x1 , . . . , xn ) → (τ (f (x1 , . . . , xn )))1/p , (x1 , . . . , xn ) → −(τ (f (x1 , . . . , xn )))−α ,
(32)
(x1 , . . . , xn ) → −(−τ (f (x1 , . . . , xn ))) . p
12. Remarks. The corollary above applies to some unexpected situations. Thus we see that the function (x, y) → log(τ (exp((x + y)2 ))) ,
(33)
where x and y are self-adjoint elements in a pair of commuting C ∗ -algebras A and B, is (jointly) convex. The same can be said of the function (x, y) → log(τ (exp(−xα y β )))
(34)
for 0 < α, β and α + β ≤ 1, defined on A+ × B+ . Applied to the root functions the Corollary shows that the function (x, y) → (τ ((x + y)q ))1/p
(35)
is convex for 1 ≤ p ≤ q, and that also (x, y) → (τ ((1 − xα y β )p ))1/p
(36)
is convex for p ≥ 1 on the product of the positive unit balls of A and B. The last two cases in Corollary 11 are perhaps easier to apply in terms of concave functions. By elementary substitutions we find that if f is a positive concave function on some cube I in Rn , then both functions x → (τ ((f (x))−α ))−1/α ; x → (τ ((f (x))1/p ))p ;
(37)
are concave on ⊕(Ai )Isai for α > 0 (and f > 0), and for p ≥ 1. In particular we see that (τ ((x + y)1/p )))p ≥ (τ (x1/p ))p + (τ (y 1/p ))p ,
(38)
for all x, y in A+ , so that the Schatten p-norms are super-additive for p < 1. Recall from [4] the definition of the Kadison–Fuglede determinant ∆ associated with a tracial state τ on a C ∗ -algebra A: ∆(x) = exp(τ (log |x|))
whenever x ∈ A−1 .
August 22, 2002 15:18 WSPC/148-RMP
642
00126
E. H. Lieb & G. K. Pedersen
This is a positive, homogeneous and multiplicative map on the set of invertible elements, cf. [4, Theorem 1], closely related to the ordinary determinant from matrix theory. Note, however, that when A = Mn (C) then ∆(x) = (det(|x|))1/n , because we have to use the normalized trace τ = n1 Tr. The following result is well known for matrices, at least for functions of one variable. 13. Proposition. For each strictly positive concave function f on a cube I in Rn , and mutually commuting C ∗ -subalgebras A1 , . . . , An of a C ∗ -algebra A the operator function x → ∆(f (x))
(39)
is concave on the appropriate n-tuples of commuting self-adjoint elements from L (Ai )Isai . In particular, the Kadison–Fuglede determinant is a concave map on the set of positive invertible elements. Proof. From example (v) in Sec. 10 we see that the function x → − exp(τ (log(−(−f (x))))) = −∆(f (x))
(40)
is convex, as desired. 14. Remark. The concavity in (39) is closely related to one of the main results (Theorem 6) in [9] namely: x → τ (exp(z + log(x)) is concave for any self-adjoint operator z. In one sense (39) is stronger because it allows a general concave f , but when f is linear (39) is a corollary of Theorem 6 in [9], as we show now for one variable: We have to prove that 1 1 1 (x + y) ≥ exp(τ (log(x))) + exp(τ (log(y))) . exp τ log 2 2 2 Put z = − log( 12 (x + y)). Our condition is then that 1 ≥ 12 exp(τ (z + log(x))) + 1 2 exp(τ (z + log(y))). However, τ is a state and therefore Jensen’s inequality applies. Thus, exp(τ (z + log(x))) ≤ τ (exp(z + log(x))), and similarly for y. By Theorem 6 of [9] we know that x → τ (exp(z + log(x))) is concave, which implies that 12 τ (exp(z + log(x))) + 12 τ (exp(z + log(y))) ≤ τ (exp(z + log( 12 (x + y)))) = τ (1) = 1, as desired. 15. Operator Means. We recall from [8], see also [6, 4.1], that a Kubo–Ando mean on the set B+ = B(H)+ of positive operators is a function σ : B+ × B+ → B+ such that (i) (ii) (iii) (iv)
(1 σ 1) = 1, (x1 σ y1 ) ≤ (x2 σ y2 ) if x1 ≤ x2 and y1 ≤ y2 , σ is (jointly) concave on B+ × B+ , z ∗ (x σ y)z = ((z ∗ xz) σ (z ∗ yz)) for every invertible z in B(H).
August 22, 2002 15:18 WSPC/148-RMP
00126
Convex Multivariable Trace Functions
643
Of particular interest are the harmonic mean ! and the geometric mean # defined on positive, invertible operators by: x ! y = 2(x−1 + y −1 )−1 = 2x(x + y)−1 y ; x # y = x1/2 (x−1/2 yx−1/2 )1/2 x1/2 .
(41)
(x # y was introduced in [18] and 12 (x ! y), the parallel sum, was introduced in [1]; see also [2], [6, 4.1] and [8].) The domain of definition for these means can be extended to all positive operators by a simple limit argument (replacing x and y with x + ε1 and y + ε1, and taking the norm limit as ε → 0), and we shall tacitly use this procedure in the following. Thus we shall state all results for positive operators, but in the proofs assume that they are invertible as well. Note that these two operator means are symmetric in the two variables, and that they reduce to the classical harmonic and geometric means when x and y are positive scalars, i.e. √ (42) x ! y = 2xy(x + y)−1 and x # y = xy . A straightforward application of the Cauchy–Schwarz inequality shows that τ (x # y) ≤ τ (x)#τ (y)
(43)
for any trace τ . The corresponding result for the harmonic mean was proved for the ordinary trace Tr in [2]. The general version below (Proposition 21) is somewhat more involved, but also richer. For greater generality, but at little extra cost, we introduce the harmonic mean of an n-tuple of positive operators as −1 −1 −1 , (x1 ! x2 ! · · · ! xn ) = n(x−1 1 + x2 + · · · + xn )
(44)
and we note that this mean is symmetric in all the variables and increasing in each variable. The fact that it is also jointly concave may not be widely known, so we present a short proof. 16. Proposition. The n-fold harmonic mean is a jointly concave function. Proof. As the expression in (44) is homogeneous all we have to show is that −1 X −1 X −1 X + yi−1 ≤ (xi + yi )−1 , (45) x−1 i for any pair of n-tuples of positive invertible operators. Multiplying left and right P by (xi + yi )−1 we obtain the equivalent inequality X −1 X −1 X X X −1 −1 + +y ) ≤ (xi +yi )−1 . (x y x−1 (xi +yi )−1 i i i i (46) We now appeal to the fact that the operator function (x, y) → y ∗ x−1 y is jointly convex, [11], hence also jointly subadditive on the space of operators x, y, where x is positive and invertible. To see this, consider n-tuples (xi ) and (yi ) and define
August 22, 2002 15:18 WSPC/148-RMP
644
00126
E. H. Lieb & G. K. Pedersen
P P −1/2 zj = xj (yj − a), with a = ( xi )−1 yi . Then by computation we obtain the desired estimate X X −1 X X X yj∗ x−1 y − (47) yj∗ xj yj . 0≤ zj∗ zj = j j Breaking the left hand side of (46) into the sum of two terms and using (47) on each we obtain the larger operator X X (xi + yi )−1 yi (xi + yi )−1 )) (xi + yi )−1 xi (xi + yi )−1 ) + =
X
(xi + yi )−1 (xi + yi )(xi + yi )−1 =
X
(xi + yi )−1 ,
(48)
which is precisely the right hand side of (46), as claimed. 17. Remark. Note that (47) also shows that the n-fold harmonic mean is dominated by the arithmetic mean (the average). Indeed, ! n ! !−1 !−1 n n n X X1 X1 1 X −1 1 1 x−1 1 xk = x1 ! x2 ! · · · ! xn = n n n k n k=1
≤
n X k=1
k=1
1 1 −1 1(x−1 1= k ) n n
n X
k=1
xk .
k=1
(49)
k=1
This result is not surprising, since the harmonic mean (on pairs of operators) is the smallest symmetric mean, whereas the arithmetic mean is the largest, cf. [6, 4.1]. 18. Proposition. Given positive, invertible operators x1 , x2 , . . . , xn and y in B(H), Pn let d = diag{x1 , x2 . . . , xn } and e = i,j=1 y ⊗ eij in Mn (B(H)). Then the following conditions are equivalent: Pn −1 , (i) y ≤ ( k=1 x−1 k ) −1 (ii) ed e ≤ e, (iii) e ≤ d. Proof. Assume first that y = 1, and set a = becomes equivalent with a ≤ 1. (i)⇐⇒(ii). By computation ed−1 e =
n X
Pn k=1
a ⊗ eij ,
x−1 k , so that condition (i)
(50)
i,j=1
from which it follows that a ≤ 1 if and only if ed−1 e ≤ e. (i)=⇒(iii). Let p = n1 e, and note that p is a projection in Mn (B(H)). For every ε0 we have d−1 = (p + (1 − p))d−1 (p + (1 − p)) ≤ (1 + ε)pd−1 p + (1 + ε−1 )(1 − p)d−1 (1 − p) = (1 + ε)n−2 ed−1 e + (1 + ε−1 )(1 − p)d−1 (1 − p) .
(51)
August 22, 2002 15:18 WSPC/148-RMP
00126
Convex Multivariable Trace Functions
645
Now a ≤ 1 by (1) and a fortiori d−1 ≤ 1. Moreover, (i)=⇒ (ii) and thus we get d−1 ≤ (1 + ε)n−2 e + (1 + ε−1 )(1 − p) = (1 + ε)n−1 p + (1 + ε−1 )(1 − p) .
(52)
Taking inverses this means that d ≥ (1 + ε)−1 np + ε(1 + ε)−1 (1 − p) ,
(53)
from which the desired inequality follows as ε → 0. (iii)=⇒(ii). If e ≤ d, then e + ε1 ≤ d + ε1 for every ε0, whence (d + ε1)−1 ≤ (e + ε1)−1 = (n + ε)−1 p + ε−1 (1 − p) .
(54)
But then, since e(1 − p) = 0, we have e(d + ε)−1 e ≤ (n + ε)−1 epe = (n + ε)−1 ne ,
(55)
and as ε → 0 we obtain the desired estimate. In the general case, where y is arbitrary (positive and invertible), we dePn fine x ˜k = y −1/2 xk y −1/2 for 1 ≤ k ≤ n. Then with e˜ = i,j=1 1 ⊗ eij and ˜ ˜2 , . . . , x ˜n } we observe that conditions (i), (ii) and (iii) become d = diag{˜ x1 , x equivalent with the conditions Pn −1 ˜−1 , (i)˜ 1 ≤ ( k=1 x k ) −1 (ii)˜ e˜d˜ e˜ ≤ e˜, ˜ (iii)˜ e˜ ≤ d. But these conditions are exactly the ones we proved to be equivalent above, assuming that y = 1. 19. Corollary. For any n-tuple (x1 , x2 , . . . , xn ) of positive operators in B(H) the harmonic mean (x1 ! x2 ! · · · ! xn ) is the largest positive operator of the form nz, such that z z ... z 0 ... 0 x1 0 0 z z ... z x2 . . . (56) ≥. . .. . . .. .. .. .. .. . . . . . . . . z z ... z 0 0 . . . xn 20. Remark. Note that Proposition 18 actually says slightly more than Corollary 19, namely that the set of positive operators nz, such that z satisfies the matrix inequality in (56), is exactly equal to ((x1 ! x2 ! · · · ! xn ) − B(H)+ ) ∩ B(H)+ .
(57)
This should be compared to the analogous result for the geometric mean of pairs of positive operators x and y. Here x # y is the largest positive operator z such that xz yz ≥ 0 in M2 (B(H)). However, one can find positive operators z such that z ≤ x # y, without the matrix being positive. It suffices to take x = 1, so that
August 22, 2002 15:18 WSPC/148-RMP
646
00126
E. H. Lieb & G. K. Pedersen
x # y = y 1/2 . Evidently the matrix
1 z
z y
is positive only if z 2 ≤ y, and it is easy
to find examples where z ≤ y 1/2 , but z 2 6≤ y. 21. Proposition. If τ is a densely defined, lower semi-continuous trace on a C ∗ algebra A, and f is a strictly positive function on ]0, ∞[ such that t → (f (t−1 ))−1 is concave, then for all positive operators x1 . . . , xn ∈ A and α > 0 we have (τ ((f (x1 ! x2 ! · · · ! xn ))α ))1/α ≤ (τ ((f (x1 ))α ))1/α ! (τ (f ((x2 ))α ))1/α ! · · · ! (τ ((f (xn ))α ))1/α .
(58)
Proof. Define g(t) = (f (t−1 ))−1 . Then from (37) we see that the function x → (τ ((g(x))−α ))−1/α
(59)
is concave on the set of positive invertible elements in A. In particular, −1 −α −1/α )) (τ ((g(1/n(x−1 1 + · · · + xn ))) −α −1/α −α −1/α ≥ 1/n(τ ((g(x−1 )) + · · · + 1/n(τ ((g(x−1 )) . n )) 1 ))
Taking the reciprocal values and noting that this means that
−1 1/n(x−1 1 +· · ·+xn )
(60)
= (x1 ! · · · ! xn )−1
(τ ((g((x1 ! · · · ! xn )−1 ))−α ))1/α −α −1/α −α −1/α −1 ≤ n((τ ((g(x−1 )) + · · · + (τ ((g(x−1 )) ) n )) 1 )) −α 1/α −α 1/α = (τ ((g(x−1 )) ! · · · ! (τ ((g(x−1 )) . n )) 1 ))
(61)
Since g(t−1 )−1 = f (t) this is the desired result. 22. Remark. The result above applies to the functions t → t1/p for p ≥ 1, in particular to the identity function (and with α = 1); but it also applies to the functions t → log(1 + t1/p ) for p ≥ 1. On the abstract level Proposition 21 applies to C 2 -functions f , such that f and t → t2 f (t)−2 f 0 (t) are simultaneously increasing or decreasing (strictly) on ]0, ∞[. Proposition 21 also applies to other (not necessarily symmetric) means of positive operators. Such means were introduced in [8] (see also [6, 4.1]). 23. Proposition. For every Kubo–Ando mean σ and for every densely defined, lower semi-continuous trace τ on a C ∗ -algebra A we have τ (x σ y) ≤ τ (x) σ τ (y) for all x, y in A+ . Proof. For each mean σ there is a unique probability measure µ on [0, ∞] such that Z Z 1 ∞ 1 ∞ (tx ! y)(1 + 1/t)dµ(t) = ((1 + t)x ! (1 + 1/t)y)dµ(t) , (62) xσ y = 2 0 2 0
August 22, 2002 15:18 WSPC/148-RMP
00126
Convex Multivariable Trace Functions
cf. [6, 4.1]. Applying Proposition 22 with f (t) = t and α = 1 it follows that Z 1 ∞ τ (tx ! y)(1 + 1/t)dµ(t) τ (x σ y) = 2 0 Z 1 ∞ (tτ (x) ! τ (y))(1 + 1/t) dµ(t) = τ (x) σ τ (y) . ≤ 2 0
647
(63)
Acknowledgments The work of E. H. Lieb was partly supported by U.S. National Science Foundation grant PHY98-20650-A02, whereas G. K. Pedersen was partly supported by the Danish Research Council (SNF). References [1] W. N. Andersen and R. J. Duffin, Series and parallel addition of matrices, J. Math. Anal. Appl. 26 (1969) 576–594. [2] T. Ando, Concavity of certain maps of positive definite matrices and applications to Hadamard products, Linear Algebra Appl. 26 (1979) 203–241. [3] H. Araki, On an inequality of Lieb and Thirring, Letters Math. Phys. 19 (1990) 167–170. [4] B. Fuglede and R. V. Kadison, Determinant theory in finite factors, Ann. Math. 55 (1952) 520–530. [5] F. Hansen, Convex trace functions of several variables, to appear in Linear Algebra Appl. [6] F. Hiai, Log-majorizations and norm inequalities for exponential operators, Banach Center Publications 38, 1997, The Polish Academy of Sciences, Warszawa, pp. 119–181. [7] R. V. Kadison and J. R. Ringrose, Fundamentals of the Theory of Operator Algebras, Vol. I–II, Academic Press, San Diego, 1986 (Reprinted by AMS in 1997). [8] F. Kubo and T. Ando, Means of positive linear operators, Math. Ann. 246 (1980) 205–224. [9] E. H. Lieb, Convex trace functions and the Wigner–Yanase–Dyson conjecture, Adv. Math. 11 (1973) 267–288. [10] E. H. Lieb, The classical limit of quantum systems, Comm. Math. Phys. 31 (1973) 327–340. [11] E. H. Lieb and M. B. Ruskai, Some operator inequalities of the Schwarz type, Adv. Math. 26 (1974) 269–273. [12] J. von Neumann, Mathematical Foundations of Quantum Mechanics, Princeton Press, Princeton NJ, 1955. [13] M. Ohya and D. Petz, Quantum Entropy and Its Use, Texts and Monographs in Physics, Springer Verlag, Heidelberg, 1993. [14] G. K. Pedersen, C ∗ -Algebras and their Automorphism Groups, LMS Monographs 14, Academic Press, San Diego, 1979. [15] G. K. Pedersen, Analysis Now, Graduate Texts in Mathematics 118, Springer Verlag, Heidelberg, 1989, reprinted 1995. [16] G. K. Pedersen, Convex trace functions of several variables on C ∗ -algebras, Preprint.
August 22, 2002 15:18 WSPC/148-RMP
648
00126
E. H. Lieb & G. K. Pedersen
[17] D. Petz, Spectral scale of self-adjoint operators and trace inequalities, J. Math. Anal. Appl. 109 (1985) 74–82. [18] W. Pusz and S. Lech Woronowicz, Functional calculus for sesquilinear forms and the purification map, Reports Math. Phys. 8 (1975) 159–170. [19] D. Ruelle, Statistical Mechanics, The Mathematical Physics Monograph Series, Benjamin, New York, 1969. [20] S. Sakai, C ∗ -Algebras and W ∗ -Algebras, Springer Verlag, Heidelberg, 1971, reprinted 1997. [21] B. Simon, The classical limit of quantum partition functions, Comm. Math. Phys. 71 (1980) 247–276. [22] B. Simon, The Statistical Mechanics of Lattice Gases, Vol. I, Princeton University Press, Princeton, 1993.
August 21, 2002 18:21 WSPC/148-RMP
00128
Reviews in Mathematical Physics, Vol. 14, Nos. 7 & 8 (2002) 649–673 c World Scientific Publishing Company
APPROXIMATELY INNER FLOWS ON SEPARABLE C ∗ -ALGEBRAS
AKITAKA KISHIMOTO Department of Mathematics, Hokkaido University, Sapporo 060-0810, Japan Received 24 July 2001 Revised 10 November 2001 We present two types of result for approximately inner one-parameter automorphism groups (referred to as AI flows hereafter) of separable C ∗ -algebras. First, if there is an irreducible representation π of a separable C ∗ -algebra A such that π(A) does not contain non-zero compact operators, then there is an AI flow α such that π is α-covariant and α is far from uniformly continuous in the sense that α induces a flow on π(A) which has full Connes spectrum. Second, if α is an AI flow on a separable C ∗ -algebra A and π is an α-covariant irreducible representation, then we can choose a sequence (hn ) of self-adjoint elements in A such that αt is the limit of inner flows Ad eithn and the sequence π(eithn ) of one-parameter unitary groups (referred to as unitary flows hereafter) converges to a unitary flow which implements α in π. This latter result will be extended to cover the case of weakly inner type I representations. In passing we shall also show that if two representations of a separable simple C ∗ -algebra on a separable Hilbert space generate the same von Neumann algebra of type I, then there is an approximately inner automorphism which sends one into the other up to equivalence.
1. Introduction and Results A flow α (or a strongly continuous one-parameter automorphism group) on a C ∗ algebra A is uniformly continuous if there is a derivation δ on A such that αt = etδ , t ∈ R. Here δ is a derivation if it is a linear mapping on A and satisfies that δ(xy) = δ(x)y + xδ(y) ,
δ(x∗ ) = δ(x)∗
for all x, y ∈ A. (Then δ is automatically bounded; see [24, 4.1.3].) For any h = h∗ ∈ A, the map ad ih : x ∈ A → i[h, x] is a derivation, called inner. Hence if A is not commutative, then we know that there exist non-trivial uniformly continuous flows on A. We recall the notion of almost uniform continuity, which was introduced in [11]. The flow α on a C ∗ -algebra A is said to be almost uniformly continuous if, for any α-invariant (closed two-sided) ideal I of A, the flow on the quotient A/I induced by α has a non-zero α-invariant hereditary C ∗ -subalgebra on which the induced flow is uniformly continuous. (If α is almost uniformly continuous, then α leaves all the ideals invariant. If furthermore A is a simple unital C ∗ -algebra, then α is uniformly continuous.) We will quote some results from [10, 11, 4]: 649
August 21, 2002 18:21 WSPC/148-RMP
650
00128
A. Kishimoto
Proposition 1.1. Let A be a C ∗ -algebra and α a flow on A. Then the following conditions are equivalent: (1) Any irreducible representation π of A is α-covariant, i.e. there is a unitary flow (or weakly continuous one-parameter unitary group) U in the representation space of π such that π(αt (x)) = Ad Ut π(x), x ∈ A. (2) α∗ is strongly continuous on the dual A∗ . (3) α is almost uniformly continuous. (4) α is universally weakly inner, i.e. there is a weakly∗ continuous unitary flow U in the second dual A∗∗ such that αt (x) = Ad Ut (x), x ∈ A. (5) There is a net (hν ) of self-adjoint elements in A such that Ad eithν (x) converges to αt (x) uniformly in t on every compact subset of R, for all x ∈ A and, simultaneously, eithν weakly∗ converges to Ut uniformly in t on every compact subset of R, where U is a unitary flow in the second dual A∗∗ as in (4). The last condition above, in particular, says that such an α is approximately inner (or AI for brevity); if the C ∗ -algebra A is separable, we can choose a subsequence (hn ) from the net (hν ) such that Ad eithn (x) converges to αt (x) uniformly in t on every compact subset of R for any x ∈ A. But, since this fact also follows from condition (3) directly, the main point in (5) is to choose a net (hν ) so that it satisfies the two requirements simultaneously. This is related to our second result below. The following is of course well-known, but it exemplifies how a flow might be used. Proposition 1.2. If A is a separable C ∗ -algebra and α is a flow which is not almost uniformly continuous, then A has uncountably many equivalence classes of irreducible representations. Proof. Proposition 1.1(1) there is an irreducible representation π of A such that π is not α-covariant. We shall show that {παt | t ∈ R} already contains uncountably many equivalence classes of irreducible representations of A. Let ω be a pure state associated with π and ϕ a pure state of A associated with another irreducible representation ρ. Then Sρ = {t ∈ R | παt ∼ ρ} is given as [\ k
{t ∈ R | kωαt (a` ) − ϕ Ad uk (a` )k ≤ 1} ,
`
where (uk ) is a dense sequence in the unitary group U(A) of A and (a` ) is a dense sequence in the unit ball of A. Hence Sρ is a Borel set. If Sρ has positive measure, then for s, t ∈ Sρ , παs ∼ παt , i.e. παs−t is equivalent to π. Since Sρ − Sρ contains a non-empty open neighborhood of 0, this implies that παt ∼ π for all t ∈ R. Thus π is covariant (as shown by using that A is separable), which is a contradiction. Hence Sρ is a null set for any irreducible representation ρ. This implies the assertion.
August 21, 2002 18:21 WSPC/148-RMP
00128
Approximately Inner Flows on Separable C ∗ -Algebras
651
To measure how far the flow α is from the uniformly continuous ones, we could use what is called Connes spectrum or Borchers spectrum. We recall the definition of Connes spectrum (resp. Borchers spectrum) [21]. The Connes spectrum RC (α) (resp. the Borchers spectrum RB (α)) of a flow α on A is \ Spec(α|B) , B
where B runs over all the non-zero α-invariant hereditary C ∗ -subalgebras of A (resp. those which generate essential ideals of A) and Spec(α|B) denotes the (Arveson) spectrum of the restriction α|B. It is known that RC (α) is a closed subgroup of R. It follows immediately that RC (α) ⊂ RB (α) in general, RC (α) = RB (α) if A is prime, and that RB (α) = {0} if α is almost uniformly continuous. What we actually use is the following stronger property. For a non-empty open for α corresponding to O, subset O of R let Aα (O) denote the spectral subspace R i.e. the closure of the subset of elements of the form f (t)αt (x)dt, where x ∈ A and f is a continuous integrable function on R such that its Fourier transform fˆ has support in −O. We will use the property that for each non-empty open subset O of R there is a non-trivial (bounded) central sequence in Aα (O); a bounded sequence (zn ) is a non-trivial central sequence if k[x, zn ]k → 0 and that limn kxzn k = 0 entails x = 0 for any x ∈ A. We will call a flow having this property a profound flow below. Our first result is: Theorem 1.3. Let A be a separable C ∗ -algebra and let π be an irreducible representation of A such that π(A) ∩ K(Hπ ) = {0}. Then there exists an AI flow α on A such that π is α-covariant and the flow on A/ ker π induced by α is profound; in particular this induced flow has full Connes (or Borchers) spectrum. As a consequence of this theorem we have that any simple separable C ∗ -algebra which is not of type I has a profound AI flow Rα. In this case, by [12, 13], there is ⊕ an irreducible representation π of A such that R παt dt is a type I representation of A whose center is L∞ (R) naturally; in particular παt is disjoint from π for all t 6= 0. In the proof of Theorem 1.3 we use the recent result on C ∗ -algebras from [20]. In the above theorem we could choose a sequence (hn ) of self-adjoint elements of A such that Ad eithn (x) converges to αt (x) for any x ∈ A and π(eithn ) converges to a unitary flow Ut implementing αt in Hπ (both uniformly in t on every compact subset of R). Actually we could choose a continuous function h of [0, ∞) into Asa , the self-adjoint part of A, (instead of just a sequence) such that lims→∞ Ad eith(s) (x) = αt (x), x ∈ A and lims→∞ π(eith(s) ) = Ut as above. If we call such an α asymptotically inner, just as for the case of single automorphisms, we do not know whether the asymptotically inner flows are more restrictive than AI flows or not. Our second result shows that it is always possible to choose such a (hn ).
August 21, 2002 18:21 WSPC/148-RMP
652
00128
A. Kishimoto
Theorem 1.4. Let A be a separable C ∗ -algebra and α an AI flow on A. Let π be an α-covariant irreducible representation with U an implementing unitary group. Then there exists a sequence (hn ) of self-adjoint elements of A such that lim Ad eithn (x) = αt (x) ,
n→∞
lim π(eithn ) = Ut ,
n→∞
x ∈ A,
strongly ,
both uniformly in t on every compact subset of R. If α is an AI flow, then, by definition, there is a sequence (hn ) in Asa such that limn Ad eithn (x) = αt (x), x ∈ A, which is one of the conditions above. But the strong convergence of (π(eithn )) is not automatic by any means. (Simply we could take hn − λn 1 instead of hn , where (λn ) is just an arbitrary sequence in R.) We have to adjust (hn ) to get the strong convergence. It is likely that any AI flow has a covariant irreducible representation. But we could show only: Proposition 1.5. Let A be a separable C ∗ -algebra and α an AI flow. If A has a non-zero projection, then there is an α-covariant irreducible representation of A. Proof. By the assumption there is a sequence (hn ) in Asa such that limn Ad eithn (x) = αt (x), x ∈ A uniformly in t on every compact subset of R. This condition is equivalent to saying that the sequence (ad ihn ) converges to the generator δ of α in the graph sense (see e.g. [3, 25]). If A has a unit, we may take the irreducible representation associated with an extreme ground state [23], which is covariant. (That is, take a state ϕn such that ϕn (hn ) = min Spec(hn ) and then any weak∗ limit point of (ϕn ) is a ground state for α. Since the set of ground states forms a closed face of the state space and is non-empty as is just shown, we may take an arbitrary extreme state of this face.) Suppose that A has no unit. Let e be a non-zero projection in A. We may assume that e is in the domain of δ [3, 25]. Then, since [δ(e)e − eδ(e), e] = δ(e), we may assume that δ(e) = 0 by taking δ−ad(δ(e)e−eδ(e)) instead of δ. Then there is a sequence (en ) in A such that ke−en k → 0 and ad ihn (en ) → 0. We can then assume that en ’s are self-adjoint and, by functional calculus, that they are projections. Then, replacing hn by en hn en + (1 − en)hn (1 − en), which is approximately equal to hn , we may assume that ad ihn (en ) = 0. There is a sequence (un ) of the unitaries in A + C1 such that Ad un (en ) = e and kun − 1k → 0. Then Ad un Ad eithn Ad u∗n leaves e invariant and converges to αt . This implies that the restriction α|eAe is also an AI flow. Since the C ∗ -algebra eAe has a unit, it has a covariant irreducible representation with respect to α|eAe, as is shown above. We can extend it uniquely to an irreducible representation of A, which is covariant for α. As another condition which shows the existence of covariant irreducible representations, we quote from [12, 13].
August 21, 2002 18:21 WSPC/148-RMP
00128
Approximately Inner Flows on Separable C ∗ -Algebras
653
Proposition 1.6. Let A be a separable prime C ∗ -algebra and α a profound flow. Then there is a faithful α-covariant irreducible representation of A. In the final section we will extend the above theorem to cover the case π is allowed to be a type I representation. Theorem 1.7. Let A be a separable C ∗ -algebra and α an AI flow on A. Let π be an α-covariant type I representation on a separable Hilbert space H such that there is a unitary flow U in π(A)00 which implements α. Then there exists a sequence (hn ) of self-adjoint elements of A such that lim Ad eithn (x) = αt (x) ,
n→∞
lim π(eithn ) = Ut ,
n→∞
x ∈ A,
strongly ,
both uniformly in t on every compact subset of R. By adopting the methods for proving the above theorem we also show: Corollary 1.8. Let A be a separable simple C ∗ -algebra and let π1 and π2 be nondegenerate representations of A on a separable Hilbert space such that π1 (A)00 is isomorphic to π2 (A)00 and is a von Neumann algebra of type I. Then there exists an asymptotically inner automorphism α of A such that π1 α is equivalent to π2 . Furthermore for any isomorphism ϕ of π1 (A)00 onto π2 (A)00 there exists such an α as above such that ϕπ1 α(x) 7→ π2 (x) extends to an inner automorphism of π2 (A)00 . Here an automorphism α is asymptotically inner if there is a continuous function u of [0, ∞) into U(A) such that α(x) = limt→∞ Ad ut (x), x ∈ A. If A is UHF or AF, even more is known (see [22, 1]). Without assuming that A is simple, the statement of the above corollary does not follow in general even for faithful π1 and π2 (for example consider the case A is abelian). Before concluding this introduction let us give a couple of remarks on nonAI flows. There are of course non-AI flows on many (separable) C ∗ -algebras, but to construct a non-AI flow on a C ∗ -algebra we need to know something on the global structure of the C ∗ -algebra. We can construct non-AI flows for the class of purely infinite simple C ∗ -algebras which is classified by Kirchberg and Phillips [7, 8], using non-AI flows on the Cuntz algebras which can be constructed explicitly (see, e.g. [9]), and for some simple AT C ∗ -algebras (see, e.g. [16, 17]), and for some AF C ∗ -algebras [19]. The properties which could be used to distinguish non-AI flows from AI flows include simplicity of crossed products, non-existence of KMS states and ground states [23], the Rohlin property [15], non-triviality of the induced map from K1 into the affine space on the tracial state space [16], etc. In the following sections we will present the proofs of the theorems stated above. 2. Proof of Theorem 1.3 We will use the following lemma to generate an AI flow:
August 21, 2002 18:21 WSPC/148-RMP
654
00128
A. Kishimoto
Lemma 2.1. Let A be a separable C ∗ -algebra and (an ) a dense sequence in A. Let (hn ) be a sequence in Asa such that khn k ≤ 1 , k[hn , am ]k ≤ e−n kam k , k[hn , hm ]k ≤ e
Pn
−n
,
m ≤ n,
m < n.
Let Hn = k=1 hk . Then Ad eitHn (x) converges for any x ∈ A as n → ∞ and defines a flow on A. Proof. This can be proven based on a theorem concerning a sequence of bounded dissipative operators which are almost commutative with each other [2]. Note that the derivation ad iHn generates the flow t 7→ Ad eitHn . Let δ be the graph limit of the sequence (ad iHn ), i.e. x belongs to the domain D(δ) if there is a sequence (xn ) in A such that limn xn = x and (ad iHn (xn )) converges, whose limit is defined as δ(x). Since ±ad iHn is dissipative, both δ and −δ are dissipative. We have that am ∈ D(δ) for any m; hence D(δ) is dense in A. Since k[hn , Hm ]k ≤ me−n for n > m and hm ∈ D(δ), we have that Hm ∈ D(δ) and kδ(Hm )k ≤ me−m−1 (1 − e−1 )−1 . Thus it follows that limn [ad iHn , ad iHm ] = limn ad[iHn , iHm ] = ad iδ(Hm ), which converges to 0 as m → ∞. Hence by [2], δ is a generator and etδ (x) = lim Ad eitHn (x), x ∈ A. We will also give another proof which is more concrete and mimics the proof of the existence result of time-developments as in [3]. First we give: Lemma 2.2. In the situation of the above lemma, for each m it follows that ∞ ∞ X X k1 =1 k2 =1
···
∞ X
kad hk1 ad hk2 · · · ad hkn (am )k ≤ n ! n2n+1 kam k
kn =1
for all large n. Proof. Let xi = ad hki ad hki+1 · · · ad hkn (am ). If ki ≡ N > m and kj < N for all j > i, then we have that if i = n then kxn k ≤ e−N kam k and if i < n, kxi k = kad hN ad hki+1 (xi+2 )k ≤ kad([hN , hi+1 ])(xi+2 )k + kad hi+1 ad hN (xi+2 )k ≤ 2e−N kxi+2 k + 2kad hN (xi+2 )k ≤ 2n−i e−N kam k + 2kad hN (xi+2 )k , where xn+1 = am and we have used that kxi+2 k ≤ 2n−i−1 kam k. Hence by repeating this procedure we have that kxi k ≤ 2n−i (n − i + 1)e−N kam k .
August 21, 2002 18:21 WSPC/148-RMP
00128
Approximately Inner Flows on Separable C ∗ -Algebras
655
Since kx1 k ≤ 2i−1 kxi k, it follows that if max(k1 , k2 , . . . , kn ) ≡ N > m, then kad hk1 ad hk2 · · · ad hkn (am )k ≤ 2n−1 ne−N kam k . There are at most nN n−1 terms with max(k1 , k2 , . . . , kn ) = N . Hence we can estimate that ∞ ∞ X ∞ X X ··· kad hk1 ad hk2 · · · ad hkn (am )k k1 =1 k2 =1
=
kn =1
∞ X
X
kad hk1 ad hk2 · · · ad hkn (am )k
N =1 max(k1 ,...,kn )=N ∞ X
≤ (2m)n kam k + Since
P∞ N =m+1
2n−1 n2 e−N N n−1 kam k .
N =m+1
e−N N n−1 is smaller than Z ∞ e−x xn−1 dx + e−n+1 (n − 1)n−1 < 2(n − 1) ! m
(at least for large n), we obtain that the above sum is smaller than n ! n2n+1 kam k for all large n. Pn Another Proof of Lemma 2.1. Recall that Hn = k=1 hk . If |t| < 1/2, then the limit of the sequence (Ad eitHn (am ))n exists and equals ∞ n X t n! n=0
X
ad ihk1 ad ihk2 · · · ad ihkn (am ) .
k1 ,k2 ,...,kn
ˆ t (am )k = kam k, Let α ˆ t (am ) = limn Ad eitHn (am ) for t ∈ (−1/2, 1/2). Then, since kα α ˆ t extends to a homomorphism αt of A such that αt (x) = limn Ad eitHn (x) for any x ∈ A. It then follows that limn Ad eitHn (x) exists for any x ∈ A and for any t ∈ R. Since αs αt = αs+t , (αt ) defines a flow α on A. We first quote a lemma from [20]: Lemma 2.3. Let A be a C ∗ -algebra. Let F be a finite subset of A, π an irreducible representation of A on a Hilbert space H, E a finite-dimensional projection on H, and > 0. Then there exists an x ∈ M1n (A) for some n ∈ N such that kxk ≤ 1, π(xx∗ )E = E, and kad aAd xk ≤ kak, a ∈ F. Lemma 2.4. Let ξ0 be a unit vector in Hπ , (an ) be a dense sequence in A, and (λn ) a dense sequence in (0, 1]. Then there exist a sequence (hn ) in Asa , a sequence (vn ) in A, and a sequence (ξn ) of unit vectors in Hπ such that 0 ≤ hn ≤ 1 , kvn k = 1 ,
August 21, 2002 18:21 WSPC/148-RMP
656
00128
A. Kishimoto
k[hn , am ]k ≤ e−n kam k , k[hn , hm ]k ≤ e−n ,
m ≤ n, m < n,
k[vn , am ]k ≤ e−n kam k ,
m≤n
hξn , ξm i = 0 ,
m < n,
π(hn )ξm = 0 ,
m < n,
kπ(hm )ξn k ≤ e−n ,
m < n,
kπ(hn )ξn − λn ξn k ≤ e−n , kπ(vn )ξ0 − ξn k ≤ e−n . Proof. We will construct (hn , vn , ξn ) inductively. Suppose that we have (hm , vm , ξm ) for m < n; this will work for n = 1, too. Let F = {a1 , . . . , an , h1 , . . . , hn−1 }. Then we choose, by the previous lemma, an x ∈ M1k (A) for some k such that xx∗ ≤ 1, π(xx∗ )ξ0 = ξ0 , and kad bAd xk ≤ e−n kbk, b ∈ F. We will define hn , vn as Ad x(y) for some y ∈ A with kyk = 1, which will take care of the commutativity of hn , vn with am , m ≤ n and hm , m < n. Let G1 be the union of {xx∗ }, {h2i | 1 ≤ i < n}, and {xi x∗j | 1 ≤ i, j ≤ k}. Let V be the (finite-dimensional) subspace of Hπ generated by ξm and π(xi x∗j )ξm , 1 ≤ i, j ≤ k, 0 ≤ m < n. We find a unit vector ξn ∈ Hπ V such that |hπ(z)ξ0 , ξ0 i − hπ(z)ξn , ξn i| < min(e−2n , δ 2 ) ,
z ∈ G1 .
Here δ > 0 is a small constant, which will be chosen later. This is possible because of the non-type I condition that π(A)∩K(Hπ ) = {0}. Since π(hm )ξ0 = 0 for m < n, it follows that kπ(hm )ξn k < e−n for m < n. Since ξm ∈ V for m < n, it follows that ξn is orthogonal to ξm , m < n. The subspace V1 spanned by π(x∗i )ξm , 1 ≤ i ≤ k, 0 ≤ m < n is orthogonal to the subspace V2 spanned by π(x∗i )ξn , 1 ≤ i ≤ k. By Kadison’s transitivity we find an h in Asa such that 0 ≤ h ≤ 1, π(h) = 0 on V1 , and π(h) = λn 1 on V2 and define hn = xhx∗ . If m < n, then X π(xi hx∗i )ξm = 0 . π(hn )ξm = i ∗
Since a = π(xx ) is positive and has norm 1, aξ0 = ξ0 , and |haξn , ξn i − 1| < e−2n , it follows that kaξn − ξn k ≤ e−n . Hence, since π(hn )ξn = π(xhx∗ )ξn = λn π(xx∗ )ξn = λn aξn , we have that kπ(hn )ξn − λn ξn k ≤ e−n . Note that we have assumed that |hπ(x∗i )ξ0 , π(x∗j )ξ0 i − hπ(x∗i )ξn , π(x∗j )ξn i| ≤ δ 2 ,
1 ≤ i, j ≤ k
August 21, 2002 18:21 WSPC/148-RMP
00128
Approximately Inner Flows on Separable C ∗ -Algebras
657
for a small δ > 0. By Lemma 3.3 of [5] for any 0 > 0 we have a δ > 0 such that the above condition implies that there are ηi ’s in Hπ such that hηi , ηj i = hπ(x∗i )ξ0 , π(x∗j )ξ0 i and kηi − π(x∗i )ξn k < 0 . Then, by Kadison’s transitivity again, there is a v ∈ A such that kvk = 1 and π(v)π(x∗i )ξ0 = ηi , We define vn = xvx∗ . Then
X
π(vn )ξ0 =
1 ≤ i ≤ k. π(xi )ηi
i
is close to π(xx∗ )ξn = aξn up to k0 . Since the latter is also close to ξn up to min(e−n , δ), we can conclude, for a small choice of δ, that kπ(vn )ξ0 − ξn k ≤ e−n . Lemma 2.5. For all m ∈ N it follows that ∞ ∞ X ∞ X X ··· kπ(hk1 hk2 · · · hkn )ξm k ≤ 2n ! n k1 =1 k2 =1
kn =1
for all large n. Proof. Let Σn denote the infinite sum in the statement of the lemma. Then we can estimate Σn as ∞ X X khk1 · · · hkn ξm k , Σn ≤ m n + N =m+1 max(k1 ,...,kn )=N
where we have omitted π. If ki = N > m and kj < N for j > i, then khk1 · · · hkn ξm k ≤ khN hki+1 · · · hkn ξm k ≤ k[hN , hki+1 ]hki+2 · · · hkn ξm k + khN hki+2 · · · hkn ξm k ≤ (n − i)e−N . Hence we obtain that Σn ≤ m n +
∞ X
(n − 1)ne−N N n−1 .
N =m+1
The right side can be estimated as desired. Proof of Theorem 1.3. By Lemmas 2.2 and 2.4 it follows that lim Ad eitHn Pn defines a flow α on A, where Hn = k=1 hk . Since π(Hn )ξ0 = 0, it follows that limn π(eitHn ) defines a unitary group U on Hπ such that Ut ξ0 = ξ0 and Ad Ut π = παt , t ∈ R.
August 21, 2002 18:21 WSPC/148-RMP
658
00128
A. Kishimoto
By the power series of Ut ξm for |t| < 1, obtained by the previous lemma, we have that
d
Ut ξm − iλm Ut ξm ≤ me−m .
dt This implies, by integration, that kUt ξm − eiλm t ξm k ≤ me−m |t|. ˆ Let f ∈ C ∞ (R) be a non-negative R integrable function such that supp f is a small compact neighborhood of 0 and f (t)dt = 1. We recall that (vn ) is a central sequence of norm one elements in A satisfying that kπ(vn )ξ0 − ξn k ≤ e−n . Let Z wn = e−iλn t f (t)αt (vn )dt . Then kwn k ≤ 1 and, by computation, Z π(wn )ξ0 − ξn = e−iλn t f (t)Ut π(vn )ξ0 dt − ξn Z ≈
f (t)(e−iλn t Ut ξn − ξn )dt
up to e−n . Hence we obtain that kπ(wn )ξ0 − ξn k ≤ e−n + ne−n
Z |t|f (t)dt ,
which, in particular, implies that kπ(wn∗ wn )ξ0 − ξ0 k → 0. Since Specα (wn ) ⊂ λn − supp fˆ, (π(wn )) is a non-trivial central sequence in π(A) (since π(wn∗ wn ) converges to 1 in the weak topology), and (λn ) is dense in [0, 1], this implies that the flow on A/ ker π ∼ = π(A) induced by α is profound. This follows because we can take products of elements from the sequences (π(wn )) and (π(wn∗ )) to get a non-trivial central sequence in any spectral subspace. We can see that the Connes spectrum of the flow on A/ ker π ∼ = π(A) induced by α contains [0, 1]. Because, if B is a non-zero invariant hereditary C ∗ -subalgebra of A/ ker π, then we take a positive e ∈ B such that Specα (e) is a small neighborhood of 0 and then eπ(wn )e ⊂ λn − supp fˆ + 2Specα (e) and keπ(wn )ek → kek2 . Hence the Connes spectrum equals R as being a closed subgroup. 3. Proof of Theorem 1.4 When A is a C ∗ -algebra, we denote by U(A) the unitary group of A (or A + C1 if A 63 1). If α is a (strongly continuous) flow on A, we call a continuous mapping u : R → U(A) an α-cocycle if us αs (ut ) = us+t for all s, t ∈ R. In this case t 7→ Ad ut αt is a flow, which will be denoted by Ad uα. If u is differentiable with ik = dut /dt|t=0 , then k ∈ Asa , i.e. it is a self-adjoint element of A. Any k ∈ Asa uniquely determines a differentiable α-cocycle u with dut /dt|t=0 = ik. In this case we will also denote Ad uα by α(k) . If δα denotes the generator of α, the generator of α(k) is δα + ad ik. See [3, 25, 18] for details.
August 21, 2002 18:21 WSPC/148-RMP
00128
Approximately Inner Flows on Separable C ∗ -Algebras
659
Lemma 3.1. Let A be a C ∗ -algebra and α an AI flow on A. Let π be an α-covariant representation of A and let U be a unitary flow on Hπ such that Ad Ut π = παt for t ∈ R and let H denote the generator of U : Ut = eitH . Suppose that there is a sequence (hn ) in Asa such that lim Ad eithn (x) = αt (x) , n
lim π(eithn ) = Ut , n
x ∈ A,
strongly ,
both uniformly in t on every compact subset of R. Then for any k ∈ Asa it follows that (k)
lim Ad eit(hn +k) (x) = αt (x) , n
x ∈ A,
lim π(eit(hn +k) ) = eit(H+π(k)) , n
strongly ,
both uniformly in t on every compact subset of R. If A is not separable, we should have taken a net instead of the sequence (hn ) in general in the above statement. This lemma is standard, see, e.g. [3, 25]. The following is taken from [14] and will be generalized in the following section to allow π to be of type I. Lemma 3.2. Let A be a separable C ∗ -algebra and α a flow on A. Let π be an αcovariant irreducible representation with Ut = eitH , t ∈ R an implementing unitary flow. For any > 0 there is a k ∈ Asa such that kkk < and H + π(k) is diagonal. We recall the situation of Theorem 1.4: π is an irreducible representation and U is a unitary flow on Hπ such that Ad Ut π(x) = παt (x), x ∈ A. To prove Theorem 1.4 we may assume, by the previous lemmas, that there is a unit vector Ω ∈ Hπ such that Ut Ω = eitλ Ω for some λ ∈ R. It is easy to show that we may suppose that λ = 0 (this is trivial if A 3 1; otherwise see the proof of 1.4 below). Let ω denote the pure state of A defined through (π, Ω) and let P = {x ∈ A | x ≥ 0, kxk = 1}. Lemma 3.3. Let ω denote the pure α-invariant state as above. Then for any finite subset F of A and > 0, there is an e ∈ P such that kexe − ω(x)e2 k < , x ∈ F, ω(e) = 1, and Specα (e) ⊂ (−, ). Proof. There is a decreasing sequence (en ) in P such that en en+1 = en+1 for any n and p = limn en in A∗∗ is the support projection of ω. (Then, for any x ∈ A, the norm of en xen − ω(x)e2n converges to zero as n → ∞.) Then we assert that, for any t ∈ R and m ∈ N, limn kem αt (en ) − αt (en )k = 0. Note that k(em − 1)αt (en )k2 = k(em − 1)αt (e2n )(em − 1)k and that ((em − 1) αt (e2n )(em − 1)) is decreasing in n. Suppose that for each n there is a state ϕn of A such that ϕn ((em − 1)αt (e2n )(em − 1)) ≥ δ(> 0). Then, for a weak∗ limit point ϕ of (ϕn ) we have that ϕ((em − 1)αt (e2n )(em − 1)) ≥ δ
August 21, 2002 18:21 WSPC/148-RMP
660
00128
A. Kishimoto
for any n. Since limn αt (e2n ) = p in A∗∗ , it follows that ϕ((em − 1)p(em − 1)) = 0, a contradiction. Hence limn kem αt (en ) − αt (en )k = 0 as asserted. Since t 7→ kem αt (en ) − αt (en )k = kα−t (em )en − en k is equi-continuous in n, it follows that subset of R. limn kem αt (en ) − αt (en )k = 0 uniformly in t on every compact R that f (t)dt = 1 and supp fˆ is Let f be a non-negative C ∞ function on R such R a small compact subset around 0 and define an = f (t)αt (en )dt. Then Specα (an ) ⊂ supp fˆ, an ≥ 0, ω(an ) = 1, kan k = 1, and limn kem an − an k = 0. For any finite subset F of A and > 0, there is an m ∈ N such that kem xem − ω(x)e2m k < /2, x ∈ F. Since there is an n ∈ N such that kem an − an k < /4C, where C = max{kx − ω(x)1k | x ∈ F }, we obtain that kan xan − ω(x)a2n k < ,
x∈F.
With e = an , this concludes the proof. (n)
Since α is an AI flow, there is a sequence (hn ) in Asa such that limn αt (x) = αt (x), x ∈ A, where α(n) denotes the inner flow t 7→ Ad eithn . Lemma 3.4. For any finite subset F of A and > 0 there exist an e ∈ P and a sequence (en ) in P such that kexe − ω(x)e2 k < ,
x∈F,
ω(e) = 1 , lim en = e , n
Specα(n) (en ) ⊂ (−, ) . R Proof. Let f be a non-negative C ∞ function on R such that f (t)dt = 1 and R supp fˆ ⊂ (−, ). Let D = f (t)|t|dt < ∞. For any 0 > 0, by using the previous lemma, we have an e0 ∈ P such that 0 0 0 2 ) k < /2, x ∈ F. Since ω(e0 ) = 1, Specα (e0 ) ⊂ (−D−1 0 , D−1 0 ), and R ke xe −ω(x)(e 0 0 −1 0 0 kαt (e ) − e k < D |t|, if we define e = f (t)αt (e )dt, we have that ke0 − ek < 0 and ω(e) = 1. Let Z (n) pn = f (t)αt (e0 )dt , (n)
where αt = Ad eithn . It follows that limn pn = e and and Specα(n) (p) ⊂ (−, ). We set en = kpn k−1 pn ∈ P. This concludes the proof. Before going into the proof of Theorem 1.4 we quote the following result from [20] (see Lemma 2.1 there; see also [5]): Lemma 3.5. Let A be a C ∗ -algebra. Then for any finite subset F of A, any pure state ω of A with πω (A) ∩ K(Hω ) = (0), and > 0, there exist a finite subset G of A and δ > 0 satisfying: If ϕ is a pure state of A such that πϕ is equivalent to πω ,
August 21, 2002 18:21 WSPC/148-RMP
00128
Approximately Inner Flows on Separable C ∗ -Algebras
661
and |ϕ(x) − ω(x)| < δ, x ∈ G, then there is a u ∈ U(A) such that ω = ϕAd u and kAd u(x) − xk < , x ∈ F. Proof of Theorem 1.4. Recall that we are given a unit vector Ω ∈ H such that Ut Ω = Ω and that the state ω of A is defined by ω(x) = hπ(x)Ω, Ωi, x ∈ A. Let F be a finite subset of A and > 0. By Lemma 3.3 we choose an e ∈ P such that kexe − ω(x)e2 k < , x ∈ F and ω(e) = 1. By applying Lemma 3.4 to {e} instead of F and a small constant δ > 0 (with δ < ) we obtain an f ∈ P and n such that kef − f k < δ, ω(f ) > 1 − δ, and Specα(n) (f ) ⊂ (−δ, δ). (The first property follows from kf ef − f 2 k < δ 2 , which implies that kf (1 − e)k < δ.) If π(A) ∩ K(Hπ ) 6= {0}, then π(A) ⊃ K(Hπ ) and we can assume that the above π(e) and π(f ) are one-dimensional projections. Since ke − f k . 2δ, we have that kΩ − Ωn k . 2δ, where Ωn = kπ(f )Ωk−1 π(f )Ω. Hence there is a un ∈ U(A) such that Ωn = π(un )Ω and kun − 1k . 2δ. Define λn ∈ R by π(f )π(hn )Ωn = λn Ωn . Since k[hn , f ]k ≤ δ, it follows that kπ(hn )Ωn − λn Ωn k ≤ δ. Then there is a bn ∈ Asa such that π(bn )Ωn = (π(hn ) − λn )Ωn and kbn k ≤ 2δ. Thus we have the following situation: we have a sequence (un ) in U(A), a sequence (bn ) in Asa , and a sequence (λn ) in R such that limn kbn k = 0, limn kun − 1k = 0, and π(u∗n (hn − bn )un )Ω = λn Ω . Since limn Ad u∗n Ad eit(hn −bn ) Ad un (x) = αt (x) for any x ∈ A and lim π(u∗n eit(hn −bn −λn ) un )π(x)Ω = π(αt (x))Ω = Ut π(αt (x))Ω , n
the sequence (Ad u∗n (hn − bn ) − λn 1) is the desired one (hence so is (hn − λn 1)) if A is unital. If A is non-unital, we replace 1 of λn 1 by an element from a given approximate identity. Now we assume that π(A) ∩ K(Hπ ) = {0}. Let E be the spectral measure of π(hn ). Since Specα(n) (f ) ⊂ (−δ, δ), it follows that E[t + δ, ∞)π(f )E(−∞, t] = 0 for any t ∈ R. Since π(f ) =
X
E[nδ, (n + 1)δ)π(f )E[nδ, (n + 1)δ)
n
+
X
E[nδ, (n + 1)δ)π(f )E[(n + 1)δ, (n + 2)δ)
n
+
X
E[nδ, (n + 1)δ)π(f )E[(n − 1)δ, nδ) ,
n
where the sums are only finite, and
X
E[nδ, (n + 1)δ)π(f )E[nδ, (n + 1)δ)
n
= sup kE[nδ, (n + 1)δ)π(f )E[nδ, (n + 1)δ)k n
August 21, 2002 18:21 WSPC/148-RMP
662
00128
A. Kishimoto
etc., we have that 1 − ≤ kπ(f )k ≤ sup kE[nδ, (n + 1)δ)π(f )E[nδ, (n + 1)δ)k n
+ sup kE[nδ, (n + 1)δ)π(f )E[(n + 1)δ, (n + 2)δ)k + sup kE[nδ, (n + 1)δ)π(f )E[(n − 1)δ, nδ)k . n
Since kE[nδ, (n + 1)δ)π(f )E[(n + 1)δ, (n + 2)δ)k2 ≤ kE[nδ, (n + 1)δ)π(f )E[nδ, (n + 1)δ)k · kE[(n + 1)δ, (n + 2)δ)π(f )E[(n + 1)δ, (n + 2)δ)k , we have that sup kE[nδ, (n + 1)δ)π(f )E[(n + 1)δ, (n + 2)δ)k ≤ d , where d = max kE[nδ, (n + 1)δ)π(f )E[nδ, (n + 1)δ)k . n
Thus we have that 3d ≥ 1 − ; hence for some n, kE[nδ, (n + 1)δ)π(f )E[nδ, (n + 1)δ)k ≥ (1 − )/3 . Assume that we have chosen a C ∞ function g on R such that 0 ≤ gˆ ≤ 1, gˆ = 1 on [−δ/2, δ/2),Rand supp gˆ ⊂ (−, ) and that δ is sufficiently small. Since g(hn − λ), f ]k < gˆ(hn − λ) = (2π)−1/2 g(t)e−it(hn −λ) dt, we may suppose that k[ˆ for any λ ∈ R. Then for λ = (n + 1/2)δ we have that kˆ g(hn − λ)f k ≥ kE[nδ, (n + 1)δ)f k ≥ ((1 − )/3)1/2 > 1/2 (assuming that < 1/4) and g(hn − λ), f ]k + kef − f k < 2 + δ < 3 . keˆ g(hn − λ)f − gˆ(hn − λ)f k ≤ 2 k[ˆ g (hn − λ)f ))Ψ0 , where kΨ0 k ≤ 2. Let Ψ be a unit vector of Hπ of the form Ψ = π(ˆ 0 Then it follows that kπ(e)Ψ − Ψk < 3kΨ k < 6. Defining a pure state ψ of A through (π, Ψ), we have that |ψ(x) − ω(x)| < 12C, x ∈ F, where C = max{kx − ω(x)1k | x ∈ F}. Since Ψ ∈ E(λ − , λ + )Hπ , we obtain that k(π(hn ) − λ)Ψk < . Hence there is a b ∈ Asa such that kbk < 2 and (π(hn ) − λ)Ψ = π(b)Ψ. Thus it follows that ψ is invariant under the inner flow Ad eit(hn −b) . Thus we have the following situation: Given a sequence (hn ) in Asa with limn Ad eithn = αt , we can choose a subsequence (nk ), a sequence (λk ) in R, a sequence (bk ) in Asa with kbk k → 0, and a sequence (Ψk ) of unit vectors in Hπ such that π(hnk − bk )Ψk = λk Ψk and ψk → ω as k → ∞, where ψn (x) = hπ(x)Ψk , Ψk i. It also follows that limk Ad eit(hnk −bk ) = αt . Since ψk → ω and since we have assumed that A is a separable nuclear C ∗ algebra and π(A) ∩ K(Hπ ) = {0}, we have, by Lemma 3.5, a sequence (uk ) in U(A) such that π(uk )Ω = Ψk and limk k[uk , x]k = 0 for any x ∈ A.
August 21, 2002 18:21 WSPC/148-RMP
00128
Approximately Inner Flows on Separable C ∗ -Algebras
663
If A has no unit, we choose an approximate identity (gk ) such that π(gk )Ψk = Ψk and lim Ad eit(hnk −bk −λk gk ) = αt . k
(This can be done as follows: Let (am ) be a dense sequence in A and let δk = ad i(hnk − bk ). We should choose (gk ) so thata (1 + δk − iλk ad gk )−1 (am ) − (1 + δk )−1 (am ) = (1 + δk − iλk ad gk )−1 (−iλk ad gk )(1 + δk )−1 (am ) has norm less than kam k/k for each m ≤ k. This is possible since there are only a finite number of conditions for the choice of gk .) If A has a unit, we let gk = 1. Let h0k = hnk − bk − λk gk ∈ Asa . Since π(h0k )Ψ = 0, it follows that 0 π(u∗k eithk uk )Ω = Ω. Since limk Ad uk (x) = x for any x ∈ A, we have that 0 limk Ad u∗k Ad eithk Ad uk = αt and that 0
0
π(u∗k eithk uk )π(x)Ω = π(Ad u∗k Ad eithk Ad uk (x))Ω → π(αt (x))Ω = Ut π(x)Ω . 0
This implies that π(u∗k eithk uk ) strongly converges to Ut and thus (u∗k h0k uk ) gives the desired sequence in Asa . 4. Proof of Theorem 1.7 Before going to the proof of Theorem 1.7 we have to extend Lemma 3.2. Lemma 4.1. Let A be a separable C ∗ -algebra and α a flow on A. Let π be a type I representation of A on a separable Hilbert space H. Suppose that there is a unitary flow U on H such that Ut ∈ π(A)00 and Ad Ut π = παt and let H be the generator of U : Ut = eitH . For any > 0 there is an h ∈ Asa such that khk < and H + π(h) is diagonal. We could certainly impose an extra condition on the choice of h as in [14, 2.2]; then this would be a generalization of it. In place of Kadison’s transitivity we will need a more elaborate form, called the non-commutative Lusin theorem, due to M. Tomita and K. Saitˆ o (see [21, 2.7.3] or [26, II.4.15]), which we adapt to the present situation as follows. When a projection is given in a von Neumann algebra M of type I, we call it of finite-rank if it is dominated by (or equivalently, is equal to) a finite sum of abelian projections in M. Lemma 4.2. Let π be a representation of A such that π(A)0 is abelian. Let x be a self-adjoint element of π(A)00 and e a finite-rank projection in π(A)00 . Then there are a net (zν ) of projections in π(A)0 and a net (xν ) in Asa such that xezν = π(xν )ezν , lim supν kxν k ≤ kxk, and limν zν = 1 (in the strong operator topology).
August 21, 2002 18:21 WSPC/148-RMP
664
00128
A. Kishimoto
Proof. By a direct application of the Lusin theorem, we obtain a net (xν ) in Asa and a net (eν ) of projections in π(A)00 such that eν ≤ e, xeν = π(xν )eν , lim supν kxν k ≤ kxk, and limν eν = e. Since e is a finite-rank projection in π(A)00 , we have a net (zν ) of projections in the center π(A)0 such that zν e ≤ eν and limν zν = 1. Proof of Lemma 4.1. We closely follow the proof of [14, 2.2] or, for that matter, the proof of Weyl’s theorem as in [6, X.2]. Let e0 be an abelian projection of π(A)0 with central support 1. We only have to show the above assertion for the representation x 7→ π(x)e0 on e0 H and the unitary flow t 7→ Ut e0 . Thus we may assume that π(A)0 is abelian. Let (ηn ) be a dense sequence in H and > 0 with n = 2−n . Let E0 denote the spectral measure of H0 = H. Let ξ1 = η1 and let F1 be a finite family of mutually disjoint translates of [0, 1 ) such that ξ1 (I) = E0 (I)ξ1 6= 0 for I ∈ F1 and X kξ1 (I)k2 < 1 . kξ1 k2 − I∈F1
Denote by m(I) the middle point of I ∈ F1 and let X (H0 − m(I))E0 (I) . K1 = I∈F1
Then K1 is a self-adjoint element of π(A)00 such that kK1 k ≤ 1 /2. Denote by e1 the projection onto the subspace spanned by zξ1 (I), z ∈ π(A)0 , I ∈ F1 . Note that P e1 ∈ π(A)00 is of finite rank, e1 ≤ I∈F1 E0 (I), and X X E0 (I)ξ1 = ξ1 (I) , e1 ξ1 = I∈F1
I∈F1
which implies that ke1 ξ1 k > kξ1 k −1. By applying Lemma 4.2 we have a projection z1 ∈ π(A)0 and h ∈ Asa such that π(h1 )e1 z1 = K1 e1 z1 , kh1 k < 1 , and kz1 e1 ξ1 k2 > kξ1 k2 − 1. Since, for z ∈ π(A)0 , 2
2
(H0 − π(h1 ))e1 z1 zξ1 (I) = z1 z(H0 − K1 )E0 (I)ξ1 = m(I)e1 z1 zξ1 (I) , it follows that (H0 − π(h1 ))e1 z1 is a self-adjoint element with spectrum {m(I) | I ∈ F1 }. Let H1 = H0 − π(h1 ) and f1 = e1 z1 , which is a projection of finite rank in π(A)00 . We have that [H1 , f1 ] = 0 and H1 f1 has finite spectrum and k(1 − f1 )η1 k < 1 . Now suppose that we have constructed sequences (e1 , e2 , . . . , en ), (f1 , . . . , fn ) of finite-rank projections in π(A)00 , a sequence (z1 , z2 , . . . , zn ) of projections in π(A)0 and a sequence (h1 , h2 , . . . , hn ) in Asa such that fk+1 = fk zk+1 + ek+1 zk+1 , fk ek+1 = 0, khk k < k , π(hk+1 )fk zk+1 = 0, Hk = Hk−1 − π(hk ) commutes with fk , Hk fk has finite spectrum, and
August 21, 2002 18:21 WSPC/148-RMP
00128
Approximately Inner Flows on Separable C ∗ -Algebras
k(1 − f` z`+1,k )η` k < 1 ,
665
` ≤ k,
where z`+1,k = z`+1 z`+2 · · · zk . If k(1 − fn )ηn+1 k < 1, we set en+1 = 0, zn+1 = 1, fn+1 = fn , and hn+1 = 0, and we are done. Otherwise let Fn+1 be a finite family of mutually disjoint translates of [0, n+1 ) such that for ξn+1 = (1 − fn )ηn+1 and ξn+1 (I) = En (I)ξn+1 6= 0, X kξn+1 (I)k2 < 1 , kξn+1 k2 − I∈Fn+1
where En is the spectral measure of Hn . Let X (Hn − m(I))En (I) , Kn+1 = I∈Fn+1
which is self-adjoint element of norm ≤ n+1 /2 in π(A)00 . Let en+1 be the projection onto the subspace spanned by zξn+1 (I), z ∈ π(A)0 , I ∈ Fn+1 ; en+1 is a finite-rank projection orthogonal to fn . Then by Lemma 4.2 we have a projection zn+1 ∈ π(A)0 and hn+1 ∈ Asa such that khn+1 k < n+1 , π(hn+1 )en+1 zn+1 = Kn+1 en+1 zn+1 , π(hn+1 )fn zn+1 = 0 , k(1 − f` z`+1,n+1 )η` k < 1 ,
` ≤ n,
k(1 − fn zn+1 − en+1 zn+1 )ηn+1 k < 1 , where for the last condition we have used that k(1 − fn − en+1 )ηn+1 k < 1. Let Hn+1 = Hn − π(hn+1 ) and fn+1 = fn zn+1 + en+1 zn+1 . Since Hn+1 fn zn+1 = Hn fn zn+1 and X m(I)En (I) en+1 zn+1 , Hn+1 en+1 zn+1 = (Hn − Kn+1 )en+1 zn+1 = I∈Fn+1
the resulting sequences by adding en+1 , fn+1 , zn+1 , hn+1 satisfy the desired properties. P∞ Define h = n=1 hn ∈ Asa ; then khk < . Define pn = limm zn,m . Then, since fn pn+1 ≤ fm pm+1 for n < m and k(1−fn pn+1 )ηn k ≤ 1, we have that limn fn pn+1 = 1. Since (H − π(h))fn pn+1 = Hn fn pn+1 has finite spectrum, H − π(h) is diagonal. This concludes the proof. Proof of Theorem 1.7. Let π be an α-covariant type I representation of the separable C ∗ -algebra A on a separable Hilbert space H such that there is a unitary flow U in π(A)00 with Ad Ut π = παt , t ∈ R.
August 21, 2002 18:21 WSPC/148-RMP
666
00128
A. Kishimoto
Since π(A)0 is of type I, there is an abelian projection p in π(A)0 with central support 1 and it suffices to show that there is a sequence (hn ) in Asa such that x ∈ A,
lim Ad eithn (x) = αt (x) , n
lim π(eithn )p = Ut p ,
strongly ,
n
both uniformly in t on every compact subset of R. Hence we may suppose that π(A)0 is abelian (and is the center of π(A)00 ). By Lemmas 4.1 and 3.1 we may suppose that the generator H of Ut = eitH is diagonal. Thus we have a unit vector Ω ∈ H cyclic for π(A)00 such that there is an orthogonal family {zi } of projections P P in π(A)0 with Ut Ω = i eiλi t zi Ω for some (λi ) in R and i zi = 1. P −itλi zi and suppose that we could construct a sequence (hn ) in Let Vt = Ut i e Asa satisfying the two conditions above for V in place of U (with p = 1). That is, it satisfies that lim Ad eithn (x) = αt (x) , n
x ∈ A,
lim π(eithn )zi = Ut e−iλi t zi , n
strongly for all i .
In this case we choose a central sequence (bn ) such that (1 ± ad i(hn + bn ))−1 − (1 ± ad ihn )−1 → 0 , !!−1 n X −1 λi zi → 0, (1 ± iπ(hn + bn )) − 1 ± i π(hn ) + i=1
both strongly. Since 1 ± i π(hn ) +
n X
!!−1 λi zi
→ (1 ± H)−1 ,
i=1
where H is the generator of U , we know that (hn +bn ) satisfies the desired conditions for U . Thus we may suppose that there is a cyclic unit vector Ω in H for π(A)00 such that Ut Ω = Ω. Let E denote the projection onto the subspace spanned by π(A)0 Ω. Then E is a projection in π(A)00 and Eπ(A)00 E = π(A)0 E ∼ = π(A)0 . Let X be the 0 0 character space of π(A) and identify π(A) with C(X). For each t ∈ X we denote by ωt the state of A defined by x 7→ (Eπ(x)E)(t); then t 7→ ωt (x) is continuous for all x ∈ A. Since Us E = EUs , it follows that ωt is an α-invariant. Denoting by µ the probability measure on X obtained by C(X) 3 f 7→ hf Ω, Ωi, we have that Z ωt (x)dµ(t) , x ∈ A , ω(x) = X
where ω is the state of A defined by x 7→ hπ(x)Ω, Ωi.
August 21, 2002 18:21 WSPC/148-RMP
00128
Approximately Inner Flows on Separable C ∗ -Algebras
667
Let X0 = {t ∈ X | ωt : pure}. Since A is separable, X0 is a Baire subset of X such that µ(X0 ) = 1. Since µ is faithful, X0 is dense in X. The representation (π, H, Ω) is identified with Z ⊕ Z ⊕ Z ⊕ πt dµ(t) , Ht dµ(t) , Ωt dµ(t) , X
X
X
where πt = πωt is the GNS representation of A associated with ωt on the Hilbert space Ht with the cyclic vector Ωt which defines ωt , and the center π(A)0 of π(A)00 is identified with C(X) = L∞ (X, µ); the latter is, more precisely, the space of equivalence classes of almost Rbounded measurable functions. Note that the unitary ⊕ flow U is also given as Us = X Us (t)dµ(t). See [24, Chap. 3] for details. Let X1 = {t ∈ X | πt (A) ∩ K(Ht ) 6= {0}}, which is a measurable subset of X. (With e the projection onto the subspace spanned by Ω and (xn ) a dense sequence S in A, X1 is given as n {t ∈ X | kπt (xn ) − e(t)k ≤ δ} for a small δ > 0, which shows that X1 is measurable.) Let E1 denote the central projection corresponding to X1 . We may handle the representations E1 π and (1 − E1 )π separately. We can handle E1 π in the same way as the case of irreducible representations (see the proof of Theorem 1.4). We will obtain a sequence (λn ) in L∞ (X1 ) with λ∗n = λn such that eis(E1 π(hn )−λn ) converges to E1 Us strongly. We then select a central sequence (cn ) in Asa such that lim Ad eis(hn −cn ) (x) = αs (x) ,
x ∈ A,
lim E1 π(eis(hn −cn ) ) = E1 Us ,
strongly .
n
n
Now we take the sequence (hn − cn ) for (hn ) and suppose that E1 = 0 or πt (A) ∩ K(Ht ) = {0} for almost all t. Since ωt is pure for t ∈ X0 , we have by Lemma 3.4 that for any finite subset F of A and > 0 there exist an et ∈ P and a sequence (etn ) in P such that ket xet − ωt (x)e2t k < ,
x∈F,
ωt (et ) = 1 , lim etn = et , n
Specα(n) (etn ) ⊂ (−, ) . We define an open neighborhood of t(∈ X0 ) by Xt = {s ∈ X | ωs (et ) > 1 − , | ωs (x) − ωt (x)| < , x ∈ F } . S Then we have that ket xet − ωs (x)e2t k < 2, x ∈ F. Since t∈X0 Xt ⊃ X0 and µ(X0 ) = 1, we find a finite number of Xt : Xt1 , . . . , Xtm such that µ(X \ Y ) < S with Y = m i=1 Xti . We may suppose, by shrinking these sets if necessary, that Xi = Xti ’s are Baire sets and are mutually disjoint. We then choose an n ∈ N such that there is an fi ∈ P for each i such that keti − fi k < and Specα(n) (fi ) ⊂ (−, ).
August 21, 2002 18:21 WSPC/148-RMP
668
00128
A. Kishimoto
We construct for each i a bounded operator Z ⊕ Z ⊕ Qi (t)dµ|Xi ∈ πt (A)00 dµ|Xi Qi = Xi
Xi
such that kQi (t)k ≤ 4, kQi (t)Ωt k = 1, πt (hn )Qi (t)Ωt ≈ λi Qi (t)Ωt for some λi , and the state ψt of A defined through (πt , Qi (t)Ωt ) is nearly equal to ωt on F , for almost all t ∈ Xi . For the pair (ei , fi ) with ei = eti we apply the arguments in the proof of Theorem 1.4. We obtain a C ∞ function g on R and a λ ∈ R such that 0 ≤ gˆ ≤ 1, gˆ is supported on a small neighborhood of 0, kˆ g(hn − λi )fi k > 1/2, and g(hn −λi )fi k ≈ 0. Let us denote y = gˆ(hn −λi ). For each t ∈ Xi ∩X0 kei gˆ(hn −λi )fi −ˆ we have an x ∈ A such that kxk < 2 and kπt (yfi x)Ωt k = 1. Thus there is a neighborhood V of t in Xi such that for s ∈ V , kπs (yfi x)Ωs k = ωs (x∗ fi y ∗ yfi x)1/2 > 1/2. Then we may define Qi on V by Qi (s) = ωs (x∗ fi y ∗ yfi x)−1/2 πs (yfi x) ,
s∈V .
We repeat this process and patch up the whole Xi by such V except for a null set (by using the fact that for any non-null set the intersection with X0 is non-empty). In this way we can define Qi on Xi in the above form. Since the α-spectrum of y is concentrated on a neighborhood of λi , it follows that πt (hn )Qi (t)Ωt ≈ λi Qi (t)Ωt . Since kπt (ei )Qi (t)Ωt − Qi (t)Ωt k ≈ 0, the state defined by Qi (t)Ωt is almost equal to ωt on F as required. We let Qm+1 (t) = 1 for t ∈ Xm+1 = X \ Y . Combining these Qi on Xi for i = 1, 2, . . . , m + 1, we have Z ⊕ Z ⊕ Q(t)dµ(t) ∈ πt (A)00 dµ(t) Q= X
X
with kQk ≤ 4 such that the states defined by Ψt = Q(t)Ωt and Ωt are almost equal on F for almost all t. Moreover we have a λ ∈ L∞ (Y ) with λ∗ = λ and Z ⊕ B(t)dµ(t) B= Y
such that B(t)∗ = B(t), kB(t)k is small, and (πt (hn ) − λ(t))Ψt = B(t)Ψt for almost all t ∈ Y . To see that there is such a B, let E(t) be the projection onto the subspace spanned by Ψt and (πt (hn ) − λ(t))Ψt for t ∈ Y ; then t ∈ Y 7→ E(t) is measurable and defines a finite-rank projection E in π(A)00 ; B(t) may be defined by B(t) = E(t)(πt (hn ) − λ(t))E(t) ,
t∈Y .
We then apply the Lusin theorem (Lemma 4.2) to approximate λ = λn and B = Bn on E = En by elements in Asa ; since we may suppose that the central support of En converges to 1, we find cn , bn ∈ Asa and a projection zn ∈ π(A)0 such that k(1 − zn )Ωk ≈ 0 and πt (cn )Ψt = λn (t)Ψt ,
πt (bn )Ψt = Bn (t)Ψt
August 21, 2002 18:21 WSPC/148-RMP
00128
Approximately Inner Flows on Separable C ∗ -Algebras
669
for all t with zn (t) = 1. We also require that lim Ad es ad i(hn −cn −bn ) (x) = αs (x) , n
x ∈ A,
by making (cn ) sufficiently central and assuming that limn kbn k = 0. By the following lemma we find a central sequence (un ) in U(A) such that πt (un )Ωt = Ψt except for t ∈ Xn with limn µ(Xn ) = 0. We then obtain that lim π(u∗n eis(hn −cn −bn ) un )π(x)Ω = lim π(αs (x))Ω = Us π(x)Ω , n
n
x ∈ A.
Thus (u∗n (hn − cn − bn )un ) satisfies the required conditions. We note that we could impose an extra condition on (un ) saying that kρ(un )ξ − ξk → 0 if ρ is a representation disjoint from π and ξ is a unit vector in Hρ (see the proof of 4.3 below). We could also impose the condition that keisρ(hn −cn −bn ) ξ − eisρ(hn ) ξk → 0 for such ρ and ξ. Hence we can make sure that this modification made on (hn ) will not affect the condition (for E1 π) already satisfied by (hn ). This concludes the proof. Let ω be a state of A such that πω (A)0 is abelian. As in the proof above, let X be the character space of πω (A)0 and E the projection onto the subspace spanned by πω (A)0 Ωω . Define a state µ = µω on πω (A)0 by µ(Q) = hQΩω , Ωω i, Q ∈ πω (A)0 and define a state ωt for each t ∈ X by ωt (x) = (Eπω (x)E)(t), x ∈ A. Then it follows that t 7→ ωt is continuous and Z ωt (x)dµ(t) , x ∈ A . ω(x) = X
Let ψ be another state of A. If πψ is unitarily equivalent to πω and µψ = µω = µ on πψ (A)0 identified with πω (A)0 by this unitary equivalence, we say that ψ is centrally equal to ω; in this case ψ has the same type of decomposition as ω: Z ψt (x)dµ(t) , x ∈ A . ψ(x) = X
Note that πωt is equivalent to πψt and is irreducible for almost all t ∈ X. Lemma 4.3. Let A be a C ∗ -algebra. For any finite subset F of A, any state ω of A such that πω (A)0 is abelian and πωt (A) ∩ K(Hωt ) = {0} for almost all t ∈ X (in the notation given before this lemma), and any > 0, there exist a finite subset G of A and δ > 0 such that if ψ is a state of A which is centrally equal to ω and |ψt (x) − ωt (x)| < δ ,
x∈G,
t∈X,
then there is a continuous path (us ) in U(A) such that u0 = 1, ψt = ωt Ad u1 for all t ∈ X except for t ∈ X0 with µω (X0 ) < , and kAd us (x) − xk < ,
x ∈ F, s ∈ [0, 1] .
August 21, 2002 18:21 WSPC/148-RMP
670
00128
A. Kishimoto
Proof. The proof is similar to the one of [20, 2.1] (see also [5]). Let F be a finite subset of A and > 0, and let ω be a state of A such that πω (A)0 is abelian. Let e be the projection onto the one-dimensional subspace spanned by Ωω in Hω . By [20] we get an x ∈ M1n (A) for some n ∈ N such that xx∗ ≤ 1, kπω (xx∗ )e − ek < , and kad bAd xk < , b ∈ F. We require that ψ satisfies that |ψt (xi x∗j ) − ωt (xi x∗j )| < δ ,
i, j = 1, 2, . . . , n ,
for a sufficiently small δ > 0. We identify the GNS representation (πω , Hω , Ωω ) with Z ⊕ Z ⊕ Z ⊕ πt dµ(t) , Ht dµ(t) , Ωt dµ(t) , X
X
X
as before, where πt = πωt etc. Since ψ is centrally equal to ω, there is a unit vector Z ⊕ Z ⊕ Ψt dµ(t) ∈ Ht dµ(t) Ψ= X
such that kΨt k = 1 for all t and ψt is the vector state defined through Ψt for almost all t. First suppose that the (finite-dimensional) subspace spanned LΩ,t by πt (x∗i )Ωt , i = 1, . . . , n is orthogonal to the subspace LΨ,t spanned by πt (x∗i )Ψt , i = 1, . . . , n for almost all t. Let Et be the projection onto LΩ,t +LΨ,t ; then t 7→ Et is measurable and defines a finite-rank projection E in πω (A)00 . Let Ft be a projection such that Ft ≤ Et and Ft πt (x∗i )(Ωt + Ψt ) ≈ 0 , Ft πt (x∗i )(Ωt − Ψt ) ≈ πt (x∗i )(Ωt − Ψt ) . Since there is a canonical way to define Ft (cf. [5, 3.3]), we may assume that t 7→ Ft is measurable and defines a projection F in πω (A)00 . Then by the Lusin theorem (Lemma 4.2) we obtain an h ∈ Asa and a projection z ∈ π(A)0 such that khk < ¯ = xhx∗ ∈ Asa . Since 1 + , k(1 − z)Ωk < , and π(h)Ez = F Ez. We define h ∗ ∗ ¯ kπω (xx )e − ek < , we have that πt (h)(Ω t − Ψt ) ≈ πt (xx )(Ωt − Ψt ) ≈ Ωt − Ψt ¯ (and πt (h)(Ω t +Ψt ) ≈ 0) except for t in a subset of small measure, which implies that ¯ ¯ πiπω (h) Ωt ≈ Ψt for most of t. Thus we conclude that the path (eiπsπω (h) ) satisfies e the required conditions. Note that when we choose h above, we could impose an extra condition on the behavior of ρ(h) if the representation ρ is disjoint from πω . If LΩ,t is not orthogonal to LΨ,t for t in a subset of non-zero measure, then we invoke the condition that πt (A) ∩ K(Ht ) = {0} for almost all t. As in the proof of Theorem 1.7 we find a state φ such that φ is centrally equal to ω, φt is almost equal to ωt on xi x∗j , i, j = 1, . . . , n, and LΦ,t is orthogonal to LΩ,t and LΨ,t , where LΦ,t is defined similarly as for LΩ,t through the cyclic vector Φ which defines φ. Then we apply the previous arguments to the pairs ω, φ and φ, ψ; we get the desired path by composition.
August 21, 2002 18:21 WSPC/148-RMP
00128
Approximately Inner Flows on Separable C ∗ -Algebras
671
Proof of Corollary 1.8. We are given two type I representations πi of A on a separable Hilbert space such that π1 (A)00 and π2 (A)00 are isomorphic. Let pi be an abelian projection of πi (A)0 whose central support is 1; then πi (A)00 is isomorphic to πi (A)00 pi . Thus, by replacing πi by pi πi , we may assume that πi (A)0 is abelian for i = 1, 2. Since πi (A)0 is the center of πi (A)00 , π1 (A)0 is isomorphic to π2 (A)0 , too. Since π1 (A)00 is spacially isomorphic to π2 (A)00 , we may assume that both πi (A) acts on the same Hilbert space H and that π1 (A)00 = π2 (A)00 = M with M0 abelian. We suppose that M0 is not trivial; otherwise the result follows from [20]. In particular we have supposed that A is not of type I. What we shall show is that there is an asymptotically inner automorphism α of A such that π1 α(x) 7→ π2 (x) extends to an inner automorphism of M. Let X be the character space of M0 and we identify M0 with C(X). Let Ω be a cyclic unit vector for M, which exists as H is separable, and let E denote the projection onto the subspace spanned by M0 Ω. Then E belongs to M and has central support 1; the reduction of M0 by multiplication by E is an isomorphism. Let ωi denote the state of A defined by x 7→ hπi (x)Ω, Ωi. For each t ∈ X and i = 1, 2 we define a state ωit on A by x 7→ (Eπi (x)E)(t), where Eπi (x)E ∈ M0 E ∼ = C(X). Let µ denote the probability measure on X defined by M0 3 f 7→ hf Ω, Ωi. Then we have that Z ωit (x)dµ(t) , x ∈ A . ωi (x) = X
The representation (πi , H) is also expressed as the direct integral: Z ⊕ Z ⊕ πit dµ(t) , H = Hit dµ(t) , π= X
X
where (πit , Hit ) is the GNS representation associated with ωit . In this situation we can show Lemma 4.4. For any finite subset F of A and > 0 there exists a unit vector Z ⊕ Z ⊕ Φt dµ(t) ∈ H = H2,t dµ(t) Φ= X
X
such that kΦt k = 1 for all t and |ω1,t (x) − φt (x)| < ,
x∈F,
f or almost all t ∈ X ,
where φt is the state defined by x 7→ hπ2,t (x)Φt , Φt i. This can be proved as in the proof of Theorem 1.7, where we have imposed more conditions on Φt ; since we know that there is such a φt for each t ∈ X, the main problem was how to choose Φt so that t 7→ Φt is measurable. Since t 7→ φt (x) equals a continuous function except on a null set and A is separable, it follows that there is a null subset N of X such that X \ N 3 t 7→ φt (x) has a continuous extension for any x ∈ A. Thus we may as well asuume that t 7→ φt is continuous
August 21, 2002 18:21 WSPC/148-RMP
672
00128
A. Kishimoto
and φt is the state defined through Φt for almost all t; in this case we have that |ω1,t (x) − φt (x)| < , x ∈ F for all t ∈ X, which is the type of assumption made in Lemma 4.3. By using Lemmas 4.3 and 4.4 and Lemma 4.2 in place of Kadison’s transitivity, we can prove, in the same way as we prove [5, 7.5] or [20, 2.3], that there is an asymptotically inner automorphism α of A such that π1t α = π2t for almost all t ∈ X, which implies that π1 α is unitarily equivalent to π2 by a unitary which commutes with the center M0 (see also [5]). (1) (1) To be more specific we approximate ω1t = ω1t by a vector state ω2t in π2t in (1) the sense of the above lemma. Then we mimic ω2t by a vector state in π1t and apply Lemma 4.3. We obtain a continuous path (us ) in U(A) such that u0 = 1, us ’s almost (1) (1) commute with elements from the prescribed finite set of A, and ω1t Ad u1 ≈ ω2t (1) for all t ∈ X except for t ∈ X0 with µ(X0 ) arbitrarily small. We then replace ω1t by (2) (2) (1) (2) (1) a vector state ω1t for each t ∈ X0 such that ω1t Ad u1 ≈ ω2t and set ω1t = ω1t (1) (2) (1) (2) for t ∈ X \ X0 . Note that kω1 − ω1 k is very small and that ω2t ≈ ω1t Ad u1 for (2) all t. We mimic ω1t Ad u1 by a vector state in π2t and again apply Lemma 4.3 to get a continuous path (vs ) in U(A) as above. This time (vs ) is even more central (1) (2) with v0 = 1 and ω2t Ad v1 ≈ ω1t Ad u1 for most of (but not all of) t ∈ X. We then (1) (2) (2) (1) (2) (2) modify ω2 to ω2 so that kω2 − ω2 k is very small and ω2t Ad v1 ≈ ω1t Ad u1 (n) for all t. We repeat this process; we will obtain a norm-convergent sequence (ωi ) of centrally equal vector states in πi and a sequence of continuous paths of type (us ) / (vs ) in U(A) as above, which should be sufficiently central; and the limit (∞) (∞) (∞) states ωi are related as ω1t α = ω2t β for almost all t, where the asymptotically inner automorphisms α and β are defined by using the above central sequences of continuous unitary paths. This will complete the proof.
References [1] O. Bratteli, “Inductive limits of finite-dimensional C ∗ -algebras”, Trans. Amer. Math. Soc. 171 (1972) 195–234. [2] O. Bratteli and A. Kishimoto, “Generation of semi-groups and two-dimensional quantum lattice systems”, J. Funct. Anal. 35 (1980) 344–368. [3] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, I, Springer, 1979. [4] L. G. Brown and G. A. Elliott, “Universally weakly inner one-parameter automorphism groups of separable C ∗ -algebras, II”, Math. Scand. 57 (1985) 281–288. [5] H. Futamura, N. Kataoka and A. Kishimoto, “Homogeneity of the pure state space for separable C ∗ -algebras”, to appear in Internat. J. Math. [6] T. Kato, Perturbation Theory for Linear Operators, Springer, 1966. [7] E. Kirchberg and N. C. Phillips, “Embedding of exact C ∗ -algebras in the Cuntz algebra O2 ”, J. reine angew. Math. 525 (2000) 17–53. [8] E. Kirchberg and N. C. Phillips, “Embedding of continuous fields of C ∗ -algebras in the Cuntz algebra O2 ”, J. reine angew. Math. 525 (2000) 55–94. [9] A. Kishimoto, “Simple crossed products of C ∗ -algebras by locally compact abelian groups”, Yokohama Math. J. 28 (1980) 69–85.
August 21, 2002 18:21 WSPC/148-RMP
00128
Approximately Inner Flows on Separable C ∗ -Algebras
673
[10] A. Kishimoto, “Universally weakly inner one-parameter automorphism groups of simple C ∗ -algebras”, Yokohama Math. J. 29 (1981) 89–100. [11] A. Kishimoto, “Universally weakly inner one-parameter automorphism groups of C ∗ -algebras”, Yokohama Math. J. 30 (1982) 141–149. [12] A. Kishimoto, “Type I orbits in the pure states of a C ∗ -dynamical system”, Publ. RIMS., Kyoto Univ. 23 (1987) 321–336. [13] A. Kishimoto, “Type I orbits in the pure states of a C ∗ -dynamical system II”, Publ. RIMS., Kyoto Univ. 23 (1987) 517–526. [14] A. Kishimoto, “Outer automorphism subgroups of a compact abelian ergodic action”, J. Operator Theory 20 (1988) 59–67. [15] A. Kishimoto, “A Rohlin property for one-parameter automorphism groups”, Commun. Math. Phys. 179 (1996) 599-622. [16] A. Kishimoto, “Unbounded derivations in AT algebras”, J. Funct. Anal. 160 (1998) 270–311. [17] A. Kishimoto, “Pairs of simple dimension groups”, Internat. J. Math. 10 (1999) 739–761. [18] A. Kishimoto, “Locally representable one-parameter automorphism groups of AF algebras and KMS states”, Rep. Math. Phys. 45 (2000) 333–356. [19] A. Kishimoto, “Non-commutative shifts and crossed products”, preprint. [20] A. Kishimoto, N. Ozawa and S. Sakai, “Homogeneity of the pure state space of a separable C ∗ -algebra”, preprint. [21] G. K. Pedersen, C ∗ -Algebras and Their Automorphism Groups, Academic Press, 1979. [22] R. T. Powers, “Representations of uniformly hyperfinite algebras and their associated von Neumann rings”, Ann. of Math. 86 (1967) 138–171. [23] R. T. Powers and S. Sakai, “Existence of ground states and KMS states for approximately inner dynamics”, Comm. Math. Phys. 39 (1975) 273–288. [24] S. Sakai, C ∗ -Algebras and W ∗ -Algebras, Classics in Math., Springer, 1998. [25] S. Sakai, Operator Algebras in Dynamical Systems, Cambridge University Press, 1991. [26] M. Takesaki, Theory of Operator Algebras, I, Springer, 1979.
August 21, 2002 18:52 WSPC/148-RMP
00127
Reviews in Mathematical Physics, Vol. 14, Nos. 7 & 8 (2002) 675–700 c World Scientific Publishing Company
BOSONIC CENTRAL LIMIT THEOREM FOR THE ONE-DIMENSIONAL XY MODEL
TAKU MATSUI Graduate School of Mathematics, Kyushu University 1-10-6 Hakozaki, Fukuoka 812-8581, Japan
[email protected] Received 7 September 2001 Revised 18 January 2002 We prove the central limit theorem for Gibbs states and ground states of quasifree Fermions (bilinear Hamiltonians) and those of the off critical XY model on a onedimensional integer lattice. Keywords: The central limit theorem; the XY model; UHF algebra.
1. Introduction The central limit theorem for mixing classical spin systems has been established in a number of situations (see [9, p. 459]). For mixing quantum spin models, a similar limit theorem was considered by D. Goderis, A. Verbeure and P. Vets in [10] and [11]. We refer to the limit theorem of D. Goderis, A. Verbeure and P. Vets as bosonic central limit theorem. Historically the limit theorem has been expected since quantum mean field models were studied in [23] and [14]. For such exactly solutable models the formal Fourier transform of the local quantum observables obeys the canonical commutation relations with respect to a highly degenerate symplectic form while the Fourier transform of the state gives rise to a quasifree state of Boson. D. Goderis, A. Verbeure and P. Vets tried to justify the limit for asymptotically abelian quantum systems with states which are not necessarily of product type. We employ the word Bosonic to make a clear distinction from other central limit theorems in non-commutative probability theories. In fact, the central limit theorem we consider here is the standard central limit theorem of mixing systems. The difference is that the family of measures is replaced by that of spectral measures of (mutually non-commuting ) asymptotically commuting operators on a Hilbert space. Although more than 10 years have passed since the work of D. Goderis and P. Vets appeared [11], no example of (non-product type) mixing states satisfying their conditions is given. One difficulty of proving the limit theorem lies in the estimates of correlation functions (CLT5) in [11]. Given two quantum local observables Q1 675
August 21, 2002 18:52 WSPC/148-RMP
676
00127
T. Matsui
and Q2 localized in space regions Λ1 and Λ2 , we may prove the following estimate: |ϕ(Q1 Q2 ) − ϕ(Q1 )ϕ(Q2 )| ≤ C(Q1 , Q2 )d(Λ1 , Λ2 )−η where d(Λ1 , Λ2 ) is the distance of Λ1 and Λ2 , and η and C(Q1 , Q2 ) are positive constants. In proving a limit theorem for general mixing system, D. Goderis, A. Verbeure and P. Vets needed information of the exponent η and the behavior of the constant C(Q1 , Q2 ). Moreover, the supports Λ1 and Λ2 should be union of cubes which are mutually entangled so that we are forced to consider multipoint correlation functions. D. Goderis, A. Verbeure and P. Vets derived their limit theorem C(Q1 ,Q2 ) on the size of Λ1 and Λ2 when under assumptions for the dependence of kQ 1 k kQ2 k the supports Λ1 and Λ2 are getting large. Here we do not state their assumptions explicitly as the conditions in [10]–[12], which are all slightly different but none are proved for any non-trivial example. (Compare (CLT5) of [11] with the mixing condition of classical cases, for example, [6].) We believe that the justification of the assumption of [10, 11] is far from a straightforward task and it has never been worked out. The only concrete states discussed in [11] are quasifree states of Bosons. However that case can be treated very easily without referring to [11, Theorem 4.1]. On the other hand, even for quasifree states of Fermions, we are not aware of proof of (CLT5) of [11]. Moreover, D. Goderis, A. Verbeure and P. Vets failed to show the central limit theorem for non-local observables, which, in turn, gives rise to an additional complication in constructing the dynamics of the algebra of fluctuation in [12]. The object of this paper is to present the first example of the bosonic central limit theorem. We do not verify the assumptions of [10] and [11] but we show a limit theorem under a mixing condition easier to prove for one-dimensional systems. We consider the ground states of the exactly solutable XY model on a one-dimensional integer lattice Z. Our bosonic central limit theorem is valid for any local quantum observable (e.g. the energy density, local currents etc.). We can prove the assumption of our limit theorem for several other states. The Ruelle transfer operator technique shows our mixing condition for Gibbs states of one-dimensional quantum spin systems with any finite range interactions at any temperature. The finite correlated pure states and their infinite range analogue are other applicable cases. The proof of our limit theorem is exhibited in Sec. 2. We also present a proof of bosonic central limit theorem for (not strictly local) quasi-local observables. Our proof is based on an idea of E. Bolthauzen in [6]. The reason why we mention only one-dimensional models here is due to lack of non trivial examples in higher dimensional models at our present stage of research. There is no essential difference to prove our bosonic central limit theorem for the multi-dimensional cases. In practice, the crucial part in proof is not the general limit theorem but to show uniform mixing condition for concrete states. In multi-dimensional quantum systems, we still have not many tools to prove such uniform mixing condition. We are not sure that if the standard cluster expansion shows bosonic central limit theorem for high temperature Gibbs states.
August 21, 2002 18:52 WSPC/148-RMP
00127
Bosonic Central Limit Theorem
677
In what follows, we abbreviate bosonic central limit theorem to CLT. To prove CLT for the XY model, we first show CLT for general local observables in quasi-free states of Fermions on lattices. When restricted to the abelian algebra of the number operator density, any (gauge invariant) quasifree state of Fermions gives rise to a Fermion point process. The Laplace transform of the stationary measure for this process is written by certain infinite determinants. This enables us to show CLT for the number operator density in various situations (see [19]–[21]). In fact, via the Jordan–Wigner transform the XY model is formally equivalent to quasifree (bilinear) Hamiltonian of Fermions on Z. However, to prove CLT for the XY model, it is necessary to obtain CLT for general observables in quasifree states due to the non-local nature of the Jordan–Wigner transform. We use detail knowledge of unitary implementors of Bogoliubov automorphisms developed by H. Araki in [2] and [3]. The one-dimensional XY model is an exactly solved model. The Hamiltonian is X {(1 + γ)σx(j) σx(j+1) + (1 − γ)σy(j) σy(j+1) + 2λσz(j) } H=− j∈Z (j) σx ,
(j) σy ,
(j)
and σz are Pauli spin matrices at the site j and γ and λ are real where parameters (anisotropy and magnetic field). We already claimed that the model is formally equivalent to a quasifree fermion Hamiltonian on a one-dimensional lattice. However, they are not physical equivalent in that the ergodic behavior of the time evolution is different. In [4], H. Araki introduced the crossed product formalism for the XY model to handle analytic aspects of the XY model and in [5] we have shown lack of ergodicity for the time evolution in the ground state representations when γ 6= 0 and |λ| < 1. We use the idea of H. Araki to show Bosonic central limit theorem for the XY model. To explain our results more precisely, we introduce some notations. By A we denote the UHF C ∗ -algebra d∞ (the infinite tensor product of d by d matrix algebras): C∗ O Md (C) . A= Z
Each component of the tensor product above is specified with a lattice site j ∈ Z. By Q(j) we denote the element of A with Q in the jth component of the tensor product and the identity in any other component. For a subset Λ of Z, AΛ is defined as the C ∗ -subalgebra of A generated by elements supported in Λ. We set [ AΛ (1.1) Aloc = Λ⊂Z:|Λ|<∞
where the cardinality of Λ is denoted by |Λ|. We call an element of AΛ a local observable or a strictly local observable. When ϕ is a state of A the restriction to AΛ will be denoted by ϕΛ : ϕΛ = ϕ|AΛ .
August 21, 2002 18:52 WSPC/148-RMP
678
00127
T. Matsui
For simplicity, we set AR = A[0,∞) ,
AL = A(−∞,−1] .
Let τj be the lattice translation determined by τj (Q(k) ) = Q(j+k) for any j and k in Z. Note that τj leaves AR (respectively AL ) globally invariant if j is positive (respectively negative). Next we present our central limit theorem. Consider a translationally invariant state ϕ of A (ϕ ◦ τj = ϕ for any j). Suppose a τ invariant dense *subalgebra B of A for which ϕ has the summability of two point correlation in the following sense: X |ϕ(Q1 τj (Q2 )) − ϕ(Q1 )ϕ(Q2 )| < ∞ (1.2) j∈Z
for any Q1 and Q2 in B. Let N be a positive integer and let Q be a selfadjoint element of B. We define QhN i via the following equation: X 1 (τj (Q) − ϕ(Q)) . QhN i = √ 2N + 1 |j|≤N Due to our assumption (1.2), the state ϕ is a factor. Consider the GNS representation {π(A), Ω, H} associated with ϕ, where Ω is the GNS cyclic vector and π is the representation of A on the GNS Hilbert space H. Then on the subspace π(B)Ω of H the following weak limit exists: w − lim [QhN i , RhN i ] = s(Q, R)1 , N →∞
where s(Q, R) =
X
ϕ([Q, τj (R)]) .
j∈Z
The convergence of s(Q, R) is guaranteed by (1.2). Thus the formal limit B(Q) = limN →∞ QhN i gives rise to an algebra of canonical commutation relations with respect to the degenerate symplectic form s(Q, R) on B. Note that the limit B(Q) does not exist as an unbounded selfadjoint operator on H. However we may consider the limit of the spectral measure for QhN i evaluated on the state ϕ. In terms of the Fourier transform of the spectral measure we introduce the notion of the central limit. Set t(Q, R) = lim ϕ(QhN i RhN i ) . N →∞
Definition 1.1. Let ϕ be a translationally invariant state of A. We say that the central limit theorem holds for ϕ and B if and only if lim ϕ(eiT QhN i ) = e−
N →∞
for any selfadjoint Q in B.
T2 2
t(Q,Q)
(1.3)
August 21, 2002 18:52 WSPC/148-RMP
00127
Bosonic Central Limit Theorem
679
Theorem 1.2. Let ϕ be a translationally invariant state of A. Let α(j) be a positive function on the set of integers such that α(j) = α(−j), α(j) = O(j −2−δ ) as j → ∞ with some positive δ. The central limit theorem holds for ϕ and Aloc if |ϕ(QL τj (QR )) − ϕ(QL )ϕ(QR )| ≤ KkQL k kQR kα(j)
(1.4)
for any QL in AL and QR in AR . In classical dynamical systems, one way to show CLT is to use the Ruelle transfer operator (c.f. [17]). In [1] H. Araki developed a quantum analogue of the Ruelle transfer operator for quantum spin chains. In [13], V. Golodets and S. V. Neshveyev considered Gibbs states of AF algebras with one-dimensional locality. For finite range interactions they derived estimates of correlation functions which imply our CLT. Corollary 1.3. The central limit theorem holds for a Gibbs state of any finite range translationally invariant interaction at arbitrary temperature. In particular CLT is valid for Gibbs states of the XY model. Our main result of this paper is as follows. Now the algebra A is the infinite tensor product of 2 by 2 matrices. Theorem 1.4. Suppose (λ, γ) 6= (0, ±1). Let ψ be a pure ground state of the XY model. The central limit theorem is valid for any selfadjoint local observable Q and any pure ground state ϕ if |λ| > 1 or if |λ| < 1 and γ 6= 0. To compare the above result with classical spin models, we make a brief remark on two-dimensional Ising model (with zero magnetic field). Two-dimensional Ising model can be solved by the same Jordan Wigner transformation. Roughly speaking, the region |λ| > 1 corresponds to the high temperature regime. When |λ| < 1 and γ 6= 0 there are precisely two translationally invariant pure ground states. These pure ground states correspond to extremal Gibbs measures (with a plus or minus boundary condition) at the low temperature. The above result tells us CLT for extremal Gibbs measures projected to one-dimensional line. Thus we are dealing non Gibbsian measures when |λ| < 1 and γ 6= 0. Recall that the classical CLT for off critical Ising models was obtained in a very skillful way by C. Newman in [16] but FKG inequality is crucial in his argument and we do not expect the same property to hold for general observables. Of course, our theorem is valid for non abelian observables Q in the sense that [Q, τj (Q)] 6= 0 for small j and such observables are of our prime interest. At the critical line |λ| = 1 or |λ| < 1, γ = 0, we believe that scaling different √ from n is required to obtain non-trivial limit. In case of certain massless Fermion, H. Spohn obtained a limit theorem for the number operator density for a more singular case, (see [22]).
August 21, 2002 18:52 WSPC/148-RMP
680
00127
T. Matsui
We present our proof of Theorem 1.2 in Sec. 2. Section 3 is devoted to CLT for quasifree states of Fermions, and we show CLT for pure ground states of the XY model in Sec. 4. 2. Proof of Theorem 1.2 The proof presented here is due to the argument of E. Bolthauzen in [6]. His idea is based on the observation that a probability measure ν on the real line R is the standard normal distribution if the characteristic function of ν is of the C 1 class and satisfies Z Z itx xe dν(x) = it eitx dν(x) . R
R
The following is [6, Lemma 2]. Lemma 2.1. Let νn be a sequence of probability measures on on the real line R and suppose that Z |x|2 dνn (x) < ∞ (2.1) sup n
and
R
Z (x − it)eitx dνn (x) = 0 ,
lim
n→∞
(2.2)
R
νn converges to the standard normal distribution. In our context,
Z eitx dνn (x) = ϕ(eitQhN i ) R
and the condition (2.1) follows from summability of two point functions. To control non-commutativity we use the following inequality: Lemma 2.2. Suppose A = A∗ B = B ∗ are selfadjoint bounded operators on a Hilbert space. Then kei(A−B) − e−iB e−iA k ≤ k[A, B]k .
(2.3)
Proof. Consider eis(A−B) e−isA eisB . By differentiating this identity, We obtain the following equation: Z 1 eis(A−B) [e−isA , B]e−i(1−s)B eiA ds . ei(A−B) − e−iB eiA = i 0
Then combined with [e−isA , B] = −i
Z 0
we obtain (2.3).
1
e−itsA [A, B]e−i(1−t)sA dt
August 21, 2002 18:52 WSPC/148-RMP
00127
Bosonic Central Limit Theorem
681
Proof of Theorem 1.2. We mention modification of the proof of E. Bolthausen now. First of all, without loss of generality we may assume that ϕ(Q) = 0 and t(Q, Q) = 1. Let {π(A), Ω, H} be the GNS triple associated with the state ϕ, where π(·) is the representation A on the Hilbert space H and Ω is the GNS cyclic vector. We denote the spectral decomposition of QhN i by Z QhN i = λdEN (λ) . Then for this sequence of probability measures dνN (λ) = d(Ω, EN ((−∞, λ])Ω) we apply Lemma 2.1. Due to the assumption for α(j) in Theorem 1.2, we can find an increasing sequence m(N ) of positive integers such that √ lim m(N )−1 (2N + 1)1/4 = ∞ . (2.4) lim α(m(N )) 2N + 1 = 0 , N →∞
N →∞
Set
X
Qj,N =
τk (Q) ,
βN =
|k|≤N,|k−j|≤m(N )
X −1/2
¯ hN i = β Q N
|k|≤N
τk (Q) ,
X
1/2ϕ({τk (Q) , Qk,N })
|k|≤N
¯ j,N = β −1/2 Qj,N . Q N
Then βN = (2N +1)(1+o(1)) due to summability of two point correlation functions and t(Q, Q) = 1 and now we show ¯ hN i )eiT Q¯ hN i ) = 0 . lim ϕ((iT − Q
(2.5)
T →∞
Then we consider the following expression: X ¯ ¯ hN i )eiT Q¯ hN i = iT 1 − β −1 τk (Q)Qk,N eiT QhN i (iT − Q N |k|≤N −1/2
− βN
X
¯ ¯ k,N )eiT Q¯ hN i τk (Q)(1 − e−iT Qk,N − iT Q
|k|≤N −1/2
− βN
X
¯
¯
¯
¯
τk (Q)(e−iT Qk,N eiT QhN i − e−iT (Qk,N −QhN i ) )
|k|≤N −1/2
− βN
X
¯
¯ k,N ) τk (Q)eiT (QhN i − Q
|k|≤N ¯
¯
= A1 eiT QhN i − A2 eiT QhN i + A3 − A4 . Now we consider ϕ(A1 e
¯ hN i iT Q
). Note that ¯
|ϕ(A1 eiT QhN i )| ≤ ϕ(A1 A∗1 )1/2 .
(2.6)
August 21, 2002 18:52 WSPC/148-RMP
682
00127
T. Matsui
By definition, ϕ(A1 A∗1 ) X X −1 −2 ϕ {τk (Q), Qk,N } +βN = T 2 1 − βN |k|≤N
=
|k|,|l|≤N
ϕ(τk (Q)Qk,N Ql,N τl (Q))
T2 X {ϕ(τa (Q)τb (Q)τd (Q)τc (Q)) − ϕ(τa (Q)τb (Q))ϕ(τd (Q)τc (Q))} + o(1) , 2 βN (2.7)
where the last summation is taken for a, b, c, d satisfying |a|, |b| |c| |d| ≤ N ,
|a − b| ≤ m(N ), |c − d| ≤ m(N ) .
The rest of estimate is same as [6] and we have limN ϕ(A1 A∗1 ) = 0. Next we look at the terms A2 and A4 . The estimate of ϕ(A4 ) is quite similar to the commutative case while the estimate of ϕ(A2 ) is slightly different as we show here. First note that X ¯ ¯ −1/2 ¯ k,N |2 τk (Q))1/2 . (2.8) ϕ(τk (Q)|1 − e−iT Qk,N − iT Q |ϕ(A2 eiT QhN i )| ≤ βN |k|≤N
¯ k,N k ≤ Recall that kQ
2m(N ) 1/2 βN
→ 0 as N → ∞ there exists a constant C such that
¯ ¯ k,N |2 ≤ C Q ¯4 . |1 − e−iT Qk,N − iT Q k,N
(2.9)
Thus |ϕ(A2 e
¯ hN i iT Q
)| ≤
−1/2 CβN (2N
√ ¯ 2k,N | ≤ C( 2N + 1 + o(1)) + 1)kQ
m(N ) √ 2N + 1
m(N )2 . ≤ 2C √ 2N + 1
2
(2.10)
Due to (2.4) the right-hand side of (2.10) vanishes as N tends to infinity. Finally we consider ϕ(A3 ) in (2.6). Applying Lemma 2.2, we obtain ¯ ¯ ¯ ¯ ¯ k,N , Q ¯ hN i ]k ke−iT Qk,N eiT QhN i − e−iT (Qk,N −QhN i ) k ≤ k[Q X −1 k[Q, τk (Q)]k , ≤ T 2 2r2 βN k∈Z
P where r is the range of Q. As Q is strictly local, k∈Z k[Q, τk (Q)]k is finite. Thus X −3/2 k[Q, τk (Q)]k , (2.11) |ϕ(A3 )| ≤ KT 2 βN (2N + 1) k∈Z
and we have limN ϕ(A3 ) = 0. Next we consider CLT for non-local observables. Here for simplicity of presentation, we consider the case when two point correlation decays exponentially fast.
August 21, 2002 18:52 WSPC/148-RMP
00127
Bosonic Central Limit Theorem
683
Certainly this can be proved for many examples. The above proof shows that CLT is valid for a non-local observable Q = Q∗ if we obtain the following estimates. (a) We have a subalgebra F of A with a norm k|Q|k such that there exist positive constants C and M such that |ϕ(Qτk (R)) − ϕ(Q)ϕ(R)| ≤ Ce−kM k|Q|k k|R|k
(2.12)
for Q and R in F and any positive integer k. (b) Any element Q in F is well localized so that the following sum is finite: X X k[τl (Q), τk (Q)]k < ∞ . (2.13) sup N
|k|≤N |l|≥N
(c) limN ϕ(A4 ) = 0. The algebra A1,x of [1] and Fθ of [15] satisfy the above condition (2.13). More generally, we introduce the notion of exponentially localized observables here. Definition 2.3. An element Q of A is exponentially localized if there exist positive constants k|Q|k and 0 < θ < 1 such that for each positive integer n there exists Qn in A[−n,n] satisfying kQn k ≤ kQk and kQ − Qn k ≤ k|Q|kθn ,
(2.14)
θ is called the localization rate for Q. Suppose Q and R are exponentially localized with a localization rate θ. Then, we have k[Q, τk (R)]k ≤ 2(kQk k|R|k + kRk k|Q|k)θk/3 .
(2.15)
The estimate (2.15) implies (2.13). Theorem 2.4. Let ϕ be a translationally invariant state. Suppose there exist positive constants M and K such that |ϕ(QL τj (QR )) − ϕ(QL )ϕ(QR )| ≤ KkQLk kQR ke−Mj
(2.16)
for any positive j, any QL in AL and QR in AR . The central limit theorem holds for any selfadjoint exponentially localized observable Q. When Q is exponentially localized, two point correlation decays exponentially fast for Q due to the localization property (2.14). 0
|ϕ(Qτk (Q)) − ϕ(Q)ϕ(Q)| ≤ K 0 e−M k , where M 0 = min{M/2, |ln θ|/2}. Proof of Theorem 2.4. Suppose that Q has the localization rate θ. Due to the exponential localization and (2.15), it is also straightforward to show (2.13).
August 21, 2002 18:52 WSPC/148-RMP
684
00127
T. Matsui
Thus we concentrate on the estimate of |ϕ(A4 )|. We approximate Q by Qr(N ) with ϕ(Qr(N ) ) = 0 for suitable r(N ). We set α(N ) = Ke−MN , If b =
4 | ln θ| ,
m(N ) =
a ln(2N + 1) , M
r(N ) = b ln(2N + 1) .
a > max{(2bM + 2), 1/2}, we have (2.4) and lim (2N + 1)2 θr(N ) = 0 .
(2.18)
N →∞
Then lim
N →∞
(2.17)
√ ¯ ¯ ¯ ¯ 2N + 1kτk (Q)eiT (QhN i −Qk,N ) − τk (Qr(N ) eiT (QhN i −Qk,N ) k = 0 .
(2.19)
On the other hand, Lemma 2.2 tells us that keiA − eiB k ≤ keiA − ei(A−B) eiB k + k(1 − ei(A−B) )eiB k ≤ k[A, B]k + kA − Bk . As a result we obtain ¯
¯
k(eiT (QhN i −Qk,N ) − e
iT (Qr(N ) hN i −Qr(N ) k,N )
k
¯ hN i − Q ¯ k,N ), (Qr(N ) − Qr(N ) k,N )]k ≤ T 2 k[(Q hN i ¯ k,N ) − (Qr(N ) ¯ hN i − Q − Qr(N ) k,N )k . + T k(Q hN i
(2.20)
Now we look at the first term in the right-hand side of (2.20):
X X
−1 ¯ τk (Q), τl (Q − Qr(N ) ) k[QhN i , Qr(N ) hN i ]k = βN
|k|≤N |l|≤N −1 ≤ (2N + 1)2 βN kQk kQ − Qr(N ) k ≤ C1 (2N + 1)3/2 θr(N ) .
(2.21)
So ¯ hN i − Q ¯ k,N ), (Qr(N ) − Qr(N ) k,N )]k ≤ C2 (2N + 1)−5/2 . T 2 k[(Q hN i The estimate of the second term of the right-hand side of (2.20) is easier. ¯ k,N ) − (Qr(N ) ¯ hN i − Q − Qr(N ) k,N )k ≤ C3 T T k(Q hN i Thus √
¯
¯
−Q
(2N + 1) 1/2 βN
θr(N ) .
(2.22)
r(N ) hN i r(N ) k,N 2N + 1k(eiT (QhN i −Qk,N ) − e k √ √ ≤ 2N + 1(C1 (2N + 1)3/2 + C4 2N + 1)θr(N ) ≤ C5 (2N + 1)2 θr(N ) .
iT (Q
)
By definition we have θr(N ) = (2N + 1)−4 . As a consequence, we have X iT (Qr(N ) hN i −Qr(N ) k,N ) −1/2 ϕ(τk (Qr(N ) )e ). lim ϕ(A4 ) = lim βN N →∞
N →∞
|k|≤N
(2.23)
August 21, 2002 18:52 WSPC/148-RMP
00127
Bosonic Central Limit Theorem
685
However the uniform cluster property of ϕ leads to |ϕ(τk (Qr(N ) )e Thus
iT (Qr(N ) hN i −Qr(N ) k,N )
)| ≤ C4 (2N + 1)−(a−2bM) .
(2.24)
−1/2 X iT (Qr(N ) hN i −Qr(N ) k,N ) β ϕ(τk (Qr(N ) )e ) N |k|≤N −1/2
≤ C4 βN
(2N + 1)(2N + 1)−(a−2bM) ≤ C5 (2N + 1)−1 .
(2.25)
3. Fermions In this section, we show uniform clustering for quasifree states of Fermions which, in turn, implies CLT. The results mentioned in this section are used in our proof of CLT for pure ground states of the XY model. We begin with explanation of quasifree states of Fermions on Z. Let ACAR be the CAR algebra generated by Fermion creation annihilation operators cj c∗j . In this paper, by a CAR algebra we mean a unital C ∗ -algebra ACAR . Thus, cj and c∗j satisfy the standard canonical anticommutation relations: {cj , ck } = {c∗j , c∗k } = 0 , for any integer j and k. For f = (fj ) ∈ l2 (Z) we set X c∗j fj , c∗ (f ) =
{cj , c∗k } = δj,k
c(f ) =
j∈Z
X
(3.1)
cj f j ,
(3.2)
j∈Z
where the sum converges in norm topology. Furthermore, let B(h) = c∗ (f1 ) + c(f2 ) ,
(3.3)
where h = (f1 ⊕ f2 ) is a vector in the test function space K = l2 (Z) ⊕ l2 (Z). By f¯ we denote the complex conjugate f¯ = (f¯j ) of f ∈ l2 (Z) and we introduce an antiunitary involution J on the test function space K = l2 (Z)⊕l2 (Z) determined by J(f1 ⊕ f2 ) = (f¯2 ⊕ f¯1 ) .
(3.4)
Then, {B(h1 )∗ , B(h2 )} = (h1 , h2 )K 1 ,
B(Jh)∗ = B(h) .
(3.5) (+)
For convenience, we introduce the automorphisms τk , Θ, Θ− and Θk τk is the shift on the lattice Z, τk (cj+k ) = cj+k ,
τk (c∗j+k ) = c∗j+k ,
of ACAR . (3.6)
August 21, 2002 18:52 WSPC/148-RMP
686
00127
T. Matsui (+)
Θ, Θ− Θk equations.
are involutive *automorphisms of ACAR determined by the following Θ(c∗j ) = −c∗j
Θ(cj ) = −cj , ( Θ− (cj ) = ( Θ− (c∗j ) = ( (+) Θk (cj )
= (
Θk (c∗j ) = (+)
for any j .
cj ,
if j ≥ 1 ,
−cj ,
if j ≤ 0 .
c∗j , −c∗j ,
if j ≥ 1 , if j ≤ 0 .
−cj ,
if j ≥ k ,
cj ,
if j ≤ k − 1 .
−c∗j , c∗j ,
if j ≥ k ,
(+)
The automorphisms τk , Θ, Θ− Θk on the test function space K:
(3.8)
(3.9)
if j ≤ k − 1 . (+)
are induced by unitaries, tk , θ, θ− , and θk Θ(B(h)) = −B(h) ,
τk (B(h)) = B(tk h) , Θ− (B(h)) = B(θ− h) ,
(3.7)
(+) Θk (B(h))
(+)
= B(θk h) .
(3.10)
Generally any unitary u on K satisfying JuJ = u gives rise to the automorphism βu of ACAR determined by βu (B(h)) = B(uh) ,
(3.11)
βu is called the Bogoliubov automorphism associated with u. As the automorphism Θ is involutive, Θ2 (Q) = Q, we introduce the Z2 grading with respect to Θ. = {Q ∈ ACAR |Θ(Q) = ±Q} , ACAR ±
ACAR = ACAR ∪ ACAR . + −
Definition 3.1. Let A be a positive operator on the test function space K satisfying 0 ≤ A ≤ 1,
JAJ = 1 − A .
(3.12)
The quasifree state ϕA is the state of ACAR determined by the following equations: ϕA (B(h1 )B(h2 ) · · · B(h2n+1 )) = 0 , ϕA (B(h1 )B(h2 ) · · · B(h2n )) =
X
sign(p)
n Y
(Jhp(2j−1) , Ahp(2j) )K ,
j=1
where the sum is over all permutations p satisfying p(1) < p(3) < · · · < p(2n − 1) ,
p(2j − 1) < p(2j)
and sign(p) is the signature of the permutation p.
(3.13) (3.14)
August 21, 2002 18:52 WSPC/148-RMP
00127
Bosonic Central Limit Theorem
687
Due to (3.13), any quasifree state ϕA is Θ invariant. A quasifree state ϕA is pure if and only if the operator A is a projection. We call a projection E a basis projection if JEJ = 1 − E. The quasifree state ϕE associated with a basis projection E is referred to as the Fock state associated with E and the associated GNS representation (respectively the GNS Hilbert space, the GNS cyclic vector) as Fock representation (respectively Fock space, Fock vacuum vector). A quasifree state ϕA is faithful if and only if the operator A is positive 0 < A < 1. A quasifree state ϕA is translationally invariant if and only if the operator A commutes with the unitary t1 introduced in (3.10), t1 A = At1 . It turns out that for a translationally invariant quasifree state ϕA , the operator A is the ˜ Fourier transform of a (2 by 2 matrix valued) multiplication operator A(x) on ˜ K = L2 ([0, 2π]) ⊕ L2 ([0, 2π]). More precisely we use the following normalization for the Fourier transform: Z X einx fn , fn = (2π)−1 e−inx F (f )(x)dx (3.15) F (f )(x) = n∈Z
for f = (fn ) ∈ l2 (Z) and F (f )(x) ∈ L2 ([0, 2π]). If t1 A = At1 , the operator ˜ commutes with the multiplication operators einx (for any integer A˜ = F AF −1 on K ˜ n) and A is a matrix valued multiplication operator. The decay of two point cor˜ relation functions is determined by regularity of the function A(x). For example, ˜ if the Fourier coefficients of A(x) vanishes exponentially fast, two point correlation functions for local observables decay exponentially as well. However, to obtain uniformity of decay of correlation we impose more conditions on the operator A. Now we state our central limit theorem of quasifree states. Theorem 1.2 is valid of ACAR and we will see that the assumption of Theorem 1.2 for the even part ACAR + holds in the following situations. Theorem 3.2. Let ϕA be a translationally invariant quasifree state of ACAR . The ∩ACAR bosonic central limit theorem is valid for ϕA restricted to the even part ACAR + loc if one of the following conditions is valid : ˜ (a) The operator A is a basis projection, and all the matrix elements of A(x) are C ∞ functions. (b) The operator A is strictly positive, 0 < < A < 1 − < 1 and all the matrix ˜ elements of A(x) are of C ∞ class. To verify the assumption of Theorem 1.2, we use results on unitary implementors of Bogoliubov automorphisms and purification technique. The proof of the following proposition can be found in [18, 2, 3]. Proposition 3.3. (i) Let E1 and E2 be basis projections. The Fock states ϕE1 and ϕE2 give rise to mutually unitarily equivalent (irreducible) representations of ACAR if and only if E1 − E2 is of Hilbert Schmidt Class.
August 21, 2002 18:52 WSPC/148-RMP
688
00127
T. Matsui
(ii) Suppose that S1 and S2 are bounded positive operators on K satisfying (3.12). √ √ The quasifree states ϕS1 and ϕS2 are quasi-equivalent if and only if S1 − S2 is of Hilbert Schmidt class. Let {πE (ACAR ), HE , ΩE } be the Fock representation associated with a basis projection E. Let u be a unitary K satisfying JuJ = u. We say that the Bogoliubov automorphism βu is unitarily implementable in {πE (ACAR ), HE , ΩE } when there exists a unitary Γ(u) on HE such that Γ(u)πE (Q)Γ(u)−1 = πE (βu (Q)) .
(3.16)
Γ(u) is unique up to phase factor. This is due to irreducibility of Fock representation. If the unitary u commutes with a basis projection E, the Fock state ϕE is invariant under βu and there exists Γ(u) leaving ΩE invariant: Γ(u)ΩE = ΩE . Proposition 3.4. (i) Let {πE (ACAR ), HE ΩE } be the Fock representation associated with a basis projection E. The Bogoliubov automorphism βu associated with u is unitarily implementable in HE if and only if uEu−1 − E is of Hilbert Schmidt Class. (ii) Let E1 and E2 be basis projections. Suppose that E1 −E2 is of Hilbert Schmidt Class and kE1 − E2 k < 1. Let {π(ACAR ), H} be the irreducible representation equivalent to the Fock representation associated with E1 (therefore with E2 ) and let ΩEa (a = 1, 2) be the unit vector in H giving rise to the Fock state ϕEa . Then |(ΩE1 , ΩE2 )| = det(1 − (E1 − E2 )2 )1/2 .
(3.17)
Furthermore if E1 − E2 is of trace Class and kE1 − E2 k < 1, |(ΩE1 , ΩE2 )| = detE1 (E1 E2 E1 ) ,
(3.18)
where detE1 (A) is the determinant of an operator A acting on the range of E1 . The following result is a consequence of (3.18). Corollary 3.5. There exist positive constants C1 and C2 such that for basis projections E1 and E2 satisfying kE1 − E2 kH.S. ≤ C1 , we have kϕE1 − ϕE2 k ≤ C2 kE1 − E2 k2H.S. .
(3.19)
We apply (3.19) to derive uniform clustering property of quasifree states. For this purpose we need the following estimate of trace norm. Lemma 3.6. Let A be a bounded operator on a Hilbert space H. Suppose there exists a complete orthonormal basis {ξj (j ∈ Z)} such that the following sum is finite: XX |(ξk , Aξl )| < ∞ . (3.20) l∈Z k∈Z
August 21, 2002 18:52 WSPC/148-RMP
00127
Bosonic Central Limit Theorem
Then the operator A is of trace class and XX |(ξk , Aξl )| . |tr(A)| ≤ tr(|A|) ≤
689
(3.21)
l∈Z k∈Z
Proof. Consider the polar decomposition of A: A = w|A| . Then, tr(|A|) ≤
X
|(ξk , w∗ Aξk )| ≤
k∈Z
≤
XX
|(ξk , Aξl )| |(ξl , w∗ ξk )|
l∈Z k∈Z
XX
|(ξk , Aξl )| .
(3.22)
l∈Z k∈Z
Lemma 3.7. Let A and B be non-negative bounded operators on H. For a real number p with 0 < p < 1 there exists a constant C(p) such that kAp − B p k ≤ C(p)kA − Bkp . Proof. Recall that sin tπ A = π
Z
∞
p
λp−1 − λp (λ + A)−1 dλ .
(3.23)
(3.24)
0
We introduce cut off in the above integral. Set ν = kA − Bk. Then, Z sin tπ ∞ p λ ((λ + B)−1 − (λ + A)−1 )dλ Ap − B p = π ν Z sin tπ ν p−1 (λ − λp (λ + A)−1 )dλ + π 0 Z sin tπ ν p−1 (λ − λp (λ + B)−1 )dλ . − π 0
(3.25)
The first term of (3.25) can be estimated as follows:
Z ∞
p −1 −1
(λ ((λ + B) − (λ + A) ))dλ
ν
Z ∞
p −1 −1
λ (λ + B) (B − A)(λ + A) ≤ ν
Z
≤ kB − Ak ν
∞
λp (λ)−2 ) = kB − Ak
kB − Akp ν p−1 = . p−1 p−1
(3.26)
Other terms of (3.25) is bounded from above by a constant times kB − Akp . Z ν Z ν kB − Akp p−1 p −1 . (3.27) λ − λ (λ + A) dλ ≤ λp−1 = 0≤ p 0 0
August 21, 2002 18:52 WSPC/148-RMP
690
00127
T. Matsui
As a consequence,
kA − B k ≤ p
p
1 2 + πp π(p − 1)
kB − Akp .
In our proof of CLT, we also use another CAR algebra A˜CAR with a larger test function space. ˜ = K ⊕ K and J˜ = J ⊕ −J. A˜CAR is a unital C ∗ -algebra with generators Set K ˜ )(f ∈ K) ˜ satisfying B(f ˜ 2 )} = (h1 , h2 ) ˜ 1 , ˜ 1 )∗ , B(h {B(h K
˜ Jh) ˜ ∗ = B(h) ˜ B(
˜ We identify ACAR with a subalgebra of A˜CAR via the following for h1 , h2 , h ∈ K. equation: ˜ ⊕ 0) . B(h) = B(h
(3.28)
Suppose S is a positive operator on K satisfying JSJ = 1 − S. Let PS be the ˜ determined by operator on K " # p S S(1 − S) . (3.29) PS = p S(1 − S) 1−S ˜ Let ϕ˜PS be the Fock state of It is easy to see that PS is a basis projection on K. CAR A˜ associated with PS . The restriction of ϕ˜PS to ACAR is ϕS . ϕ˜PS (Q) = ϕS (Q) for Q in ACAR .
(3.30)
This procedure of passing to pure states of a larger algebra A˜CAR is referred to as purification. Proof of Theorem 3.2. We now return to our proof of CLT for quasifree states. Let E be a basis projection on K and we show CLT for the Fock state ϕE . Let {e(j)|j = 0, ±1, ±2, . . .} be the standard basis of l2 (Z) where e(j) = (e(j)k ) ∈ l2 (Z) ,
e(j)k = δj,k .
Let p be the projection to the subspace of l2 (Z) spanned by e(j)(j ≤ 0) and let qN be the projection to the subspace spanned by e(j)(N ≤ j). Set rN = 1 − p − qN . We use the same symbol p, qN , rN for the projections on K: " # p 0 p= 0 p and so on. We show the following estimate, which implies CLT. For any integer M there exists a constant CM 1 k(ϕE )(−∞,0]∪[N,∞) − (ϕE )(−∞,0] ⊗ (ϕE )[N,∞) k ≤ CM M , (3.31) N . where (ϕE )Λ is the restriction of ϕE to ACAR Λ
August 21, 2002 18:52 WSPC/148-RMP
00127
Bosonic Central Limit Theorem
691
The state (ϕE )(−∞,0]∪[N,∞) should be understood as a quasifree state of the CAR algebra with test function space (1 − rN )K. By abuse of notation we write (ϕE )(−∞,0]∪[N,∞) = ϕ(1−rN )E(1−rN ) . In the same sprit, (ϕE )(−∞,0] ⊗ (ϕE )[N,∞) = ϕpEp+qN EqN . Then, (3.30) leads to k(ϕE )(−∞,0]∪[N,∞) − (ϕE )(−∞,0] ⊗ (ϕE )[N,∞) k ≤ kϕP(1−rN )E(1−rN ) − ϕPpEp+qN EqN k ,
(3.32)
where the right-hand side should be understood as the norm of difference of pure states for the CAR algebra over (1 − rN )K ⊕ (1 − rN )K. Due to (3.19) we obtain the following estimates: kP(1−rN )E(1−rN ) − PpEp+qN EqN k2H.S. ≤ CM Note that
" P(1−rN )E(1−rN ) − PpEp+qN EqN =
1 . NM
AN
BN
BN
−AN
# ,
where AN = (1 − rN )E(1 − rN ) − (pEp + qN EqN ) and
p p BN = (1 − rN )ErN E(1 − rN ) − pE(1 − p)Ep + qN E(1 − qN )EqN . p p (1 − rN )ErN E(1 − rN ), pE(1 − p)Ep + qN E(1 − qN )EqN and A Operators are of trace class. It is easy to see that 2 ). kP(1−rN )E(1−rN ) − PpEp+qN EqN k2H.S. = 2 Tr(A2N + BN
(3.33)
To obtain estimates of (3.33) we use Lemmas 3.6 and 3.7. ˜ Recall that the Fourier transform of a multiplication operator D(x) on L2 ([0, 2π]) gives rise to a convolution operator D: X D(j − k)fk , (Df )j = k
˜ where f = (fj ) ∈ l2 (Z) and D(j) is the fourier coefficient of D(x). On the other hand, the integration by parts formula implies that Z M d ˜ dx |j|−M . D(x) |D(j)| ≤ (2π)−1 dxM
(3.34)
August 21, 2002 18:52 WSPC/148-RMP
692
00127
T. Matsui
Combined with Lemma 3.6, the estimate (3.34) tells us that |(1 − p)Ep| is of trace class. X |k − l|−M < ∞ . (3.35) Tr(|(1 − p)Ep|) = Tr(|(1 − qN )EqN |) ≤ C k≤0,1≤l
Note that we used the fact that both the trace and the operator E are translationally invariant in the first identity of (3.35). Let us consider Tr(A2N ). As 1 − rN = p + qN and AN = pEqN + qN Ep for any integer M we can find constants C1 , C2 and C3 such that XX |k − l|−M Tr(A2N ) = Tr(pEqN Ep + qN EpEqN ) ≤ C1 k≤0 N ≤l
≤ C2
X
|l|−(M−1) ≤ C3 |N |−(M−2) ,
(3.36)
N ≤l
for any integer M . p p 2 ). We have seen pE(1−p)Ep and qN E(1−qN )EqN Next we consider Tr(B N p are of trace class while (1 − rN )ErN E(1 − rN ) is of finite rank. Then
p
p
2 ) ≤ (1 − rN )ErN E(1 − rN ) − pE(1 − p)Ep + qN E(1 − qN )EqN Tr(BN p p p qN E(1 − qN )EqN + pE(1−p)Ep+ (1 − rN )ErN E(1−rN ) . × Tr (3.37) Due to Lemma 3.7 we have
p p
(1 − rN )ErN E(1 − rN ) − pE(1 − p)Ep + qN E(1 − qN )EqN ≤ C2 k(1 − rN )ErN E(1 − rN ) − pE(1 − p)Ep − qN E(1 − qN )EqN k1/2 , (3.38) (1 − rN )ErN E(1 − rN ) − pE(1 − p)Ep − qN E(1 − qN )EqN = −pEqN Ep − qN EpEqN + pE(1 − p − qN )EqN + EqN E(1 − p − qN )Ep . (3.39) The norm of each operator in (3.39) decays faster than any polynomial order as N tends to ∞. Thus, for any integer M ,
p p
(1 − rN )ErN E(1 − rN ) − pE(1 − p)Ep + qN E(1 − qN )EqN ≤ CM N −M . We claim that there exists a constant C such that p p qN E(1 − qN )EqN + Tr (1 − rN )ErN E(1 − rN ) ≤ C Tr
(3.40)
(3.41)
August 21, 2002 18:52 WSPC/148-RMP
00127
Bosonic Central Limit Theorem
693
for any N . In fact both terms of (3.41) are finite due to Lemma 3.6 and the first term does not depend on N because of translational invariance of the trace. We consider the last term in (3.41). X p (1 − rN )ErN E(1 − rN ) ≤ |(ηa , rN E(1 − rN )ηb )| Tr a,b
≤
X
(|(ηa , (1 − p)Epηb )| + |(ηa , (1 − qN )EqN ηb )|
a,b
+ 2|(ηa , pEqN ηb )|) ,
(3.42)
where {ηa = e(i) ⊕ e(j)} is a basis of K and we used Lemma 3.6 again. The last term in the right-hand side of (3.42) vanished as N tends to ∞ and other terms are independent of N (due to translational invariance again), which implies the inequality (3.41). By (3.40) and (3.41), we obtain 2 ) ≤ CM N −M . kP(1−rN )E(1−rN ) − PpEp+qN EqN k2H.S. = 2 Tr(A2N + BN
This inequality shows |ϕE (Qτj (R)) − ϕE (Q)ϕE (R)| ≤ kQk kRkCM N −M CAR for Q ∈ ACAR (−∞,0] and R ∈ A[1,∞) . Thus CLT is valid for ϕE . Now consider quasifree states ϕS when S is strictly positive:
0 < < S < 1− < 1. We can show CLT in this case with our previous argument for pure Fock states. 2 ) in (3.33). In the present situation, The only difference is the estimate of Tr(BN p BN = (1 − rN )S(1 − rN ) − (1 − rN )S(1 − rN )S(1 − rN ) p (3.43) − pSp − pSpSp + qN SqN − qN SqN SqN . Both (1 − rN )S(1 − rN ) and pSp + qN SqN are strictly positive on the range of 1 − rN . When two operators A and B are strictly positive, 0 < < A, B, we have Z ∞ √ √ 1 1 −1 (B − A) dλ . A− B =π λ1/2 λ + A λ + B Using this identity, we obtain 2 ) Tr(BN
≤π
−1
Z
∞
−3/2
λ
kA − BkH.S.
where A = (1 − rN )S(1 − rN ) − (1 − rN )S(1 − rN )S(1 − rN ) , B = pSp − pSpSp + qN SqN − qN SqN SqN .
(3.44)
August 21, 2002 18:52 WSPC/148-RMP
694
00127
T. Matsui
The operators A and B are linear combination of pSqN and qN Sp. As before, pSqN is of trace class and Tr(|pSqN |) ≤ CN −M .
(3.45)
2 ) ≤ C 0 Tr(|pSqN |) ≤ CM N −M . Tr(BN
(3.46)
Thus
This estimate gives rise to CLT. 4. Ground States of the XY Model We denote the finite volume Hamiltonian of one-dimensional XY model by H[a,b] = −
b−1 X
{(1 + γ)σx(j) σx(j+1) + (1 − γ)σy(j) σy(j+1) + 2λσz(j) } .
(4.1)
j=a
The following limit exists in norm topology of A and defines the time evolution αt of the infinite system in Heisenberg picture: αt (Q) = lim eitH[−N,N ] Qe−itH[−N,N ] N →∞
for Q in A. The generator of the one-parameter group αt of automorphisms is denoted by δ: δ(Q) =
d αt (Q) |t=0 = i[H[−∞,∞] , Q] dt
for Q in Aloc . A state ϕ is a ground state for αt if and only if −iϕ(Q∗ δ(Q)) ≥ 0 for Q in Aloc . (See 5.3.18 and subsequent sections in [8].) The complete set of the ground states of the XY model is known. Theorem 4.1. (i) The ground state of the XY model is unique if |λ| ≥ 1 or γ = 0. (ii) If |λ < 1, γ 6= 0 and (λ, γ) 6= (0, ±1) there exist precisely two pure ground states of the XY model. (iii) The spectral gap of the Hamiltonian closes if |λ = 1 or if |λ < 1 and γ = 0. See [5, Theorem 1]. Note that in the Ising model, (λ, γ) = (0, ±1), any pure ground state is a product state and CLT is trivial. At a formal level, the Jordan–Wigner transformation maps the XY model to quasifree Hamiltonian of fermions and the ground states of the XY model is mapped into pure Fock states, however, this formal equivalence is not correct physically. For example, the ground state is always unique at the fermionic counterpart while it is not unique as we mentioned above. Another physical difference is ergodic behavior of time evolution. In [5] we have shown that the time evolution is not asymptotically abelian for the XY model. For Fermion the asymptotic abelian property is obvious.
August 21, 2002 18:52 WSPC/148-RMP
00127
Bosonic Central Limit Theorem
695
To obtain CLT we use the method of [4]. We enlarge the algebra A to another algebra A˜ adding a new selfadjoint unitary element T having the following property: T2 = 1,
T∗ = T ,
where
Θ− (Q) = lim N →∞
T QT = Θ− (Q) for Q in A 0 Y
σz(j) Q
j=−N
0 Y
(4.2)
σz(j) .
(4.3)
j=−N
A˜ is the crossed product by the Z2 action via Θ− . Obviously A˜ = A ∪ AT . We also introduce the automorphism Θ via the formula, N N Y Y σz(j) Q σz(j) . Θ(Q) = lim N →∞
j=−N
(4.4)
j=−N
Set A± = {Q ∈ A | Θ(Q) = ±Q} . Then we can realize the creation and annihilation operators of fermions inside A˜ as follows. c∗j = T Sj (σx(j) + iσy(j) )/2 , where
cj = T Sj (σx(j) − iσy(j) )/2
(1) (j−1) σz · · · σz Sj = 1 σ (−j) · · · σ (0) z
z
(4.5)
for j ≥ 2 , for j = 1 ,
(4.6)
for j ≤ −1 .
This gives rise to the canonical anticommutation relations (3.1) and the C ∗ subalgebra generated by c∗j and cj is denoted by ACAR again. We extend Θ− and Θ to A˜ via the following formulae: Θ− (T ) = T ,
Θ(T ) = T .
Then, it is easy to see , A+ = ACAR +
A− = ACAR T. −
Proof of Theorem 1.4. In what follows, we assume (λ, γ) 6= (0, ±1). As the Hamiltonian of the XY model is Θ invariant, the ground state is Θ invariant when it is unique. In fact, the Θ invariant ground state of the XY model is unique for any λ and γ and we denote it by ψ. When restricted to A+ , ψ is a Fock state ϕE(λ,γ) introduced in the previous section. ψ |A+ = ϕE(λ,γ) |ACAR +
(4.7)
August 21, 2002 18:52 WSPC/148-RMP
696
00127
T. Matsui
˜ γ) of the operator E(λ, γ) is a multiplication operator The Fourier transform E(λ, ˜ ˜ γ) can be found in [5]. We do not need the on K and the explicit form of E(λ, ˜ ˜ γ) are precise form of E(λ, γ) as we only use the fact that matrix elements of E(λ, ∞ C functions. Consider the case when |λ| > 1. By unicity of the ground state, ψ is Θ invariant. Due to (4.7) and the results of the previous section, we obtain CLT for any local observable Q in A+ . Thus we concentrate on A− . For simplicity we set E(λ, γ) = E from now on. Let {πE (ACAR ), HE , ΩE } be the GNS representation associated with ϕE . If Q1 T is an element of A− localized ) the expectation in (∞, 0] and Q2 T is in A− localized in [1, ∞)(Q1 , Q2 ∈ ACAR − CAR as follows: value ψ(Q1 T τk (Q2 T )) is written in terms of A ψ(Q1 T τk (Q2 T )) = (ΩE , πE (Q1 τk (Q2 )Sk )ΩE ) .
(4.8)
(+) (+) By use of Lemma 3.6, we see that both θ− Eθ− − E and θk Eθk − E are of trace (+) (+) class. Consider the unitary implementors Γ(Θ− ) and Γ(Θk ) for Θ− and Θk . As (+) Ad(Sk ) and Θ− ◦ Θk ◦ Θ give rise to the same involutive automorphism of ACAR
we conclude that (+)
πE (Sk ) = ±Γ(Θ− )Γ(Θk )Γ(Θ)
(4.9)
ˆ − Eθ− /E) due to irreducibility of Fock representation. Consider the unitaries R(θ (+) (+) 0 ˆ ˆ and R(θk Eθk /E) where R(E /E) is the unitary defined in (9.3) in p. 424 of (+) (+) ˆ − Eθ− /E) and [2]. As θ− Eθ− − E and θk Eθk − E are of trace class, both R(θ (+) (+) CAR ˆ . Furthermore, they satisfy R(θk Eθk /E) belong to A ˆ − Eθ− /E)ΩE = Γ(Θ− )ΩE , R(θ
ˆ (+) Eθ(+) /E)ΩE = Γ(Θ(+) )Ω R(θ k k k
Γ(Θ)ΩE = Ω .
(4.10)
On the other hand, as we know rapid uniform decay of two point correlation functions of observables localized in (−∞, 0] and [1, ∞) for Fock states, the following ˆ (+) Eθ(+) /E) suffices to show CLT. ˆ − Eθ− /E) and R(θ localization property of R(θ k k We set ˆ − Eθ− /E) , R2 = R(θ ˆ (+) Eθ(+) /E) . (4.11) R1 = R(θ k
k
(n)
Lemma 4.2. For any positive integer M, there exist unitaries R1 (n)
R2
in ACAR (−∞,n] ,
in ACAR [k−n,∞) and a constant CM such that kRa − Ra(n) k ≤ CM n−M
f or a = 1, 2 .
(4.12)
We postpone the proof of Lemma 4.2 till the end of this section. We sketch our proof of CLT of the XY model for the case |λ| < 1, γ 6= 0 and (λ, γ) 6= (0, ±1). In this case, there exist two translationally invariant pure ground states ψ± . As the Θ invariant ground state ψ is unique and ψ = 1/2(ψ+ + ψ− ). In terms of Fock states of ACAR , ψ± are described as follows: ψ± (Q) = (ΩE , πE (Q)1/2(1 ± Γ(Θ− )T )ΩE ) for Q in A.
(4.13)
August 21, 2002 18:52 WSPC/148-RMP
00127
Bosonic Central Limit Theorem
697
This time, Γ(Θ− ) belongs to ACAR . Lemma 4.2 is still valid. As a consequence, − we obtain CLT as before. Before we start proof of Lemma 4.2, we prepare a few results on operators on l2 (Z). By X we denote the position operator on l2 (Z) determined by X(fj ) = (jfj ) for f = (fj ) in l2 (Z) . We say that A vector f = (fj ) is rapidly decreasing if (|X| + 1)n f belongs to l2 (Z) for any positive integer n. Lemma 4.3. Let A be a bounded operator l2 (Z) such that on the matrix elements Aij = (e(i), Ae(j)) satisfy XX |Aij | |j|m < ∞ (4.14) i∈Z j∈Z
for any positive integer m. (i) For any positive integer m, (|X| + 1)m |A|(|X| + 1)m is of trace class. (ii) Any eigenvector of |A| is rapidly decreasing. (iii) For any positive integer m, there exists a positive finite range operator AN (m) localized in the interval [−N, N ]) such that k |A| − AN (m)ktr ≤ CN −m .
(4.15)
Proof of Lemma 4.3. Proof of (i) is similar to that of Lemma 3.6 so we do not give here. Let ξ be an eigenvector of |A| with an eigenvalue , |A|ξ = ξ. Then (|X| + 1)m |A|ξ = (|X| + 1)m ξ . The left-hand side is in l2 (Z) for any m. Thus ξ is rapidly decreasing. To prove (iii), we first recall that for any bounded operator Q (|X| + 1)−2 Q(|X| + 1)−2 is of trace class. Let qN be the projection defined in the previous section. We set B(m) = (|X| + 1)m |A|(|X| + 1)m and AN (m) = (1 − qN )|A|(1 − qN ) . As B(m) is of trace class by our assumption and we have kAN (m) − |A| kTr = k(|X| + 1)−m (qN B(m)qN − qN B(m) − B(m)qN )(|X| + 1)−m kTr ≤ 3k(|X| + 1)−m qN k kB(m)kTr . By definition if N is positive, k(|X| + 1)−m qN k ≤ (N + 1)−m . This concludes the proof of (4.15).
(4.16)
August 21, 2002 18:52 WSPC/148-RMP
698
00127
T. Matsui
Finally we consider the localization property of R1 and R2 in (4.15). We examine ˆ − Eθ− /E) has the R1 only. R2 can be treated in the same way. Recall that R1 = R(θ following form. Set sin φ = |θ− Eθ− −E|. The operators sin φ and φ(0 ≤ φ, ψ ≤ π/2) are of trace class due to Lemma 3.6. Note that sin2 φ commutes with E and θ− Eθ− . Consider the spectral resolution of φ Z π/2 xdF (x) . φ= 0
As sin φ is compact, the spectrum of φ is discrete. In particular the multiplicity of the eigenvalue π/2 is finite. Moreover the eigenspace for π/2 is characterized by one of the following conditions : Eξ = ξ ,
θ− Eθ− ξ = 0
Eξ = 0 ,
θ− Eθ− ξ = ξ .
or
Due to this fact, we conclude that φ(cos φ)−1 [E, θ− Eθ− ] = φ(cos φ)−1 (1 − F ({π/2})[E, θ− Eθ− ] . Thus we may find a constant C such that φ(cos φ)−1 (1 − F ({π/2}) ≤ C sin φ . This inequality tells us that the operator φ(cos φ)−1 (1 − F ({π/2}) satisfies the assumption of Lemma 4.3 as sin φ satisfies the same condition. Taking into the account of these facts, we define the trace class operator H(θ− Eθ− /E) via the following equation: E − θ− Eθ− . (4.17) H(θ− Eθ− /E) = 2iφ(cos φ)−1 E, |E − θ− Eθ− | With this notation, R1 is written as R1 = ei/2(B,H(θ− Eθ− /E)B)
Y
B(ξa )
(4.18)
a
where ξa are eigenvectors sin φ with the eigenvalue 1 and (B, H(θ− Eθ− /E)B) is defined in [2]. Due to Lemma 4.3 (ii), the eigenvector ξ of sin φ is rapidly decreasing so the Fermion operator B(ξ) is well localized. To complete our proof of Lemma 4.2, it suffices to prove the following inequality: k(B, H(θ− Eθ− /E)B) − (B, (1 − qN )H(θ− Eθ− /E)(1 − qN )B)k ≤ CM N −M . (4.19) Note that (B, (1 − qN )H(θ− Eθ− /E)(1 − qN )B) is in ACAR (−∞,N ] . On the other hand, k(B, H(θ− Eθ− /E)B) − (B, (1 − qN )H(θ− Eθ− /E)(1 − qN )B)k ≤ kH(θ− Eθ− /E) − (1 − qN )H(θ− Eθ− /E)(1 − qN )kTr .
(4.20)
August 21, 2002 18:52 WSPC/148-RMP
00127
Bosonic Central Limit Theorem
699
(See [2, (7.11)]). Now we claim that kH(θ− Eθ− /E) − (1 − qN )H(θ− Eθ− /E)(1 − qN )kTr ≤ CM N −M .
(4.21)
By Lemma 4.3, (|X| + 1)m sin φ(|X| + 1)m and the following operator are of trace class for any positive m: (|X| + 1)m H(θ− Eθ− /E)(|X| + 1)m . We can carry out the computation similar to (4.16) and the left hand side of (4.21) is bounded from above by 3kqN (|X| + 1)−m k k(|X| + 1)m H(θ− Eθ− /E)(|X| + 1)m kTr , which shows the inequality (4.21). References [1] H. Araki, Gibbs states of the one-dimensional quantum spin chain, Commun. Math. Phys. 115 (1968) 477–528. [2] H. Araki, On quasifree states of CAR and Bogoliubov automorphisms, Publ. Res. Inst. Math. Sci. 6 (1970/71) 385–442. [3] H. Araki, Bogoliubov automorphisms and Fock representations of canonical anticommutation relations, in Operator Algebras and Mathematical Physics (Iowa City, Iowa, 1985), pp. 23–141, Contemp. Math. 62, Amer. Math. Soc., Providence, RI, 1987. [4] H. Araki, On the XY-model on two-sided infinite chain, Publ. Res. Inst. Math. Sci. 20(2) (1984) 277–296. [5] H. Araki and T. Matsui, Ground states of the XY-model, Comm. Math. Phys. 101(2) (1985) 213–245. [6] E. Bolthauzen, On the central limit theorem for stationary mixing random fields, Ann. Prob. 4 (1982) 1047–1050. [7] O. Bratteli and D. Robinson, Operator Algebras and Quantum Statistical Mechanics I, 2nd edition, Springer, 1987. [8] O. Bratteli and D. Robinson, Operator Algebras and Quantum Statistical Mechanics II, 2nd edition, Springer, 1997. [9] H. O. Georgii, Gibbs Measures and Phase Transitions, de Gruyter Studies in Mathematics, 9. Walter de Gruyter., Berlin, 1988. [10] D. Goderis, A. Verbeure and P. Vets, Noncommutative central limits, Prob. Th. Related Fields 82 (1989) 527–544. [11] D. Goderis and P. Vets, Central limit theorem for mixing quantum systems and the CCR-algebra of fluctuations, Commun. Math. Phys. 122 (1989) 249–265. [12] D. Goderis, A. Verbeure and P. Vets, Dynamics of fluctuations for quantum lattice systems, Commun. Math. Phys. 128 (1990) 533–549. [13] V. Golodets and S. V. Neshveyev, Gibbs states for AF-algebras, J. Math. Phys. 39 (1998) 6329–6344. [14] K. Hepp, and E. Lieb, On the superradiant phase transition for molecules in a quantized radiation field: the Dicke Maser model, Ann. Physics 76 (1973) 360–404. [15] T. Matsui, On non-commutative ruelle transfer operator, Rev. Math. Phys. 13 (2001) 1183–1201. [16] C. Newman, A general limit theorem for FKG systems, Commun. Math. Phys. 91 (1983) 75–80.
August 21, 2002 18:52 WSPC/148-RMP
700
00127
T. Matsui
[17] D. Ruelle, Statistical mechanics of a one-dimensional lattice gas, Commun. Math. Phys. 6 (1968) 267–278. [18] R. T. Powers, and E. Størmer, Free states of the canonical anticommutation relations, Commun. Math. Phys. 16 (1970) 1–33. [19] A. Soshnikov, Determinantal random point fields, Russian Math. Surveys 55(5) (2000) 923–975. [20] T. Shirai and Y. Takahashi, Random point fields associated with certain Fredholm determinants I: fermion, Poisson and boson point processes, preprint, Tokyo Institute of Technology. [21] T. Shirai and Y. Takahashi, Random point fields associated with certain Fredholm determinants II: fermion shifts and their ergodic and Gibbs properties, preprint, Tokyo Institute of Technology. [22] H. Spohn, Interacting Brownian particles: a study of Dyson’s model, in Hydrodynamic Behavior and Interacting Particle Systems, IMA Vol. Math. Appl., 9, Springer, New York, 1987, pp. 151–179. [23] W. F. Wreszinski, Fluctuations in some mean-field models in quantum statistics, Helv. Phys. Acta 46 (1973/74) 844–868.
August 22, 2002 15:51 WSPC/148-RMP
00129
Reviews in Mathematical Physics, Vol. 14, Nos. 7 & 8 (2002) 701–707 c World Scientific Publishing Company
HOW SHOULD ONE DEFINE ENTROPY PRODUCTION FOR NONEQUILIBRIUM QUANTUM SPIN SYSTEMS?
DAVID RUELLE Institut Des Hautes Etudes Scientifiques, 91440 Bures sur Yvette, France
[email protected] Received 5 July 2001
This paper discusses entropy production in nonequilibrium steady states for infinite quantum spin systems. Rigorous results have been obtained recently in this area, but a physical discussion shows that some questions of principle remain to be clarified. Keywords: Statistical mechanics; nonequilibrium; entropy production; quantum spin systems; reservoirs.
1. Introduction Recent papers by Ruelle [4, 5], and Jakˇsi´c and Pillet [3] have discussed the nonequilibrium statistical mechanics of infinite quantum spin systems, and in particular the positivity of entropy production. Mathematically, these papers are based on the treasure of results accumulated in the two volumes of Operator algebras and quantum statistical mechanics by Bratteli and Robinson [2]. In particular the concepts of relative modular operator and relative entropy, developed by Huzihiro Araki [1], turn out to play an essential role (see [3]). Clearly, the current surge of activity in nonequilibrium statistical mechanics is going to make more demands on operator algebras, and on the basic structural facts discovered about these algebras by Tomita, Takesaki, and Araki. The present paper is however less concerned with mathematics than with the physical question of how to define entropy production. We shall try to find out what is likely to be true or not true in this area, and therefore what are the theorems that one should attempt at proving.
2. A Formula for the Entropy Production in a Finite System In this section we consider a finite system described by a density matrix ψ on a finite-dimensional Hilbert space H. The (von Neumann) entropy associated with ψ is S(ψ) = −Tr ψ log ψ . 701
August 22, 2002 15:51 WSPC/148-RMP
702
00129
D. Ruelle
In the presence of a time evolution defined by unitary operators U (t) on H we may define ψ(t) = U (t)ψU (−t) . It is clear and well known that the entropy S(ψ(t)) is independent of t: there is no entropy production in this setup. To understand entropy production we have to think of a large system (the universe) of which we observe a small part. By virtue of the time evolution, there are correlations between the state of the small system, and parts of the large system that are more and more remote. After a while these correlations are forgotten, or equivalently entropy is created. This is basically the way entropy production was understood by Boltzmann. We shall follow this way of thinking, and consider that our system is composed of N several subsystems labelled by an index a = 0, 1, . . . Corresponingly H = a≥0 Ha , we assume that the U (t) form a one-paramameter group of unitary transformations of H, with U (t) = e−iHt . We can now define a density matrix ψa (t) on Ha as a partial trace:
where H\a =
ψa (t) = TrH\a ψ(t)
N b6=a
Hb . The entropies Sa (t) = −TrHa ψa (t) log ψa (t)
may depend on t, and we define the (rate of) entropy production e = e(t) as d X d X Sa (t) = Sa (t) − S(ψ(t)) . e= dt dt a≥0
a≥0
This is the rate of change of entropy associated with the decomposition of the system described by ψ(t) into the subsystems described by ψa (t), a ≥ 0. Note that by the subbaditivity of the entropy X Sa (t) − S(ψ(t)) ≥ 0 . a≥0
This positive quantity may be viewed as the information lost about the state ψ(t) of the system when we cut it into the subsystems labelled by a = 0, 1, . . . In the large system limit where correlations move away and disappear at infinity, we expect P a≥0 Sa (t) − S(ψ(t)) to be an increasing function of t, so that e ≥ 0 (at least in the average). But for the moment we consider finite systems, i.e. we keep H finite dimensional, and we look more carefully at the expression for the entropy production. Let us write X Ha + h H= a≥0
August 22, 2002 15:51 WSPC/148-RMP
00129
How Should One Define Entropy Production
703
where ˆa Ha = 1\a ⊗ H ˆ a is not and 1\a is the unit operator on H\a . (Note that that the choice of h, H unique, one may in particular take h = H and all Ha = 0). Then, to first order in dt, X Ha + h, ψ(t) , ψ(t + dt) = e−iHdt ψ(t)eiHdt = ψ(t) − i dt[H, ψ(t)] = ψ(t) − i dt a≥0
hence ˆ a , ψa (t)] − i dt TrH [h, ψ(t)] . ψa (t + dt) = ψa (t) − i dt[H \a Therefore, assuming that ψ(t) is invertible so that the log is well defined, Sa (t + dt) = −TrHa ψa (t + dt) log ψa (t + dt) = σ1 + σ2 , where ˆ a , ψa (t)] − i dtTrH [h, ψ(t)]) σ1 = −TrHa ψa (t) log(ψa (t) − i dt[H \a ˆ a , ψa (t)] + i dt TrHa TrH [h, ψ(t)] = −TrHa ψa (t) log ψa (t) + i dt TrHa [H \a = − TrHa ψa (t) log ψa (t) , ˆ a , ψa (t)] + TrH [h, ψ(t)]) log ψa (t) σ2 = i dt TrHa ([H \a = i dt TrHa (TrH\a [h, ψ(t)]) log ψa (t) = i dt TrH ([h, ψ(t)](1\a ⊗ log ψa (t))) = −i dt TrH (ψ(t)[h, 1\a ⊗ log ψa (t)]) . Therefore Sa (t + dt) − Sa (t) = −i dt TrH (ψ(t)[h, 1\a ⊗ log ψa (t)]) , X TrH (ψ(t)[h, 1\a ⊗ log ψa (t)]) , e = −i
(2.1)
a≥0
and finally e = −i TrH (ψ(t)[h, log ⊗a≥0 ψa (t)]) .
(2.2)
Note that we may in (2.1) and (2.2) replace h by the total Hamiltonian H (take all ˆ a = 0): H e = −i TrH (ψ(t)[H, log ⊗a≥0 ψa (t)]) .
August 22, 2002 15:51 WSPC/148-RMP
704
00129
D. Ruelle
3. The Large System Limit We shall be interested in the limit of a large system. More precisely, the subsystem R0 = Σ corresponding to a = 0 will remain small, but it will interact with reservoirs R1 , R2 , . . ., which will become large (there is no direct interaction between the reservoirs). We shall be interested in the case where there are at least two large reservoirs (in the case of only one large reservoir R1 , we expect that the small system Σ will get in equilibrium with R1 if we wait long enough — this is the situation of approach to equilibrium). We think of the reservoirs R1 , R2 , . . . as having different inverse temperatures β1 , β2 , . . . Of course, putting the reservoirs in contact with the small system Σ will produce a flow of heat, so that the temperature in the reservoirs will not remain uniform, in particular the entropy production e defined by the large system limit of (2.2) might depend on where the boundary between the small system and the reservoirs is put. [It is also possible that it does not since, physically, entropy production depends on information disappearing at infinity on different sides of some separating surfaces, and the exact position of these surfaces may not be important]. In any case we are interested in a double limit where first the reservoirs are allowed to become infinite and then, perhaps, the boundaries between the small system and the reservoirs are allowed to move to infinity. This double limit is more or less imposed by physics, but seems hard to analyze mathematically. Note for example that we expect the entropy −Tr(ψ log ⊗a≥0 ψa ) + constant to diverge in a large system limit where it becomes time independent while its time derivative tends to a nonzero constant e. We shall try to argue that in the double limit discussed above, (2.2) becomes the standard thermodynamic relation between the heat fluxes and the temperatures of the reservoirs, but we shall not be able to give a proof of this fact. Basically, our difficulty is to make sense of the limit of log ⊗a≥0 ψa or [h, log ⊗a≥0 ψa ]. 4. Infinite Systems In order to be able to discuss a small system Σ coupled with actually infinite reservoirs Ra with a > 0, we shall now introduce more structure into the problem. Let L be countably infinite, and Hx be a finite dimensional Hilbert space for each S x ∈ L. We let L be the disjoint union L = a≥0 Ra , where R0 = Σ is finite and the Ra with a > 0 are infinite. Choosing Λ finite such that Σ ⊂ Λ ⊂ L, we may define N N Ha = HΛa = x∈Λ∩Ra Hx , H = HΛ = x∈Λ Hx , and study the finite system defined by a density matrix ψ(t) = ψΛ (t) on HΛ and a (self-adjoint) Hamiltonian H = HΛ on HΛ . N For finite X ⊂ L let AX be the C∗ -algebra of operators on HX = x∈X Hx . If Y ⊂ X we may identify AY with a subalgebra of AX by B 7→ B ⊗ 1X\Y , and define S the quasilocal C∗ -algebra A corresponding to L as the norm closure of X AX . We can then introduce a Hamiltonian for the infinite system L as the formal expression X Φ(X) HLΦ = X⊂L
August 22, 2002 15:51 WSPC/148-RMP
00129
How Should One Define Entropy Production
705
where the sum is over finite subsets X of L, and Φ(X) is self-adjoint ∈ AX . The finite system Hamiltonian is then defined bya X H = HΛΦ = Φ(X) . X⊂Λ
The infinite system limit consists now in letting Λ tend to infinity in a suitable way, which we shall not discuss (but Λ should eventually contain any given finite set). We may assume that the density matrices ψΛ (t) tend to a time independent state ρ on A when Λ → L in the sense that TrHΛ ψΛ (t)A → ρ(A)
if A ∈ AX
for finite X ⊂ L. We want to take for ρ not just a time invariant state, but one which qualifies as nonequilibrium steady state (so that in particular, if the entropy production can be defined, it is not negative). We shall discuss nonequilibrium steady states below. Of the quantities occuring in (2.1) and (2.2) we see that we can now replace TrH (ψ(t) · · ·) by ρ(· · ·). For finite range interactions, h is a well defined element of A and independent of Λ for sufficiently large Λ. The operator H is, in the limit of infinite Λ given formally by HLΦ as defined above. It is however not clear what to do with the limit of log ⊗a≥0 ψa (t). One idea would be to assume that X Ψ(X) log ψΛ (t) + cΛ 1Λ → − X⊂L
where the cΛ are constants and the right hand side is (up to sign) a formal sum of self-adjoint elements Ψ(X) ∈ AX for finite X ⊂ L. But such an Ansatz conflicts with the notion that log ψa (t) for a reservoir has long distance correlations, i.e. very large or infinite sets X should be important in the the formula displayed above. In conclusion, we believe, for physical reasons that the infinite system limit of the entropy production makes sense, but we cannot prove this fact. 5. The Thermodynamic Formula for the Entropy Production At this point we have come to an expression of e as limit when Λ tend to infinity (or Λ → L) of O O ψa ]) = i Tr([HΛΦ , ρΛ ] log ψa ) . −i Tr(ρΛ [HΛΦ , log a≥0
a≥0
HLΦ ,
we have ρ([HΛΦ , A] = 0 Since ρ is invariant under the time evolution defined by if A belongs to a local algebra and Λ is sufficiently large. Therefore, Tr([HΛΦ , ρΛ ]A) vanishes if A is localized well inside Λ, and is nonzero only for A localized near the boundary of Λ. In other words, in computing e we may ignore local terms may be modified (for instance by boundary terms) provided formally HΛ → HL when Λ tends to L.
a This
August 22, 2002 15:51 WSPC/148-RMP
706
00129
D. Ruelle
N from log a≥0 ψa and pay attention only to contributions from far away, at the boundary of Λ. An obvious guess is then to replace ψa by the equilibrium state at temperature βa in Ra , obtaining now e as limit when Λ → L of ! " #! X X Φ Φ Φ Φ βa HΛ∩Ra βa [HΛ , HΛ∩Ra ] . i Tr ρΛ HΛ , = iρ a
a
Φ ] limΛ→L i[HΛΦ , HΛ∩R a
is a well defined operator localized near the surface Note that of the small system Σ, it represents the rate of transfer of energy to the reservoir Ra and therefore for large enough Λ ! X βa i[HΛ , HΛ∩Ra ] . (5.1) e=ρ a
In fact (5.1) is the usual thermodynamic expression of the entropy production in terms of heat fluxes. Note that we may ignore the term with a = 0 since the fluxes to the small system add up to 0 in a stationary state. We shall from now on proceed with the formula (5.1) for the entropy production, but remember that its relation with (2.2) has not been satisfactorily established. 6. Nonequilibrium Steady States (NESS) The definition of nonequilibrium steady states (NESS) should choose a direction of time, i.e. distinguish between the past and the future. Otherwise one cannot hope to prove that the entropy production e has a definite sign. One must also impose the asymptotic temperatures β1−1 , β2−1 , . . . in the reservoirs R1 , R2 , . . . We assume that the interaction Φ determines a one-parameter group (αt ) of automorphisms of A, defining the time evolution of our system (see [2]). Let σ1 , σ2 , . . . be equilibrium states at inverse temperature β1 , β2 , . . . for the reservoirs R1 , R2 , . . . and σ0 any state for the small system Σ = R0 . Assuming the existence for each A ∈ A of a limit O σa )(αt A) = ρ(A) (6.1) lim ( t→∞
a≥0
defines a state ρ which one can certainly call a NESS. One can prove the existence of the limits (6.1) under strong conditions of asymptotic abelianness in time of the αta ). This was the point of evolution (αt ) and also of the “uncoupled” evolutions (˘ view adopted in [4]. It has the advantage of leading to strong results like linear response formulae, but the disadvantage that the assumed asymptotic abelianness can practically never be verified. Progress in understanding nonequilibrium quantum spin systems will probably depend on a better understanding of the asymptotic abelianness conditions in question. It is however possible to obtain some results without unverifiable assumptions by considering limit points for T → ∞ of Z O 1 T dt(αt )∗ ( σa ) (6.2) T 0 a≥0
August 22, 2002 15:51 WSPC/148-RMP
00129
How Should One Define Entropy Production
707
in the weal dual of A. Such limit points ρ always exist (by w∗ -compactness of the set of states) and they are invariant under time evolution. The limit points ρ are good candidates to represent nonequilibrium steady states. In fact, it has been proved in [5], and more generally in [3] that the entropy production e defined by (5.1) is ≥ 0 for such states. (The reason is basically that the commutator in (5.1) is a derivative, which combines with the integral in (6.2) to produce a manageable expression). If one assumes that the σa (a > 0) are extremal KMS states and that (αt ) is asymptotically abelian one can show (see [5]) that the definition of the NESS as limit points of (6.2) does not depend on where the boundaries between the small system Σ and the reservoirs Ra (a > 0) are placed. This is of course quite desirable. Asymptotic abelianness of (αt ) also ensures that the NESS have a unique ergodic decomposition. So, even with the definition of NESS based on (6.2), questions of asymptotic abelianness seem to appear unavoidably. This is natural because if our system is composed of subsystems that do not interact (say the small system does not interact with the reservoirs, we violate asymptotic abelianness, and have an uninteresting theory. We have said nothing of the geometry of the reservoirs Ra (a > 0), but consideration of the macroscopic limit shows that (if they are pieces of regular lattices Zd ) their dimension d must be ≥ 3. Indeed in the macroscopic limit, a NESS corresponds to a temperature field T satisfying 4T = 0 (say), and tending to limits βa−1 in the various reservoirs. In view of properties of harmonic functions this cannot happen for d < 3. What will happen for d < 3 is that the temperature will tend, as time goes to infinity, to a constant in any bounded region, the temperature gradient and heat flux will tend to zero, and the NESS will reduce to a thermodynamic equilibrium state with e = 0. In conclusion we hope to have shown in this note that, on our way to understanding quantum nonequilibrium statistical mechanics, there remains not only problems of mathematics to solve but also questions of physics to clarify. References [1] H. Araki, Relative entropy of states of von Neumann algebras, Publ. R.I.M.S., Kyoto Univ. 11 (1976) 809–833. [2] O. Brattelli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Springer-Verlag, Berlin, 2nd ed. 1996. [3] V. Jakˇsi´c and C.-A. Pillet, On entropy production in quantum statistical mechanics, preprint. [4] D. Ruelle, Natural nonequilibrium states in quantum statistical mechanics, J. Statist. Phys. 98 (2000) 57–75. [5] D. Ruelle, Entropy production in quantum spin systems, Commun. Math. Phys. 224 (2001) 3–16.
August 21, 2002 19:22 WSPC/148-RMP
00135
Reviews in Mathematical Physics, Vol. 14, Nos. 7 & 8 (2002) 709–731 c World Scientific Publishing Company
FUSION RULES OF MODULAR INVARIANTS
DAVID E. EVANS School of Mathematics, University of Wales Cardiff PO Box 926, Senghennydd Road, Cardiff CF24 4YH, Wales, UK Received 28 March 2002
This contribution is dedicated to Huzihiro Araki on the occasion of his seventieth birthday Modular invariants satisfy remarkable fusion rules. Let Z be a modular invariant associated to a braided subfactor N ⊂ M . The decomposition of the non-normalized modular invariants ZZ ∗ and Z ∗ Z into sums of normalized modular invariants is related to the decomposition of the full induced M -M system of sectors. Keywords: Modular invariants, subfactors.
Contents 1. Introduction 2. Preliminaries 3. A closer look at the M -N system 3.1 Varying the ι-vertex on the M -N graphs 3.2 A curious identity 3.3 Towards a general formula for [θ] 4. A closer look at the M -M system 4.1 Some remarks on products of modular invariants 4.2 On the geometry of the M -M system 5. Examples 5.1 SU (2)-invariants 5.2 SU (3)-invariants 5.3 Towards a pattern 5.4 Interesting invariants of SU (n)n Acknowledgements Reference
709 711 714 714 715 716 718 718 719 720 721 723 725 727 729 729
1. Introduction Suppose N ⊂ M is a braided type III subfactor; i.e. say the type III factor N possesses a non-degenerate system N XN of braided endomorphisms with the inclusion generated by certain sectors of the system. Then we know by [42] that the system N XN generates a representation of the modular group SL(2; Z), with generators 709
August 21, 2002 19:22 WSPC/148-RMP
710
00135
D. E. Evans
S = {Sλ,µ ; λ, µ ∈ N XN }, T = {Tλ,µ ; λ, µ ∈ N XN }. Moreover [7, 11, 18] the inclusion generates a modular invariant Z through the process of α-induction from sectors of N to sectors of M : − Zλ,µ = hα+ λ , αµ i ,
λ, µ ∈ N XN .
(1)
The right hand side is interpreted as multiplicities of common sectors in the two inductions, which is clearly thus a matrix with positive integer entries. It commutes with both S and T matrices or the representation of SL(2; Z). In particular this covers the case of all A-D-E SU (2) modular invariants, and much more besides. We say that a modular invariant is sufferable if it can be realised from a subfactor in this way from α-induction on a braided system of endomorphisms. ± which in turn Moreover, we can generate from α-induction the sectors M XM P 2 irregenerate the full system M XM of M . The full system M XM has λ,µ Zλ,µ ducible sectors and clearly α-induction gives representations of the original N -N fusion rules on M XM . However it is the natural action of the N -N sectors on the corresponding N -M sectors N XM is what gives the A-D-E classification and its P generalizations. In particular the trace of Z, trZ = λ Zλ,λ , gives the number of P 2 = trZZ ∗ . The matrix ZZ ∗ is a modular N -M sectors in N XM . Now λ,µ Zλ,µ invariant, in that it has postive integral entries, commutes with representation of the modular group SL(2; Z), but in general will not be physical in not having the vacuum entry normalized to be one. It is therefore tempting to ask whether we can understand the full M -M system in terms of an analysis of the modular invariant ZZ ∗ , and an inclusion N ⊂ M1 with the full M -M system being related to the chiral N -M1 system, just as we understand the N -M system from the modular invariant Z. This was the original motivation in [9, 10] to write the numerical count P 2 ∗ λ,µ Zλ,µ as trZZ , the trace of a modular invariant. For example for SU (2), where moreover always Z = Z ∗ , we have the following simple commutative fusion rules for the three modular invariants at level 16 labelled by the three Dynkin diagrams with Coxeter number 18: 2 = 2ZD10 , ZD 10
ZD10 ZE7 = ZE7 ZD10 = 2ZE7 ,
ZE2 7 = ZD10 + ZE7 .
Consider the subfactor N ⊂ M describing the E7 modular invariant. Indeed the fusion graph of α+ 1 for the E7 example has two connected components, D10 and E7 with the decomposition trZE2 7 = trZD10 + trZE7 reflecting this decomposition of the M -M graph. The aim of this paper is to begin to understand better the decomposition of the full M -M system into its components via the decomposition of the matrix ZZ ∗ into normalized modular invariants. If a modular invariant Z is associated to an inclusion N ⊂ M , then where would we try to understand the doubled or modular invariant ZZ ∗ ? This will be our driving principle: For a subfactor N ⊂ M , there is a natural squaring or iteration procedure N ⊂ M ⊂ M1 of the basic construction. Indeed, if the decomposition Eq. (1) formula for a type I invariant Z = B ∗ B in matrix form is related to an inclusion N ⊂ M , with dual canonical endomorphism ¯ιι, then it is natural
August 21, 2002 19:22 WSPC/148-RMP
00135
Fusion Rules of Modular Invariants
711
to try to understand the iteration ZZ ∗ = B ∗ BB ∗ B, with the basic construction N ⊂ M ⊂ M1 which has dual canonical endomorphism ¯ιι¯ιι. In the next section we outline our framework of preliminaries in more detail. In Sec. 3 we complete some analysis begun in [8] regarding changing the ι vertex on the M -N graph which will be used for example in [18] to understand the Kostant polynomials of [31] from a subfactor point of view. It is not necessary that a given modular invariant can be realised from a subfactor. However, even if a modular invariant can be realised from a subfactor it is not clear what the possible dual canonical endomorphisms are. Nevertheless there is a simple expression for the sum of all possible dual canonical endomorphisms in Sec. 3.2. This will be used for example in [20] for answering the question of which modular invariants are realisable in concrete situations with given modular data. Many subfactors can give rise to the same modular invariant. However, in Sec. 3.3 we consider whether sufferable modular invariants can be realised in canonical ways with natural dual canonical endomorphisms. We then in Sec. 4 look at the structure of the products of modular invariants, in particular ZZ ∗ and Z ∗ Z, and how their decomposition into normalised modular invariants is related to the geometry of the related M -M ± ± orbits, and the decomposition of M XM into system, its decomposition into M XM ∗ ∗ 0 (ZZ )0,0 and (Z Z)0,0 M XM orbits respectively. Sections 5.1 and 5.2 contain a discussion of concrete examples from SU (2) and SU (3) respectively. In particular the curious example of the full M -M system for the conformal embedding modular + orbits in M XM yet the full invariant SU (3)9 ⊂ (E6 )1 where there are six M XM (12) system contains besides three copies of E1 , also three copies of the isospectral (12) graph E2 . We can write a sufferable modular invariant Z in terms of rectangular branch∗ ∗ ∗ ∗ ∗ B− so that ZZ ∗ = B+ B− B− B+ and Z ∗ Z = B− B+ B+ B− . ing matrices as Z = B+ ∗ We look in Sec. 5.3 at the sandwiched B± B± . This is a modular invariant for the extended system which is in general not normalized but its decomposition into normalized modular invariants (usually permutations) and its relationship to the ± orbits is discussed. Finally in decomposition of the full system M XM into M XM Sec. 5.4 we discuss some interesting invariants of SU (n)n . In the conclusions of [8] we speculated about modular invariants which look like type I or type II but really come from heterotic extensions, i.e. for which we have different intermediate local subfactors. We provide examples, actually making use of the heterotic SO(16`)1 modular invariants (` = 1, 2, 3, . . .) treated in [8], and conformal inclusions SU (n)n ⊂ SO(n2 −1)1 . The simplest case is SU (7)7 ⊂ SO(48)1 and by pulling back the hetorotic situation on SO(48)1 we obtain our strange heterotic modular invariant on SU (7)7 — which of course must be symmetric. 2. Preliminaries We cite [19] as a general reference for operator algebras and subfactors, and recall the sector setting of [34]. Let A and B be type III von Neumann factors. A unital
August 21, 2002 19:22 WSPC/148-RMP
712
00135
D. E. Evans
∗-homomorphism ρ : A → B is called a B-A morphism. The positive number dρ = [B : ρ(A)]1/2 is called the statistical dimension of ρ; here [B : ρ(A)] is the minimal Jones index [30] of the subfactor ρ(A) ⊂ B. If ρ and σ are B-A morphisms with finite statistical dimensions, then the vector space of intertwiners Hom(ρ, σ) = {t ∈ B : tρ(a) = σ(a)t , a ∈ A} is finite-dimensional, and we denote its dimension by hρ, σi. Indeed we will only consider morphisms of finite statistical dimension. To any B-A morphism ρ is assigned a conjugate A-B morphism ρ¯ so that the map [ρ] → [¯ ρ] is additive, antimultiplicative and idempotent — generalizing the notion of inversion and conjugate representation in a group or group dual respectively. We work with the setting of [11], i.e. we are working with a type III subfactor and finite system N XN ⊂ End(N ) of (possibly degenerately) braided morphisms which is compatable with the inclusion N ⊂ M . Then the inclusion is in particular forced to have finite Jones index and also finite depth (see e.g. [19]). More precisely, we make the following Assumption 2.1. We assume that we have a type III subfactor N ⊂ M together with a finite system of endomorphisms N XN ⊂ End(N ) in the sense of [11, Definition 2.1] which is braided in the sense of [11, Definition 2.2] and such that θ = ¯ιι ∈ Σ(N XN ) for the injection M -N morphism ι : N ,→ M and a conjugate N -M morphism ¯ι. With the braiding ε on N XN and its extension to Σ(N XN ) (the set of finite sums of morphisms in N XN ) as in [11], one can define the α-induced morphisms α± λ ∈ End(M ) for λ ∈ Σ(N XN ) by the Longo–Rehren formula [37], namely by putting ι −1 ◦ Ad(ε± (λ, θ)) ◦ λ ◦ ¯ι , α± λ =¯ where ¯ι denotes a conjugate morphism of the injection map ι : N ,→ M . Then − ± α+ λ and αλ extend λ, i.e. αλ ◦ ι = ι ◦ λ, which in turn implies dα± = dλ by the λ
± ± multiplicativity of the minimal index [35]. Moreover, we have α± λµ = αλ αµ if also ± ± µ ∈ Σ(N XN ), and clearly α± ¯ is a conjugate for αλ . idN = idM . The morphism αλ Let γ = ι¯ι denote Longo’s canonical endomorphism from M into N . We will assume that braiding on the system N XN is non-degenerate. In this case there is a natural represention of the modular group SL(2; Z) where the S and T matrices are basically given by the Hopf link and twist respectively. More precisely, recall that the statistics phase of ωλ for λ ∈ N XN is given as dλ φλ (ε+ (λ, λ)) = ωλ 1, P where the state φλ is the left inverse of λ. We set z = λ∈N XN d2λ ωλ If z 6= 0 we put c = 4 arg(z)/π, which is the central charge defined modulo 8. The S-matrix is defined by 1 X ωλ ωµ ρ Nλ,µ dρ , λ, µ ∈ N XN , Sλ,µ = |z| ωρ ρ∈N XN
August 21, 2002 19:22 WSPC/148-RMP
00135
Fusion Rules of Modular Invariants
713
ρ with Nλ,µ = hρ, λµi denoting the fusion coefficients, [42, 22, 21]. (As usual, the label 0 refers to the identity morphism id ∈ N XN .) Let T be the diagonal matrix with entries Tλ,µ = e−iπc/12ωλ δλµ . Then this pair of S and T matrices satisfy T ST ST = S and give a unitary representation of the modular group SL(2; Z), [42, 45]. Putting − Zλ,µ = hα+ λ , αµ i defines a matrix with positive integral entries normalized at the vacuum, Z0,0 = 1, commuting with S and T . Consequently, Z gives a modular invariant [11, 18]. Let M XM ⊂ End(M ) denote a system of endomorphisms consisting of a choice of representative endomorphisms of each irreducible subsector of sectors of the form [ιλ¯ι], λ ∈ N XN . We choose id ∈ End(M ) representing the trivial sector in M XM . ± α and the α-system M XM to be Then we define similarly the chiral systems M XM the subsystems of endomorphisms β ∈ M XM such that [β] is a subsector of [α± λ] − α ], respectively, for some λ, µ ∈ X . The neutral system is defined and of of [α+ N N µ λ + − ± 0 0 α = M XM ∩ M XM , so that M XM ⊂ M XM ⊂ M XM ⊂ M XM . as the intersection M XM Suppose that we have two subfactors, N ⊂ Ma and N ⊂ Mb where the irreducible components of both dual canonical endomorphisms lie in the braided non degenerate system N XN with corresponding modular invariants Z a and Z b respectively. Let Ma XMb denote the irreducible subsectors of ιa λ¯ιb where ιa , ιb are the corresponding embeddings of N in Ma and Mb respectively. We can then by an extension of the ideas of [11] show that the complexification of the bimodule Ma XMb under the left action of Ma XMa and the right action of Mb XMb is isomorphic to M a ¯b , Hλ,µ ⊗H (2) λ,µ λ,µ∈N XN
where c = Hλ,µ
M
Hom(λ¯ µ, x¯ x), λ, µ ∈ N XN
(3)
x∈N XMc c , c = a, b. In particular the is the Hilbert space of intertwiners of dimension Zλ,µ decomposition in Eq. (2) is compatable in the natural way as a bimodule with the complexification of the fusion rule algebra of Mc XMc as M c B(Hλ,µ ). (4) λ,µ∈N XN
A dimension counts shows that the number of irreducible Ma -Mb sectors of Ma XMb is tr(Z a∗ Z b ). If Ma = Mb = M , then #M XM = trZ ∗ Z, and if Ma = N and Mb = M , then #N XM = trZ. The action of N XN × N XN on Ma XMb via α− induction namely ν, ρ → α+ ν αρ , on either the left via the induction N ⊂ Ma or on the right via N ⊂ Mb , gives a doubled nimrep (ν, ρ) → Γν,ρ whose spectrum is a b Zλ,µ . This reduces to parts 1 and 2 respecSλ,ν Sµ,ρ /Sλ,0 Sν,0 with multplicity Zλ,µ tively of [12, Theorem 4.16] when Ma = Mb , Ma = N respectively. Applications of the existence of such Z a -Z b nimreps for sufferable invariants and the question of the decomposition of the products Z a∗ Z b into normalised modular invariants will appear elsewhere.
August 21, 2002 19:22 WSPC/148-RMP
714
00135
D. E. Evans
We are particularly concerned here with modular invariants arising in WZW or loop group settings. The modular data (S, and T matrices etc) can be constructed from representation theory of unitary integrable highest weight modules over affine Lie algebras or in exponentiated form from the positive energy representations of loop groups. The subfactor machinery is invoked as follows. Let LG be a loop group (associated to a simple, simply connected loop group G). Let LI G denote the subgroup of loops which are trivial off some proper interval I ⊂ S 1 . Then in each level k vacuum representation π0 of LG, we naturally obtain a net of type III factors {N (I)} indexed by proper intervals I ⊂ S 1 by taking N (I) = π0 (LI G)00 (see [46, 23, 1]). Since the Doplicher–Haag–Roberts DHR selection criterion (cf. [27]) is met in the (level k) positive energy representations πλ , there are DHR endomorphisms λ naturally associated with them. (By some abuse of notation we use the same symbols for labels of positive energy representations and endomorphisms.) The rational conformal field theory RCFT modular data matches that in the subfactor setting — in particular the RCFT Verlinde fusion coincides with the (DHR superν = hλµ, νi. The statistics S- and T -matrices selection) sector fusion, i.e. that Nλ,µ are identical with the Kac-Peterson S- and T -modular matrices which perform the conformal character transformations. Antony Wassermann has informed us that he has extended his results for SU (n)k fusion [46] to all simple, simply connected loop groups; and with Toledano–Laredo all but E8 using a variant of the Dotsenko– Fateev differential equation considered in his thesis [32], see also [46, 33, 32, 3, 4]. Two subfactor cases are of particular interest in this context, that of conformal embeddings [47, 6, 7] and simple current or orbifold constructions [6]. For a conformal embedding Gk ⊂ H1 we have subfactors N = π 0 (LI G)00 ⊂ π 0 (LI H)00 = M , with π 0 denoting the level 1 vacuum representation of LH. Here, the subfactor comes equipped with non-degenerately braided systems of endomorphisms on N and M isomorphic to the level k representations of G and level 1 representations of H respectively, and is relevant for the role of studying conformal embedding modular invariants. The centre Zn of SU (n) acts on the algebra N = π0 (LI SU (n))00 , for say the vacuum level k representation. We can form the crossed product subfactor N (I) ⊂ N (I) o Zn , which will recover the orbifold modular invariants, but this extended system is only local if and only if k ∈ 2nN if n is even and k ∈ nN if n is odd [6]. 3. A Closer Look at the M -N System For a (non-degenerately) braided subfactor it is the M -N (or N -M ) system which is relevant for the diagonal part of the modular invariant. Therefore it is in particular the key to understand the role of (Coxeter) exponents. 3.1. Varying the ι-vertex on the M -N graphs We here assume that we are dealing with a braided (type III) subfactor N ⊂ M . For a ∈ N XM consider the (irreducible) subfactor a(M ) ⊂ N and let a(M ) ⊂ N ⊂ L
August 21, 2002 19:22 WSPC/148-RMP
00135
Fusion Rules of Modular Invariants
715
be its basic extension. Note that then θL = a¯ a has a Q-system [36] for a(M ) ⊂ N so that it is a canonical endomorphism, i.e. θa is the dual canonical endomorphism a = ¯ιL ιL for ιL : N ,→ L the injection homomorphism and of N ⊂ L. Thus θa = a¯ ¯ιL ∈ Mor(L, N ) a conjugate so that ¯ιL (L) = a(M ). We conclude that ¯ιL−1 ◦ a is an isomorphism in Mor(M, L) with conjugate (i.e. inverse) a−1 ◦ ¯ιL ∈ Mor(L, M ). For any b ∈ Mor(M, N ) we now associate xb ∈ Mor(L, N ) by putting xb = b ◦ a−1 ◦ ¯ιL . Note that xb is irreducible if and only if b is and that xa = ¯ιL . Lemma 3.1. Varying b ∈ N XM , the xb ’s yield all the N -L sectors, and this provides a canonical bijection between N XM and N XL . Proof. Note that for any b ∈ as b¯ a ∈ Σ(N XN ). Thus
N XM
there is some λ ∈
N XN
such that hb¯ a, λi = 6 0
a, λi = 6 0, hxb , λ¯ιL i = hba−1 ¯ιL ιL , λi = hba−1 a¯ implying that [xb ] is one of the N -L sectors. Conversely, assume that there is some x ∈ N XL such that hx, xb i = 0, i.e. hx, ba−1 ¯ιL i = 0 for all b ∈ N XM . This implies 6 0 for all λ ∈ N XN , in contradiction to x ∈ N XL . hx, λaa−1 ¯ιL i = hx, λ¯ιL i = Lemma 3.2. For b, c ∈ N XM we have hxb , νxc i = hb, νci , i.e. the (graphs describing the) multiplication rules of the same.
(5) N XN
on
N XM
and
N XL
are
Proof. This is just hxb , νxc i = hba−1 ¯ιL , νca−1 ¯ιL ai = hb, νca−1 ¯ιL ιL−1 ai = hb, νci , −1 ¯ιL . using that ι−1 L a is a conjugate morphism of a
Note that the lemma implies in particular that at least the diagonal part of the coupling matrices produced from N ⊂ M and N ⊂ L are the same. That in fact the full coupling matrix (and not only the diagonal part) remains invariant under this change of the ι-vertex has been shown in [10]. 3.2. A curious identity We here assume that we are dealing with a non-degenerately braided (type III) subfactor N ⊂ M . We have seen that for a given braided subfactor N ⊂ M , realizing a coupling matrix Z and a category of morphisms, we obtain irreducible a, a ∈ N XM , realizing the subfactors with dual canonical endomorphisms θa = a¯ same Z [10]. It seems likely to be true that this way we in fact exhaust all irreducible subfactors producing equivalent categories.
August 21, 2002 19:22 WSPC/148-RMP
716
00135
D. E. Evans
Given a modular invariant matrix Z, it is usually not easy to decide whether it can be realized from a subfactor or not, and, if yes, how the possible dual canonical endomorphisms might look like. In the latter case, i.e. if there is some N ⊂ M realizing Z, at least a statement on the sum of all these endomorphisms can be made in the following Proposition 3.3. If the braiding on M [a¯ a] = a∈N XM
N XN
M
is non-degenerate we have the identity Zλ,µ [λ¯ µ] .
(6)
λ,µ∈N XN
Proof. The multiplicity of [ν] on the left-hand side is for all ν ∈ N XN X X X Sρ,ν ha¯ a, νi = ha, νai = tr(Gν ) = Zρ,ρ , Sρ,0 a a ρ where we used [12, Theorem 4.16]. The multiplicity of [ν] on the right-hand side is for all ν ∈ N XN X X X X Sρ,ν Sρ,ν λ ∗ Zλ,µ hλ¯ µ, νi = Zλ,µ Nν,µ = Zλ,µ Sρ,µ Sρ,λ = Zρ,ρ , Sρ,0 Sρ,0 ρ λ,µ
λ,µ
λ,µ,ρ
where we used the Verlinde formula and modular invariance. 3.3. Towards a general formula for [θ] Looking at a couple of examples, it seems that a “physical invariant” Z, which can be realized from some subfactor, can in fact be realized with a dual canonical endomorphism given by something like M ¯ . Zλ , λ[λ] (7) [θ] = λ
In general summing over a subset of N XN related to Frobenius-Schur indicators and conformal dimensions. Let us consider some examples. For Zn conformal field theories with n odd, this works perfectly. In this situation, there are n sectors, labelled by λ = 0, 1, 2, . . . , n−1 (mod n), obeying Zn fusion rules, and conformal dimensions of the form hλ = aλ2 /2n (mod 1), where a is an integer mod 2n, a and n coprime and a is even whenever n is odd. The modular invariants of such models have been classified [15]. They are labelled (with notation as in [9, 10]) by the divisors δ of n ˜ , where n ˜ = n if n is odd and n ˜ = n/2 if n is even. Let us take (δ) (n/δ) n odd. Then it is not hard to show that Zλ,λ¯ = Zλ,λ . Thus by [9, Eq. (8.1)] we Ln/δ−1 (δ) (δ) find Zλ,λ¯ = 1 for λ = 0 mod δ and Zλ,λ¯ = 0 otherwise, and in fact θ = j=0 [ρjδ ]
− + + realizes Z (δ) , see [9]. (By the way: Since hα+ j α−j , θi = hαj α−j , θi = 1 it is easy to Pn˜ /δ−1 + − see that [θ] = j=0 [αj α−j ] for all Zn theories, no matter whether n is even or odd.) Note that for the conjugation invariant C we would usually insert all morphisms in the [θ]. This does not work for the Zn CFT’s with n even because we must use
August 21, 2002 19:22 WSPC/148-RMP
00135
Fusion Rules of Modular Invariants
717
the even labels only [9]. So for some reasons the odd labels have to be ruled out. (Moreover, if n is a multiple of 4 we do not want to see the self-conjugate even label n/2 in the dual canonical endomorphism realizing the trivial invariant.) A similar thing happens for SU (2)k . Here Zλ,λ = Zλ,λ¯ , but if we restrict the sum to even spins then we can in fact realize each A-D-E invariant by Eq. (7). Let us start with the subfactors used to produce the A-D-E modular invariants in [6, 11, 12], i.e. the corresponding dual canonical endomorphisms [θ] are given by: A` ,
`=k+1:
[λ0 ]
D` ,
k = 2` − 4 :
[λ0 ] ⊕ [λk ]
E6 ,
k = 10 :
[λ0 ] ⊕ [λ6 ]
E7 ,
k = 16 :
[λ0 ] ⊕ [λ8 ] ⊕ [λ16 ]
E8 ,
k = 28 :
[λ0 ] ⊕ [λ10 ] ⊕ [λ18 ] ⊕ [λ28 ] .
Now we choose the following M -N morphisms [¯ a]: For A` (where ι is trivial) we choose ιλ[k/2] ≡ λ[k/2] . Here [x] denotes the greatest possible integer less than or equal to x. For D` we choose √ ιλ[`/2]−1 . For E6 we choose σι with σ the marked0 vertex with statistical dimension 2. For E7 we choose the morphism denoted by ¯b in [12, (1) (1) Fig. 41]. For E8 we choose α6 ι with α6 the neutral or marked vertex as in [6, Fig. 8]. It is now straightforward to compute the sectors [a¯ a] which will be our new [θ]’s. For example, for D` we compute [λ[`/2]−1 ¯ιιλ[`/2]−1 ] = [λ[`/2]−1 ]2 ([λ0 ] ⊕ [λk ]). For E6 we compute [¯ισσι] = [¯ι]([α0 ] ⊕ [α10 ])[ι] = ([λ0 ] ⊕ [λ10 ])([λ0 ] ⊕ [λ6 ]). Only for E7 we need to sit down a bit, using [¯b0 ] = [ιλ2 ] ⊕ [ιλ4 ] [ιλ6 ]. This gives: A` ,
`=k+1:
[λ0 ] ⊕ [λ2 ] ⊕ [λ4 ] ⊕ · · · ⊕ [λ2[k/2] ]
D2% ,
k = 4% − 4 :
[λ0 ] ⊕ [λ2 ] ⊕ · · · ⊕ [λ2%−4 ] ⊕ 2[λ2%−2 ] ⊕ [λ2% ] ⊕ · · · ⊕ [λk ]
D2%+1 ,
k = 4% − 2 :
[λ0 ] ⊕ [λ2 ] ⊕ [λ4 ] ⊕ · · · ⊕ [λk ]
E6 ,
k = 10 :
[λ0 ] ⊕ [λ4 ] ⊕ [λ6 ] ⊕ [λ10 ]
E7 ,
k = 16 :
[λ0 ] ⊕ [λ4 ] ⊕ [λ6 ] ⊕ [λ8 ] ⊕ [λ10 ] ⊕ [λ12 ] ⊕ [λ16 ]
E8 ,
k = 28 :
[λ0 ] ⊕ [λ6 ] ⊕ [λ10 ] ⊕ [λ12 ] ⊕ [λ16 ] ⊕ [λ18 ] ⊕ [λ22 ] ⊕ [λ28 ] .
So here we indeed find exactly the even spins of the diagonal. (Note that the [θ]’s for A and Dodd are the same (at levels k = 6, 10, 14, . . .). Thus these are examples for subfactors producing different Z’s but having the same dual canonical endomorphism sector.) It is interesting to note what the canonical endomorphism looks like in these possibly natural subfactors: A` ,
`=k+1:
[α0 ] ⊕ [α2 ] ⊕ [α4 ] ⊕ · · · ⊕ [α2[k/2] ]
D2% ,
k = 4% − 4 :
[α0 ] ⊕ [α2 ] ⊕ · · · ⊕ [α2%−4 ] ⊕ [α2%−2 ] ⊕ [α2%−2 ] ⊕ []
(1)
⊕[β2 ] ⊕ [β4 ] ⊕ · · · ⊕ [β2%−4 ] ⊕ [η] ⊕ [η 0 ]
(2)
August 21, 2002 19:22 WSPC/148-RMP
718
00135
D. E. Evans
D2%+1 ,
k = 4% − 2 :
[α0 ] ⊕ [α2 ] ⊕ [α4 ] ⊕ · · · ⊕ [αk ]
E6 ,
k = 10 :
[α0 ] ⊕ [α10 ] ⊕ [δ] ⊕ [δ 0 ]
E7 ,
k = 16 :
− + + [α0 ] ⊕ [η] ⊕ [δ] ⊕ [α+ 3 α1 ] ⊕ [α6 ] ⊕ [α4 ] − + − ⊕([α+ 5 α1 ] [α3 α1 ])
E8 ,
k = 28 :
[α0 ] ⊕ [α6 ] ⊕ [δ] ⊕ [X ] ⊕ [ω] ⊕ [$] ⊕ [η] ⊕ [η 0 ] . (1)
For D2% , D2%+1 , E6 , and E7 , E8 , we have used the notation of [12, Fig. 9], [12, Fig. 40], [7, Fig. 2], [12, Fig. 42] and [7, Fig. 5] respectively. At least in this SU (2) setting, there is a fusion rule symmetry on M XM obtained − + − by interchanging α+ λ with αλ taking M XM to M XM . In terms of the above figures for D2% , D2%+1 , E6 , E7 , E8 , this is the flip around the vertical through the vacuum. (When we change in the above examples the subfactor N ⊂ M but retain the ± 0 , M XM , M XM , M XN remain same modular invariant the systems of sectors M XM isomporhic to the old ones, so we retain the same figures). Again for E7 we need to − − 2 ι, α+ do some work, e.g. h¯b0 b0 , α+ i αj i = hι([λ2 ] ⊕ [λ4 ] [λ6 ]) ¯ i αj i = h([λ2 ] ⊕ [λ4 ] − 2 [λ6 ])2 , ¯ια+ i αj ιi = h([λ2 ] ⊕ [λ4 ] [λ6 ]) , [λi ][λj ]([λ0 ] ⊕ [λ8 ] ⊕ [λ16 ])i. Then the the “real” part of the full system are the sectors fixed under the flip, i.e. the sectors lying on the vertical through the vacuum. Then the canonical endomorphism is the even part of the “real” part of the full system — presumably these are the ones of Frobenius-Schur indicator one. For SU (3)k we have checked for k = 1 and k = 2 that Z = C is indeed realized by Eq. (7), the sum taken over all SU (3)k weights. Assuming that the subfactor exists for k = 3 it is easy to check it for this case as well. Similarly is is easy to check for k ≤ 3 that [θ] given as sum over all selfconjugate sectors indeed produces Z = 1 since the [θ] is just the square of the only non-trivial self-conjugate sector [λ(2,1) ]. Squaring larger [λ]’s instead, this procedure should also work at any higher level k. 4. A Closer Look at the M -M System Here we discuss the structure of the entire M -M system. We will only consider proper modular invariants here, i.e. we assume the N -N system is non-degenerate. First some observations. 4.1. Some remarks on products of modular invariants A modular invariant from a subfactor is of the form X − b+ Zλ,µ = τ,λ bτ,µ . 0 τ ∈M X M
Now let us consider the fusion graph of α+ λ in the entire system M XM . (We will here consider the non-degenerate case only.) We know since [12] that the multiplicity of
August 21, 2002 19:22 WSPC/148-RMP
00135
Fusion Rules of Modular Invariants
719
P 2 the eigenvalue Sλ,ρ /S0,ρ is given by µ Zρ,µ = (ZZ ∗ )ρ,ρ , and that this exhausts the spectrum. Since this graph contains the chiral graph as a subgraph, we must P + + + , where Z + denotes the type I parent, Zλ,µ = τ b+ have (ZZ ∗ )ρ,ρ ≥ Zρ,ρ τ,λ bτ,µ . And indeed, we can compute quite generally X XX − − + Zλ,ν Zµ,ν = b+ (ZZ ∗ )λ,µ = τ,λ bτ,ν bτ 0 ,µ bτ 0 ,ν ν
≥
ν
XX ν
τ,τ 0
− + − b+ τ,λ bτ,ν bτ,µ bτ,ν ≥
τ
X
+ + b+ τ,λ bτ,µ = Zλ,µ ,
τ
0 there is a ν such that b− where we used that for each τ ∈ M XM τ,ν ≥ 1. (And of − , etc. etc.) Note that ZZ ∗ − Z + must course we obtain similarly (Z ∗ Z)λ,µ ≥ Zλ,µ be modular invariant, and looking at the above calculation we see that it is even non-negative. So what about normalization? We distinguish the two cases: (i) Z is a pure permutation. Then in fact ZZ ∗ = Z + = 1. (ii) otherwise there is a λ with P 2 > 1. If this gives exactly 2 then Z0,λ 6= 0, and consequently (ZZ ∗ )0,0 = λ Z0,λ ∗ + we know that ZZ − Z is another normalized integral modular invariant, but if it is larger then it is not clear whether ZZ ∗ − Z + can always written as a positive integer linear combination of normalized integral modular invariants. It is clear that if we always obtain a positive integer linear combination of normalized integral modular invariants, then the number of such invariants (counting P 2 = hθ+ , θ+ i. Each invariant is expected to multiplicities) will be (ZZ ∗ )0,0 = λ Z0,λ P 2 compocorrespond to a component of the full fusion graph, so we expect λ Z0,λ nents. (By components we mean here a connected component of the fusion graph of a generator α+ f . Equivalently one can decompose the sum of the full fusion matrices of all chiral sectors into irreducible components.) That at least this numbering for the connected components is indeed correct is shown in the following subsection.
4.2. On the geometry of the M -M system ± 0 ⊃ M XM . Under Let us recall that the M -M system has subsystems M XM ⊃ M XM + the action (fusion) of a chiral system, say M XM , the M XM system decomposes into + M XM orbits. These correspond to the connected components of the fusion graph of + in M XM . We may draw such a graph using straight lines, and a generator of M XM − using dotted lines as the graph arising from the corresponding generator of M XM + in [7, Figs. 2, 5, 8, 9] or [12, Figs. 40, 42, 43]. For the E8 example we find 4 M XM orbits which are precisely the 4 straight-lined E8 “layers” in [7, Fig. 5]. How many such layers do we usually have? A first answer is this: ± 0 orbits in M XM is equal to the number of M XM Lemma 4.1. The number of M XM ∓ ± ∓ orbits in M XM . In fact, all M XM orbits in M XM intersect with the subset M XM ⊂ ∓ 0 M XM , and the intersections are precisely the M XM orbits in M XM . + Proof. Consider the identity component Γ+ (0) of the fusion graph of M XM in M XM + + − (which is essentially M XM itself). Since M XM and M XM generate M XM , each
August 21, 2002 19:22 WSPC/148-RMP
720
00135
D. E. Evans
− + connected component Γ− (j) of the fusion graph of M XM in M XM must touch Γ(0) + somewhere. (E.g. the identity component Γ− (0) meets Γ(0) exactly on the ambichiral − orbits in M XM is equal to the number of vertices.) Hence the number of M XM + groups groups of vertices on Γ(0) lying on the same component Γ− (j) . Two vertices + corresponding to sectors β , β ∈ X lie on the same component Γ− on Γ+ 1 2 M M (0) (j)
− if and only if there is a β ∈ M XM such that hβ1 β, β2 i 6= 0. But hβ, β1 β2 i 6= 0 means that β is ambichiral. Hence two vertices on Γ+ (0) corresponding to sectors + − β1 , β2 ∈ M XM lie on the same component Γ(j) if and only if they are in the same ambichiral orbit. The proof is completed by exchanging + and − signs.
A more concrete answer is now obtained in the following Lemma 4.2. The number of
0 M XM
orbits in
± M XM
is given by
P
± 2 λ (b0,λ ) .
± 0 fusion matrix of τ ∈ M XM in M XM , as in [12, Sec. 4]. Proof. Let Γ± τ,0 be the P ± The sum matrix Q = τ Γτ,0 will not be irreducible as long as we have more than ± 0 0 fusion orbit (i.e. as long as M XM 6= M XM ). In fact Q must decompose one M XM into a number of irreducible blocks which is exactly the number of fusion orbits. + is an eigenvector of Q with Nevertheless the vector d~ with entries dβ , β ∈ M XM P eigenvalue τ dτ . Since all the entries are strictly positive, it must be the direct sum of the Perron–Frobenius eigenvectors of each irreducible block (up to a scaling by a positive factor for each block). Thanks to the Perron–Frobenius theorem, the P number τ dτ is thus the (non-degenerate) Perron–Frobenius eigenvalue of each irreducible block. It follows that the number of irreducible components is given by P the multiplicity of the eigenvalue τ dτ , i.e. by the multiplicity of X0ext (τ ) in Γ± τ,0 , 0 . By the diagonalization of the Γ± ’s derived in [12, Theorem 4.16], we τ ∈ M XM τ,0 P 2 know that this multiplicity is exactly λ (b± 0,λ ) .
P P 2 P − 2 P 2 2 Note that λ (b+ 0,λ ) = λ Zλ,0 and λ (b0,λ ) = λ Z0,λ . In fact, the consid± 0 in M XM was instructive, but not really necessary to get the eration of the M XM ± orbits in M XM . Thanks to the generating property, we could also number of M XM − have determined the number of N XN orbits in M XM via the induced [α+ λ ] and [αλ ]. Then the statement of [12, Thorem 4.14] would similarly determine the multiplicity P 2 P 2 and µ Zµ,0 , respectively. of the Perron–Frobenius eigenvalues X0 (λ) as µ Z0,µ 5. Examples Suppose N ⊂ M is a braided subfactor with ι : N ,→ M being the injection map and basic construction N ⊂ M ⊂ M1 . Thus if ι1 : M ,→ M1 is the corresponding inclusion, then by naturality the sector [¯ι1 ι1 ] of the dual canonical endomorphism for M ⊂ M1 is identified with the sector of the canonical endomorphism for N ⊂ M , i.e. ι¯ι. Hence the sector of the dual canonical endomorphism θ1 for N ⊂ M1 is [¯ι¯ι1 ι1 ι] = [¯ιι¯ιι] = [θ2 ], which lies in Σ(N XN ) as θ does. In particular, if N XN is
August 21, 2002 19:22 WSPC/148-RMP
00135
Fusion Rules of Modular Invariants
721
braided, we can certainly apply α-induction to the inclusion N ⊂ M1 . Note that in this context, that the inclusion, N ⊂ M1 rarely satisfies chiral locality by [5, Corollary 3.6]. We have the naturality equations for α-induced morphisms xε± (ρ, λ) = ε± (ρ, µ)α± ρ (x) whenever x ∈ Hom(ιλ, ιµ) and ρ ∈ Σ(N XN ), see e.g. [8, Eq. (9)]. In particular, inducing from N to M1 , we have taking λ = µ = id, and x ∈ Hom(ι1 ι, ι1 ι), that αρ (x) = x on N 0 ∩ M1 , for all ρ. We will look again at the SU (3) and SU (2) situations in detail in this basic construction. 5.1. SU (2)-invariants By the A-D-E classification [14], we know that there are at most three invariants for each level labelled by Dynkin diagrams. They satisfy the following fusion rules: 2 = 2ZD2% , ZD 2%
2 ZD = ZA4%−1 , 2%+1
ZE2 6 = 2ZE6 ,
ZE2 7 = ZD10 + ZE7 ,
ZE2 8 = 4ZE8 .
(i) Example: Dj We start with SU (2) at even level k and the simple current or orbifold invariants. Here there is a Z2 extension: N ⊂ N o Z2 , with N as π 0 (LI SU (2))00 in the vacuum representation at level k, and dual canonical endomorphism [λ0 ] ⊕ [λk ]. If k = 4l − 4, [6] then the extension is local, the corresponding modular invariant is D2` , and the canonical endomorphism is γ = [id] ⊕ [α± k ]. If k = 4l − 2, [6] then the extension is not local, the corresponding modular invariant is D2`+1 and the canonical endomorphism is γ = [id] ⊕ []. where is an irreducible subsector of − [α+ 1 α1 ]. In either case, the basic construction is by Takesaki duality: ˆ 2 = N ⊗ Mat2 . N ⊂ N o Z2 ⊂ N o Z2 o Z Thus by the above naturality, αλ = λ ⊗ id, as here N 0 ∩ M1 = Mat2 , the 2 × 2 complex matrices. Thus N XN is identified with M1 XM1 , and we do not appear to have anything interesting. To see the finer structure, we need to look closer at the dual canonical endomorphism [θ1 ], which decomposes in the local case k = 4l − 4, into [¯ιι] and [¯ια± k ι]. Both are dual canonical endomorphisms in their own right. The first can be thought of as giving the sheet of Dj in the full M XM system starting at [idM ] and the second sector as giving the other sheet in the full M XM system. All this becomes clearer in the type I conformal embedding modular invariants. (ii) Example: E6 , SU (2)10 ⊂ SO(5)1 We now consider the E6 modular invariant for SU (2): ZE6 = |χ0 + χ6 |2 + |χ4 + χ10 |2 + |χ3 + χ7 |2 .
August 21, 2002 19:22 WSPC/148-RMP
722
00135
D. E. Evans
This is exhibited by the conformal embedding SU (2)10 ⊂ SO(5)1 . Here the dual canonical endomorphism θ is given by the vacuum sector [θ] = [λ0 ] ⊕ [λ6 ], and the − corresponding canonical endomorphism was computed in [7] as [γ] = [id] ⊕ [α+ 1 α1 ]. Then for the corresponding basic construction N ⊂ M ⊂ M1 we have − [θ1 ] = [ιι1 ι1 ι] = [¯ιι] ⊕ [¯ια− 1 α1 ι] .
This time the dual canonical endomorphism [¯ιι] gives the first sheet of the full M XM system, whilst the second term gives the second sheet of the full system where − − the sector [α+ 1 α1 ] in the full system is identified with the N -M sector [α1 ι] using the changing the ι vertex argument. (iii) Example: E8 , SU (2)28 ⊂ (G2 )1 Next let us revisit the E8 modular invariant at level k = 28: ZE8 = |χ0 + χ10 + χ18 + χ28 |2 + |χ6 + χ12 + χ16 + χ22 |2 . This is exhibited by the conformal embedding SU (2)28 ⊂ (G2 )1 . The dual canonical endomorphism is again given by the vacuum sector [θ] = [λ0 ] ⊕ [λ10 ] ⊕ [λ18 ] ⊕ [λ28 ] . The canonical endomorphism was computed in [7] as: − + − [θ] = [idM ] ⊕ [α+ 1 α1 ] ⊕ [α2 α2 ] ⊕ [η] . − where [η] was described as an irreducible subsector of [α+ 3 α3 ]. However by comparing [6, Fig. 8] with [7, Fig. 5], and with the above experience for E6 , we suspect +(2) −(2) ±(1) ±(2) ] ⊕ [α5 ] and that [η] can be identified with [α5 α5 ], where [α± 5 ] = [α5 ±(1) ± ]. We can compute [α± 7 ] = [α3 ] ⊕ [α5 +(2)
hγ, α5
−(2)
α5
+(2) −(2) α5 i
i = hι¯ι, α5
+(2)
= hid, ¯ια5
−(2)
α5
ιi
+ + − − − = hid, ¯ι[α+ 5 ⊕ α3 α7 ][α5 ⊕ α3 α7 ]ιi
= hid, ¯ιι([λ5 ] ⊕ [λ3 ] [λ7 ])2 i =1 using the Verlinde fusion rules for SU (2) at level 28. We can similarly show that +(2) −(2) − + − [α5 α5 ] is irreducible and disjoint from [idM ], [α+ 1 α1 ] and [α2 α2 ]. Hence [η] = +(2) −(2) [α5 α5 ]. There are four sheets in the full system M XM , all copies of E8 . The four terms in − + − γ give rise to the four sheets in the full system with vertices [idM ], [α+ 1 α1 ], [α2 α2 ], +(2) −(2) −(2) − ι on the N -M graph [α5 α5 ] identified with base points ι, α− 1 ι, α2 ι and α5 E8 using again the argument of changing the ι vertex.
August 21, 2002 19:22 WSPC/148-RMP
00135
Fusion Rules of Modular Invariants
723
5.2. SU (3)-invariants We now move on the the case of SU (3) and its modular invariants. (i) Example: E (8) , SU (3)5 ⊂ SU (6)1 . The first conformal embedding invariant is at level 8: ZE (8) = |χ0,0 + χ4,2 |2 + |χ2,0 + χ5,3 |2 + |χ2,2 + χ5,2 |2 + |χ3,0 + χ3,3 |2 + |χ3,1 + χ5,5 |2 + |χ3,2 + χ5,0 |2 . This can be obtained from the conformal inclusion SU (3)5 ⊂ SU (6)1 with dual canonical endomorphism is given by the vacuum sector [θ] = [λ0,0 ] ⊕ [λ4,2 ] with the − canonical endomorphism computed in [7] as [γ] = [id] ⊕ [α+ 1,0 α1,1 ]. (ii) Example: E (12) , SU (3)9 ⊂ (E6 )1 . This modular invariant is at level 12: ZE (12) = |χ0,0 + χ9,0 + χ0,9 + χ4,1 + χ1,4 + χ4,4 |2 + 2|χ2,2 + χ5,2 + χ2,5 |2 . It is obtained from the conformal embedding SU (3)9 ⊂ (E6 )1 , with dual canonical endomorphism given by the vacuum sector: [θ] = [λ0,0 ] + [λ9,0 ] + [λ0,9 ] + [λ4,1 ] + [λ1,4 ] + [λ4,4 ] .
(8)
This modular invariant can also be realized from the dual canonical endomorphism ⊕λ Zλ,λ¯ [λ] = [λ0,0 ] + [λ9,0 ] + [λ0,9 ] + [λ4,1 ] + [λ1,4 ] + [λ4,4 ] + 2[λ2,2 ] + 2[λ5,2 ] + 2[λ2,5 ] , where the sum is over all sectors in N χN using [10] the extension N ⊂ M o Z3 , as E6 at level 1 has Z3 fusion rules. Now the canonical endomorphism corresponding to Eq. (8) was computed in [7] as − + − + − + − + − [γ] = [id] ⊕ [α+ 1,0 α1,1 ] ⊕ [α1,1 α1,0 ] ⊕ [α2,0 α2,2 ] ⊕ [α2,2 α2,0 ] ⊕ [α2,1 α2,1 ] .
So we expect six sheets in the full M -M system, but this is where a surprise (12) appears. We do not get six copies of the N -M graph E1 . We only get three copies (12) + − − of E1 , located at the three sectors [id], [α1,0 α1,1 ], [α+ 1,1 α1,0 ] in the M XM graph − and three copies of the isospectral graph E2 located at the three sectors [α+ 2,0 α2,2 ], + − + − [α2,2 α2,0 ], [α2,1 α2,1 ] in M XM . Here we show that for the conformal inclusion SU (3)9 ⊂ (E6 )1 , for which we (12) + orbits in M XM , we find three copies of the graph E1 and three have six M XM (12) copies of E2 . Let us draw the fusion graph of the generator [α+ (1,0) ] in M XM in blue. (We use the labelling as in [7, Fig. 12].) The vacuum column forces its identity component, (12) , see [7]. Now let us think of the i.e. the chiral fusion graph of [α+ (1,0) ], to be E1 (12)
, fusion graph of [α− (1,0) ] in M XM as being red. We now use the fact that Ej j = 1, 2, 3, exhaust the list of isospectral graphs. The connected components of the (12) red graph will correspond to nimreps and hence must be Ej , j = 1, 2, 3. (Note that (12)
August 21, 2002 19:22 WSPC/148-RMP
724
00135
D. E. Evans
the modular invariant obeys Z ∗ Z = 6Z, hence we must have six layers.) Which one (12) of the three graphs can touch the vertices of the blue E1 ? At the identity vertex this is clearly the other chiral graph, determined by the vacuum row to be (a red) (12) (12) E1 . These two (blue and red) E1 ’s intersect exactly on the marked (ambichiral) 0 fusion orbits in vertices. The other red “coset” graphs will connect the other M XM (12) + 0 X . Now the X fusion orbits are just the Z symmetry orbits of E1 . Thus M M M M 3 (12) we will have six red layers: The first is the already determined E1 corresponding +(1) 0 orbit of id. Then there will be one layer connecting [α+ to the M XM (1,0) ], [α(3,1) ] +(2)
+(1)
+(2)
and [α(3,1) ], similarly one layer connecting [α+ (1,1) ], [α(3,2) ] and [α(3,2) ], and finally + + 0 fixed point [α+ ], [α ] and [α each M XM (2,0) (2,1) (2,2) ] are connected to one red layer. + To determine the red layer which touches [α(1,0) ], we compute − + − + + − − hα+ (1,0) α(1,0) , α(1,0) α(1,0) i = hα(1,0) α(1,1) , α(1,0) α(1,1) i = 1 .
Thus [α+ (1,0) ] has only one target vertex on the red graph. Hence we must have here (12)
or the unique isolated extremal either one of the three extremal vertices of E1 (12) (12) does not have such a vertex, this one is ruled out here. vertex of E2 . Since E3 Now note that the target vertices of these extremal vertices have itself two and four (12) (12) target vertices for E1 and E2 , respectively. But since − − + − − hα+ (1,0) α(1,0) α(1,0) , α(1,0) α(1,0) α(1,0) i = 2 (12)
we conclude that a red E1
+ touches [α+ (1,0) ]. The same is checked for [α(1,1) ], and (12)
it cannot lie on the same red E1 as [α+ (1,0) ] since this would mean that one is the fusion product of the other by an ambichiral sector. Next we check what red graph touches [α+ (2,0) ]. Since − + − hα+ (2,0) α(1,0) , α(2,0) α(1,0) i = 1 (12)
we must again locate an extremal vertex of E1
(12)
or E2
here. But now
− − + − − hα+ (2,0) α(1,0) α(1,0) , α(2,0) α(1,0) α(1,0) i = 4 (12)
+ + + forces us to select E2 . (We used [α+ (2,0) ][α(2,2) ] = [id] ⊕ [α(2,1) ] ⊕ [α(4,2) ].) A similar + + argument applies to [α(2,1) ] and [α(2,2) ]. Thus we have indeed found three layers of (12)
(12)
and three layers of E2 . (12) Di Francesco and Zuber actually produced three isospectral graphs Ei , i = 1, 2, 3, whose spectrum reproduced the diagonal part of the modular invariant E (12) (12) (12) (SU (3)9 ⊂ (E6 )1 ), and we realized two of those graphs E1 and E2 in [9]. (12) does The third was apparently eliminated by [39]. We certainly know that E3 not appear in a “natural” way in the sense that we have some subfactor N ⊂ M (12) producing E3 as M -N graph in the following sense: We know that such a subfactor would have intermediate subfactors N ⊂ M+ = M− producing the same invariant (12) ZE (12) and with M+ -N graph E1 . This subfactor could not have the “natural” E1
August 21, 2002 19:22 WSPC/148-RMP
00135
Fusion Rules of Modular Invariants
725
property that the dual canonical endomorphism of M+ ⊂ M decomposes exclusively into ambichiral sectors. This is because we know that the only irreducible braided extensions (relative to the ambichiral system) are the trivial one M+ ⊂ M = (12) (12) and E2 , M+ and M+ ⊂ M = M+ o Z3 where in turn N ⊂ M produces E1 respectively [9, 10]. (iii) Example: E (24) SU (3)21 ⊂ (E7 )1 : The corresponding modular invariant reads ZE (24) = |χ0,0 + χ21,0 + χ21,21 + χ8,4 + χ17,4 + χ17,13 + χ11,1 + χ11,10 + χ20,10 + χ12,6 + χ15,6 + χ15,9 |2 + |χ6,0 + χ21,6 + χ15,15 + χ15,0 + χ21,15 + χ6,6 + χ11,4 + χ17,7 + χ14,10 + χ11,7 + χ14,4 + χ17,10 |2 , therefore [θ] = [λ0,0 ] ⊕ [λ21,0 ] ⊕ [λ21,21 ] ⊕ [λ8,4 ] ⊕ [λ17,4 ] ⊕ [λ17,13 ] [λ11,1 ] ⊕ [λ11,10 ] ⊕ [λ20,10 ] ⊕ [λ12,6 ] ⊕ [λ15,6 ] ⊕ [λ15,9 ] . Taking the extension N ⊂ M o Z2 [10], as the extended system E7 at level 1 has Z2 fusion rules, the modular invariant can also be realised from ⊕λ Zλ,λ¯ [λ] where the sum is over all sectors in N XN . 5.3. Towards a pattern
P 2 P 2 + We have seen that we have exactly λ Z0,λ (respectively λ Zλ,0 ) M XM (re− − + ) orbits in M XM . These intersect with M XM (respectively M XM ), spectively M XM − + 0 i.e. with the M XM (respectively M XM ) orbit containing [id], precisely on its M XM + − orbits. We are interested in the particular shape of the M XM or M XM orbits in ∗ the full system M XM . For all examples we know, the products ZZ and Z ∗ Z are integral linear combinations of physical invariants, and the linear combination corresponds precisely to the decomposition of the full system in chiral orbits. Note ± orbit must be a nimrep. As long as we have a one-to-one correthat each M XM spondence between irreduciblea nimreps and diagonals of modular invariants we find that at least the diagonal part of ZZ ∗ and Z ∗ Z can be written as a positive integral linear combination of diagonal parts of modular invariants. Since there are no distinctb modular invariants known sharing the same diagonal part, this is a strong indication that there is indeed a general rule. a We
do not mean irreducibilty in the usual sense for representations here — this would mean “onedimensional” since our braided systems N XN are commutative. Here we rather mean irreducibility in the sense that the sum of the representation matrices is irreducible (in the sense of [26]). b Here we do not worry about the distinction between a sufferable modular invariant and its transpose, which can be obtained from the same subfactor by reversing the braiding.
August 21, 2002 19:22 WSPC/148-RMP
726
00135
D. E. Evans
∗ We can write Z in terms of rectangular branching matrices as Z = B+ B− so ∗ ∗ ∗ ∗ ∗ ∗ that ZZ = B+ B− B− B+ and Z Z = B− B+ B+ B− . Let us look at the sandwiched ∗ which must be invariant under the extended S- and T -matrices thanks to B± B± the intertwining rules of [8, Thorem 6.5]. The extended S- and T -matrices have at most permutation invariants. If these invariants in fact span the entire commutant ∗ must be a linear combination of of S and T (may well be in general) then B± B± these permutations. Unfortunately, it is not clear whether this is always an integral linear combination. It is very instructive to look at some examples. Even type I invariants are interesting here, i.e. when we have B+ = B− . For instance for the D10 invariant of SU (2)16 we have 2 0 0 0 0 0 0 2 0 0 0 0 0 0 2 0 0 0 1 1 ∗ = 1 6 + t0 , B+ B+ = = 2 · 14 ⊕ 0 0 0 2 0 0 1 1 0 0 0 0 1 1
0
0
0
0
1
1
where t0 is is the transposition matrix which exchanges the two marked vertices (j) [α8 ], j = 1, 2, on the short legs of D10 . For the E7 invariant we have B− = ΠB+ , (j) where the permutation Π is either tj , j = 1, 2, the transpositions exchanging [α8 ] with the marked vertex [α2 ], or one of the two cyclic permutations c1 , c2 . For ∗ = 1 + t2 . Next let us consider the D(12) invariant example, if Π = t1 , then B− B− of SU (3)9 . Here we find 1 1 1 ∗ = 3 · 1 6 ⊕ 1 1 1 = 1 9 + c1 + c2 , B+ B+ 1 1 1 where here c1 , c2 denote the two non-trivial cyclic permutations of the three fixed (j) points [α(6,3) ], j = 1, 2, 3. For simple current invariants with a single full fixed point we probably have a sum over all cyclic permutations of the fixed point constituents in general. For the conformal inclusion invariant E (12) we find 6 0 0 ∗ = 0 3 3 = 3 · 13 + 3 · C , B+ B+ 0 3 3 with the Z3 charge conjugation C exchanging the two non-trivial marked vertices (12) [η1 ] and [η2 ]. These numbers reflect exactly the appearance of three times E1 (12) which corresponds to 1 and three times E2 which corresponds to C. The very special property of this example is that the orbifold corresponding to charge conju(12) (12) gation changes the graph non-trivially, E2 is the Z3 orbifold of E1 whereas the
August 21, 2002 19:22 WSPC/148-RMP
00135
Fusion Rules of Modular Invariants
727
modular invariant is self-conjugate, Z = CZ. We do not know any other example where this happens. Other examples for self-conjugate modular invariants which are non-self-conjugate on the extended level are D4% for SU (2). But for D4 , the conjugation of the extended conjugation is obtained by a Z3 orbifold and D4 is its own Z3 orbifold. For % > 1, the conjugation will no longer be obtained as an orbifold since we do not have a simple current group as extended theory, but apparently the D4% graphs are in general identical with their non-group-like orbifolds. Another example is the conformal inclusion SU (4)2 ⊂ SU (6)1 , for which the extended conjugation is also obtained by a Z3 orbifold, however, the graphs are their own Z3 orbifolds. Another strange but different case is the conformal inclusion SU (4)6 ⊂ SU (10)1 invariant for which 4 0 3 1 ∗ B+ B+ = ⊕ ⊗ 14 = 3 · 110 + C , 0 4 1 3 with the Z10 charge conjugation C. An in fact, here we expect three layers of the chiral graph and one layer of the conjugation graph in the entire M -M system. (See subsection below.) 5.4. Interesting invariants of SU (n)n In the conclusions of [8] we speculated about modular invariants which look like type I or type II but really come from heterotic extensions, i.e. for which we have different intermediate local subfactors M+ 6= M− . By the results of [8, Sec. 4] − this means that at least for one λ we have Hom(id, α+ λ ) 6= Hom(id, αλ ) in spite of ± Zλ,0 = Z0,λ . Since Hom(id, αλ ) ⊂ Hom(ι, ιλ) this will necessarily require hθ, λi ≥ 2 for such λ. In [8], we pointed out that such a case may be possible but also that did not know of an example. Here are examples, actually making use of the heterotic SO(16`)1 modular invariants (` = 1, 2, 3, . . .) treated in [8]. For this we consider once more the series of conformal inclusions SU (n)n ⊂ SO(n2 − 1)1 . Note that for n = 7, 9, 15, 17, 23, . . . , i.e. for n = 8r ± 1, r = 1, 2, 3, . . . , the number n2 − 1 is a multiple of 16, so that the ambient algebra has a heterotic extension. (The simplest case is therefore SU (7)7 ⊂ SO(48)1 .) Using the standard labelling for the sectors of SO(16`)1 , the two heterotic invariants can be written as Z = χ0 (χ0 )∗ + χs (χ0 )∗ + χ0 (χc )∗ + χs (χc )∗ ˜ and Z ∗ (their coupling matrices are denoted by Q and t Q in [8]). Now let N ⊂ M 2 denote the conformal inclusion subfactor of SU (n)n ⊂ SO(n − 1)1 for n = 8r ± 1, r = 1, 2, 3, . . . . As we know from [8, Sec. 7], there is a crossed product extension by ˜ ⊂M =M ˜ o(Z2 ×Z2 ) producing Z and all SO(64r2 ±16r)1 sectors v, s, c (and 0) M ∗ Z (using braiding and its opposite). The local intermediate extensions are different, namely the Z2 extensions corresponding to s and c separately. So let us consider the subfactor N ⊂ M . Then its maximal local intermediate extensions will therefore
August 21, 2002 19:22 WSPC/148-RMP
728
00135
D. E. Evans
˜ os Z2 and M− = M ˜ oc Z2 . Nevertheless the SU (8r ± 1)8r±1 invariant be M+ = M arising from N ⊂ M does not seem to have non-symmetric vacuum coupling — all known SU (n) invariants are symmetric. Therefore we expect that the sectors s and c of SO(64r2 ± 16r)1 will have the same branching rules, i.e. have the same restriction to SU (8r ± 1)8r±1 . (This is quite natural due to the similarity of the sectors s and c of SO(n)1 . For n an odd multiple of 8 this similarity even covers the sector v and e.g. for the conformal inclusion SU (3)3 ⊂ SO(8)1 all three sectors v, s, c have the same SU (3) restriction.) Anyway, then here we have a heterotic extension, but the identical branching rules of s and c would imply that Z, when written in SU (8r ± 1)8r±1 characters, has symmetric coupling matrix and in particular does not look heterotic anymore. In fact, upon restriction to SU (8r ± 1)8r±1 , both Z and Z ∗ will be identical with the invariants |χ0 + χs |2 and |χ0 + χc |2 . Hence there will be a 4-fold degeneracy. Due to the permutation s ↔ c, the original conformal inclusion invariant |χ0 |2 + |χv |2 + |χs |2 + |χc |2 will be two-fold degenerate. Now let λ be an SU (8r ± 1)8r±1 sector appearing in the restriction of s and hence of c. Since Z contains χs (χ0 )∗ and χ0 (χc )∗ it follows that Zλ,0 = Z0,λ is non-zero. On the other hand, since the dual canonical endomorphism sector [θ] of the full subfactor N ⊂ M is the σ-restriction of [id] ⊕ [v] ⊕ [s] ⊕ [c] it follows that hθ, λi ≥ 2 — as it must be. Now let us concentrate on the simplest example of this series, the conformal embedding SU (7)7 ⊂ SO(48)1 , which already seems to produce quite interesting SU (7)7 modular invariants. The center of the Weyl alcove, the simple current fixed point (1, 1, 1, 1, 1, 1) (or [6, 5, 4, 3, 2, 1] as a Young frame) appears in the restriction of s and c, with a multiplicity 4. Indeed the branching rules are [24, Eq. (5.30)]: 0 −→ (0, 0, 0, 0, 0, 0) ⊕ (1, 0, 0, 2, 1, 0) ⊕ (0, 1, 0, 0, 0, 2) ⊕(0, 2, 0, 0, 2, 0) ⊕ (2, 1, 0, 1, 0, 1) × Z7 ⊕ (1, 1, 1, 1, 1, 1) v −→ (1, 0, 0, 0, 0, 1) ⊕ (3, 0, 0, 1, 0, 0) ⊕ (1, 2, 0, 1, 1, 0) ⊕ (1, 1, 0, 0, 1, 1) × Z7 s, c −→ 4 · (1, 1, 1, 1, 1, 1) , where “×Z7 ” means that the entire Z7 simple current orbit has to be taken. Note that then indeed the modular invariants 1 or W of SO(48)1 (in the notation of [8, Sec. 7]) restrict to the same SU (7)7 invariant, let us call it Z1 , and similarly a different SU (7)7 invariant, let us call it Zs , is obtained from either Xs , Xc , Q or t Q of SO(48)1 (i.e. the latter is the specialization of the above Z or Z ∗ .) Also note that Z1 , as it arises from the diagonal invariant |χ0 |2 + |χv |2 + |χs |2 + |χc |2 , has only two (large) non-zero matrix blocks, because the identical s and c blocks intersect with the vacuum block — this is actually the first modular invariant with this property we have encountered so far. The first block, including the vacuum, is a 36 × 36 block, containing 1’s everywhere except a single 33 = 1 + 2 · 42 on the corner corresponding to the label (1, 1, 1, 1, 1, 1). Then there is a 28 × 28 block
August 21, 2002 19:22 WSPC/148-RMP
00135
Fusion Rules of Modular Invariants
729
of 1’s coming from v. The other invariant Zs , arising from |X 0 + X s |2 (either from Xs , Xc , Q or t Q in the notation of [8, Sec. 7]) has only a single 36 × 36 block, containing a 35 × 35 block of 1’s being cornered by a row and a column of 35 entries 5, and they meet with a 25 on the corner corresponding to the label (1, 1, 1, 1, 1, 1). Anyway, these seem to be interesting modular invariants. First note that we have the curious multiplication rules Z1 · Z1 = 28Z1 + 8Zs and Zs · Zs = 60Zs . (Clearly both Z’s are selfconjugate in both senses, i.e. Z = CZ and Z = Z ∗ .) Since ± , #M XM = 3168 for trZ1 = 96 and trZs = 60 we will have #M XN = 96 = #M XM ± (the two-fold degenerate) Z1 , and #M XN = 60 = #M XM , #M XM = 3600 for (the 4-fold degenerate) Zs , and that for Z1 the full system will decompose into 28 copies of its chiral graph plus 8 copies of the chiral graph for Zs whereas we expect for Zs itself that the full system will decompose into 60 layers of its own chiral graph. (Note that SU (7)7 has 1716 primaries.) It is tempting to conjecture that for type I invariants, this fusion graph will always consist exclusively of copies of the chiral graph. This is however not the case, as for instance for the E (12) modular invariant of SU (3)9 the full system (12) (12) also 3 copies of the isospectral graph E2 , see contains besides 3 copies of E1 ∗ below. Moreover, even for type I invariance the product ZZ is not necessarily a multiple of Z. For instance the modular invariant arising from the conformal inclusion SU (4)6 ⊂ SU (10)1 fulfills ZZ ∗ = 3Z + CZ, see [10]. Acknowledgment This work was completed during visits to the MSRI programme on Operator Algebras in 2000–2001. I am grateful for the organizers and MSRI for their invitation and financial support. I would like to thank Jens B¨ ockenhauer for discussions on this work as well as for our collaboration in 1996–2000, and Karl-Henning Rehren for comments on a preliminary version of this manuscript.
References [1] H. Baumg¨ artel, Operatoralgebraic Methods in Quantum Field Theory, Akademie Verlag, 1995. [2] R. E. Behrend, P. A. Pearce, V. B. Petkova and J.-B. Zuber, Boundary conditions in rational conformal field theories, Nucl. Phys. B570 (2000) 525–589. [3] J. B¨ ockenhauer, Localized endomorphisms of the chiral Ising model, Commun. Math. Phys. 177 (1996) 265–304. [4] J. B¨ ockenhauer, An algebraic formulation of level one Wess-Zumino-Witten models, Rev. Math. Phys. 8 (1996) 925–947. [5] J. B¨ ockenhauer and D. E. Evans, Modular invariants, graphs and α-induction for nets of subfactors. I, Commun. Math. Phys. 197 (1998) 361–386. [6] J. B¨ ockenhauer and D. E. Evans, Modular invariants, graphs and α-induction for nets of subfactors. II, Commun. Math. Phys. 200 (1999) 57–103. [7] J. B¨ ockenhauer and D. E. Evans, Modular invariants, graphs and α-induction for nets of subfactors. III, Commun. Math. Phys. 205 (1999) 183–228.
August 21, 2002 19:22 WSPC/148-RMP
730
00135
D. E. Evans
[8] J. B¨ ockenhauer and D. E. Evans, Modular invariants from subfactors: Type I coupling matrices and intermediate subfactors, Commun. Math. Phys. 213, (2000) 267–289. [9] J. B¨ ockenhauer and D. E. Evans, Modular invariants from subfactors, in Quantum Symmetries in Theoretical Physics and Mathematics, ed. R. Coquereaux et al., Contemporary Mathematics, American Mathematics Society. [10] J. B¨ ockenhauer and D. E. Evans, Modular invariants and subfactors, in Mathematical Physics in Mathematics and Physics, ed. R. Longo, Fields Institute Communications 30 2001, pp. 11–37. [11] J. B¨ ockenhauer, D. E. Evans and Y. Kawahigashi, On α-induction, chiral generators and modular invariants for subfactors, Commun. Math. Phys. 208 (1999) 429–487. [12] J. B¨ ockenhauer, D. E. Evans and Y. Kawahigashi, Chiral structure of modular invariants for subfactors, Commun. Math. Phys. 210 (2000) 733–784. [13] J. B¨ ockenhauer, D. E. Evans and Y. Kawahigashi, Longo-Rehren subfactors arising from α-induction, Publ. RIMS, Kyoto Univ. 37 (2001) 1–35. [14] A. Cappelli, C. Itzykson and J.-B. Zuber, The A-D-E classification of minimal and (1) A1 conformal invariant theories, Commun. Math. Phys. 113 (1987) 1–26. [15] P. Degiovanni, Z/N Z Conformal field theories, Commun. Math. Phys. 127 (1990) 71–99. [16] P. Di Francesco, Integrable lattice models, graphs and modular invariant conformal field theories, Int. J. Mod. Phys. A7 (1992) 407–500. [17] P. Di Francesco and J.-B. Zuber, SU (N ) lattice integrable models associated with graphs, Nucl. Phys. B338 (1990) 602–646. [18] D. E. Evans, Critical phenomena, modular invariants and operator algebras, math.OA/0204281. [19] D. E. Evans and Y. Kawahigashi, Quantum Symmetries on Operator Algebras, Oxford University Press, 1998. [20] D. E. Evans and P. R. Pinto, Subfactor realisation of modular invariants, in preparation. [21] K. Fredenhagen, K.-H. Rehren and B. Schroer, Superselection sectors with braid group statistics and exchange algebras. II, Rev. Math. Phys. Special issue (1992) 113–157. [22] J. Fr¨ ohlich and F. Gabbiani, Braid statistics in local quantum theory, Rev. Math. Phys. 2 (1990) 251–353. [23] J. Fr¨ ohlich and F. Gabbiani, Operator algebras and conformal field theory, Commun. Math. Phys. 155 (1993) 569–640. [24] J. Fuchs, B. Schellekens and C. Schweigert, Quasi Galois-symmetries of the modular S-matrix, Commun. Math. Phys. 176 (1996) 447–465. [25] T. Gannon, The classification of affine SU (3) modular invariants, Commun. Math. Phys. 161 (1994) 233–264. [26] F. Goodman, P. de la Harpe and V. F. R. Jones, Coxeter graphs and towers of algebras, Springer, 1989. [27] R. Haag, Local Quantum Physics, Springer-Verlag, 1992. [28] C. Itzykson, From the harmonic oscillator to the A-D-E classification of conformal models, Adv. Stud. Pure Math. 19 (1989) 287–346. [29] M. Izumi, R. Longo and S. Popa, A Galois correspondence for compact groups of automorphisms of von Neumann algebras with a generalization to Kac algebras, J. Funct. Anal. 155 (1998) 25–63. [30] V. F. R. Jones, Index for subfactors, Invent. Math. 72 (1983) 1–25. [31] B. Kostant, On finite subgroups of SU (2), simple Lie algebras, and the McKay correspondence, Proc. Natl. Acad. Sci. USA 81 (1984) 5275–5277.
August 21, 2002 19:22 WSPC/148-RMP
00135
Fusion Rules of Modular Invariants
731
[32] V. Toledano Laredo, Fusion of positive energy representations of LSpin2n , PhD Thesis, Cambridge, 1997. [33] T. Loke, Operator algebras and conformal field theory of the discrete series representations of Diff(S 1 ), Dissertation, Cambridge, 1994. [34] R. Longo, Index of Subfactors and Statistics of Quantum Fields II, J. Funct. Anal. 130 (1990) 285–309. [35] R. Longo, Minimal index of braided subfactors, J. Funct. Anal. 109 (1991) 98–112. [36] R. Longo, Duality for Hopf algebras and for subfactors. I, Commun. Math. Phys. 159 (1994) 133–150. [37] R. Longo and K.-H. Rehren, Nets of subfactors, Rev. Math. Phys. 7 (1995) 567–597. [38] A. Ocneanu, Paths on Coxeter diagrams: From Platonic solids and singularities to minimal models and subfactors, (Notes recorded by S. Goto) in, eds. B. V. Rajarama Bhat et al., Lectures on Operator Theory, The Fields Institute Monographs, Providence, Rhode Island, A.M.S. 2000, pp. 243–323. [39] A. Ocneanu, The classification of subgroups of quantum SU (N ), in Quantum Symmetries in Theoretical Physics and Mathematics, eds. R. Coquereaux et al., Contemporary Mathematics, American Mathematics Society. [40] V. B. Petkova and J.-B. Zuber, From CFT to graphs, Nucl. Phys. B463 (1996) 161–193. [41] V. B. Petkova and J.-B. Zuber, Conformal field theory and graphs, in Proceedings Goslar 1996. [42] K.-H. Rehren, Braid group statistics and their superselection rules, in Kastler, The Algebraic Theory of Superselection Sectors, eds. D. Kastler, World Scientific, 1990, pp. 333–355. [43] K.-H. Rehren, Chiral observables and modular invariants, Commun. Math. Phys. 208 (2000) 689–712. [44] K.-H. Rehren, Canonical tensor product subfactors, Commun. Math. Phys. 211 (2000) 395–406. [45] V. G. Turaev, Quantum Invariants of Knots and Three Manifolds, Walter de Gruyter, 1994. [46] A. Wassermann, Operator algebras and conformal field theory III: Fusion of positive energy representations of LSU (N ) using bounded operators, Invent. Math. 133 (1998) 467–538. [47] F. Xu, New braided endomorphisms from conformal inclusions, Commun. Math. Phys. 192 (1998) 347–403.
August 21, 2002 19:41 WSPC/148-RMP
00137
Reviews in Mathematical Physics, Vol. 14, Nos. 7 & 8 (2002) 733–757 c World Scientific Publishing Company
ON A SUBFACTOR ANALOGUE OF THE SECOND COHOMOLOGY
MASAKI IZUMI Department of Mathematics, Graduate School of Science Kyoto University, Sakyo-ku, Kyoto 606-8502, Japan
[email protected] HIDEKI KOSAKI Faculty of Mathematics, Kyushu University Higashi-ku, Fukuoka 812-8581, Japan
[email protected] Received 25 January 2002 Revised 8 March 2002
Dedicated to Professor Huzihiro Araki on his seventieth birthday The set of equivalence classes of Longo’s Q-systems is shown to serve as a right subfactor analogue of the second cohomology. This “cohomology” is computed for several classes of subfactors, which tells us if a subfactor is uniquely determined up to inner conjugacy by the associated canonical endomorphism. As a byproduct of our analysis, two different groups of order 64 are shown to possess the same representation category as an abstract tensor category. Thus, this category has two permutation symmetries with non-isomorphic groups via the Doplicher–Roberts duality.
1. Introduction The starting point of the present work is the simple question for an inclusion M ⊃ N of factors: Does Longo’s canonical endomorphism γN (as a sector) uniquely determine the subfactor N up to inner conjugacy? In terms of bimodules, this can be rephrased as follows: Does the basic extension M1 (as an M -M bimodule) uniquely determine the subfactor N up to inner conjugacy? The answer is negative in general, and the obstruction for uniqueness is analyzed in the present article. The notion of a Q-system due to R. Longo [24] was initially introduced to present an abstract characterization of the canonical endomorphism of a subfactor of finite index. In [25] it was employed as a tool for dealing with nets of observable algebras, and R. Longo and K.-H. Rehren introduced a new construction for a subfactor (now referred to as the Longo–Rehren inclusion) from a given finite system of endomorphisms of a type III factor. This notion was further translated into type II1 language by T. Masuda to show that A. Ocneanu’s asymptotic inclusion, or 733
August 21, 2002 19:41 WSPC/148-RMP
734
00137
M. Izumi & H. Kosaki
S. Popa’s symmetric enveloping algebra [29], is essentially the same object as the Longo–Rehren inclusion (see [27]). A Q-system also played a crucial role in the first-named author’s construction [18] of the Haagerup subfactor [1] from a Cuntz algebra endomorphism. For fixed-point algebra subfactors the obstruction for the above-mentioned uniqueness question is closely related to the second cohomology obstruction [20, 31] for cocycle conjugacy of group actions. Therefore, one expects that systematic study for uniqueness for more general subfactors might reveal a new interplay between operator algebras and cohomological objects. We will see that this is indeed the case. In Sec. 2 the “second cohomology” H 2 (M ⊃ N ) is defined as the set of equivalence classes of the Q-systems with a common endomorphism γN . This cohomology parameterizes the inner conjugacy classes of subfactors with the canonical endomorphism γN , and it will be determined for depth 2 subfactors (in Sec. 3), group-subgroup subfactors (in Sec. 5), and subfactors of index at most 4 (in Sec. 6). Depth 2 subfactors give us Kac algebras, and for such a subfactor H 2 (M ⊃ N ) is identified as the second cohomology of the corresponding Kac algebra. Hence, (as pointed out in [34, 37]) Hopf algebras cannot be recovered from the abstract tensor categories of representations without specifying Hilbert space realization. This phenomenon occurs because cocycle perturbation of the coproduct might change the Hopf algebra structure [19, 37]. As a byproduct of our analysis, in Sec. 4 we show that two different groups of order 64 possess the same representation category as an abstract tensor category based on A. Wassermann’s observation on the correspondence between the second cohomology of a group dual and ergodic actions of a compact group [35]. Consequently, via the Doplicher–Roberts duality [7] this category has at least two permutation symmetries providing non-isomorphic groups. After the completion of the present work the authors have learned that such examples (of larger order) are also constructed in the recent works [6, 9], which indeed have some overlap with our Sec. 4. Our basic reference for subfactors is [10] (see [21] for the type III case), and we follow the notations in [17, Sec. 2] for sectors. The authors would like to thank M. M¨ uger for informing them of the above-mentioned references [6, 9], and the firstnamed author is also grateful to S. Yamagami for stimulating discussions on the topics covered in Sec. 4.
2. Q-Systems and Second Cohomology Let M ⊃ N be an inclusion of infinite factors of finite index with the canonical endomorphism γN = Ad(JN JM ) : M → N , where JM , JN are modular conjugations of M, N respectively. Although γN depends on the choice of natural cones for M and N , the sector [γN ] ∈ Sect(N, M ) is uniquely determined by N . Any representative of [γN ] is also called the canonical endomorphism in the sequel. It is known that one can find isometries S ∈ (Id, γN ) and T ∈ (Id|N , γN |N ) satisfying
August 21, 2002 19:41 WSPC/148-RMP
00137
Subfactor Analogue of the Second Cohomology −1/2
T ∗ S = γ(S)∗ T = [M : N ]0
735
.
In order to get an abstract characterization of the canonical endomorphisms, R. Longo introduced the notion of Q-systems in [24]. Definition 2.1. A triple (γ, S, T ) consisting of an endomorphism γ of M and isometries S ∈ (Id, γ), T ∈ (γ, γ 2 ) is called a Q-system if the following relations hold: 1 (2.1) T ∗ S = γ(S ∗ )T = ∈ R+ , d T T ∗ = γ(T ∗ )T ,
(2.2)
T 2 = γ(T )T .
(2.3)
Note that (2.2) and (2.3) are equivalent as long as (2.1) is assumed. Indeed, the implication (2.3) ⇒ (2.2) was shown in [26]. On the other hand, (2.2) shows T ∗ γ(T )γ(T ∗) = T ∗ γ 2 (T ∗ )γ(T ). Another use of (2.2) shows that the left side here is equal to T T ∗γ(T ∗ ). But, the right side is T ∗ γ 2 (T ∗ )γ(T ) = γ(T ∗ )T ∗ γ(T ) = γ(T ∗ )T T ∗ = T T ∗T ∗ due to T ∈ (γ, γ 2 ) and (2.2). Therefore, we have T T ∗γ(T ∗ ) = T T ∗T ∗ , showing (2.3). Longo proved that every Q-system comes from an inclusion of factors. Theorem 2.2 (Longo [24]). For a given Q-system Ξ = (γ, S, T ) one can construct a subfactor N in such a way that γ is the canonical endomorphism for the inclusion M ⊃ N and T ∈ (Id|N , γ|N ). Moreover, N is given by N = {x ∈ N ; T x = γ(x)T, T x∗ = γ(x∗ )T } , and EΞ (·) = T ∗ γ(·)T gives a conditional expectation onto N. We say that two Q-systems Ξ1 = (γ1 , S1 , T1 ) and Ξ2 = (γ2 , S2 , T2 ) are equivalent if there exists a unitary u ∈ (γ1 , γ2 ) satisfying S2 = uS1 ,
T2 = uγ1 (u)T1 u∗ .
This is clearly an equivalence relation (note γ2 (·) = uγ1 (·)u∗ ). The two relevant conditional expectations are related by EΞ2 (x) = T2∗ γ2 (x)T2 = uT1∗ γ1 (u∗ )γ1 (x)γ1 (u)T1 u∗ = uEΞ1 (u∗ xu)u∗ so that two equivalent systems give inner conjugate subfactors. On the other hand, it is easy to show that two inner conjugate subfactors give rise to equivalent Q-systems. Definition 2.3. For an endomorphism p γ, we denote by Q(γ) the set of Q-systems ∗ of the form (γ, S, T ) with S T = 1/ d(γ). For an inclusion M ⊃ N , we define H 2 (M ⊃ N ) to be the set of equivalence classes of the elements in Q(γN ).
August 21, 2002 19:41 WSPC/148-RMP
736
00137
M. Izumi & H. Kosaki
When a subfactor P (⊂ M ) satisfies [γP ] = [γN ], i.e., γP (·) = uγN (·)u∗ for some unitary u ∈ M , we have Ju∗ P u = u∗ JP u and the commutativity [u, JM xJM ] = 0 (x ∈ M ) shows γu∗ P u (x) = u∗ JP uJM xJM u∗ JP u = u∗ JP JM xJM JP u = u∗ γP (x)u = γN (x) . Consequently, there is a natural one-to-one correspondence between H 2 (M ⊃ N ) and the inner conjugacy classes of subfactors P satisfying [γN ] = [γP ] in Sect(M ). The following elementary, but instructive, example justifies our notation H 2 (M ⊃ N ): Let α be an outer action of a finite group G on M with the fixed-point algebra N = M α . Then, we have M [αg ] , [γN ] = g∈G
and hence [γN ] = [γP ] holds if and only if P is the fixed-point algebra under inner perturbation of α (see [15]). Recall [3] that every α-cocycle is a coboundary since G is a finite group. Therefore, there is a one-to-one correspondence between H 2 (M ⊃ N ) and the second cohomology group H 2 (G, T) [20, 31]. This means that [γN ] fails to determine the inner conjugacy class of N when H 2 (G, T) is non-trivial. The next theorem generalizes the well-known fact #H 2 (G, T) < ∞ for a finite group G [32]. Theorem 2.4. We have #H 2 (M ⊃ N ) < ∞ for an inclusion M ⊃ N of factors of finite index. Proof. Let us regard Q(γN ) as a compact subset in the finite-dimensional space (Id, γ) × (γ, γ 2 ), and it suffices to show the existence of a positive constant ε with the following property: two Q-systems Ξ1 , Ξ2 as above are equivalent as long as kS1 − S2 k < ε and kT1 − T2 k < ε. Note that kT1 − T2 k < ε implies kEΞ1 − EΞ2 k < 2ε (the last part in Theorem 2.2). Thus, in the type II1 setting ε small can be chosen to guarantee inner conjugacy, i.e., equivalence (see proposition after Theorem 2 in [28]). Note H 2 (M ⊃ N ) depends only on the paragroup of M ⊃ N , hence, to show #H 2 (M ⊃ N ) < ∞ in the current case, we may replace M ⊃ N by an inclusion of II1 factors with the same paragroup due to T. Masuda’s formulation [27] of Q-systems, and we are done. Remark 2.5. A “II1 factor-free” proof is possible for irreducible inclusions M ⊃ N . Here is such a proof inspired by A. Ocneanu’s argument for finiteness of the equivalence classes of 6j-symbols for a given finite-dimensional fusion algebra. Since we have dim(Id, γN ) = 1, for given two Q-systems as above we may assume S1 = S2 = S replacing Ξ2 with an equivalent Q-system if necessary. We set δT = T1 − T2 , and ≈ will mean an equality up to the order of kδT k2 in the sequel. The defining conditions for Q-systems imply δT ∗ T2 + T2∗ δT ≈ 0 ,
(2.4)
August 21, 2002 19:41 WSPC/148-RMP
00137
Subfactor Analogue of the Second Cohomology
S ∗ δT = γN (S ∗ )δT = 0 ,
(2.5)
δT T2∗ + T2 δT ∗ ≈ γN (δT ∗ )T2 + γN (T2∗ )δT .
(2.6)
Let a = dS ∗ EΞ2 (δT ) ∈ (γN , γN ) with d = [M : a = dS
∗
737
T2∗ γN (δT )T2
≈ dS
∗
(δT T2∗
1/2 N ]0 .
Then, we get
∗
+ T2 δT − δT ∗ γN (T2 ))T2
= δT ∗ T2 − dS ∗ δT ∗ T2 T2 = δT ∗ T2 − hδT ∗ T2 S, Si thanks to (2.6), (2.5) together with the fact that δT ∗ T2 ∈ (γN , γN ) and S ∗ δT ∗ T2 is proportional to S ∗ . Thus, we have aS ≈ δT ∗ T2 S − hδT ∗ T2 S, SiS = 0 . ∗
Thanks to (2.4), we have a ≈ −a∗ . With u = e(a−a have uS ≈ S + aS ≈ S and
)/2
, a unitary in (γN , γN ), we
uγN (u)T2 u∗ ≈ T2 + aT2 − T2 a + γN (a)T2 ≈ T2 + δT ∗ T22 − T2 δT ∗ T2 + dγN (S ∗ EΞ2 (δT ))T2 = T2 + δT ∗ T22 − T2 δT ∗ T2 + T2∗ γN (δT )T2 ≈ T2 + δT ∗ T22 + (δT T2∗ − δT ∗ γN (T ))T2 = T2 + δT = T1 , where (2.6) was used. Let c ∈ T satisfying uS = cS. Then, by setting v = c¯u, we get vS = S and vγN (v)T2 v ∗ ≈ T1 . A little more careful examination of the above computation actually shows k1 − vk < C1 kT1 − T2 k and kT1 − vγN (v)T2 v ∗ k < C2 kT1 − T2 k2 with constants C1 , C2 (depending only on d). Therefore, by starting from T2 satisfying C2 kT1 − T2 k < 1, we can choose converging unitaries {vn }∞ n=1 ⊂ (γN , γN ) satisfying vn S = S
and kvn γN (vn )T2 vn∗ − T1 k < C22
n
−1
n
kT1 − T2 k2 ,
showing the equivalence between Ξ1 and Ξ2 . 3. The Depth 2 Case When an irreducible inclusion M ⊃ N of infinite factors (of index n) is of depth 2, N is realized as the fixed-point algebra under a Kac algebra action on M [5, 33]. In this section, H 2 (M ⊃ N ) is identified with the second cohomology of the corresponding Kac algebra (in the depth 2 case). Definition 3.1. Let A be a Kac algebra with a coproduct δ. A unitary ω ∈ A ⊗ A is called a 2-cocycle (or simply a cocycle) if it satisfies the cocycle relation (δ ⊗ Id)(ω)(ω ⊗ 1) = (Id ⊗ δ)(ω)(1 ⊗ ω) .
August 21, 2002 19:41 WSPC/148-RMP
738
00137
M. Izumi & H. Kosaki
We say that ω1 , ω2 ∈ Z 2 (A), the set of 2-cocycles for A, are equivalent if ω2 = δ(ξ ∗ )ω1 (ξ ⊗ ξ) for some unitary ξ ∈ A . The set of equivalence classes of 2-cocycles for A is denote by H 2 (A). Let ρ be the inclusion map from N into M and ρ¯ be the conjugate morphism ρ), T0 ∈ (Id, ρ¯ρ) forming a from M into N . We fix two isometries S0 ∈ (Id, ρ¯ ∗ ρρ), an nQ-system (ρ¯ ρ, S0 , T0 ). We identify the C -algebra generated by H = (ρ, ρ¯ dimensional Hilbert space, with the Cuntz algebra On . A unitary u ∈ On determines the endomorphism λu of On by λu (S) = uS, S ∈ H. A Hilbert space K ⊂ On with left support 1 (with an orthonormal basis {Ti }i ) gives us the endomorphism P Ti · Ti∗ of On . We identify Hm H∗n with B(H⊗n , H⊗m ) and often use σK (·) = the tensor notation inside On . For example, x ⊗ y means xσH (y) for x, y ∈ B(H). With this notation, we have λF = σH with the flip F (of the tensor components of H ⊗ H). We employ the Kac algebra action on On investigated in [4, 17, 24] as our model ρ|On is given by λV F , action. As pointed out in [24], ρ¯ ρ globally preserves On and ρ¯ where V is an irreducible multiplicative unitary [2]. For the objects associated with V , we use the same notation as in [17, Sec. 4] in what follows. Here, the most n ρ)n ) = (λm important observation for us is ((ρ¯ ρ)m , (ρ¯ V F , λV F ) for each n, m, which enables us to work with On for determination of H 2 (M ⊃ N ). ¯ R instead in [17, p. 439], and so we have Note that S0 , T0 were denoted by R, √ √ e) , T0 = nΛϕ (e) = Λϕˆ (1) . S0 = Λϕ (1) = nΛϕˆ (ˆ ρ¯V (S0 ) = S0 ,
ρV (T0 ) = T0 .
We introduce Kac algebra structure on Aˆ0 (the commutant here is taken in B(H)) ˆ For a ∗-subalgebra B of from Aˆ through the conjugate linear isomorphism Ad J. B(H) we denote by EB the trace preserving conditional expectation from B(H) onto ˆ is the restriction of the B. Note that the Haar state ϕ of A (respectively ϕˆ of A) normalized trace of B(H). We denote by EρV and Eρ¯V the conditional expectations from On onto ρV (On ) and ρ¯V (On ) respectively given by EρV (x) = T0∗ λV F (x)T0
and Eρ¯V (x) = S0∗ λVˆ F (x)S0 .
Lemma 3.2. With the notations so far the following assertions hold : ˆ ˆ for each x ∈ A, (1) T0 V xV ∗ T0 = ϕ(x)1 (2) EAˆ0 is given by X d(π) X d(π) uˆ(π)ab xˆ u(π)∗ab . EAˆ0 (x) = n π∈Π a,b=1
ˆ A) ˆ 0 ) = Aˆ0 ⊗ Aˆ0 . (3) (EAˆ0 ⊗ Id)(δ( ∗ (4) V T0 = U V T0 .
August 21, 2002 19:41 WSPC/148-RMP
00137
Subfactor Analogue of the Second Cohomology
739
(5) The coproduct δˆ0 of Aˆ0 satisfies V (1 ⊗ x)V ∗ = (U ⊗ 1)δˆ0 (x)(U ⊗ 1) ∈ Aˆ ⊗ Aˆ0 , ˆ ˆ Vˆ ∗ (1 ⊗ x)Vˆ = F δˆ0 (x)F ,
x ∈ Aˆ0 ,
x ∈ Aˆ0 .
Proof. (1) This follows from [17, (4.1.6)] and the fact that ϕ ˆ is a trace. (2) Thanks to [17, Corollary 4.4], we have B(H) ∩ ρ¯(On ) = Aˆ0 , and so EAˆ0 is given by the restriction of Eρ¯V . Thus, using [17, Lemma 4.1, (4.1.6)], we get EAˆ0 (x) = S0∗ λVˆ F (x)S0 = σH (S0∗ )V xV ∗ σH (S0 ) X
=
u ˆ(π)ac xˆ u(π)∗bc σH (S0∗ e(π)ab S0 ) =
π,a,b,c
X d(π) uˆ(π)ac xˆ u(π)∗ac , n
π,a,b,c
where the last equality follows from [17, (4.1.11)]. (3) It suffices to check the commuting square condition for Aˆ0 ⊗ B(H)
⊂
B(H) ⊗ B(H)
⊂
∪ ˆ A) ˆ 0, δ(
∪ ˆ0
ˆ0
A ⊗A
∗ ˆ ˆ ˆ0 ˆ0 ˆ0 ˆ and hence Eδ( ˆ A) ˆ 0 (A ⊗ B(H)) = A ⊗ A has to be seen. Since δ(A) = V (C ⊗ A)V , we have E ˆ ˆ 0 = Ad(V ∗ ) · (Id ⊗ E ˆ0 ) · Ad(V ), and for x ∈ Aˆ0 ⊗ B(H) we compute δ(A)
Eδ( ˆ A) ˆ 0 (x) =
A
X π∈Π,a,b
=
X π∈Π,a,b
=
d(π) ∗ V (1 ⊗ u ˆ(π)ab )V xV ∗ (1 ⊗ u ˆ(π)∗ab )V n d(π) ˆ ˆ u(π)∗ ) δ(ˆ u(π)ab )xδ(ˆ ab n
X π∈Π,a,b,c,d
=
X π∈Π,a,b,c
d(π) (ˆ u(π)ac u ˆ(π)∗ad ⊗ u ˆ(π)cb )x(1 ⊗ u ˆ(π)∗db ) n
d(π) (1 ⊗ uˆ(π)cb )x(1 ⊗ u ˆ(π)∗cb ) = (Id ⊗ EAˆ0 )(x) ∈ Aˆ0 ⊗ Aˆ0 . n
ˆ κ(x)), x ∈ A. (4) This follows from [17, (4.1.6)] and U Λϕˆ (x) = Λϕˆ (ˆ ∗ ˆ (J ⊗ J) ˆ [11], we get the following for x ∈ Aˆ0 : (5) Using V = (J ⊗ J)V ˆ Jx ˆ ∗ (1 ⊗ Jx ˆ Jˆ)V (J ⊗ J) ˆ = (J ⊗ J) ˆ δ( ˆ Jˆ)(J ⊗ J) ˆ V (1 ⊗ x)V ∗ = (J ⊗ J)V = (U ⊗ 1)δˆ0 (x)(U ⊗ 1) . ˆ we have Since JxJ = κ ˆ (x∗ ) for x ∈ A, ˆ ˆ ˆ κ(Jˆx∗ J))(U ˆ (U ⊗ U ) = (U ⊗ U )δ(ˆ ˆ ˆ ∗ J))V ⊗ U) ˆ (Jx Vˆ ∗ (1 ⊗ x)Vˆ = (U ⊗ U )V ∗ (1 ⊗ κ ˆ κ(JˆxJ))( ˆ κ⊗κ ˆ Jˆ ⊗ J) ˆ = F δˆ0 (x)F . = (Jˆ ⊗ J)(ˆ ˆ ) · δ(ˆ
August 21, 2002 19:41 WSPC/148-RMP
740
00137
M. Izumi & H. Kosaki
We are now ready to prove the main result in the section. Theorem 3.3. For a given Q-system (λV F , S, T ), there exists a unique 2-cocycle ∗ ω 0 ∈ Z 2 (Aˆ0 ) satisfying S = ω 0 S0 , T = V ω 0∗ V ∗ T0 . On the other hand, every 2cocycle arises in this way, and furthermore two Q-systems are equivalent if and only if the corresponding 2-cocycles are equivalent. In consequence, we can identify ˆ H 2 (M ⊃ N ) with H 2 (Aˆ0 ) (and also with H 2 (A)). Proof. For a Q-system (λV F , S, T ) we set T˜0 = ρ¯V (T0 ) = V ∗ T0 and T˜ = ρ¯V (T ). Due to V ∈ Aˆ ⊗ A and T ∈ (λV F , λ2V F ) = (λV F , σH λV F ) = HAˆ0 , we have T˜ = ˆ A) ˆ 0 because λV ∗ (T ) = V ∗ T ∈ H2 H∗ . Note that T0 T ∗ ∈ B(H)⊗ Aˆ0 implies T˜0 T˜ ∗ ∈ δ( ˆ = V ∗ T0 T ∗ (1 ⊗ x)V = V ∗ (1 ⊗ x)T0 T ∗ V , V ∗ T0 T ∗ V δ(x)
x ∈ Aˆ .
We set ω 0 = n(EAˆ0 ⊗ Id)(T˜0 T˜ ∗ ), which belongs to Aˆ0 ⊗ Aˆ0 thanks to Lemma 3.2(3). Lemma 3.2(1) and (2) imply X X d(π)V uˆ(π)ab T˜T0∗ V uˆ(π)∗ab V ∗ T0 = d(π)ϕ(ˆ ˆ u(π)∗ab )V u ˆ(π)ab T˜ V ω 0∗ V ∗ T0 = π,a,b
π,a,b
= V T˜ = T . We show that ω 0 is a unitary. Since (ρV ρ¯V , S, T ) is a Q-system, we have ρV (T ) = σH ρ¯V (T ∗ )¯ ρV (T ) = σH (T˜ ∗ )T˜ . T˜ T˜ ∗ = ρ¯V (ρV ρ¯V (T ∗ )T ) = ρ¯V ρV ρ¯V (T ∗ )¯ Using [17, (4.1.10), (4.1.12)], Lemma 3.2(1) and (2), and T ∈ HAˆ0 , we get X d(π)d(σ)ˆ u(π)ab T˜T0∗ V u ˆ(π)∗ab uˆ(σ)cd V ∗ T0 T˜ ∗ uˆ(σ)∗cd ω 0∗ ω 0 = π,σ,a,b,c,d
=
X π,a,b
=
X
d(π)ˆ u(π)ab T˜ T˜ ∗ uˆ(π)∗ab =
X
d(π)σH (T˜ ∗ )ˆ u(π)ab V ∗ T u ˆ(π)∗ab
π,a,b
d(π)σH (T˜ ∗ )ˆ u(π)ab V ∗ σH (ˆ u(π)∗ab )T
π,a,b
=
X π,a,b
X
ˆ u(π)∗ )T˜ = d(π)σH (T˜ ∗ )ˆ u(π)ab δ(ˆ ab
d(π)σH (T˜∗ )σH (ˆ u(π)∗bb )T˜
π,a,b
e)T˜ = nσH (T˜ ∗ )σH (¯ ρV (ˆ e))T˜ = n¯ ρ(ρ¯ ρ(T ∗ SS ∗ )T ) = 1 = nσH (T˜∗ )σH (ˆ as desired. Note that the above argument also shows uniqueness of ω 0 . Now we show that ω 0 is a 2-cocycle. T02 = λV F (T0 )T0 and T 2 = λV F (T )T imply σH (T˜0 )T˜0 = T˜02 , σH (T˜ )T˜ = T˜ 2 . Thanks to Lemma 3.2(4) and (5), the left side of the second equation is equal to 0∗ 0∗ (1 ⊗ U ⊗ 1)V23 ω13 σH (T0 )T˜0 σH (ω 0 U V T0 )ω 0∗ T˜0 = ω23 0∗ 0∗ ∗ = ω23 (1 ⊗ U ⊗ 1)V23 ω13 V23 (1 ⊗ U ⊗ 1)σH (T˜0 )T˜0 0 0∗ = ω23 (Id ⊗ δˆ0 )(ω ∗ )T˜02 .
August 21, 2002 19:41 WSPC/148-RMP
00137
Subfactor Analogue of the Second Cohomology
741
On the other hand, the right side is 0∗ 0∗ ∗ 0∗ ˆ0 (U ⊗ 1 ⊗ 1)V12 ω23 V12 (U ⊗ 1 ⊗ 1)T˜02 = ω12 (δ ⊗ Id)(ω 0∗ )T˜02 . ω 0∗ U V T0 ω 0∗ T˜0 = ω12
We claim n2 (Id ⊗ EAˆ0 ⊗ EAˆ0 )(T˜02 T˜0∗2 ) = 1, which will show the cocycle relation of ω 0 . Indeed, since we have ∗ ∗ V23 (e ⊗ e ⊗ 1)V23 V12 , T˜02 T˜0∗2 = V ∗ T0 V ∗ T0 T0∗ V T0 V = V12
it suffices to show n(Id ⊗ EAˆ0 )(V ∗ (e ⊗ 1)V ) = 1. Thanks to the commuting square condition shown in [17, p. 442], the restriction of EAˆ0 to A is the Haar state ϕ. Thus, Lemma 3.2 (2) implies X uˆ(π)∗ca eˆ u(π)cb ⊗ EAˆ0 (e(π)ab ) (Id ⊗ EAˆ0 )(V ∗ (e ⊗ 1)V ) = π,a,b,c
=
X d(π) 1 u ˆ(π)∗ca eˆ u(π)ca ⊗ 1 = EAˆ0 (e) ⊗ 1 = , n n π,a,c
where we use the fact that {ˆ u(π)∗ab }ab is also an irreducible unitary corepresentation as Aˆ is a finite-dimensional Kac algebra. Therefore, ω 0 is a 2-cocycle. Let ˆ0 be the counit of Aˆ0 . Then, the cocycle relation implies (ˆ 0 ⊗ Id)(ω 0 ) = (Id ⊗ ˆ0 )(ω 0 ) ∈ T . ρ) = CS0 , we have Note that xS0 = ˆ0 (x)S0 holds for any x ∈ Aˆ0 . Since (Id, ρ¯ S = cS0 for some c ∈ T and we compute 1 1 √ = S ∗ T = c¯S0∗ V ω 0 V ∗ T0 = c¯(ˆ ⊗ Id)(ω 0∗ )S0∗ T0 = c¯(ˆ ⊗ Id)(ω 0∗ ) √ , n n showing S = (ˆ 0 ⊗ Id)(ω 0∗ )S0 = ω 0∗ S0 . Let ξ ∈ (λV F , λV F ) = Aˆ0 be a unitary. We claim that the cocycle corresponding to the Q-system (λV F , ξS, ξλV F (ξ)T ξ ∗ ) is δˆ0 (ξ)ω 0 (ξ ∗ ⊗ ξ ∗ ). Direct computation shows V ∗ ξλV F (ξ)T ξ ∗ = V ∗ ξV σH (ξ)V ∗ T ξ = ξσH (ξ)ω 0 T˜0 ξ ∗ . Therefore, to prove the claim it suffices to show n(EAˆ0 ⊗ Id)(T˜0 ξ T˜0∗ ) = δˆ0 (ξ). However, by Lemma 3.2(4) and (5) we compute (EAˆ0 ⊗ Id)(T˜0 ξ T˜0∗ ) = (EAˆ0 ⊗ Id)((U ⊗ 1)V (U ⊗ 1)(e ⊗ ξ)(U ⊗ 1)V ∗ (U ⊗ 1)) = (U ⊗ 1)V (U ⊗ 1)(EAˆ0 ⊗ Id)((e ⊗ ξ))(U ⊗ 1)V ∗ (U ⊗ 1) =
1 1 (U ⊗ 1)V (1 ⊗ ξ)V ∗ (U ⊗ 1) = δˆ0 (ξ) . n n
Finally, we show that every cocycle arises in this way. We set S = ω 0∗ S0 ,
T = V ω 0∗ V ∗ T0
August 21, 2002 19:41 WSPC/148-RMP
742
00137
M. Izumi & H. Kosaki
for a given cocycle ω 0 , and show that (λV F , S, T ) is a Q-system. It is easy to show √ S ∗ T = λV F (S ∗ )T = 1/ n. We claim ρ¯V (T ) = ω 0∗ T˜0 . Indeed, direct computation based on the pentagon equation yields ∗ ∗ 0∗ ∗ V23 V12 ω12 V12 V23 T0 ρ¯V (T ) = V ∗ σH (V ∗ )V ω 0∗ V ∗ T0 V = V12 ∗ ∗ 0∗ ∗ 0∗ ∗ ∗ ∗ = V23 V13 ω12 V12 V23 T0 = ω12 V23 V13 V12 V23 T0 0∗ ∗ ∗ ∗ = ω12 V12 V23 V12 V12 V23 T0 = ω 0∗ T˜0 ,
which shows the claim. It is easy to show T ∈ (λV F , λ2V F ) from the claim, as ω 0∗ T˜0 2 belongs to (σH ρ¯, σH ρ¯V ) = (¯ ρV λV F , ρ¯V λ2V F ). To prove T 2 = λV F (T )T , it suffices ρV (T ), which follows from direct computation using to show ρ¯V (T )2 = σH ρ¯V (T )¯ Lemma 3.2(4), (5) and the claim. In particular, when M = N o G, a crossed product by G, H 2 (M ⊃ N ) can be ˆ of the group dual G ˆ (see the first identified with the second cohomology H 2 (G) part in Sec. 4). Also, from Theorem 2.4 and Theorem 3.3 we conclude Corollary 3.4. The second cohomology H 2 (A) is a finite set for a finitedimensional Kac algebra A. ˆ Jˆ⊗J), ˆ a 2-cocycle of Aˆ0 . We keep Let ω be a 2-cocycle of Aˆ and set ω 0 = (Jˆ⊗J)ω( ˜ ˜ the notations S, T , T0 , and T in the above proof. We would like to determine the Kac algebra corresponding to the subfactor associated with the Q-system (ρ¯ ρ, S, T ). ρ(x)T . Let We denote by Eω the conditional expectation defined by Eω (x) = T ∗ ρ¯ P be the image of Eω , and we set K = {X ∈ M ; Xy = ρ¯ ρ(y)X for each y ∈ P } . Then, the multiplicative unitary VP ∈ K2 K2∗ describing the new inclusion M ⊃ P is characterized by λV F (X) = VP FK X, X ∈ K with the flip FK of K ⊗ K [24]. Lemma 3.5. With the above notations we have (1) λV F (T ) = U ω 0 σH (T )ω 0∗ U, (2) K = U ω 0 H. Proof. (1) We note V σH (T˜0 )V ∗ = σH (T˜0 ) because the pentagon equation shows ∗ ∗ ∗ V13 σH (T0 ) = V23 V12 σH (T0 ) V σH (T˜0 )V ∗ = V σH (V ∗ T0 )V ∗ = V12 V23
= σH (V ∗ T0 ) = σH (T˜0 ) . Thus, by the pentagon equation and the above equality we compute 0∗ ∗ ∗ V12 F23 T0 V12 λV F (T ) = V F σH (V F )V ω 0∗ V ∗ T0 F V ∗ = V12 F12 V23 F23 V12 ω12 0∗ ∗ 0∗ ∗ = V12 V13 V23 ω23 σH (T˜0 )V12 = V23 V12 ω23 V12 σH (T˜0 )
= V23 (U ⊗ 1 ⊗ 1)(δˆ0 ⊗ Id)(ω 0∗ )(U ⊗ 1 ⊗ 1)σH (T˜0 )
August 21, 2002 19:41 WSPC/148-RMP
00137
Subfactor Analogue of the Second Cohomology
743
0 0∗ = (U ⊗ 1 ⊗ 1)ω12 V23 ω23 (Id ⊗ δˆ0 )(ω 0∗ )(1 ⊗ U ⊗ 1)V23 σH (T0 )(U ⊗ 1 ⊗ 1) 0 0∗ 0∗ V23 ω23 (1 ⊗ U ⊗ 1)V23 ω13 σH (T0 )(U ⊗ 1 ⊗ 1) = (U ⊗ 1 ⊗ 1)ω12 0 0∗ σH (T )ω12 (U ⊗ 1 ⊗ 1) , = (U ⊗ 1 ⊗ 1)ω12
showing (1). (2) Using (1), we compute ρ)2 (x)λV F (T ) ρ¯ ρ(Eω (x)) = λV F (T ∗ )(ρ¯ ρ(x)U ω 0 σH (T )ω 0∗ U = U ω 0 σH (T ∗ )ω 0∗ U σH ρ¯ ρV F (x)T )ω 0∗ U = U ω 0 σH (Eω (x))ω 0∗ U , = U ω 0 σH (T ∗ ρ¯ where we used the fact (ρ¯ ρ, ρ¯ ρ) = (λV F , λV F ) = Aˆ0 . This shows K = U ω 0 H. We set W = (U ⊗ 1)ω 0 (U ⊗ 1) ∈ Aˆ ⊗ Aˆ0 , which satisfies K = W H. Then, we get VP FK =
n X
λV F (W Si )Si∗ W ∗ = λV F (W )V F W ∗
i=1
= V F σH (V F )W σH (F V ∗ )F V ∗ V F W ∗ = V F σH (V F )W σH (F V ∗ )W ∗ , where {Si }ni=1 is an orthonormal basis of H. Let Vω ∈ H2 H∗2 be the multiplicative unitary that is unitary equivalent to VP through the unitary transformation W : H → K. Then, using the cocycle relation and Lemma 3.2(5), we compute ∗ ∗ ∗ W12 V12 F12 V23 F23 W12 F23 V23 W23 F12 Vω = σH (W ∗ )W ∗ VP FK W σH (W )F = W23 ∗ ∗ 0 ∗ = W12 W23 V12 (1 ⊗ U ⊗ 1)F12 V23 ω13 V23 (U ⊗ 1 ⊗ 1)W23 F12 ∗ ∗ 0 = W12 W23 V12 (U ⊗ U ⊗ 1)F12 (Id ⊗ δˆ0 )(ω 0 )ω23 (U ⊗ U ⊗ 1)F12 ∗ ∗ 0 = W12 W23 V12 (U ⊗ U ⊗ 1)F12 (δˆ0 ⊗ Id)(ω 0 )ω12 (U ⊗ U ⊗ 1)F12
ˆ ∗ ∗ 0 ˆ Vˆ 12 ((F ω 0 (U ⊗ U )F ) ⊗ 1) = W12 W23 V12 (U ⊗ U ⊗ 1)Vˆ ∗12 ω23 ∗ ∗ W23 (1 ⊗ ((U ⊗ 1)ω 0 (U ⊗ 1))((V F (U ⊗ U )ω 0 (U ⊗ U )F ) ⊗ 1) = W12
= (((U ⊗ 1)ω 0∗ (U ⊗ 1)V F (U ⊗ U )ω 0 (U ⊗ U )F ) ⊗ 1) . Moreover, when ω is a normalized cocycle (such a choice is always possible up to equivalence as shown in [19, Chap. 8]), we have κ⊗κ ˆ)(ω ∗ )F = ω , F (U ⊗ U )ω 0 (U ⊗ U )F = F (J ⊗ J)ω(J ⊗ J)F = F (ˆ and so Vω = (1 ⊗ U )F ω ∗ F (1 ⊗ U )V ω. Let Aˆω be the Kac algebra whose underlying algebra is Aˆ with the coproduct δˆω coming from Vω , i.e., δˆω (x) = Vω∗ (1 ⊗ x)Vω (and the other Kac algebra operations
August 21, 2002 19:41 WSPC/148-RMP
744
00137
M. Izumi & H. Kosaki
ˆ Γω : M → M ⊗ Aˆω be actions of A, ˆ Aˆω unchanged). Let Γ : M → M ⊗ A, respectively defined by Γ(x) =
n X
Si∗ ρ¯ ρ(x)Sj ⊗ Si Sj∗ ,
i,j=1
Γω (x) =
n X
Si∗ W ∗ ρ¯ ρ(x)W Sj ⊗ Si Sj∗ .
i,j=1
Then, N and P are the fixed-point algebras of the actions Γ and Γω respectively. From the arguments so far we have Corollary 3.6. With the above notations we have (1) (2) (3)
λV F λW = λW λVω F . ˆ ˆ x ∈ A. δˆω (x) = ω ∗ δ(x)ω, ω Γ = Ad(Z) · Γ with Z=
n X
Si∗ W ∗ Sj ⊗ Si Sj∗ ,
i,j=1
ˆ and Z satisfies (Id ⊗ δ)(Z) = (1 ⊗ ω)(Z ⊗ 1)(Γ ⊗ Id)(Z). (4) The representation categories of Aˆ and Aˆω are isomorphic as an abstract tensor category. Proof. The first three facts follow from easy computations, and (4) follows from the fact that the two categories have the same realization in End(M ) generated by ρ¯ ρ. In fact, the above argument also shows that if two finite dimensional Kac algebras A and B have the isomorphic representation categories, then A and B ω are isomorphic Kac algebras for some cocycle ω ∈ Z 2 (B) (as the corresponding depth 2 inclusions have the isomorphic canonical endomorphisms). Corollary 3.7. Let the notations be as above. (1) Let P be a subfactor of M satisfying [γP ] = [γN ] in Sect(M ). Then, there ˆ and a unitary ZP ∈ M ⊗ Aˆ satisfying exists a normalized cocycle ω ∈ Z 2 (A) ˆ P ) = (1 ⊗ ωP )(ZP ⊗ 1)(Γ ⊗ Id)(ZP ) such that ΓP = Ad(ZP ) · Γ is an (Id ⊗ δ)(Z action of Aˆω on M whose fixed-point algebra is P. (2) Assume that P and Q are subfactors satisfying the above condition. Then, P ˆ and Q are inner conjugate if and only if [ωP ] = [ωQ ] in H 2 (A). ˆ We Proof. We must see inner conjugacy of P and Q for [ωP ] = [ωQ ] ∈ H 2 (A). ∗ may and do assume ωP = ωQ as we may replace ZP with (1 ⊗ v )ZP if v ∈ Aˆ is a unitary satisfying ωP = δG (v)ωQ (v ∗ ⊗ v ∗ ). Then, direct computation shows
August 21, 2002 19:41 WSPC/148-RMP
00137
Subfactor Analogue of the Second Cohomology
745
ωP that X = ZQ ZP∗ is a ΓP -cocycle, i.e., (Id ⊗ δG )(X) = (X ⊗ 1)(ΓP ⊗ Id)(X). With ΓQ = Ad(X) · ΓP we can show, as in the case of group actions, that the identity of M extends to an isomorphism from the crossed product by ΓP onto that by ΓQ . Since the crossed product by ΓP is the basic extension of P = M ΓP ⊂ M , the desired inner conjugacy comes from uniqueness (up to inner perturbation) of downward basic construction.
4. Non-Uniqueness of Permutation Symmetries Let G be a finite group, and CG be its group algebra with the usual coproduct δG . We identify G with its image in the regular representation and keep the notations in the previous section. We freely use the terminology in [35] for cocycles of the dual of G, and the readers are recommended to consult [35] or references there for the facts ˆ and H 2 (G) ˆ for Z 2 (CG) stated below without proofs. We use the notations Z 2 (G) 2 2 and H (CG) respectively. Note that when G is abelian Z (CG) (resp. H 2 (CG)) is ˆ T) (respectively the second cohomology H 2 (G, ˆ T)) the usual 2-cocycle group Z 2 (G, ∞ ˆ ∼ ˆ of the dual group G via the duality CG = ` (G). ˆ the As was seen in the previous section, for a normalized cocycle ω ∈ Z 2 (G) ω Kac algebra (CG) has the same representation category as G. However, in general ω ω is not cocommutative. Even if δG is cocommutative, i.e., (CG)ω is isomorphic δG to the group algebra CGω with some group Gω thanks to the structure theorem of cocommutative Kac algebras, there is no reason to expect that the new group Gω is isomorphic to G. Indeed, the purpose of this section is to construct an example ˆ such that Gω is not isomorphic to G. of a group G and a cocycle ω ∈ Z 2 (G) Such an example shows that it is inevitable to fix a permutation symmetry in the Doplicher–Roberts duality theorem in order to uniquely recover the group from its representation category (see [7]). We would like to point out that contents in this section have some overlap with the recent works [6, 9], and in fact Lemmas 4.1 and 4.3 below are also obtained there. For a cocycle ω ∈ Z 2 (K, T) of a finite abelian group K, βω (k, l) = ω(k, l)ω(l, k) (k, l ∈ K) gives us a skew-symmetric bicharacter. It is well-known that the map ω 7→ βω induces an isomorphism from H 2 (K, T) into the set of skew-symmetric bicharacters. Lemma 4.1. Let G be a finite group, and we assume that ω ∈ Z 2 (CG) is a normalized cocycle. Then the following two conditions are equivalent: ω is cocommutative. (1) δG (2) There exist an abelian normal subgroup N ⊂ G and a non-degenerate cocycle ˆ ) satisfying [ω] = [ω1 ] in H 2 (G) ˆ and [g ω1 ] = [ω1 ] in H 2 (N ˆ ) for each ω1 ∈ Z 2 (N g −1 −1 2 ˆ g ∈ G, where ω1 = (g ⊗ g)ω1 (g ⊗ g ). (Note that Z (N ) is regarded as a ˆ since CN is a Hopf subalgebra of CG). subset of Z 2 (G)
Proof. Thanks to [35, Theorem 2], the class of ω uniquely corresponds to a conjugacy class of a full multiplicity ergodic action θ of G on a finite-dimensional von
August 21, 2002 19:41 WSPC/148-RMP
746
00137
M. Izumi & H. Kosaki
Neumann algebra B. Then, θ is induced by a full multiplicity ergodic action θ1 of ˆ ) be the cocycle whose a subgroup N on a full matrix algebra B1 . Let ω1 ∈ Z 2 (N 1 class corresponds to the class of θ . Then, since B1 is a matrix algebra, ω1 is nondegenerate [35, Theorem 12]. Thus, the naturality of the one-to-one correspondence ˆ and the conjugacy classes of full multiplicity ergodic actions of G between H 2 (G) ˆ shows [ω] = [ω1 ] in H 2 (G). Now, we assume that (1) holds. Then, since ω1 is non-degenerate, (CN )ω1 being ω is cocommmutative implies that N is commutative [35, Theorem 12]. Since δG ∗ ∗ cocommutative, we have F ω1 δG (g)ω1 F = ω1 δG (g)ω1 (g ∈ G), which is, by simple computation, equivalent to T
(g −1 ⊗ g −1 )βω1 (g ⊗ g) = βω1
for each g ∈ G .
(4.1)
Set A = g∈G CgN g −1 . Then, (4.1) implies βω1 ∈ A ⊗ A, and so A = CN because ω1 is non-degenerate (see the condition (C1 ) in [35, Theorem 12]). Thus, N is ˆ , T) normal. As N is abelian, it is well known that the cohomology class [ω1 ] ∈ H 2 (N g 2 ˆ is determined by βω1 , and so [ω1 ] = [ ω1 ] in H (N ). The converse implication follows from the same computation. Remark 4.2. (1) The reasoning in Lemma 4.1 works for compact groups as well with simple modification because it relies only on A. Wassermann’s work [35], where compact groups are dealt with. Let us assume, in the above, that G is a compact connected group with a closed abelian normal subgroup N . Under such circumstances N must be a central subgroup as the action of G on the (discrete) dual ω = δG . This observation actually group of N is trivial, and consequently we have δG enables us to obtain a new proof for D. Handelman’s result that the representation category uniquely recovers group structure for a compact connected group (though his statement in [14, Theorem 2.14] is much stronger). The authors are grateful to R. Longo for fruitful discussions on this subject. ˆ satisfying the condition of (2) Let G be a finite group with a cocycle ω ∈ Z 2 (G) Lemma 4.1. Then, there exists a permutation symmetry E ω of the tensor category Rep(G) of the unitary representations of G coming from the canonical permutation symmetry of Rep(Gω ) through the isomorphism between Rep(G) and Rep(Gω ) (see Corollary 3.6(4)). For (πi , Hi ) ∈ Rep(Gω ), i = 1, 2, E ω : H1 ⊗ H2 → H2 ⊗ H1 is given by E ω = (π2 ⊗ π1 )(ω)E(π1 ⊗ π2 )(ω ∗ ) = (π2 ⊗ π1 )(βω1 )E , ˆ ) is a where E is the canonical permutation symmetry of Rep(G) and ω1 ∈ Z 2 (N 2 ˆ cocycle equivalent to ω in H (G). Although this follows from a purely category theoretical argument, we present a proof based on our model for the convenience of those not familiar with category theory (including the authors). First, we assume ω = ω1 . Let (L, `2 (G)) be the left regular representation of G, and V be the multiplicative unitary on `2 (G)⊗`2 (G) defined by V f (g, h) = f (hg, h),
August 21, 2002 19:41 WSPC/148-RMP
00137
Subfactor Analogue of the Second Cohomology
747
g, h ∈ G. Then, the image of ρ¯V is the fixed-point algebra Onα under the G-action α defined by αg = λLg , g ∈ G. We denote ρ¯V by θ when regarded as an isomorphism from On onto Onα . Let σG be the restriction of σH to Onα . Then, θλV F = σG θ holds, showing that λV F is conjugate to σG . By simple computation based on the pentagon equation and V13 V23 = V23 V13 , we get θ(V F V ∗ ) = F , which shows that V F V ∗ is the canonical permutation symmetry for λV F coming from σG through θ. Thus, E ω acting on `2 (G) ⊗ `2 (G) is given by θ(VP FK VP∗ ), where VP and FK are as in Sec. 3. By straightforward computation using the cocycle relation of ω 0 and the fact that N is abelian, we get VP FK VP∗ = V βω F V ∗ . Since βω commutes with Lg ⊗Lg , we can conclude θ(VP FK VP∗ ) = βω F . Assume now that ω = δG (v)ω1 (v ∗ ⊗v ∗ ) with a unitary v ∈ CG. From the above computation we get E ω = (π2 ⊗ π1 )(ω)E(π1 ⊗ π2 )(ω ∗ ), which is equal to (π2 ⊗ π1 )(βω1 )E as βω1 commutes with δG (CG). We assume that N and ω = ω1 satisfy the condition (2) in Lemma 4.1. Since N ˆ , T). In what is abelian, by the duality we can regard ω as a group 2-cocycle in Z 2 (N ˆ follows, we use the following notation: For n ∈ N , g ∈ G, and τ ∈ N, we set ng = −1 ˆ → T, we set ∂ξ(σ, τ ) = ξ(σ)ξ(τ )ξ(στ ). g −1 ng and τ g (n) = τ (ng ). Also, for ξ : N Lemma 4.3. With these notations we have ˆ → T for each g ∈ G and η ∈ Z 2 (G, N ) such that for any (1) There exist ξg : N ˆ pair σ, τ ∈ N we have ω(σ g , τ g ) = ∂ξg (σ, τ )ω(σ, τ ) , ξg1 (σ)ξg2 (σ g1 ) = σ(η(g1 , g2 ))ξg1 g2 (σ) . (2) We may take ξg so that ξgn = ξg , ξe = 1 hold for all n ∈ N, g ∈ G, and η is regarded as an element in Z 2 (G/N, N ). Then {ξg g}g∈G is the set of group-like elements of (CG)ω , and Gω is given by the extension 1 → N → Gω → G/N → 1 whose difference from the extension G is given by η. Proof. (1) Since the cohomology class of ω is invariant under the G-action, ξg satisfying the first equality exists. The associativity of G implies ∂ξg1 (σ, τ )∂ξg2 (σ g1 , τ g1 ) = ∂ξg1 g2 (σ, τ ) , showing the existence of η(g1 , g2 ) ∈ N as above. (2) Since N is abelian, ω(σ n , τ n ) = ω(σ, τ ) holds for each n ∈ N and we may ˆ ) = CN ⊂ assume ξgn = ξg , ξe = 1. Note that through the duality we have `∞ (N CG. Let g˜ = ξg g ∈ CG for g ∈ G. Then, ω (˜ g ) = ω ∗ δG (ξg )(g ⊗ g)ω = ω ∗ δG (ξg )∂ξg ω(g ⊗ g) = g˜ ⊗ g˜ , δG
August 21, 2002 19:41 WSPC/148-RMP
748
00137
M. Izumi & H. Kosaki
which shows that {˜ g}g∈G is the set of group-like elements of (CG)ω , and so Gω = {˜ g}g∈G by definition. For g, h ∈ G, we have ˜ = η(g, h)gh ˜ , ˜ = ξg (gξh g −1 )gh = ξg (gξh g −1 )ξ ∗ gh g˜h gh which proves the lemma. ˆ is a finite abelian group, it has a unique direct product Remark 4.4. Since N ˆo , where N ˆe is a 2-group and N ˆo is an odd group. Due to ˆe × N decomposition N 2 ˆ 2 ˆ 2 ˆ H (N , T) = H (Ne , T) × H (No , T) (see [32]) we may assume ω = ωe ωo with ˆe , T), ωo ∈ Z 2 (N ˆo , T). Since βωo has odd order, the class [ωo ] has a ωe ∈ Z 2 (N representative of the form βωmo for some m ∈ N. Therefore, if [ωo ] is fixed by the G-action, we can choose a representative ωo invariant under the G-action. Thus, ωo has no contribution to η. From now on, we consider the case G = N o H with N = Z/mZ × Z/mZ and H ⊂ SL(2, Z/mZ). As shown in Remark 4.4, the case where m is a power of 2 is essential for our purpose and we assume so. We identify G/N with H and regard ˆ with Z/mZ × Z/mZ via the η above as an element of Z 2 (H, N ). We identify N n1 τ1 +n2 τ2 ˆ where ζm = exp( 2πi paring h(n1 , n2 ), (τ1 , τ2 )i = ζm m ). For n ∈ N , τ ∈ N , and a b h= ∈ SL(2, Z/mZ) , c d we use the convention t (nh ) = h−1 · t n so that τ h = τ · h = (aτ1 + cτ2 , bτ1 + dτ2 ). x1 y2 ˆ , T) by ω((x1 , x2 ), (y1 , y2 )) = ζm . Then, since βω We define a cocycle ω ∈ Z 2 (N is invariant under the SL(2, Z/mZ)-action, G, N , and ω satisfy the assumption of Lemma 4.3. It is a routine work to show the following: ˆ and Lemma 4.5. For (s, t) ∈ N a h= c
b d
∈ SL(2, Z/mZ) ,
a general solution of the equation h ω = ∂ξh ω in Lemma 4.3(1) is given by −[ abs(s−1) + cdt(t−1) +bcst] 2 2
ξh (s, t) = ξh (1, 0)s ξh (0, 1)t ζm
,
where ξh (1, 0) and ξh (0, 1) satisfy ξh (1, 0)m = (−1)ab , ξh (0, 1)m = (−1)cd . In particular, the following is a solution: −(abs2 +cdt2 +2bcst)
ξh (s, t) = ζ2m
with 0 ≤ a, b, c, d < m .
We fix ξh as in the last formula in the above lemma. We consider the case m = 4 with H = {1, h1 , h2 , h3 } ⊂ SL(2, Z/4Z), where 1 2 1 0 1 2 h1 = , h2 = , h3 = . 0 1 2 1 2 1
August 21, 2002 19:41 WSPC/148-RMP
00137
Subfactor Analogue of the Second Cohomology
749
Then, direct computation shows ξh1 (s, t) = ζ4−s , 2
ξh2 (s, t) = ζ4−t , 2
−(s2 +t2 )
ξh3 (s, t) = ζ4
.
Let {n1 , n2 } be the natural basis of N , and we set n3 = n1 n2 . We have η(hi , hi ) = n2i
(i = 1, 2, 3) ,
η(h1 , h2 ) = η(h2 , h1 ) = 1 , η(h1 , h3 ) = η(h3 , h1 ) = n21 ,
η(h2 , h3 ) = η(h3 , h2 ) = n22 . ˜ i , i = 1, 2, 3. Then, Gω is generated by {ni } S {ki }, and they satisfy We set ki = h the following relations: k1 n1 = n1 k1 , k1 n2 = n21 n2 k1 , ki2 = n2i
k2 n2 = n2 k2 , k2 n1 = n1 n22 k2 , (i = 1, 2, 3) ,
k1 k2 = k2 k1 = k3 , k1 k3 = k3 k1 = n21 ,
k2 k3 = k3 k2 = n22 .
Theorem 4.6. With the 2-cocycle ω constructed so far the two groups G and Gω are non-isomorphic. Proof. Let Z(G), Z(Gω ) be the centers of G and Gω respectively. Then, we have Z(G) = Z(Gω ) = {1, n21 , n22 , n23 }. We show that there exists no subgroup K ⊂ Gω isomorphic to Z/2Z × Z/2Z with trivial intersection with Z(Gω ). Note that this will prove the theorem as G does have such a subgroup. (2) Let Gω be the set of order 2 elements of Gω . Then, direct computation shows G(2) ω = (Z(Gω ) \ {e}) ∪ c1 Z(Gω ) ∪ c2 Z(Gω ) ∪ c3 Z(Gω ) ∪ c4 Z(Gω ) , where c1 = n1 k1 , c2 = n2 k2 , c3 = n1 k3 , c4 = n2 k3 . Suppose that there exists a subgroup K ⊂ Gω isomorphic to Z/2Z × Z/2Z with trivial intersection with Z(Gω ). Note that K cannot include two elements of the forms z3 c3 and z4 c4 with z3 , z4 ∈ Z(Gω ) because c3 c4 = n1 n32 . Thus, K should include two elements z1 c1 and z2 c2 for some z1 , z2 ∈ Z(Gω ). On the other hand, we have c1 c2 = n31 n2 k3 , c2 c1 = n1 n32 k3 , which shows that this is impossible. Therefore, there is no such K. Remark 4.7. A similar strategy works for H = SL(2, Z/4Z). Let Γ and Γω be the subgroup generated by the order 2 elements of G and Gω respectively. Then, we have Z(Γ) = Z(Γω ) = {1, n21 , n22 , n23 }. It is possible to show that there is no subgroup of Γω isomorphic to Z/2Z × Z/2Z × Z/2Z with trivial intersection with Z(Γω ), and hence we can conclude that Gω is not isomorphic to G.
August 21, 2002 19:41 WSPC/148-RMP
750
00137
M. Izumi & H. Kosaki
5. Group-Subgroup Subfactors Let R be an infinite factor equipped with an outer action α of a finite group G. We assume that a subgroup H of G contains no non-trivial normal subgroup of G (see [22, Proposition 3.1] for its meaning), and RH , RG be the fixed-point algebras under the H and G-actions respectively. In this section, we determine the structure of H 2 (RH ⊃ RG ), which will be parameterized by relative cohomology-like objects. Definition 5.1. Let G ⊃ H be a pair of finite groups. We say that a 2-cocycle ω ∈ Z 2 (G, T) is H-invariant if it satisfies ω(hg1 , g2 ) = ω(g1 , g2 h) = ω(g1 h, h−1 g2 ) = ω(g1 , g2 ) for g1 , g2 ∈ G and h ∈ H . We say that two H-invariant 2-cocycles ω1 , ω2 are equivalent if there exists a two-sided H-invariant function η : G → T satisfying ω2 (g1 , g2 ) = ω1 (g1 , g2 )η(g1 g2 )η(g1 )η(g2 ) . 2 (G, T) ZH
2 and BH (G, T) the set of H-invariant 2-cocycles and the set We denote by of H-invariant 2-cocycles equivalent to 1 respectively. The H-invariant cohomology 2 2 2 (G, T) is the quotient group of ZH (G, T) by BH (G, T). group HH
Remark 5.2. When G is a semi-direct product G = N o H, an H-invariant co2 (G, T) is determined by its restriction ω|N ∈ Z 2 (N, T) to N and cycle ω ∈ ZH 2 ZH (G, T) can be identified with the subset of Z 2 (N, T) consisting of those elements ω satisfying ω(nh1 , nh2 ) = ω(n1 , n2 ) for n1 , n2 ∈ N and h ∈ H . The corresponding coboundaries are given by functions η : N → T satisfying η(nh ) = η(n)
(n ∈ N, h ∈ H) .
2 (G, T) is isomorphic to a subgroup of the relative cohomology group Therefore, HH 2 H (G/N, G; T, T) (see [12, Introduction] for instance for the definition).
We will use the model inclusion for RH ⊃ RG constructed in [16]. We denote by g˙ (g ∈ G) the left coset gH. We fix a left transversal Ω of G/H with e ∈ Ω, and let On (with n = [G : H]) be the Cuntz algebra with generators {Sg˙ }g∈Ω . We denote by H the linear span of {Sg˙ }g∈Ω as before. Let θ be an action of G on On defined through the left translation of G on G/H, and OH , OG be the fixed-point algebras of On under the H and G-actions. We define an endomorphism γG/H of OH by X Sg˙ θg (x)Sg∗˙ . (5.1) γG/H (x) = g∈Ω
Note that this expression makes sense only for x ∈ OH and does not depend on the choice of Ω. Let σH be the canonical shift of On and σG be its restriction to OG . Note 2 = σG · γG/H .The that the restriction of γG/H to OG coincides with σG and so γG/H following statements are either well-known or easily deduced from well-known facts (see [8, 16] for example):
August 21, 2002 19:41 WSPC/148-RMP
00137
Subfactor Analogue of the Second Cohomology
751
0 Lemma 5.3. (1) The relative commutant On ∩ OG is trivial. m n n ∗m n ∗m (2) (σG , σG ) = (H H )G , where (H H )G is the G-invariant part of Hn H∗m . m n , γG/H ) ⊂ Hn H∗m . (3) (γG/H (4) The subspace {X ∈ On ; Xθg1 (x) = θg2 (x)X for each x ∈ OH } (with g1 , g2 ∈ G) is non-zero if and only if g1 H = g2 H.
We set S0 = Se˙
1 X and T0 = √ Sg˙ . n g∈Ω
As in Sec. 3, we can show that (γG/H , S0 , T0 ) is a canonical model of the Q-system for the inclusion RH ⊃ RG (see [16]). Theorem 5.4. For a Q-system (γG/H , S, T ) ∈ Q(γG/H ), there exists a unique 2 (G, T) satisfying H-invariant 2-cocycle ω ∈ ZH 1 X ω(g, g −1 k)Sg˙ Sk˙ Sk∗˙ , S = ω(e, e)Se˙ and T = √ n g,k∈Ω
2 (G, T) ZH
is obtained in this way. Furthermore, two Q-systems are and every ω ∈ equivalent if and only if the corresponding H-invariant 2-cocycles are equivalent. 2 (G, T). Therefore, we can identify H 2 (RH ⊃ RG ) with HH Proof. Let (γG/H , S, T ) be a Q-system. The intertwining property γG/H (x)Sg˙ = 2 ) yield Sg˙ θg (x) for x ∈ OH (see (5.1)) and T ∈ (γG/H , γG/H Sg∗˙ 2 Sg∗˙ 1 T Sg˙ 3 θg3 (x) = θg2 (x)Sg∗˙ 2 Sg∗˙ 1 T Sg˙ 3 (x ∈ OH ) for g1 , g2 , g3 ∈ Ω . Thus, Sg∗˙ 2 Sg∗˙ 1 T Sg˙ 3 is either 0 or a scalar (when g2 = g3 ) by Lemma 5.4, and one ˙ (g, k ∈ Ω) satisfying ˙ k) gets scalars ω0 (g, 1 X ˙ g˙ S ˙ S ∗˙ . ω0 (g, ˙ k)S T =√ k k n g,k∈Ω
Since T is an isometry, we must have X |ω0 (g, k)|2 = n
(k ∈ G) .
(5.2)
g∈Ω
˙ (hk)) ˙ = On the other hand, since T ∈ OH , we must have the invariance ω0 ((hg), ˙ for h ∈ H. Hence, the function ω(g, k) = ω0 (g, ˙ on G × G satisfies ˙ k) ˙ gk) ω0 (g, ω(hg, k) = ω(g, kh) = ω(gh, h−1 k) = ω(g, k) for each g, k ∈ G and h ∈ H , and T is expressed as X 1 √ T = #H n
ω(g, g −1 k)Sg˙ Sk˙ Sk∗˙ =
g∈Ω,k∈G
1 X ∗ = √ ω(g, k)Sg˙ Sgk ˙ Sgk ˙ . n g,k∈Ω
1 √ #H n
X g∈Ω,k∈G
∗ ω(g, k)Sg˙ Sgk ˙ Sgk ˙
August 21, 2002 19:41 WSPC/148-RMP
752
00137
M. Izumi & H. Kosaki
Since (Id, γG/H ) = CS0 , we have S = cS0 for some c ∈ T. Thus, the requirement (2.1) in Definition 2.1 is equivalent to ω(e, g) = ω(g, e) = c. We next show that (2.2) and (2.3) are equivalent to the 2-cocycle relation. Indeed, we compute 1 X 1 X ∗ ω(g, k)Sg˙ θg (T )Sgk ω(g, k)θg (Se˙ T Sk˙ Sk∗˙ ) γG/H (T )T = √ ˙ Sgk ˙ = √ n n g,k∈Ω
=
=
1 n 1 n
g,k∈Ω
X
∗ ∗ ω(g, k)ω(l, m)θg (Se˙ Sl˙Slm ˙ Slm ˙ Sk˙ Sk˙ )
g,k,l,m∈Ω
X
∗ ω(g, lm)ω(l, m)θg (Se˙ Sl˙Slm ˙ Slm ˙ ).
g,l,m∈Ω
On the other hand, we have 1 X 1 ω(g, l)θg (Se˙ Sl˙Sl∗˙ )T = T2 = √ n n g,l∈Ω
X
∗ ω(g, l)ω(gl, m)θg (Se˙ Sl˙Slm ˙ Slm ˙ )
g,l,m∈Ω
thanks to ω(nh, m) = ω(n, hm) (m, n ∈ G, h ∈ H). Hence, the requirement (2.3) means the cocycle relation ω(g, lm)ω(l, m) = ω(g, l)ω(gl, m) . Note that we must show |ω(g, h)| = 1 (g, h ∈ G). But, in a similar way as above T T ∗ = γG/H (T )∗ T (i.e., (2.2)) implies ω(l, m)ω(g, l) = ω(gl, m)ω(g, lm) . By setting l = e here, we conclude |ω(g, m)| = 1 thanks to (5.2). The above computation also shows that every H-invariant cocycle gives a Qsystem and this correspondence is one-to-one. Also, close examination of the above computations shows that the correspondence preserves the equivalence relations. Remark 5.5. A few remarks are in order. 2 (G, T) we may always assume ω(g, e) = ω(e, g) = 1 for each (1) For ω ∈ ZH g ∈ G up to equivalence. Then, the H-invariance of ω shows ω(h, g) = ω(g, h) = 1 for each g ∈ G, h ∈ H which is, in fact, equivalent to the H-invariance for ω ∈ Z 2 (G, T) satisfying the above requirement. For such a cocycle, we can find unitaries {ug }g∈G ⊂ R satisfying ugk = ω(g, k)ug αg (uk )(g, k ∈ G)
and uh = 1 for h ∈ H .
Let P be the fixed-point algebra under the perturbed action β = Ad(ug ) · αg . The class of the Q-system for the inclusion RH (= R(α,H) = R(β,H) ) ⊃ P probably corresponds to the class of ω in the theorem. 2 (G, T) to H 2 (G, T) is not (2) The kernel of the natural homomorphism from HH trivial in general. Indeed, let us assume ω(g, k) = ξ(g)ξ(k)ξ(gk) with a function 2 (G, T) if and only if ξ is a character on H and ξ : G → T. Then, ω falls into ZH ξ(hg) = ξ(gh) = ξ(h)ξ(g) ,
g ∈ G, h ∈ H .
August 21, 2002 19:41 WSPC/148-RMP
00137
Subfactor Analogue of the Second Cohomology
753
2 On the other hand, ω ∈ BH (G, T) if and only if ξ extends to a character of G. Let G be a finite group with [G, G] = G and H be a non-normal subgroup with #H = 2. The non-trivial character on H cannot be extended to the whole group G (since G does not admit any), and hence there is a non-trivial H-invariant cocycle that is a coboundary in B 2 (G, T). (3) When G is a semi-direct product G = N o H, there is a natural homo2 (G, T) to H 2 (N, T) (recall Remark 5.2). We now describe the morphism from HH 2 (G, T) and that there is η : N → T satisfying kernel. Assume that ω ∈ ZH
ω(n1 , n2 ) = η(n1 )η(n2 )η(n1 n2 ) ,
n1 , n2 ∈ N .
= ω(n1 , n2 ), h ∈ H (Remark 5.2) one can choose a By the invariance character χh (for each h ∈ H) of N satisfying ω(nh1 , nh2 )
η(nh ) = χh (n)η(n) ,
n ∈ N, h ∈ H .
This χ = {χh }h∈H gives an element of Z 1 (H, Hom(N, T)). If there exists an H-invariant function ζ : N → T satisfying ω(n1 , n2 ) = ζ(n1 )ζ(n2 )ζ(n1 n2 ) ,
n1 , n2 ∈ N ,
then there exists a character τ of N satisfying ζ(n) = τ (n)ζ(n), n ∈ N . This happens if and only if χ ∈ B 1 (H, Hom(N, T)). Therefore, there exists an injection from the kernel into H 1 (H, Hom(N, T)). This actually comes from the exact sequence 0 → H 1 (G/N, Hom(N, T)) → H 2 (G/N, G; T, T) → H 2 (N, T) . (1)
Corollary 5.6. For an inclusion M ⊃ N with the principal graph E6 #H 2 (M ⊃ N ) = 2.
we have
Proof. The alternating group A4 of degree 4 is a semi-direct product (Z/2Z × Z/2Z) o Z/3Z, where the generator of Z/3Z acts as the cyclic permutation of the non-trivial elements in Z/2Z × Z/2Z. It is known [13, 16] that M ⊃ N is always of the form RZ/3Z ⊃ RA4 . Since we can choose a Z/3Z-invariant representative of the unique non-trivial element in H 2 (Z/2Z × Z/2Z, T), we get a non-trivial element 2 (A4 , T) thanks to Remark 5.2. The result now follows from Remark 5.2 in HZ/3Z and Remark 5.5(3) as H 1 (Z/3Z, Hom(Z/2Z × Z/2Z, T)) is easily shown to be trivial. Remark 5.7. For the dual inclusion of RH ⊃ RG or a little broader class of subfactors the following conjecture is plausible: Let θ be an ergodic action of G on a finite-dimensional von Neumann algebra A, and we set M = (A ⊗ R)G ⊃ N = (C ⊗ RG ). The basic extension M1 for M ⊃ N is identified with (B(L2 (A)) ⊗ R)G with the GNS Hilbert space L2 (A) of the unique G-invariant trace on A. In view of [30], we conjecture that H 2 (M1 ⊃ M ) is in one-to-one correspondence with the set of conjugacy classes of ergodic G-actions on B such that the natural representation of G on L2 (B) is equivalent to that on L2 (A). Note that for A = `2 (G/H) (with
August 21, 2002 19:41 WSPC/148-RMP
754
00137
M. Izumi & H. Kosaki
the natural G-action), M ⊃ N is isomorphic to RH ⊃ RG , and so it is well-known that M1 ⊃ M is isomorphic to P o G ⊃ P o H for some factor P . For the special case H = {e}, the conjecture is certainly true thanks to what was mentioned in the paragraph before Corollary 3.4 and the one-to-one correspondence (in [35]) between ˆ the full multiplicity ergodic actions and H 2 (G). 6. Small Index Cases In this section, we determine the structure of H 2 (M ⊃ N ) for subfactors of index less than or equal to 4. Lemma 6.1. Let M ⊃ N be an irreducible inclusion of factors of finite index, and N ⊂ M ⊂ M1 ⊂ M2 ⊂ · · · be the corresponding Jones tower. If M1 ∩ N 0 ∼ = C ⊕ C and M2 ∩ N 0 ∼ = M2 (C) ⊕ C, 2 then H (M ⊃ N ) is trivial. Proof. The assumption M2 ∩ N 0 ∼ = M2 (C) ⊕ C means that the inclusion map ρ of N into M satisfies [ρ¯ ρρ] = 2[ρ] ⊕ [σ] with an irreducible sector σ. Let P be a subfactor of N whose canonical endomorphism is equivalent to that of N , and ρ0 be its inclusion map into M . The assumption M1 ∩ N 0 ∼ = C ⊕ C and the Frobenius reciprocity guarantee ρ, ρ0 ρ¯0 ) = dim(ρ¯ ρ, ρ¯ ρ) = (¯ ρρ, ρ¯ρ) = 2 , dim(¯ ρ0 ρ, ρ¯0 ρ) = dim(ρ¯
(6.1)
[¯ ρ0 ρ] = [µ1 ] ⊕ [µ2 ]
(6.2)
with irreducible sectors µ1 , µ2 . Another use of the Frobenius reciprocity shows that both of ρ0 µ1 and ρ0 µ2 contain ρ with multiplicity one. On the other hand, we have ρρ] = 2[ρ] ⊕ [σ] [ρ0 µ1 ] ⊕ [ρ0 µ2 ] = [ρ0 ρ¯0 ρ] = [ρ¯ so that one of µi ’s, say µ1 , must satisfy [ρ0 µ1 ] = [ρ]. This means d(µ1 ) = 1 and [ρ0 ] = [ρµ1 ](∈ Sect(M, P )), showing that P is inner conjugate to N . Among irreducible subfactors with indices less than or equal to 4 (the reducible cases can be easily treated), those not covered by the above lemma are the ones with (1) (1) (1) (1) the principal graphs D4 , E6 , Dn , and E6 . For D4 , D4 , and E6 , we already know the answers: the first two cases are given by fixed-point algebras under finite (1) group actions and the last was treated in Corollary 5.6. For E6 and Dn (n > 4) we have Proposition 6.2. Let M ⊃ N be an inclusion of factors with the principal graph (1) either E6 or Dn , n > 4. Then, H 2 (M ⊃ N ) is trivial.
August 21, 2002 19:41 WSPC/148-RMP
00137
Subfactor Analogue of the Second Cohomology
755
Proof. We begin with the principal graph E6 . The same notations as in the proof of the previous lemma are used ((6.1), (6.2) are still valid), but we have the irreducible decomposition [ρ¯ ρρ] = 2[ρ] ⊕ [σ1 ] ⊕ [σ2 ] with dimensions √ √ 3+1 , d(σ2 ) = 2 . d(ρ) = d(σ1 ) = √ 2 As before, ρ0 µ1 and ρ0 µ2 contain ρ with multiplicity one. We claim that either ρ0 µ1 or ρ0 µ2 must be ρ (which shows inner conjugacy as before). Suppose we had [ρ0 µ1 ] = [ρ] ⊕ [σ1 ] and [ρ0 µ2 ] = [ρ] ⊕ [σ2 ]. Then, ρ¯0 σ2 would √ contain µ2 with multiplicity one by the Frobenius reciprocity, and d(µ2 ) = 3. Comparing dimensions, we conclude [¯ ρ0 σ2 ] = [µ2 ] ⊕ [θ] with d(θ) = 1. But, the Frobenius reciprocity again shows [σ2 ] = [ρ0 θ], contradicting d(σ2 ) 6= d(ρ0 ). Therefore, H 2 (M ⊃ N ) is trivial for the E6 case. (1) Now we assume that the principal graph is Dn , n > 4, and let ρ, ρ0 be as above. Note that (6.1) is replaced by ρ, ρ0 ρ¯0 ) = dim(ρ¯ ρ, ρ¯ ρ) = 3 dim(¯ ρ0 ρ, ρ¯0 ρ) = dim(ρ¯ √ ρ0 ρ) = 4(< 3 2), so that ρ¯0 ρ is decomposed into 3 irreducible sectors. Since d(¯ ρ¯0 ρ must contain a sector of dimension 1, showing that N and P are inner conjugate. Summing up the arguments so far, we have shown Theorem 6.3. Let M ⊃ N be an inclusion of factors of index less than or equal to 4. Except for the following two cases, H 2 (M ⊃ N ) is trivial : (1)
(1) If the principal graph is D4 and N is the fixed point-algebra under a Z/2Z × Z/2Z-action, then H 2 (M ⊃ N ) has exactly two elements. (1) (2) If the principal graph is E6 , then H 2 (M ⊃ N ) has exactly two elements. For (1) we note H 2 (Z/2Z × Z/2Z, T) = Z/2Z while H 2 (Z/4Z, T) is trivial (see the paragraph after Definition 2.3). Before ending the final section, we present an easy but handy criterion, showing non-triviality of H 2 (M ⊃ N ) for generic SU (N )k -Hecke algebra subfactors (see [36]) for N ≥ 3 (apply the criterion below to the fundamental representation). Proposition 6.4. Let ρ be an irreducible endomorphism of a factor M satisfying d(ρ) < ∞. When [ρ¯ ρ] = [¯ ρρ] and ρ2 contains no sector of dimension 1, we have #H 2 (M ⊃ ρ(M )) ≥ 2. Proof. It suffices to show that ρ(M ) and ρ¯(M ) are not inner conjugate. Indeed, if they were, i.e, ρ¯ · α = Ad(u) · ρ with some unitary u ∈ M and α ∈ Aut(M ), then ρ2 would contain α thanks to the Frobenius reciprocity, a contradiction.
August 21, 2002 19:41 WSPC/148-RMP
756
00137
M. Izumi & H. Kosaki
Acknowledgment The research of the paper is supported in part by the Grant-in-Aid for Scientific Research, JSPS. The following two recent preprints contain some results related to the contents of this paper: V. Ostrik, Module categories, weak Hopf algebras and modular invariants. math.QA/0111139. P. Etingof, D. Nikshych, V. Ostrik, On fusion categories. math.QA/0203060.
References [1] M. Asaeda and U. Haagerup, Exotic subfactors of finite depth with Jones index √ √ (5 + 13)/2 and (5 + 17)/2, Comm. Math. Phys. 202 (1999) 1–63. [2] S. Baaj and G. Skandalis, Unitaires multiplicatifs et dualit´ e pour les produits crois´es ´ de C ∗ -alg´ebres, Ann. Sci. Ecole Norm. Sup. 26(4) (1993) 425–488. [3] A. Connes, Periodic automorphisms of the hyperfinite factor of type II1 , Acta Sci. Math. (Szeged), 39 (1977) 39–66. [4] J. Cuntz, Regular actions of Hopf algebras on the C ∗ -algebra generated by a Hilbert space, in Operator Algebras, Mathematical Physics, and Low-dimensional Topology (Istanbul, 1991), Res. Notes Math. 5, A K Peters, Wellesley MA, 1993, pp. 87–100. [5] M.-C. David, Paragroup d’Adrian Ocneanu et alg`ebra de Kac, Pacific J. Math. 172 (1996) 331–363. [6] A. A. Davydov, Galois algebras and monoidal functors between categories of representations of finite groups, J. Algebra 244 (2001) 273–301. [7] S. Doplicher and J. E. Roberts, A new duality theory for compact groups, Invent. Math. 98 (1989) 157–218. [8] S. Doplicher and J. E. Roberts, Endomorphisms of C ∗ -algebras, cross products and duality for compact groups, Ann. of Math. 130 (1989) 75–119. [9] P. Etingof and S. Gelaki, Isocategorical groups, Internat. Math. Res. Notices (2001), (2), 59–76. [10] D. E. Evans and Y. Kawahigashi, Quantum Symmetries on Operator Algebras, Oxford Math. Monographs Oxford Science Publications, The Clarendon Press, Oxford University Press, New York, 1998. [11] M. Enock and J.-M. Schwartz, Kac Algebras and Duality of Locally Compact Groups, Springer-Verlag, Berlin, 1992. [12] N. Habegger, V. F. R. Jones, O. Pino Oritz and J. Ratcliffe, Relative cohomology of groups, Comment. Math. Helvetici 59 (1984) 149–164. (1) [13] Jeong Hee Hong, Subfactors with principal graph E6 , Acta Appl. Math. 40 (1995) 255–264. [14] D. Handelman, Representation rings as invariants for compact groups and limit ratio theorems for them, Internat. J. Math. 4 (1993) 59–88. [15] M. Izumi, Application of fusion rules to classification of subfactors, Publ. Res. Inst. Math. Sci. 27 (1991) 953–994. [16] M. Izumi, Goldman’s type theorems in index theory, in Operator Algebras and Quantum Field Theory (Rome, 1996), International Press, Cambridge MA, 1997, pp. 249–269. [17] M. Izumi, Subalgebras of infinite C ∗ -algebras with finite Watatani indices II. Cuntz– Krieger algebras, Duke Math. J. 91 (1998) 409–461.
August 21, 2002 19:41 WSPC/148-RMP
00137
Subfactor Analogue of the Second Cohomology
757
[18] M. Izumi, The structure of sectors associated with Longo–Rehren inclusions II. Examples, Rev. Math. Phys. 13 (2001) 603–674. [19] M. Izumi and H. Kosaki, Kac algebras arising from composition of subfactors: General theory and classification, Memoirs Amer. Math. Soc. 158 (750) (2002). [20] V. F. R. Jones, Actions of finite groups on the hyperfinite type II1 factor, Memoirs Amer. Math. Soc. 28 (1980) (237). [21] H. Kosaki, Extension of Jones’ theory on index to arbitrary factors, J. Funct. Anal. 66 (1986) 123–140. [22] H. Kosaki and S. Yamagami, Irreducible bimodules associated with crossed product algebras, Internat. J. Math. 3 (1992) 661–676. [23] R. Longo, Index of subfactors and statistics of quantum fields I and II, Comm. Math. Phys. 126 (1989) 217–247 and 130 (1990) 285–309. [24] R. Longo, A duality for Hopf algebras and for subfactors I, Comm. Math. Phys. 159 (1994) 133–150. [25] R. Longo and K.-H. Rehren, Nets of subfactors, Rev. Math. Phys. 7 (1995) 567–597. [26] R. Longo and J. E. Roberts, A theory of dimension, K-Theory 11 (1997) 103–159. [27] Toshihiko Masuda, An analogue of Longo’s canonical endomorphism for bimodule theory and its application to asymptotic inclusions, Internat. J. Math. 8 (1997) 249–265. [28] M. Pimsner and S. Popa, Sur les sous-facteurs d’indice fini d’un facteur de type II1 ayant la propri´et´e T, C. R. Acad. Sci. Paris S´er. I Math. 303 (1986) 359–361. [29] S. Popa, Symmetric enveloping algebras, amenability and AFD properties for subfactors, Math. Res. Lett. 1 (1994) 409–425. [30] R. Schaflitzel, II1 -subfactors associated with the C ∗ -tensor category of a finite group, Pacific J. Math. 184 (1998) 333–348. [31] C. E. Sutherland, Cohomology and extensions of von Neumann algebras II, Publ. Res. Inst. Math. Sci. 16 (1980) 135–174. [32] M. Suzuki, Group Theory I, Grundlehren der Mathematischen Wissenschaften 247, Springer-Verlag, Berlin-New York, 1982. [33] W. Szyma´ nski, Finite index subfactors and Hopf algebra crossed products, Proc. Amer. Math. Soc. 120 (1994) 519–528. [34] D. Tambara and S. Yamagami, Tensor categories with fusion rules of self-duality for finite abelian groups, J. Algebra 209 (1998) 692–707. [35] A. Wassermann, Ergodic actions of compact groups on operator algebras II. Classification of full multiplicity ergodic actions, Canad. J. Math. 40 (1988) 1482–1527. [36] H. Wenzl, Hecke algebras of type An and subfactors, Invent. Math. 92 (1988) 349–383. [37] S. Yamagami, Group symmetry in tensor categories and duality for orbifold, J. Pure Appl. Algebra 167 (2002) 83–128.
August 22, 2002 16:25 WSPC/148-RMP
00138
Reviews in Mathematical Physics, Vol. 14, Nos. 7 & 8 (2002) 759–785 c World Scientific Publishing Company
MODULAR LOCALIZATION AND WIGNER PARTICLES
R. BRUNETTI Dipartimento di Scienze Fisiche, Universit` a di Napoli “Federico II” Complesso Univ. Monte S. Angelo, I–80126 Napoli, Italy
[email protected] D. GUIDO∗ and R. LONGO† Dipartimento di Matematica, Universit` a di Roma “Tor Vergata” Via della Ricerca Scientifica 1, I-00133 Roma, Italy ∗
[email protected] †
[email protected] Received 13 March 2002 Revised 12 June 2002 Dedicated to Huzihiro Araki on the occasion of his seventieth birthday We propose a framework for the free field construction of algebras of local observables which uses as an input the Bisognano–Wichmann relations and a representation of the Poincar´ e group on the one-particle Hilbert space. The abstract real Hilbert subspace version of the Tomita–Takesaki theory enables us to bypass some limitations of the Wigner formalism by introducing an intrinsic spacetime localization. Our approach works also for continuous spin representations to which we associate a net of von Neumann algebras on spacelike cones with the Reeh–Schlieder property. The positivity of the energy in the representation turns out to be equivalent to the isotony of the net, in the spirit of Borchers theorem. Our procedure extends to other spacetimes homogeneous under a group of geometric transformations as in the case of conformal symmetries and of de Sitter spacetime. Keywords: Local quantum physics, free fields, continuous spin, modular theory, induced representations.
1. Introduction Although quantum physics represents one of the most innovative and drastic conceptual changes of view in modern science, the construction of quantum mechanics and quantum field theory has been fruitfully realized with the guidelines of the “classical analogue”. This is unsatisfactory, beyond the well known difficulties to construct a quantum field theory with interaction, if one takes the attitude that quantum field theory should stand on its own legs [28]. One point where the structure is selfconsistently dictated by quantum principles is the construction of local observable algebras associated with free fields. We may 759
August 22, 2002 16:25 WSPC/148-RMP
760
00138
R. Brunetti, D. Guido & R. Longo
summarize the construction in the following building blocks: 1. The one-particle Hilbert space. 2. Second quantization. 3. Localization. Point 1 is E. Wigner’s cornerstone analysis of the irreducible unitary representations of (the cover of) the Poincar´e group. As is well known, the positive energy representations are classified by the mass m and the spin s if m > 0. When m = 0 the stabilizer of a non-zero point is isomorphic to the Euclidean group E(2) which is not compact. Irreducible representations of the Poincar´e group induced by finite-dimensional representations of E(2), namely by representations which are trivial on the translational part, are labelled by the helicity (a character on the one-dimensional torus). Irreducible representations of the Poincar´e group induced by infinite-dimensional representations of E(2) are historically called continuous spin representations (although properly speaking one should talk of helicity rather than spin). Usually one discards such representations because the corresponding particles have not been experimentally observed so far, but there is no conceptual a priori reason not to consider them. As we will explain below, the analysis in this paper naturally gets into the consideration of the case of continuous spin too. Point 2 is well described by E. Nelson’s expression: “First quantization is a mystery, but second quantization is a functor”. Segal’s quantization is indeed an automatic procedure to get Weyl operators on the Fock space associated with vectors in the one-particle Hilbert space. In particular one gets a von Neumann algebra out of a real Hilbert subspace of the one-particle space: this is Araki’s lattice of von Neumann algebras [1, 2]. In this sense free field analysis is basically reduced to one-particle analysis. In point 3 the basic principle of locality enters. The definition of local real Hilbert subspaces, hence of local von Neumann algebras, requires however one more step. One possibility is to take the functions localized in a region of the configuration spacetime and then get the real Hilbert space in the momentum space. That this procedure is not entirely intrinsic may be seen from the fact that it is not possible to extend it to the case of continuous spin [45]. The purpose of this note is to show how a net of local algebras may be canonically associated with any positive energy (anti)-unitary representation of the proper Poincar´e group. This construction relies on the idea of modular covariance, namely the identification of some one-parameter subgroups of the Poincar´e group with some modular groups constructed via the Tomita–Takesaki theory. In this way a net of standard subspaces of the representation space may be canonically defined directly in the Wigner one-particle space. Then the second quantization functor produces the net of local von Neumann algebras. Such a net coincides with the one generated by the free Bose field of mass m and spin s when the corresponding irreducible representation of the proper Poincar´e group is considered. This construction reveals the deep connection between the positivity of the energy and the isotony property
August 22, 2002 16:25 WSPC/148-RMP
00138
Modular Localization and Wigner Particles
761
of the net, and reflects the relation between the cyclicity of the vacuum for the intersection of two wedges and the existence of a PCT operator in terms of the Tomita modular conjugations, cf. [22]. Our analysis is related to [5, 35]. In other words, the Bisognano–Wichmann theorem tells us what the Tomita operator associated with a wedge region W should be. Since it is a second quantization operator [15], it is determined by the operator SW on the one-particle Hilbert space H. According to Bisognano–Wichmann 1/2
SW = JW ∆W
(1.1)
is made up by the boosts unitaries ∆it W and the PCT anti-unitary that are canonically associated with the given (anti)-unitary irreducible representation of the proper Poincar´e group. We may then reverse the point of view and define SW by formula (1.1) in terms of the Poincar´e group representation, hence define the real subspace KW ≡ {ξ : SW ξ = ξ} . This procedure is, of course, general and can be performed for any unitary representation of the Poincar´e group, including those with continuous spin, where the construction of the corresponding Wightman fields is not possible [45]. The von Neumann algebra R(W ) is then defined by R(W ) = {V (ξ) : ξ ∈ KW }00 , where V is the representation of the Weyl commutation relations on the Fock space over H. If O is a region of the spacetime obtained as intersection of wedges, we may then define \ R(W ) R(O) ≡ W ⊃O
(intersection over all wedges containing O). By a classical result the vacuum vector Ω is cyclic for R(O) if O is a double cone, for any irreducible representation of finite helicity. By an intrinsic analysis in terms of Poincar´e group representations, we shall show that, in case of continuous spin, Ω is cyclic for R(O) if O is a spacelike cone. But Reeh–Schlieder property for double cones is not to be expected in this case [30]. Our analysis extends to spacetimes with a group of symmetries, where a suitable notion of “wedge region” can be defined, in particular to any such wedge one would associate a one-parameter group of symmetries and a time-reversing reflection, both giving rise to modular objects in the unitary representations. The precise context is explained in Sec. 5, cf. also [12, 25] for related notions of wedge. Relevant situations are those given by the Minkowski spacetime (or the covering of its Dirac–Weyl compactification) with conformal symmetries, by the circle with M¨obius transformations, and by the (d-dimensional) de Sitter spacetime with the isometry group SO(d, 1).
August 22, 2002 16:25 WSPC/148-RMP
762
00138
R. Brunetti, D. Guido & R. Longo
Preliminary versions of this article have been circulating since a few years. The concept of modular localization has then found different applications in papers by B. Schroer and collaborators, see [16] and references therein. 2. Basic Preliminaries Let us recall some basic geometrical and analytical facts. The most important geometrical setting we consider is Minkowski spacetime, but we shall abstract our procedure to extend it to more general spaces and to discuss some other examples. The Minkowski spacetime is the real manifold Rd ≡ R × Rd−1 of dimension d ≥ 2, equipped with the metric hx, yi = x0 y 0 −
d−1 X
xi y i ,
∀ x, y ∈ Rd .
i=1
This makes Minkowski space a Lorentzian manifold and we consider the time orientation fixed once and for all. As a result the Minkowski spacetime is divided into subregions called spacelike, timelike and lightlike corresponding respectively to hx, xi < 0, hx, xi > 0, and hx, xi = 0. By theorems of Zeeman, the group of diffeomorphisms of the Minkowski space preserving the causal structure is the semidirect product of R × L with the translations, where R acts as the group of dilations and L is the full homogeneous Lorentz group. On the other hand the group of isometries of the Minkowski space is the Poincar´e group P, the semidirect product L n Rd , where Rd corresponds to the spacetime translations: (Λ, a) ◦ (Λ0 , b) = (Λ · Λ0 , a + Λ · b) ,
with Λ, Λ0 ∈ L, a, b ∈ Rd .
The full Poincar´e group P is simply connected, non connected, non compact, and perfect. It admits a splitting into connected components ↑ ↓ ↑ ↓ ∪ P+ ∪ P− ∪ P− . P = P+
where the ± corresponds to det(g) = ±1, namely selects those transformations which preserve or change the orientation, and the up/down arrow corresponds to hx, gxi ≷ 0, namely selects those transformations which preserve or change the time orientation. We shall be mainly concerned with the proper part of the Poincar´e group, ↑ ↓ ∪ P+ . i.e. P+ = P+ Let then P+ 3 g → U (g) be a strongly continuous (anti-)unitary representation on the Hilbert space H, i.e., ( ↑ unitary if g ∈ P+ U (g) is ↓ antiunitary if g ∈ P+ .
August 22, 2002 16:25 WSPC/148-RMP
00138
Modular Localization and Wigner Particles
763
We select now a particular class of causally complete subregions in Minkowski spacetime which are left globally invariant by suitable one-parameter velocity transformations. It is traditional to call them wedge regions and we denote the set of wedges by W. As usual, W 0 denotes the causal complement of W . Each wedge is a Poincar´e transform of the wedge W1 = {x ∈ Rd : x1 > |x0 |}. It is possible to assign to each wedge a one parameter group of transformations ΛW and a time-reversing reflection RW satisfying (a) Reflection covariance. For any W ∈ W, RW maps W onto W, RW (W ) = W 0 and RgW = gRW g −1 , g ∈ P+ . (b) Λ-covariance. For any W ∈ W, ΛW (t) maps W onto W, ΛW (t)(W ) = W and ↑ ↓ , ΛgW (t) = gΛW (−t)g −1 , t ∈ R, g ∈ P+ . ΛgW (t) = gΛW (t)g −1 , t ∈ R, g ∈ P+ Indeed, since the action of P+ is transitive on the family W, it is enough to choose ↑ (W ) := ΛW1 and RW1 to determine the whole assignment. Moreover, setting P+ ↑ {g ∈ P+ : gW = W }, properties (a) and (b) imply that ΛW is in the center of ↑ ↑ (W ), while RW commutes with P+ (W ). P+ ΛW1 is chosen as the (rescaled) boosts preserving W1 , namely cosh(2πt) − sinh(2πt) 0 . . . 0 − sinh(2πt) cosh(2πt) 0 ... 0 0 0 1 . . . 0 ∈ L↑+ . ΛW1 : R 3 t → ΛW1 (t) = .. .. .. . . . . .. . . . 0
0
0
...
1
The element RW1 in P+ is the reflection w.r.t. the edge of the wedge W1 , and is given by RW1 (x0 , x1 , . . . , xd−1 ) = (−x0 , −x1 , x2 , . . . , xd−1 ) . Let us fix a unitary representation U of P+ on a Hilbert space H. With W ∈ W a wedge, let HW be the self-adjoint generator of U (ΛW (t)) and define ∆W := exp(HW ) JW := U (RW ) . Proposition 2.1. The following facts hold true: (i) ∆W is a densely defined, closed, positive non-singular linear operator on H; 2 = 1; (ii) JW is an antiunitary operator on H and JW −1 −1 (iii) JW ∆W JW = ∆W . Proof. (i) and (ii) are obvious. Concerning (iii), let us observe that RW commutes −1 it with ΛW (t) which implies that JW ∆it W JW = ∆W , but from the anti-unitarity of −1 JW we have that JW HW JW = −HW , hence the thesis.
August 22, 2002 16:25 WSPC/148-RMP
764
00138
R. Brunetti, D. Guido & R. Longo
These properties allow us to introduce and discuss the properties of the following operator 1/2
SW := JW ∆W : H → H , indeed, denoting by R and D the range and the domain, we have: Proposition 2.2. SW is a densely defined, antilinear, closed operator on H with 2 ⊂ 1. R(SW ) = D(SW ) and SW Proof. Density and closedness follow from the corresponding property of ∆W in Proposition 2.1(i), antilinearity from the antilinearity of JW . Now, R(SW ) ⊂ 1/2 1/2 D(SW ) ≡ D(∆W ), indeed by Proposition 2.1(iii) we have that JW ∆W x = −1/2 1/2 1/2 1/2 2 = JW ∆W JW ∆W = ∆W JW x ∈ D(∆W ). But we get immediately that SW −1/2 1/2 ∆W ∆W ⊂ 1 and therefore if x ∈ D(SW ) then x = SW (SW x) ∈ R(SW ), so we can conclude. Let us now define real subspaces of H associated with any W ∈ W, KW = {h ∈ D(SW ) : SW h = h}. Recall that an R-linear subspace G in H is said to be standard whenever the following holds: G ∩ iG = {0} ,
(2.1)
G + iG = H .
(2.2)
Proposition 2.3. Each KW is an R-linear closed and standard subspace in H, SW is the Tomita operator of KW , namely D(SW ) = KW +iKW and SW (h+ik) = h−ik, h, k ∈ KW . In particular we have: ∆it W KW = KW 0 JW KW = KW , 0 := {h ∈ H : Im(h, k) = 0 ∀ k ∈ KW } is the symplectic complement of where KW KW .
Proof. The R-linearity and subspace property of any KW is obvious. Note first than any x ∈ D(SW ) can be written as x = h + ik where h respectively k have the form h=
x + SW x , 2
k=i
−i(x − SW x) . 2
By the preceding Proposition both terms belong to KW . Hence KW +iKW = D(SW ) which is dense, so (2.2) is fulfilled, and if x ∈ KW ∩ iKW then x = SW x and ix = SW ix = −iSW x = −ix, therefore x ≡ 0, and (2.1) holds too.
August 22, 2002 16:25 WSPC/148-RMP
00138
Modular Localization and Wigner Particles
765
The graph norm on D(SW ) is, for x = h + ik where h, k ∈ KW , kh + ikk2SW = kh + ikk2 + kSW (h + ik)k2 = kh + ikk2 + kh − ikk2 = 2(khk2 + kkk2 ) . Therefore D(SW ) with the graph norm is KW ⊕ iKW , hence the closedness of KW follows from that of SW . Proposition 2.4. The representation U acts covariantly on the family {KW : W ∈ W}, namely, U (g)KW = KgW ,
↑ g ∈ P+ .
(2.3)
Proof. From properties (a) and (b) it follows that ∗ it U (g)∆it W U (g) = ∆gW
and U (g)JW U (g)∗ = JgW
which imply that U (g)SW U (g)∗ = SgW , hence the thesis. ↓ Note that Eq. (2.3) holds true also for g ∈ P+ due to Proposition 2.3 and the following theorem.
Theorem 2.5. Let U be a (anti-) unitary representation of P+ and W 7→ KW the above defined map. Then wedge duality holds, namely 0 . KW 0 = KW
Moreover, the following are equivalent : 0 = {0}. (i) The spaces KW are factors, namely KW ∩ KW (ii) The representation U does not contain the trivial representation. (iii) The net is irreducible, namely \ KW = {0} . W ∈W −1/2
∗ = JK ∆K . Since RW 0 = RW and ΛW 0 (t) = Proof. Observe that SK0 = SK ΛW (−t), we get the first statement. Let us prove the equivalences. 0 = {x : U (ΛW (t))x = x = (ii) ⇒ (i). We have KW ∩ KW 0 = KW ∩ KW JW x, ∀ t ∈ R}. If such a space contains a non-zero x then the matrix coefficient (x, U (g)x) does not vanish at infinity. By the vanishing of the matrix coefficient theorem for semisimple Lie groups (cf. e.g. [46]) the representation must admit an invariant vector.
August 22, 2002 16:25 WSPC/148-RMP
766
00138
R. Brunetti, D. Guido & R. Longo
(i) ⇒ (iii). This follows directly by the first statement. (iii) ⇒ (ii). Decompose U |P ↑ as U 0 ⊕ I where I is the trivial representation +
(with some multiplicity) and U 0 does not contain the trivial representation. The commutation relations between ∆it and J imply that any J decomposes accordingly, namely has no anti-diagonal terms. Hence any space KW decomposes as KW = 0 I I ⊕ KW . We have U I (ΛW (t)) = I, and, given two wedges W1 , W2 , JW JI = KW 1 W2 I I U (RW1 RW2 ) = I, namely KW is independent of W . Therefore \ KW ⊃ 0 ⊕ KI . W ∈W
Irreducibility implies KI = 0, namely U = U 0 . Remark 2.6. Let us note that the construction of the net KW requires a repre↑ and a PCT operator. sentation of P+ , or, equivalently, a representation of P+ More precisely we need an anti-unitary involution J satisfying JU (g)J = U (RgR), for some space-time reflection R. Such involution does not necessarily ↑ on H, a reexist in any representation. However, given a representation U of P+ flection R and an anti-unitary involution C on H, we may set ! ! 0 C U (g) 0 ↑ ˜ ˜ U(R) = . U(g) = , g ∈ P+ , C 0 0 CU (RgR)C ˜ gives rise to a (anti)-unitary representation of P+ on H ⊕ H. Clearly U Moreover, if U |P ↑ is irreducible, then the anti-unitary involution U (RW ) is + unique up to a phase, that does not depends on W by covariance. Hence the family {KW } depends only on U |P ↑ up to unitary equivalence. +
It is known (see e.g. [41]) that a PCT operator exists for an irreducible represen↑ (on R4 ) if and only if the representation is induced by a self-conjugate tation of P+ representation of the stabilizer of a point, which is always the case, except for the finite non-zero helicity representations. 3. Inclusions of Real Subspaces and Wedges Proposition 3.1. Let K1 , K2 be standard subspaces of the Hilbert space H, and 1/2 assume that U K1 = K2 , with U unitary on H. Then K2 ⊂ K1 iff ∆1 U ∗ ⊂ 1/2 J1 U ∗ J1 ∆1 . Proof. The following equivalences hold: K2 ⊂ K1 ⇔ S2 ⊂ S1 ⇔ U J1 ∆1 U ∗ ⊂ J1 ∆1 1/2
1/2
⇔ ∆1 U ∗ ⊂ J1 U ∗ J1 ∆1 1/2
1/2
.
August 22, 2002 16:25 WSPC/148-RMP
00138
Modular Localization and Wigner Particles
767
The following theorem is a one-particle analogue of results in [8, 43]. It is related to the positive energy criterion in [4]. Theorem 3.2. Let K be a standard space in the Hilbert space H and U (a) = eiaH a one-parameter group of unitaries on H satisfying ∆it U (a)∆−it = U (e∓2πt a) JU (a)J = U (−a) ,
(3.1) (3.2)
where J and ∆ are the modular conjugation and operator associated with K. The following are equivalent: (i) U (a)K ⊂ K for a ≥ 0; (ii) ±H is positive. Proof. By replacing K with K0 it suffices to prove the case H positive. The implication (i) ⇒ (ii) was proved in [43]. (ii) ⇒ (i). Let us observe that the spectrum of H is acted upon by the group ∆it and by J, with {0} and (0, ∞) being the invariant subsets. The corresponding eigenspaces are henceforth invariant under the action of ∆it and J, as a consequence K is decomposed in a direct sum of respectively the H = 0 and the H > 0 parts. Hence the thesis may be proven in the two cases separately. When H = 0 isotony trivially holds. In the following we assume that H > 0. By Proposition 3.1, together with Eq. (3.2), we get U (a)K ⊂ K ⇔ ∆1/2 U (a)∗ ⊂ U (a)∆1/2 .
(3.3)
Let K = log H (it exists since H > 0), and M the generator of ∆it/2π . It is easy to see that eiµK and eiλM satisfy Weyl’s commutation relations, i.e., eiλM eiµK = eiλµ eiµK eiλM . According to von Neumann’s theorem every representation of the Weyl’s commutation relations is equivalent to a multiple of the Heisenberg representation. Then the relation on the right hand side of (3.3) can be checked in just one representation. Because of the equivalence (3.3), it is enough to verify the inclusion U (a)K ⊂ K, a > 0, in one non-trivial representation. An example is provided by the one-particle space of the conformal field theory on the line corresponding to lowest weight representations of P SL(2, R). Taking K as the standard space associated with the right half-line (0, ∞), and U (t) as the translations, the relations in the hypothesis are verified [10], and the mentioned inclusion of subspaces hold by isotony. Remark 3.3. Condition (3.2) is not needed for the implication (i) ⇒ (ii) in the above theorem, see [13]. However the condition is necessary for the converse implication. Indeed, given J, ∆ and U as in the theorem, and assuming positivity of the
August 22, 2002 16:25 WSPC/148-RMP
768
00138
R. Brunetti, D. Guido & R. Longo
generator of U (a), one may choose a unitary V which commutes with ∆, anticommutes with J and does not commute with U (a), e.g. V = (∆ + i)(∆ − i)−1 , and then replace J with V J, the space K being redefined accordingly. Now property (i) in the theorem above cannot hold, since, by the result of Borchers [8], it would imply condition (3.2) for the new J, against the hypothesis. ↑ consisting of the generators Let us denote by H the cone in the Lie algebra of P+ of future-pointing light-like or time-like translations. As is known, a unitary repre↑ has positive energy if the corresponding self-adjoint generators are sentation of P+ positive. Given two wedges W0 ⊂ W , we shall say that W0 is positively included in W whenever W0 can be obtained by W via a suitable translation exp(a0 h), a0 ≥ 0, such that ±h ∈ H, where we denoted by exp the exponential map from the Lie algebra to the Lie group, and
ΛW (t) exp(ah)ΛW (−t) = exp(e∓2πt ah) RW exp(ah)RW = exp(−ah)
a, t ∈ R .
The following is a well known geometric fact: (c) Positive inclusion. Any inclusion of wedges is the composition of finitely many positive inclusions. Theorem 3.4. Let U be a (anti-)unitary representation of P+ , W1 ⊂ W2 wedges. Then KW1 ⊂ KW2 iff U is a positive energy representation. Proof. Follows immediately from (c) and Theorem 3.2. Since causally complete convex regions are intersections of wedges, the map W → KW extends to causally complete, convex regions C via \ KW , (3.4) KC = W ⊃C
and to general causally complete regions via _ KC , KO =
(3.5)
C⊂O
where C are convex and causally complete. Let us observe that isotony for wedges implies that (3.4) is consistent with the original definition of KW . Denote by K the family of all convex causally complete regions. Let us point out he following fact (see e.g. [40]): (d) Wedge separation. For any space-like separated O1 , O2 ∈ K there exists a wedge W such that O1 ⊂ W and O2 ⊂ W 0 . Corollary 3.5. Let U be a positive energy representation of P+ . Then the map O → KO is a local Poincar´e covariant net of real vector spaces, i.e., isotony holds
August 22, 2002 16:25 WSPC/148-RMP
00138
Modular Localization and Wigner Particles
769
0 and if O1 ⊂ O20 then KO1 ⊂ KO . If O is a convex causally complete region then 2 0 Haag duality holds, namely KO0 = KO .
Proof. The first part of the statement holds by definition. Let us fix O0 ∈ K. If O0 is a wedge, then duality has been proved in Theorem 2.5. If O0 is not a wedge, then its space-like complement is not convex, hence, by (d), we have the following chain of identities: 0 0 _ _ 0 = K KW 0 KO 0 = O = 0 O⊂O00 O∈K
W 0 ⊂O00 W ∈W
\
0 KW 0 =
W 0 ⊂O00 W ∈W
\
KW = KO0 .
W ⊃O0 W ∈W
Therefore Haag duality holds. Remark 3.6. (1) A net of von Neumann algebras may be obtained via second quantization: R(O) = {V (h) : h ∈ KO }00 where V (h) are the Weyl unitaries on the Bosonic Fock space eH . Weyl unitaries may be defined via V (h)e0 = e− 4 khk e 1
V (h)V (k) = e
2
√i h 2
− 2i Im(h,k)
,
h∈H
V (h + k) ,
h, k ∈ H L∞ h⊗n
√ where the coherent vectors eh are defined by eh = n=0 n! . Coherent vectors H turn out to form a total set in e (see e.g. [19, p. 32]), hence the V (h)’s are well defined unitaries. The standard property of KO is equivalent to the Reeh–Schlieder property for R(O) (cf. [1, 15, 33]). (2) If U is the irreducible representation of mass m and spin s the map O → R(O) gives the net of local observable algebras for the free field of mass m and spin s. In fact, for these nets the one-particle version of the Bisognano–Wichmann theorem holds, i.e.,
JW = U (RW ) ∆it W = U (ΛW (t)) where JW and ∆W are the Tomita operators of the real space KW of vectors localized in W . This means that KW is effectively reconstructed in terms of the representation U . Moreover, it was shown by Araki [2] that the map O → KO is an isomorphism of complemented lattices (∩, ∪, space-like complement) (∩, ∨, symplectic complement)
August 22, 2002 16:25 WSPC/148-RMP
770
00138
R. Brunetti, D. Guido & R. Longo
if O is connected, causally complete, with piecewise C 1 boundary. This shows that KO is also reconstructed in terms of the representation U . Three questions arise for the subspaces of the described net O → KO : the standard property, the III1 factor property (see [3]), namely the fact that the corresponding second quantization algebra is a type III1 factor, and the intersection property (for convex causally complete regions), namely \ \ Wi ⇒ KC = KWi . (3.6) C= i∈I
i∈I
When wedge regions are concerned, we proved the standard property, the intersection property and the factor property for irreducible nets. The III1 factor property (for irreducible nets) and the other properties are proved for space like cones in Sec. 4. 4. Intersections and Cyclicity Proposition 4.1. Let Kj , j ∈ J , a family of standard subspaces of a Hilbert space T H, o a distinguished element of J . Then j∈J Kj is standard if and only if the space {x ∈ H : x ∈ D(Sj So ) & Sj So x = x, ∀ j ∈ J }
(4.1)
is dense. T T Proof. Since Ko is standard, j∈J Kj is standard if and only if j∈J Kj + T i j∈J Kj is dense. We contend that the last subspace can be equivalently written as the expression in (4.1). Indeed, if x ∈ Kj for all j ∈ J , then Sj x = So x = x for all j ∈ J . Since range and domain of the S operators coincide, So x belongs to the domain of Sj and Sj So x = x, j ∈ J . Hence x belongs to the space in (4.1). Such a space being T complex linear, it contains also i j∈J Kj . Conversely, if x ∈ D(Sj So ) and Sj So x = x ∀ j ∈ J , then, ∀ j ∈ J , x ∈ D(Sj ), hence it can be written as x = hj + ikj with hj , kj ∈ Kj , and So x = Sj x. Therefore we get So x = Sj x = Sj (hj +ikj ) = hj −ikj , hence 12 (x+So x) = hj and T 1 j∈J Kj . 2i (x − So x) = kj , namely hj and kj are independent of j and belong to
Recalling the definition in Eq. (3.4), we get the following. Proposition 4.2. Let U be a (anti-)unitary representation of P+ on the Hilbert space H, C a convex, causally complete region, W a wedge containing C, and G(C) = ↑ : gW ⊃ C}. Then KC is standard iff, denoting by T (g) the operator {g ∈ P+ ↑ −1/2 U (Rg −1 R)∆1/2 , g ∈ P+ , ∆ {x ∈ H : x ∈ D(T (g)) & T (g)x = U (g −1 )x, ∀ g ∈ G(C)}
(4.2)
August 22, 2002 16:25 WSPC/148-RMP
00138
Modular Localization and Wigner Particles
771
is dense, where ∆ and R refer to the wedge W . T ↑ T , G1 KgW = G2 KgW iff Also, given two subsets G1 , G2 of P+ {x ∈ H : x ∈ D(T (g)) & T (g)x = U (g −1 )x, ∀ g ∈ G1 } = {x ∈ H : x ∈ D(T (g)) & T (g)x = U (g −1 )x, ∀ g ∈ G2 } .
(4.3)
↑ is transitive on the wedges, the first statement Proof. Since the action of P+ immediately follows by the previous proposition. The second statement follows by the proof of the previous proposition.
Now we may tackle the main questions concerning convex, causally complete regions in the Minkowski space in this approach, namely the standard property (2.1), (2.2), the III1 factor property and intersection property (3.6). Since local algebras (and local subspaces) are not defined in terms of local fields, the classical Reeh–Schlieder argument does not apply. However, Proposition 4.2 shows that the standardness for a given region (or family of regions) is a property of the representation U , hence group theoretic techniques may be applied. Intersection property instead has to do with the definition in Theorem 3.4. Though the local space of a given convex causally complete region C is defined as the intersection of the spaces of all wedges containing it, just a few of them may be enough to determine C. Would the corresponding intersection of local spaces give rise to the same space? Again, because of the absence of local fields, the answer is not trivial, and the group theoretic approach may do the job. Lemma 4.3. Let U be a (anti-)unitary positive energy representation of P+ on the Hilbert space H, C a convex, causally complete R ⊕ region. AssumeUthat the representa↑ ) decomposes as Uλ dµ(λ). Then KC is standard if and tion U (restricted to P+ only if KCUλ is standard for µ-almost all λ. T T U if and only if Given Wj , j ∈ J , such that C = j∈J Wj , then KCU = j∈J KW j T Uλ Uλ KC = j∈J KWj for µ-almost all λ. Proof. By Proposition 4.2, both properties depend only on U |P ↑ . The thesis +
follows by (4.2) and (4.3). Theorem 4.4. Let U be a (anti-)unitary positive energy representation of P+ on the Hilbert space H, C a spacelike cone. Then the standard property and the intersection property hold. If U does not contain the trivial representation, then the type III1 factor property holds too. Proof. Let us prove the standard property. Clearly we may assume that the vertex of the space-like cone lies at the origin of the coordinates. Lemma 4.3 shows that is enough to check the density of the space in (4.2) for all the irreducible positive energy representations. Since this property is known for the
August 22, 2002 16:25 WSPC/148-RMP
772
00138
R. Brunetti, D. Guido & R. Longo
positive mass representations and for the zero mass, finite helicity representations, we only have to verify it for the so called continuous spin representations. Let us now denote by F (C) the set of wedges containing C. Given a wedge in F (C), we may consider the family of wedges parallel to the given one and still belonging to F (C). The intersection of all such wedges is clearly a wedge in F (C) whose edge contains the vertex of C, namely the origin. Because of isotony (Theorem 3.4), \ \ KW = KW , W ∈F (C)
W ∈F0 (C)
where F0 (C) denotes the subset consisting of wedges whose edge contains the origin. Then, fixing a wedge W in F0 (C) and setting G0 (C) = {g ∈ P+ : gW ∈ F0 (C)}, T the complex span of the space W ∈F0 (C) KW is given by {x ∈ H : x ∈ D(T (g)) & T (g)x = U (g −1 )x, ∀ g ∈ G0 (C)} ,
(4.4)
namely only the Lorentz subgroup is involved. Therefore the standard property has only to be checked on the restriction to the Lorentz group of the given continuous spin representation U . Theorem A.1 concludes the proof. Let us now prove the intersection property. Again by isotony, we may restrict to the intersection of wedges whose edge contains the origin. If G1 , G2 are T T two subsets of L↑+ such that g∈G1 gW = g∈G2 gW = C, then the equality T T g∈G1 KgW = g∈G2 KgW is equivalent to relation (4.3). Then the proof goes on as for the previous case. We finally prove the III1 factor property. It has been proved in [18] that if 1 is in the spectrum of ∆, but not in the point spectrum, then the second quantization algebra is a type III1 factor. Clearly the property 1 ∈ σ(∆)\σp (∆) is stable under direct sums and quasi-equivalence. Then, by the proof of Theorem A.1, it is enough to show this property for the finite spin representations. Indeed this shows the property for the regular representation of L↑+ , hence for the restriction to L↑+ of ↑ , since they are quasi-equivalent to the the continuous spin representations of P+ regular representation. Now we follow [17], where it is shown (Theorem 3.6) that ∆ can be written as a functional calculus of a selfadjoint operator B via the function t+1 t−1 , showing in particular that 1 6∈ σp (∆). Moreover, using the explicit formula for B, one concludes that B is unbounded, hence 1 is in the spectrum of ∆. Now we prove the standard property for light-like strips, namely for regions given by W ∩ W 0 + a, where a is a lightlike vector parallel to W , namely such that W + a ⊂ W . Such property is motivated by the proof of the spin and statistics property for spacetimes with bifurcated Killing horizon given in [25 Sec. 4.2]. Theorem 4.5. Let W and a be as above and assume the spacetime dimension is d 6= 2. For any positive energy (anti-)unitary representation of P+ , KW ∩W 0 +a is standard.
August 22, 2002 16:25 WSPC/148-RMP
00138
Modular Localization and Wigner Particles
773
Proof. Clearly any wedge containing L = W ∩ W 0 + a either contains W or contains W 0 + a. Then, by isotony (Theorem 3.4), \ KW = K ∩ KW 0 +a . W ⊃L
Let us assume for the moment that U is the trivial representation. Then transla∗ = SW . tions and boosts act trivially, namely KW 0 +a = KW 0 = KW , since SW 0 = SW Therefore we may assume that U does not contain the trivial representation, namely U does not have invariant vectors. Since d 6= 2, the vanishing of the matrix coefficient theorem applies (cf. e.g. [46, Proposition 2.3.5]), hence the spectrum of the generator of any light-like translation is strictly positive, i.e. zero is not an eigenvalue. As explained before, the standard property is equivalent to the density of the space {x ∈ D(∆1/2 U (τ ((a))∆1/2 ) : U (τ ((a))∆1/2 U (τ ((a))∆1/2 x = x} ,
(4.5)
where τ (a) denotes the translation by a. This property clearly depends only on the restriction of the representation of the Poincar´e group to the subgroup P1 generated by boosts and light-like translations with strictly positive generator (relative to the wedge W ). As the logarithm of the generator of translations and the generator of the boosts give rise to (and are determined by) a representation of the CCR in one dimension, the strictly positive energy representations of P1 have a simple structure: they are always a multiple of the unique irreducible representation. Therefore the density of the space in (4.5) holds either always or never, and hence can be checked in the irreducible case. But this is the case of the current algebra on the circle, where cyclicity holds by conformal covariance. Now we show that some form of the intersection property holds for double cones too. Let C be a diamond generated by a relatively open convex subregion Ω of some space-like hyperplane G. For any ξ ∈ ∂Ω, let us consider the family F (ξ) of the half-spaces in G tangent to Ω at ξ, namely the half-spaces containing Ω and whose boundary contains ξ. Being parametrized by the normal vectors at ξ, they have a linear structure, and clearly form a closed convex set. Let us denote by F∗ (ξ) its S T extreme points, and by F∗ (Ω) the union x∈∂Ω F∗ (ξ). Clearly Ω = h∈F∗ (Ω) h. We shall call F∗ (Ω) the minimal family for Ω. Analogously, denoting with Wh the wedge generated by the space-like half-space h, we shall call F∗ (C) = {Wh : h ∈ F∗ (Ω)} the minimal family for C. Clearly when C is the intersection of a finite number of wedges Wi , the minimal family F∗ (C) consists only of (some) Wi . Theorem 4.6. Let C be a diamond generated by a relatively open convex subregion Ω of some space-like hyperplane G, F∗ (C) its minimal family. Then \ KW . KC = W ∈F∗ (C)
August 22, 2002 16:25 WSPC/148-RMP
774
00138
R. Brunetti, D. Guido & R. Longo
Proof. Let W be a wedge containing C. Then W ∩ G ⊃ Ω. Since W ∩ G is a cone given by the intersection of (at most) two half spaces h1 , h2 of G, then, by the intersection property for space-like cones, one gets KW ⊃ KWh1 ∩Wh2 ⊃ KC and KWh1 ∩ KWh2 = KWh1 ∩Wh2 . Therefore \ KWh . KC = h∈F (Ω)
Then, again by the intersection property for space-like cones, for any point ξ ∈ ∂Ω, T T T we may replace h∈F (ξ) KWh with h∈F∗ (ξ) KWh , since h∈F (ξ) Wh is a spacelike cone, and the proof is completed. Theorem 4.7. The following pair of classes can be put in one-to-one correspondence: (i) Positive energy representations of P+ . (ii) Local nets of closed real vector spaces on K satisfying modular covariance, namely ∆it W KO = KΛW (t)O , and standard property for the space-like cones. Proof. The map from (i) to (ii) has been illustrated above. The inverse map has ↑ . been constructed in [11], getting a representation of the universal covering of P+ ↑ , It has been shown in [22] that such representation is indeed a representation of P+ and extends to a representation of P+ . ↑ Remark 4.8. Let U be a unitary representation of P+ on a Hilbert space H which is finite direct sum of irreducible representations each with strictly positive mass. As recently shown in [36], if F : W ∈ W → FW is a net of standard real subspaces of H and U acts covariantly on F , namely U (g)FW = FgW , then FW is the standard subspace associated with W and U .
5. Free Nets on Different Spacetimes In this section we discuss various extensions of the previous construction to different spacetimes. We begin with a general setting. Let M be a globally hyperbolic spacetime, G a (Lie) group of transformations acting on it (e.g. isometries, or conformal transformations), G+ the subgroup of orientation preserving transformations, G↑ the subgroup of time-preserving transformations, G↑+ their intersection. Assume it is possible to choose a triple (W, R, Λ) where W is a family of open, causally complete subregions, called wedges, stable under the action of G+ , R : W → RW is a map from W to time-reversing reflections in G+ , Λ : W → ΛW is a map from W to one-parameter subgroups of G↑+ satisfying the following properties: (a) Reflection covariance. For any W ∈ W, RW maps W onto W, RW (W ) = W 0 and RgW = gRW g −1 , g ∈ G+ .
August 22, 2002 16:25 WSPC/148-RMP
00138
Modular Localization and Wigner Particles
775
(b) Λ-covariance. For any W ∈ W, ΛW (t) maps W onto W, ΛW (t)(W ) = W and ΛgW (t) = gΛW (t)g −1 , t ∈ R, g ∈ G↑+ , ΛgW (t) = gΛW (−t)g −1 , t ∈ R, g ∈ G↓+ . Remark 5.1. Properties (a) and (b) imply that RW 0 = RW and ΛW 0 (t) = ΛW (−t). Moreover, if gW = W , then g commutes with ΛW and RW , namely ΛW belongs to the center of the stabilizer G↑+ (W ) = {g ∈ G↑+ : gW = W } and RW commutes with G↑+ (W ). If G↑+ acts transitively on W, then the assignments W → ΛW , W → RW are determined by the choice of a one parameter subgroup in the center of the stabilizer of one wedge W0 , and by the choice of a reflection commuting with G↑+ (W0 ). In many cases, e.g. Minkowski spacetime with Poincar´e symmetry in dimension d 6= 3, or Minkowski with conformal symmetry in any dimension, or de Sitter spacetime in dimension d 6= 3, the center of G↑+ (W ) is one-dimensional, hence ΛW is fixed up to rescaling. Given a (anti)-unitary representation U of G+ , we can reproduce the analysis in Sec. 2: Set ∆W = U (ΛW (−i)), JW = U (RW ) (the above normalization at t = −i is conventional, as we could arbitrarily rescale ΛW . The positive energy condition, see below, will fix the normalization). Clearly JW is a self-adjoint antiunitary, and ∆ is strictly positive. By (a) and (b), RW = RΛW (t)W = ΛW (t)RW ΛW (−t), namely 1/2 RW and ΛW commute. Therefore JW ∆W JW = ∆−1 W , and, setting SW = JW ∆W , 2 we easily obtain that SW is closed, densely defined and satisfies SW ⊂ I. Set KW = {ξ ∈ D(SW ) : SW ξ = ξ}. It turns out that KW is a standard space, and that the representation U acts geometrically on the family: U (g)KW = KgW . 0 = KW 0 . Moreover, essential duality holds: KW Let B be the family of regions that are intersections of wedges, and set KB = T W ⊃B KW , B ∈ B. If we assume W to be a subbase for the topology of M , then B forms a base, hence any open set O is a union of elements in B. Then we may W define KO = B⊂O KB . G-covariance follows as in Sec. 2. Proposition 5.2. The following properties hold: (i) {KW , W ∈ W} is a covariant family of real subspaces, namely KgW = U (g)KW , 0 . g ∈ G+ , moreover KW is standard and KW 0 = KW (ii) {KB , B ∈ B} is a covariant net of real subspaces, namely B1 ⊂ B2 implies KB1 ⊂ KB2 , and KgB = U (g)KB , g ∈ G+ . Remark 5.3. As in Remark 2.6, giving a representation of G+ is equivalent to giving a representation of G↑+ together with some sort of PCT, namely an antiunitary involution J satisfying JU (g)J = U (RgR), for some reflection R. Notice that the net B → KB , B ∈ B, is not necessarily local. Also, it is not T necessarily true that KB = KW if B = W , namely it may happen that W ⊃W0 KW is strictly smaller than KW , since we did not prove wedge isotony. We need further assumptions to solve these two problems.
August 22, 2002 16:25 WSPC/148-RMP
776
00138
R. Brunetti, D. Guido & R. Longo
Let H be a convex cone in the Lie algebra of G, and let us denote by exp the exponential map from the Lie algebra to G↑+ . If W0 ⊂ W are wedges, we shall say that W0 is positively included in W w.r.t. the cone H if there is a one parameter subgroup exp(ah) of G↑+ , depending on W0 and W , with exp(a0 h)W = W0 for some a0 ≥ 0, such that ±h ∈ H, and ΛW (t) exp(ah)ΛW (−t) = exp(e∓2πt ah) RW exp(ah)RW = exp(−ah)
a, t ∈ R .
Let us assume the following: (c) Positive inclusion. Any inclusion of wedges is the composition of finitely many positive inclusions. (d) Wedge separation. For any space-like separated O1 , O2 ∈ B there exists a wedge W such that O1 ⊂ W and O2 ⊂ W 0 . We shall say that a (anti)-unitary representation of G+ is positive if, whenever h ∈ H, the self-adjoint generators in the representation space of the one-parameter groups U (exp(ah)) are positive. Theorem 5.4. Assume the triple (W, R, Λ) satisfies assumptions (a), (b), (c), (d), and let U be a (anti)-unitary positive representation of G+ . Then wedge isotony holds, namely W1 ⊂ W2 implies KW1 ⊂ KW2 , the net B → KB , B ∈ B, is local and extends the net W → KW , W ∈ W. Moreover, for any B ∈ B such that B 0 6∈ B, Haag duality holds: 0 = KB 0 . KB
If G↑+ is a simple Lie group with finite center and U does not contain the trivial representation, the net is irreducible. If moreover the closure of {ΛW (t) : t ∈ R} in G↑+ is not compact, the local space KW is a factor. Proof. Wedge isotony follows by property (c), locality of B 7→ KB follows by S property (d). If B00 6∈ B then KB00 = B⊂B 0 KB , hence Haag duality follows as in 0 Corollary 3.5. The assumption of non-compactness for the closure of Λ(t) allows us to use the vanishing of the matrix coefficients theorem as in Theorem 2.5 to prove the factoriality. Let us observe that, if the positivity in the previous statement is a non-trivial requirement, namely if there are wedges included one in another, then Λ and exp(ah) give rise to a representation of the ax + b group, namely the requirement that the closure of {ΛW (t) : t ∈ R} in G↑+ is not compact is automatically satisfied. For the same reason, also the assumption on the finiteness of the center is unnecessary (cf. [23]). Let us discuss a toy example satisfying the general scheme presented above, where the last statement of the previous theorem does not apply. Let M = S 2 × R, where S 2 is the unit sphere in R3 , with the induced Lorentzian metric, and
August 22, 2002 16:25 WSPC/148-RMP
00138
Modular Localization and Wigner Particles
777
G+ = SO(3) × R o Z2 , where SO(3) acts on the sphere, R gives time-translations, and the Z2 element implements the orientation preserving space-time reflection (PT transformation). We also set W to be the family of diamonds with base a hemisphere (at time t). Clearly the stabilizer G↑+ (W ) is one-dimensional, and, since two hemispheres included one in the other coincide, no positivity is needed. Therefore the parametrization of the groups ΛW may be fixed arbitrarily. Also, the action of G+ is transitive, hence we may fix a wedge W0 as the causal completion of {(t, x, y, z) : x2 + y 2 + z 2 = 1, t = 0, z > 0} and assign 1 0 0 0 0 cos θ sin θ 0 , ΛW0 (θ) = 0 − sin θ cos θ 0 0 0 0 1 RW0 (t, x, y, z) = (−t, x, y, −z). In any faithful irreducible representation of G↑+ , the generator of ΛW0 has a one-dimensional kernel, therefore the corresponding space KW0 is not a factor. More precisely it is a tensor product of a continuous abelian von Neumann algebra and of a type I∞ factor (cf. [18]). However the net is irreducibile. In such a generality, it is not possible to prove important properties, such as the standard property, the intersection property, or the factor property, for elements of B. We now discuss this structure in specific spacetimes. 5.1. Conformal group In the following the conformal group on the Minkowski spacetime M of dimension d ≥ 1 (with M = R if d = 1) is the group generated by the Poincar´e group (“ax+b” group if d = 1) and the relativistic ray inversion map. The conformal group is isomorphic to P SO(d, 2). If d > 2, this is the group of local diffeomorphisms (defined out of meager sets) which preserve the metric tensor up to non-vanishing functions; its universal covering acts globally and transitively on the universal covering of the Dirac–Weyl compactification of M . If d = 2, the Dirac–Weyl compactification is a two-torus, and only the time-covering is considered, namely the conformal group acts on the cylinder spacetime with non compact time curves. In the d = 1 case the identity component of P SO(d, 2) is isomorphic to P SL(2, R) and we consider its action on S 1 . For details see [10]. If d ≥ 2 a wedge is any conformal transformed of (the lift of) a wedge in the Minkowski space, in particular Poincar´e-wedges, double cones, future cones and past cones give rise to conformal wedges. The maps R and Λ are here the lifts of those defined on Minkowski spacetime. If d = 1 wedges are proper intervals, the reflection associated with the upper semi-circle maps z ∈ S 1 to its complex conjugate z¯, and Λ is the (lift of the) one parameter subgroup of P SL(2, R) of (Cayley transformed) dilations. Then, we may consider (anti-)unitary representations of these groups, and check that properties (a), (b) and (c) hold true, the cone H being generated by the Lie
August 22, 2002 16:25 WSPC/148-RMP
778
00138
R. Brunetti, D. Guido & R. Longo
algebra generators of lightlike translations and their conjugates under the action of the conformal group. In this case wedges form already a base for the topology, so it is enough to consider the net on wedges. Hence assumption (d) is not needed. An analogue of Theorem 5.4 holds true here. Theorem 5.5. Let the spacetime be the universal covering of the compactified Minkowski space S d−1 × R for d ≥ 2 and S 1 if d = 1. Let U be a (anti-)unitary positive representation of the universal cover of P SO(d, 2) which does not contain the trivial representation. Then, W 3 W → KW is a local conformal net for which Haag duality holds. The standard and the III1 factor properties are satisfied. Moreover, the family is irreducible. Proof. Haag duality, conformal covariance and standard property follow by Proposition 5.2(i), wedge-isotony, factor property and irreducibility follow by Theorem 5.4, and locality follows by wedge isotony and wedge-duality. III1 factor property follows as in [23, Proposition 1.2]. The one-dimensional conformal case is extensively studied in [26] and we refer to that paper for further details. We recall that all the nets corresponding to irreducible representations of P SL(2, R) are subsystems (nth derivatives) of the same net on R (the U (1) current algebra) which is their common dual net. 5.2. de Sitter spacetime Since the d-dimensional de Sitter spacetime dS d may be defined as the hyperboloid P x20 + 1 = di=1 x2i in M d+1 , the wedges can be defined as the intersection of this hyperboloid with the wedges in M d+1 whose edges contains the origin. The natural symmetry group of dS d is the Lorentz group L+ = SO(d, 1), and the maps R, Λ are assigned here as in the Minkowski spacetime. Then properties (a), (b), and (d) immediately follow from the corresponding properties for the Minkowski spacetime. Property (c) instead is trivially satisfied, since two wedges W1 ⊂ W2 , whose edge contain the origin, coincide. Intersections of wedges, namely elements of B, correspond to spacelike cones in the Minkowski space, therefore the standard property and the intersection property on the d-dimensional de Sitter spacetime can be studied applying the techniques of the preceding section. But we can also rely on the direct analysis by Bros and Moschella [9]. Let us recall that the irreducible representations of the group SO(d, 1), d ≥ 2, belong to three classes, usually called principal series representations, complementary series representations, and discrete series representations (cf. e.g. [42, 38]). The first class corresponds to representations appearing in the direct integral decomposition of the regular representation, the second one to representations not appearing in the direct integral decomposition of the regular representation. Concerning the third class however, the name “discrete series” is not always appropriate, namely
August 22, 2002 16:25 WSPC/148-RMP
00138
Modular Localization and Wigner Particles
779
it is not always true that they are irreducible direct summands of the regular representation. Indeed, applying a result of Harish–Chandra, it turns out that this fact is possible if and only if d is even (see [38]). The exact determination of the direct-summand representations, namely the recognition of the discrete series as opposed to the “mock discrete series” is well known for the two-covering SL(2, R) of SO(2, 1) [32], implying that there are not mock discrete series representations for SO(2, 1). This problem has been solved in [14] for d = 4 and in [39] for a general d = 2m, m ≥ 2. Theorem 5.6. Let {K(B) : B ∈ B} be the net of local real vector subspaces associated to a representation U of the Lorentz group SO(d, 1). If U is a subrepresentation of the regular representation, then the standard property, the intersection property and the factor property hold. If U is a representation in the principal or complementary series of the Lorentz group SO(d, 1), then the mentioned properties hold. Proof. The restriction of a representation from the Poincar´e group to the Lorentz group gives a map from nets KM on the Minkowski space M d to nets KS on the de Sitter space defined as KS (B) = KM (C(B)), where C(B) is the spacelike cone in M d generated by the region B in S d−1 and the origin. We may rephrase results in the previous section saying that the standard property, the intersection property and the factor property hold for regions B given as intersections of wedges in the regular representation of the Lorentz group. By Theorem A.1, cf. also Remark A.8, the properties hold for all subrepresentations. Since the regular representation decomposes as direct integral of the principal series representations, the standard and the intersection properties hold for almost all values of the parameter labeling the principal series. However, we may use the analysis in [9], where it is shown that a class of free fields may be constructed, corresponding to the principal, respectively complementary series of the representations of L↑+ . In [9] the authors prove the Reeh–Schlieder and Bisognano–Wichmann properties for the free fields corresponding to the principal series, and state that these results extend to the complementary series. By the Bisognano–Wichmann property, such free fields necessarily give rise to the nets constructed as above for the corresponding representations. Therefore the standard property follows by the Reeh–Schlieder property and the intersection property is trivially satisfied since the local algebras are generated by local fields. Thus, concerning the principal and complementary series, the standard and intersection properties are consequence of the Reeh–Schlieder in [9]. Yet the above proof goes beyond that, by showing the same properties to hold in the dS d models associated with the discrete series, d even. Discrete series representations have been explicitly excluded in the analysis in [9], and the result that such representations give rise to (free) nets of local algebras on the de Sitter space-time seems to be not known before our analysis. This will be discussed in detail in [27].
August 22, 2002 16:25 WSPC/148-RMP
780
00138
R. Brunetti, D. Guido & R. Longo
Appendix A. Restricting the Poincar´ e Group Representations to the Lorentz Subgroup We give here an analysis of the representations of the Lorentz and Poincar´e groups needed in the paper. We treat explicitly the (3 + 1)-dimensional case, however the analysis extends to any dimension, as explained in Remark A.8. If G is a locally compact group, we shall denote by λG its left regular representation. If H ⊂ G is a closed subgroup and π is a unitary representation of H, we shall denote by IndH↑G (π) the representation of G induced by π in the sense of Frobenius, Wigner and Mackey; we shall refer to the books [46, 31, 34] for the theory of induced representations. ↑ are induced represenLet us recall that the irreducible representations of P+ ↑ tations IndF ↑P ↑ (η), where F = F (p) is the stabilizer of some point p ∈ R4 , P+ +
acting on the subgroup R4 by conjugation, and η is an irreducible representation ˆ 4 . When p varies in a given P ↑ -orbit the of F . We often identify R4 with its dual R + corresponding induced representations are equivalent, therefore they are labelled by m = pµ pµ . When m > 0 the stabilizer is isomorphic to SO(3)nR4 , therefore positive mass m representations are completely described by the spin s. When m = 0 and we choose p0 > 0 (to have positive energy), the stabilizer is isomorphic to the Euclidean group E(2). The representations which are trivial on the E(2)-translations are the so-called finite-helicity representations, and are completely labelled by Z. The others are called continuous-spin representations. The other cases, namely p = 0 and m < 0, correspond respectively to null energy (trivial translations) and non positive energy. In the following we shall say that a property P for representations of a group G is stable if “P is true for π” implies “P is true for all representations unitarily equivalent to π” and P is true for π ≡ π1 ⊕ π2 ⇔ P is true for π1 and π2 .
(A.1)
Theorem A.1. Assume that P is a stable property for the representations of L↑+ . The following are equivalent: (i) (ii) (iii) (iv)
P P P P
is true for the restriction to L↑+ of the positive mass representations. is true for the restriction to L↑+ of the continuous spin representations. is true for the restriction to L↑+ of the massless finite helicity representations. is true for λL↑ . +
The proof of this theorem requires some steps. ↑ as above, Lemma A.2. Let π = IndF ↑P ↑ (η) be an irreducible representation of P+ +
F = F (p0 ). Then π|L↑ = Ind (η|E ) , +
where E is the stabilizer, in L↑+ , of p0 .
E↑L↑ +
August 22, 2002 16:25 WSPC/148-RMP
00138
Modular Localization and Wigner Particles
781
↑ Proof. Let us denote by X the orbit of p0 under the adjoint action of P+ on the 4 subgroup R , or equivalently the homogeneous space P+ /F (p0 ), and let ν be the ↑ -invariant measure on X. If η is a representation of F acting on the Hilbert space P+ Hη of η, π can be defined as
(π(g)ξ)(p) = η(α(g, p))ξ(g −1 p) , ↑ , p ∈ X, ξ ∈ L2 (X, Hη , dν), and α is an F -valued cocycle of the where g ∈ P+ form α(g, p) = s(p)−1 gs(g −1 p), where s is a Borel section, namely a Borel map ↑ satisfying s(p)p0 = p. s : X → P+ By definition, R4 acts trivially on itself, hence F = E n R4 with R4 acting trivially on X, therefore we may choose s to be a L↑+ -valued section. As a consequence, α : L↑+ × X is E-valued, namely the restriction to L↑+ of π is by definition the representation induced by η|E .
If ρ and σ are representations, we shall write ρ = σ if ρ is unitary equivalent to σ and ρ ≈ σ if ρ is quasi equivalent to σ, namely ρ ⊗ ι = σ ⊗ ι, where ι is the identity representation on `2 (N). Lemma A.3. Let H be a locally compact group isomorphic to the Euclidean group E(2). If π is an irreducible unitary representation of H and π has non-trivial restriction to the subgroup R2 , then π = πq ≡ IndR2 ↑H (q) where q 6= 0 is a character ˆ 2. q∈R R⊕ We have λH = Rˆ 2 πq dq. Proof. E(2) is the semidirect product E(2) = R2 o T, where T acts on the plane ˆ 2 by dual conjugation factors through the R2 by rotations. The action of E(2) on R ˆ 2 is E(2) (iff q = 0) or action of T and is smooth. The stabilizer Hq of a point q ∈ R 2 Hq = R . By Mackey’s theorem every irreducible representation π of H is induced from an irreducible representation ρ of Hq with ρ|R2 = dim(ρ)q. Thus either q = 0 and π acts trivially on R2 , or q 6= 0 and π = πq . The rest is now clear by induction at stages because Z ⊕ Z ⊕ Z ⊕ (λR2 ) = Ind qdq = Ind (q)dq = πq dq . λH = Ind 2 2 2 R ↑H
R ↑H
ˆ 2 R ↑H R
ˆ2 R
ˆ2 R
Proposition A.4. Let G be a locally compact group and H ⊂ G a closed subgroup isomorphic to the Euclidean group E(2). Then Z ⊕ Ind (πq )dq . λG = ˆ 2 H↑G R
Proof. Immediate by the Lemma A.3 because Z Z ⊕ πq dq = λG = Ind (λH ) = Ind H↑G
H↑G
ˆ2 R
⊕
Ind (πq )dq .
ˆ 2 H↑G R
August 22, 2002 16:25 WSPC/148-RMP
782
00138
R. Brunetti, D. Guido & R. Longo
We shall denote by πm,s the irreducible representation of mass m > 0 and spin s ∈ N L↑
↑ + and by πm,s its restriction to the Lorentz subgroup L↑+ . of the Poincar´e group P+ ↑ are the ones induced By definition, the continuous spin representations σq of P+ 4 by the representations πq of H = E(2)nR in Lemma A.3, where H is the stabilizer L↑
↑ of a point p with hp, pi = 0, p0 > 0. We shall denote by σq + the restriction in P+ L↑
of σq to L↑+ . By Lemma A.2 we have σq + = IndR2 ↑L↑ (q). +
Lemma A.5. λL↑ = +
R⊕ ˆ2 R
L↑
σq + dq.
Proof. Immediate by Lemmas A.2, A.3 and Proposition A.4. Lemma A.6. For any given m > 0 we have λL↑ ≈
M
+
L↑
+ πm,s .
s∈N L↑
+ ≈ Proof. Denote by ρs the representation of SO(3) of spin s. By Lemma A.2, πm,s
L↑
+ is independent of m > 0. IndSO(3)↑L↑ (ρs ), in particular πm,s + We have
M
L↑ + πm,s
=
s∈N
≈
M
Ind
↑ s∈N SO(3)↑L+
Ind
SO(3)↑L↑ +
(ρs ) =
Ind
SO(3)↑L↑ +
M
! ρs
s∈N
(λSO(3) ) ≈ λL↑ . +
The following lemma is a particular case of the subgroup theorem of Mackey (cf. e.g. [34, Chap. II, Theorem 1]) when the subgroup G2 coincides with the group G. We give a proof here for the convenience of the reader. Lemma A.7. Let H be a closed subgroup of G, η a representation of H and g0 an element of G normalizing H. Then Ind (η) = Ind (η g0 ) ,
H↑G
H↑G
where η g0 (h) ≡ η(g0−1 hg0 ), h ∈ H. Proof. Let us denote by X the homogeneous space G/H, and let ν be a G-quasiinvariant measure on X, that for simplicity we assume to be invariant. Setting π = IndH↑G (η), namely (π(g)ξ)(p) = η(α(g, p))ξ(g −1 p) ,
August 22, 2002 16:25 WSPC/148-RMP
00138
Modular Localization and Wigner Particles
783
where g ∈ G, p ∈ X, ξ ∈ L2 (X, Hη , dν), and α is an H-valued cocycle α(g, p) = s(p)−1 gs(g −1 p), with s : X → G a Borel section satisfying s(p)p0 = p. Hence π g0 is given by (π g0 (g)ξ)(p) = η(g0−1 α(g, p)g0 )ξ(g −1 p) = η(g0−1 s(p)−1 gs(g −1 p)g0 )ξ(g −1 p) = η(αg0 (g, p))ξ(g −1 p) ,
(A.2)
where the cocycle αg0 (g, p) = g0−1 s(p)−1 gs(g −1 p)g0 = sg0 (p)−1 gsg0 (g −1 p) is associated with the map sg0 (p) = s(p)g0 . Clearly sg0 : X → G is a Borel section for the different quotient map g → gg0−1 p0 . As the stabilizer of p0 coincides with the stabilizer of g0−1 p0 , the statement follows by the uniqueness of the induced representation. Proof (of Theorem A.1). (i) ⇔ (iv): By Lemma A.6, property P holds for the L↑
+ iff it holds for the regular representation, by stability. representations πm,s (ii) ⇔ (iv): If p 6= 0 has zero mass, the stabilizer E(p) in L↑+ does not change replacing p with λp, λ > 0. Therefore all elements in L↑+ moving p to some of its multiples normalizes the stabilizer of p. For such a g,
πq (g −1 hg) = πgq (h) ,
h ∈ E(p) .
ˆ2
Note that every p-orbit in R (except {0}) can be reached by some g with the L↑
property g : p 7→ λp. Therefore the σq + ’s are all equivalent, by Lemma A.7. Then, L↑
by Lemma A.5, σq + is a subrepresentation of the regular representation, hence property P holds by stability. The converse is also true by stability. ˆ n∈Z (iii) ⇔ (iv): The argument is again similar to the above ones. Let χn ∈ T, be the characters of T. The finite helicity representations are the representations of the Poincar´e group induced by the representations αn ≡ IndT↑E(2) (χn ) of E(2). Their restrictions to L↑+ are IndE(2)↑L↑ (αn ). Then, by induction at stages, + ! M M M Ind (αn ) = Ind (χn ) = Ind χn ↑ n E(2)↑L+
n
T↑L↑ +
T↑L↑ +
n
= Ind (λT ) = λL↑ , T↑L↑ +
+
and the statement follows by stability. Remark A.8. Although the proof of Theorem A.1 has been written for the 4↑ (d) acting on the dimensional case, it extends to the case of the Poincar´e group P+ d-dimensional Minkowski space, d ≥ 2. Indeed the continuous spin representations are present only when d ≥ 4, therefore the property (ii) is void for dimension ≤ 3. When d ≥ 4, the stabilizer of a light-like point is the Euclidean group E(d − 2), whose irreducible representations are parametrized by vectors in Rd−2 (and vectors with the same length give equivalent representations). This can be found e.g. in
August 22, 2002 16:25 WSPC/148-RMP
784
00138
R. Brunetti, D. Guido & R. Longo
[42], or proved by induction where the first step is given by Lemma A.3 and the induction step follows by the Mackey theorem [46, Theorem 7.3.1]. Therefore all the above analysis applies. Acknowledgment This work is supported in part by MIUR and GNAMPA-INDAM. References [1] H. Araki, A lattice of von Neumann algebras associated with the quantum field theory of a free Bose field, J. Math. Phys. 4 (1963) 1343. [2] H. Araki, von Neumann algebras of local observables for free scalar field, J. Math. Phys. 5 (1964) 1–13. [3] H. Araki, Type of von Neumann algebra associated with free field, Progr. Theoret. Phys. 32 (1964) 956–965. [4] P. Bertozzini, R. Conti and R. Longo, Covariant sectors and positivity of the energy, Commun. Math. Phys. 141 (1998) 471–492. [5] H. Baumg¨ artel, M. Jurke and F. Lled´ o, On free nets over the Minkowski spacetime, Rep. Math. Phys. 35 (1995) 101–127. [6] J. Bisognano and E. Wichmann, On the duality condition for a Hermitean scalar field, J. Math. Phys. 16 (1975) 985. [7] J. Bisognano and E. Wichmann, On the duality condition for quantum fields, J. Math. Phys. 17 (1976) 303–321. [8] H. J. Borchers, The CPT theorem in two-dimensional theories of local observables, Commun. Math. Phys. 143 (1992) 315. [9] J. Bros and U. Moschella, Two-point functions and quantum fields in de Sitter universe, Rev. Math. Phys. 8 (1996) 327–391. [10] R. Brunetti, D. Guido and R. Longo, Modular structure and duality in conformal quantum field theory, Commun. Math. Phys. 156 (1993) 201–219. [11] R. Brunetti, D. Guido and R. Longo, Group cohomology, modular theory and spacetime symmetries, Rev. Math. Phys. 7 (1994) 57–71. [12] D. Buchholz, O. Dreyer, M. Florig and S. J. Summers, Geometric modular action and spacetime symmetry groups, Rev. Math. Phys. 12 (2000) 475–560. [13] D. Buchholz and S. J. Summers, An algebraic characterization of vacuum states in Minkowski space, Commun. Math. Phys. 155 (1993) 449–458. [14] J. Dixmier, Repr´esentations int´egrables du groupe de De Sitter, Bull. Soc. Math. France 89 (1961) 9–41. [15] J. P. Eckmann and K. Osterwalder, An application of Tomita’s theory of modular Hilbert algebras: Duality for free Bose fields, J. Funct. Anal. 13 (1973) 1–22. [16] L. Fassarella and B. Schroer, Wigner particle theory and local quantum physics, hep-th/0112168. [17] F. Figliolini and D. Guido, The Tomita operator for the free scalar field, Ann. Inst. H. Poincar´e Phys. Th´eor. 51 (1989) 419–435. [18] F. Figliolini and D. Guido, On the type of second quantization factors, J. Operator Theory 31 (1994) 229–252. [19] A. Guichardet, Tensor Products of C ∗ -Algebras (Infinite), Aarhus University, Lecture notes series 13, 1969. [20] D. Guido and R. Longo, Relativistic invariance and charge conjugation in quantum field theory, Commun. Math. Phys. 148 (1992) 521.
August 22, 2002 16:25 WSPC/148-RMP
00138
Modular Localization and Wigner Particles
785
[21] D. Guido, Modular covariance, PCT, spin and statistics, Ann. Ist. H. Poincar´e 63 (1995) 383. [22] D. Guido and R. Longo, An algebraic spin and statistics theorem, Commun. Math. Phys. 172 (1995) 517–533. [23] D. Guido and R. Longo, The conformal spin and statistics theorem, Commun. Math. Phys. 181 (1996) 11. [24] D. Guido and R. Longo, Natural energy bounds in quantum thermodynamics, Commun. Math. Phys. 218 (2001) 513–536. [25] D. Guido, R. Longo, J. E. Roberts and R. Verch, Charged sectors, spin and statistics in quantum field theory on curved space-times, Rev. Math. Phys. 13 (2001) 125–198. [26] D. Guido, R. Longo and H.-W. Wiesbrock, Extensions of conformal nets and superselection structures, Commun. Math. Phys. 192 (1998) 217. [27] D. Guido and R. Longo, Dethermalization and de sitter/CFT corresponde, under preparation. [28] R. Haag, Local Quantum Physics, Springer-Verlag, New York-Berlin-Heidelberg, 1996. [29] P. Hislop and R. Longo, Modular structure of the local observables associated with the free massless scalar field theory, Commun. Math. Phys. 84 (1982) 71–85. [30] G. J. Iverson and G. Mack, Quantum fields and interaction of massless particles: The continuous spin case, Ann. Phys. 64 (1971) 211–253. [31] A. A. Kirillov, Elements of the Theory of Representations, Springer-Verlag, BerlinHeidelberg, 1976. [32] S. Lang, SL2 (R), Springer-Verlag, New York-Berlin-Heidelberg, 1985. [33] P. Leyland, J. E. Roberts and D. Testard, Duality for the free electromagnetic field, Marseille preprint, 1976, unpublished. [34] R. L. Lipsman, Group Representations, Lecture Notes in Mathematics 338, SpringerVerlag, Berlin-Heidelberg-New York, 1974. [35] F. Lled´ o, Conformal covariance of massless free nets, Rev. Math. Phys. 13 (2001) 1135–1161. [36] J. Mund, The Bisognano–Wichmann theorem for massive theories, Ann. Henri Poincar´e 2 (2001) 907–926. [37] S. Stratila and L. Zsido, Lectures on von Neumann Algebras, Abacus Press, 1979. [38] E. A. Thieleker, The unitary representations of the generalized Lorentz groups, Trans. Amer. Math. Soc. 199 (1974) 327–367. [39] E. A. Thieleker, On the integrable and square-integrable representations of Spin(1, 2m), Trans. Amer. Math. Soc. 230 (1977) 1–40. [40] L. J. Thomas and E. H. Wichmann, On the causal structure of Minkowski spacetime, J. Math. Phys. 38 (1997) 5044–5086. [41] V. S. Varadarajan, Geometry of Quantum Theory, Springer-Verlag, New York, 1985. [42] N. Ja. Vilenkin and A. U. Klimyk, Representation of Lie groups and Special Functions. Vol. 2. Class I Representations, Special Functions, and Integral Transforms, Mathematics and its Applications (Soviet Series), 74. Kluwer Academic Publishers Group, Dordrecht, 1993. [43] H.-W. Wiesbrock, A comment on a recent work of Borchers, Lett. Math. Phys. 25 (1992) 157–159. [44] H.-W. Wiesbrock, Half-sided modular inclusions of von Neumann algebras, Commun. Math. Phys. 157 (1993) 83. [45] D. Yngvason, Zero-mass infinite spin representations of the Poincar´ e group and quantum field theory, Commun. Math. Phys. 18 (1970) 195–203. [46] R. J. Zimmer, Ergodic Theory of Semisimple Lie Groups, Birkh¨ auser, Boston-BaselStuttgart, 1984.
August 28, 2002 10:28 WSPC/148-RMP
00139
Reviews in Mathematical Physics, Vol. 14, Nos. 7 & 8 (2002) 787–796 c World Scientific Publishing Company
A REMARK ON QUANTUM GROUP ACTIONS AND NUCLEARITY
S. DOPLICHER Dipartimento di Matematica, Universit` a di Roma “La Sapienza” Piazzale A. Moro 5, I-00185 Roma, Italy
[email protected] ´‡ R. LONGO∗ , J. E. ROBERTS† and L. ZSIDO Dipartimento di Matematica, Universit` a di Roma “Tor Vergata” Via della Ricerca Scientifica 1, I-00133 Roma, Italy ∗
[email protected] †
[email protected] ‡
[email protected] Received 28 March 2002 Revised 6 July 2002 Dedicated to Huzihiro Araki on the occasion of his seventieth birthday Let H be a compact quantum group with faithful Haar measure and bounded counit. If α is an action of H on a C ∗ -algebra A, we show that A is nuclear if and only if the fixed-point subalgebra Aα is nuclear. As a consequence H is a nuclear C ∗ -algebra. Keywords: C ∗ -algebra; coaction; nuclearity; compact quantum group.
1. Introduction The notion of compact quantum group has been axiomatized by Woronowicz [16] and some variations appeared in subsequent papers, e.g. [1, 12, 13, 9]. Here we shall consider compact quantum groups allowing a faithful Haar state and bounded counit, conditions that cover a rather wide range of applicability cf. [15]. In this note we will show that a C ∗ -algebra acted upon by a compact quantum group with faithful Haar state and bounded counit is nuclear if and only if the fixed point algebra is nuclear. In particular, a compact quantum group with faithful Haar state and bounded counit is a nuclear C ∗ -algebra. Notice however that our assumptions about the faithfulness of the Haar state and the boundedness of the counit are essential: there exist non-nuclear compact quantum groups, as we can infer from [15], second remark after the statement of Proposition 1.8, and [13]. As a special case, a C ∗ -algebra acted upon by a compact group of automorphisms is nuclear if and only if the fixed point algebra is nuclear. Indeed this was 787
August 28, 2002 10:28 WSPC/148-RMP
788
00139
S. Doplicher et al.
our initial motivation, during a common discussion long ago at the Rome seminar on Operator Algebras, inspired by a result of Høegh-Krohn, Landstad and Størmer [8] showing that there is a compact ergodic action on a C ∗ -algebra only if the algebra is nuclear, and by a lemma in [6]. This particular case is discussed separately as it is a simple and clarifying illustration for the general case. We notice that another extension of the result of Høegh-Krohn, Landstad and Størmer to compact matrix pseudogroups was already proved by Boca [2, Corollary 23]: any C ∗ -algebra acted upon ergodically by a nuclear compact matrix pseudogroup is again nuclear. Actually Boca’s proof works in the case of an ergodic action of an arbitrary nuclear compact quantum group. We also analyze compact actions on von Neumann algebras where the analogous result holds for injective von Neumann algebras. This W ∗ -version can however also be inferred from a similar result for crossed products due to Connes [5] and indeed can be extended in part to the case of integrable actions of locally compact groups or Kac algebras, but we do not treat this case here. Our discussion is elementary, but in the last section (W ∗ -case) we make use of the relation between injectivity and semi-discreteness due to Effros-Lance, ChoiEffros and Connes [7, 3, 4]. Notation. If A is a ∗ -algebra we shall denote the identity map on A by ιA , or simply by ι. When A is unital, we shall denote its unit by 1A . N The maximal tensor product for C ∗ -algebras is denoted by max , the minimal N N , the algebraic tensor product by . For a C ∗ -algebra one by min or simply by A, we shall usually identify C ⊗ A and A ⊗ C with A. By a homomorphism we shall always mean a ∗ -homomorphism.
2. Basic Result Let us call a conditional expectation E on a C ∗ -algebra A GNS-faithful if x ∈ A & E(y ∗ x∗ xy) = 0
∀y ∈ A ⇒ x = 0 .
This means that the direct sum of the GNS representations associated with all E-invariant states is injective. Of course, E is GNS-faithful if E is faithful. Let A be a C ∗ -algebra, A0 ⊂ A a C ∗ -subalgebra and E : A → A0 a conditional expectation. If B is any C ∗ -algebra, it is easy to see that ιB E : B A → B A0 extends to a conditional expectation from B ⊗min A to B ⊗min A0 , which is (GNS-)faithful if E is (GNS-)faithful and which we denote by ιB ⊗ E. We shall say that E is stably (GNS-)faithful if, for every C ∗ -algebra B, ιB E extends to a bounded (GNS-)faithful map E˜ : B ⊗max A → B ⊗max A. In this case E˜ is a (GNS-)faithful conditional expectation from B ⊗max A onto the closure of B A0 in B ⊗max A.
August 28, 2002 10:28 WSPC/148-RMP
00139
A Remark on Quantum Group Actions and Nuclearity
789
Proposition 2.1. Let A be a C ∗ -algebra, A0 ⊂ A a C ∗ -subalgebra and E : A → A0 a stably GNS-faithful conditional expectation.Then A is nuclear if and only if A0 is nuclear. Proof. The implication A nuclear ⇒ A0 nuclear is known (indeed one just needs A0 to be the range of a conditional expectation). One way to see this is to recall that A is nuclear iff the enveloping von Neumann algebra A∗∗ is injective [7, 3], and to consider the double transposed conditional expectation E ∗∗ : A∗∗ → A∗∗ 0 . If A is ∗∗ ∗∗ nuclear, A is injective, and then A0 is injective too, hence A0 is nuclear. Conversely, assume that A0 is nuclear. Given a C ∗ -algebra B, the identity map on the algebraic tensor product B A extends to a homomorphism π : B ⊗max A → B ⊗min A . To show that A is nuclear, we have to prove π to be one-to-one. By assumption, there is a GNS-faithful conditional expectation E˜ : B ⊗max A → B ⊗max A extending ιB E. Denoting the conditional expectation ιB ⊗E on B ⊗min A by E˜0 , we clearly have a commutative diagram π
B ⊗max A −−−−→ B ⊗min A 0 E˜ ˜ Ey y
(1)
π
B ⊗max A0 −−−−→ B ⊗min A . Since E˜ maps B A onto B A0 , by continuity E˜ maps B ⊗max A onto the closure of B A0 in B ⊗max A, that we may denote by B ⊗ A0 since A0 is nuclear. Then (1) yields the commutative diagram π
B ⊗max A −−−−→ B ⊗min A 0 ˜ E˜ Ey y
(2)
ιB⊗A
0 B ⊗ A0 −−−−→ B ⊗ A0 .
Now if x ∈ B ⊗max A belongs to the kernel of π, then E˜0 (π(y ∗ x∗ xy)) = 0 for all ˜ ∗ x∗ xy) = 0 for all y ∈ B ⊗max A . By the commutativity of diagram (2) we have E(y y ∈ B ⊗max A, so x = 0 because E˜ is GNS-faithful. We conclude that π is injective and A nuclear. 3. Compact Group Actions and Nuclearity In the sequel, group actions on C ∗ -algebras and von Neumann algebras will be assumed continuous in the usual sense, namely pointwise norm continuity in the C ∗ -case and pointwise weak∗ -continuity in the W ∗ -case. Proposition 3.1. Let α : G → Aut(A) be an action of a compact group G on a C ∗ -algebra A. Then A is nuclear iff the fixed point C ∗ -subalgebra Aα is nuclear.
August 28, 2002 10:28 WSPC/148-RMP
790
00139
S. Doplicher et al.
Proof. Let Eα : A → Aα be the conditional expectation defined by Z Eα (a) = αg (a)dg , a ∈ A . According to Proposition 2.1, it suffices to show that Eα is stably faithful. The action β = ιB ⊗ α on B A preserves the maximal cross norm k · kmax . Furthermore, G 3 g 7→ βg (x) is continuous for every x ∈ B A with respect to k · kmax . Thus β extends to an action β max of G on B ⊗max A. Set Z Eβ (x) = βgmax (x)dg , x ∈ B ⊗max A . Then Eβ is a faithful conditional expectation because we have Z Eβ (x) = βgmax (x)dg = 0 ⇒ βgmax (x) = 0 ∀g ∈ G for every positive x ∈ B ⊗max A. Since Eβ maps B A onto B Aα and Eβ |B A = ι Eα , we see that Eα is stably faithful. 4. Quantum Group Actions and Nuclearity Following Woronowicz and Van Daele [16, 12], by a compact quantum group we shall mean a unital C ∗ -algebra H equipped with a comultiplication δ, that is a unital homomorphism δ : H → H ⊗ H satisfying (δ ⊗ ιH ) ◦ δ = (ιH ⊗ δ) ◦ δ , such that δ(H)(H ⊗ 1H ) and δ(H)(1H ⊗ H) are linearly dense in H ⊗ H . According to [16, 12], there then exists a unique Haar state on H, that is a state ϕ which satisfies the invariance condition ((ϕ ⊗ ιH ) ◦ δ)(x) = ((ιH ⊗ ϕ) ◦ δ)(x) = ϕ(x)1H ,
x∈H.
(3)
We notice that by condition (3) the fixed point algebra H δ = {x ∈ H; δ(x) = x ⊗ 1H } is equal to C1H .
(4)
The unique Haar state ϕ is not necessarily faithful. However, the cyclic vector ξϕ of the GNS-representation πϕ is also separating for πϕ (H)00 . Indeed, δ lifts to a comultiplication δϕ on Mϕ = πϕ (H)00 (see e.g. [14, Theorem 2.4]) and the vector state ωξϕ on Mϕ satisfies the invariance conditions corresponding to (3), in particular Mϕ δϕ = C1Mϕ . Now [11, Lemma 0.2.4] implies that the support of ωξϕ |Mϕ belongs to Mϕ δϕ = C1Mϕ , hence it is equal to 1Mϕ . Consequently, replacing H with H/ker(πϕ ) if necessary, one can consider the case where ϕ is faithful, cf. e.g. [15, remarks after Theorem 5.6], [10, 14].
August 28, 2002 10:28 WSPC/148-RMP
00139
A Remark on Quantum Group Actions and Nuclearity
791
According to [16, Theorem 2.2], the linear span A of all matrix elements of all finite-dimensional unitary representations of H is a dense ∗ -subalgebra of H with 1H ∈ A, δ(A) ⊂ A A and there are unique linear maps : A → C and κ : A → A, called respectivelycounit and coinverse or antipode, such that ( ιA )(δ(a)) = (ιA )(δ(a)) = a ,
a ∈ A,
(m ◦ (κ ιA ))(δ(a)) = (m ◦ (ιA κ))(δ(a)) = (a)1H ,
a ∈ A,
where the linear map m : A A → A is defined by m(a b) = ab, a, b ∈ A. The counit is a multiplicative positive linear functional on A with (1H ) = 1, but it is in general not bounded. However, it is often bounded and then it extends to a multiplicative state on H, still denoted by , which satisfies ( ⊗ ιH ) ◦ δ = (ιH ⊗ ) ◦ δ = ιH .
(5)
We notice that (5) implies the injectivity of δ. In the rest of this section H will denote a compact quantum group with faithful Haar state and bounded counit. This is the case for compact groups and for the quantum U (N )-group, but not for the dual of a non-amenable discrete group, see the second remark after the statement of Proposition 1.8 in [15]. By a coaction α of H on a C ∗ -algebra A we mean a homomorphism α : A → A ⊗ H such that (α ⊗ ιH ) ◦ α = (ιA ⊗ δ) ◦ α (ιA ⊗ ) ◦ α = ιA . The last equation implies that α is injective. The fixed-point subalgebra is then defined by Aα = {a ∈ A; α(a) = a ⊗ 1H } . Denoting E = (ιA ⊗ ϕ) ◦ α : A → A , we have the known fact: Lemma 4.1. E is a faithful conditional expectation from A to Aα . Proof. E is a faithful map, being the composition of faithful maps. Clearly Aα is contained in the range of E.
August 28, 2002 10:28 WSPC/148-RMP
792
00139
S. Doplicher et al.
We now apply standard calculations and check that E is idempotent. Indeed, identifying A with A ⊗ C, we get E 2 = ((ιA ⊗ ϕ) ◦ α) ◦ (ιA ⊗ ϕ) ◦ α = (ιA ⊗ ϕ) ◦ (α ⊗ ϕ) ◦ α = (ιA ⊗ ϕ ⊗ ϕ) ◦ (α ⊗ ιH ) ◦ α = (ιA ⊗ ϕ ⊗ ϕ) ◦ (ιA ⊗ δ) ◦ α = (ιA ⊗ ((ϕ ⊗ ϕ) ◦ δ )) ◦ α | {z } =ϕ(·)·1C ⊗1C
= (ιA ⊗ ϕ) ◦ α = E . To see that E(A) ⊂ Aα we compute: α ◦ E = α ◦ (ιA ⊗ ϕ) ◦ α = (ιA ⊗ ιH ⊗ ϕ) ◦ (α ⊗ ιH ) ◦ α = (ιA ⊗ ιH ⊗ ϕ) ◦ (ιA ⊗ δ) ◦ α = (ιA ⊗ ((ιH ⊗ ϕ) ◦ δ )) ◦ α {z } | =ϕ(·)·1H ⊗1C
= (ιA ⊗ (ϕ(·) · 1H )) ◦ α = E ⊗ 1H . The rest is now clear. Lemma 4.2. Let B, A and H be C ∗ -algebras. The identity map on B A H extends to a homomorphism ρ : B ⊗max (A ⊗min H) → (B ⊗max A) ⊗min H . Proof. Let B and A act faithfully on a Hilbert space H in such a way that the C ∗ algebra generated by B and A is B⊗max A and let H act faithfully on a Hilbert space K. The C ∗ -algebra generated by B, A and H on H ⊗ K is clearly (B ⊗max A)⊗min H and the C ∗ -algebra generated by A and H is A⊗min H. Thus (B⊗max A)⊗min H contains commuting copies of B and A ⊗min H. By the universal property of ⊗max , we have a natural homomorphism ρ : B ⊗max (A ⊗min H) → (B ⊗max A) ⊗min H. Let α : A → A ⊗ H be a coaction as above and B a C ∗ -algebra. By the universality property of ⊗max , the map ιB α : B A → B (A ⊗ H) extends to a homomorphism ιB ⊗ α : B ⊗max A → B ⊗max (A ⊗ H). By composing it with the map ρ in Lemma 4.2, we get a homomorphism α ˜ ≡ ρ ◦ (ιB ⊗ α) : B ⊗max A → (B ⊗max A) ⊗ H . Lemma 4.3. α ˜ is a coaction of H on B ⊗max A. Proof. We first check the “counit” condition ˜ = ιB⊗max A . (ιB⊗max A ⊗ ) ◦ α
August 28, 2002 10:28 WSPC/148-RMP
00139
A Remark on Quantum Group Actions and Nuclearity
793
Clearly we have ˜ )(b ⊗ 1A ) = (ιB⊗max A ⊗ )(b ⊗ 1A ⊗ 1H ) = b ⊗ 1A , ((ιB⊗max A ⊗ ) ◦ α
b∈ B.
Note that (ιB⊗max A ⊗ ) ◦ ρ is the bounded map from B ⊗max (A ⊗ H) to B ⊗max A, b ⊗ a ⊗ h 7→ b ⊗ ((h) · a), a ∈ A, b ∈ B, h ∈ H, thus (ιB⊗max A ⊗ ) ◦ ρ : b ⊗ x 7→ b ⊗ ((ιA ⊗ )(x)),
b ∈ B,
x∈A⊗H.
By the counit property of , we then have ˜ )(1B ⊗ a) = ((ιB⊗max A ⊗ ) ◦ ρ)(1B ⊗ α(a)) ((ιB⊗max A ⊗ ) ◦ α = 1B ⊗ ((ιA ⊗ )α(a)) = 1B ⊗ a ,
a ∈ A.
˜ acts identically on B A, hence on B ⊗max A As is multiplicative (ιB⊗max A ) ⊗ ◦ α by continuity. As a consequence, α ˜ is injective. Since α ˜ is injective, the restriction of ρ to (ιB ⊗ α)(B ⊗max A) is injective too and we can define the following commutive diagrams ι ⊗α
ιB ⊗(α⊗ιH )
B ⊗max A −−B−−→ B ⊗max (A ⊗ H) −−−−−−−→ B ⊗max ((A ⊗ H) ⊗ H) ιB ⊗(ιA ⊗δ) 0 ρ ρ ιB⊗max A y y y α ˜
α⊗ι ˜
H B ⊗max A −−−−→ (B ⊗max A) ⊗ H −−−−−− −→ (B ⊗max A) ⊗ H ⊗ H
ιB⊗max A ⊗δ
˜ is a coaction. where ρ0 is the natural map constructed by Lemma 4.2, showing that α
Proposition 4.4. E is stably faithful. Proof. Let B a C ∗ -algebra. The conditional expectation E˜ on B ⊗max A extending ˜ hence it is faithful by ιB ⊗ E on B A is the one associated with the coaction α, Lemma 4.1. Corollary 4.5. Let α be a coaction of a compact quantum group H with faithful Haar state and bounded counit on a C ∗ -algebra A.Then A is nuclear iff Aα is nuclear. Proof. This is a consequence of Propositions 2.1 and 4.4. Corollary 4.6. A compact quantum group H with faithful Haar state and bounded counit is a nuclear C ∗ -algebra. Proof. The comultiplication δ is a coaction of H on itself and the statement follows by the above corollary, taking (4) into account.
August 28, 2002 10:28 WSPC/148-RMP
794
00139
S. Doplicher et al.
5. Compact Group Actions: the W ∗ -case Before giving a W ∗ -version of Proposition 3.1, we need some preliminaries. Let X be a Banach space, and Y a weak∗ dense Banach subspace of X ∗ . By a σ(X, Y )-continuous group of isometries V : G → B(X) we mean a σ(X, Y )continuous homomorphism of the group G into the group of all σ(X, Y )-continuous linear isometries of X. We note that to check the σ(X, Y )-continuity of the map g ∈ G → Vg ∈ B(X) we may verify that the maps g ∈ G → y(Vg x) are continuous when x varies in a norm dense subset of X and y varies in a norm dense subset of Y . Let M and N be von Neumann algebras. The binormal tensor product N ⊗bin M is the norm completion of N M with respect to the norm kxk = sup kπ(x)k , π
x∈N M,
where the supremum ranges over all representations of N M with normal restrictions to N ⊗ 1M and 1N ⊗ M , called binormal representations [7]. Let F ⊂ (N ⊗bin M )∗ be the Banach space of normal linear functionals associated with all such binormal representations. Lemma 5.1. The kernel J of the natural homomorphism π : N ⊗bin M → N ⊗min M is σ(N ⊗bin M, F )-closed. Proof. When M and N act on L2 (M ) and L2 (N ) (respectively), the C ∗ -subalgebra ¯ L2 (M )) generated by N ⊗ 1M and 1N ⊗ M carries the minimal tensor of B(L2 (N ) ⊗ product norm k · kmin . Thus we may identify it with N ⊗min M . With this identification, π is a binormal representation of N ⊗bin M on ¯ L2 (M ), whose kernel is J. Note now that π is continuous from the L2 (N ) ⊗ σ(N ⊗bin M, F )-topology to the σ-weak topology of B(L2 (N ) ⊗ L2 (M )), therefore J is σ(N ⊗bin M, F )-closed. Proposition 5.2. Let α : G → Aut(M ) be an action of acompact group G on a von Neumann algebra M . Then M is injective if and only if the fixed point subalgebra M α is injective. Proof. If M is injective, so is M α because there is a conditional expectation from M to M α . Now assume that M α is injective. In analogy with the proof of Proposition 2.1, we shall prove that, given any von Neumann algebra N , the ideal J in Lemma 5.1 is {0}, which means that M is semidiscrete, that is injective [3, 5]. The action β = ι ⊗ α on N M preserves the norm k · kbin, so each βg extends to a ∗-automorphism of N ⊗bin M , still denoted by βg . Furthermore, the map
August 28, 2002 10:28 WSPC/148-RMP
00139
A Remark on Quantum Group Actions and Nuclearity
795
G 3 g 7→ βg (x) is σ(N M, F )-continuous for every x ∈ N M . Therefore the action β on N ⊗bin M is σ(N ⊗bin M, F )-continuous. Let us consider the conditional expectation Eα : M → M α defined by Z Eα (x) = αg (x)dg , x ∈ M . The completely positive map ι ⊗ Eα : N ⊗min M → N ⊗min M has norm 1, so k(ι ⊗ Eα )(x)kmin ≤ kxkmin ≤ kxkbin ,
x∈N M.
Now let x ∈ N M . Then (ι ⊗ Eα )(x) ∈ N M α . As M α is semidiscrete, k · kmin and k · kbin coincide on N M α . Therefore k(ι ⊗ Eα )(x)kbin ≤ kxkbin ,
x∈N M,
so (ι ⊗ Eα )|N M extends to a linear map of norm 1 from N ⊗bin M to itself, still denoted by ι ⊗ Eα , whose range is contained in the closure of N M α in N ⊗bin M . Let x ∈ N ⊗bin M and choose xn ∈ N M , n ≥ 1, with kx − xn kbin → 0. Then, for all ϕ ∈ F , ϕ((ιN ⊗ Eα )(x)) = lim ϕ((ιN ⊗ Eα )(xn )) n
= lim n
Z
Z ϕ(βg (xn ))dg =
ϕ(βg (x))dg ,
so Z Eβ (x) =
βg (x)dg
exists in the σ(N ⊗bin M, F )-weak sense and Eβ (x) = (ιN ⊗Eα )(x) for all x ∈ N M . In particular, the range of Eβ is contained in the closure of N M α . Moreover Eβ is faithful. Now let x ∈ J be a positive element. Since each βg leaves J globally invariant, βg (x) ∈ J for all g ∈ G. By the Hahn–Banach theorem Eβ (x) belongs to the σ(N ⊗bin M, F )-closed linear span of {βg (x); g ∈ G}, thus Eβ (x) ∈ J by Lemma 5.1. But Eβ (x) is also in the closure of N M α which, by the semidiscreteness of α M , intersects J only in 0. Consequently Eβ (x) = 0 and the faithfulness of Eβ yields x = 0. We conclude that J is trivial and M is semidiscrete, thus injective.
Acknowledgments The research of this paper is supported in part by MIUR and GNAMPA-INDAM. We thank S. Woronowicz for a clarifying answer to a question and F. P. Boca for calling our attention to reference [2].
August 28, 2002 10:28 WSPC/148-RMP
796
00139
S. Doplicher et al.
References [1] P. S. Baaj and G. Skandalis, G. “Unitaires multiplicatifs et dualit´ e pour les produits crois´es des C ∗ -alg`ebres”, Ann. Scient. Ec. Norm, Sup. 4e s´erie 26 (1993) 425–488. [2] F. P. Boca, “Ergodic actions of compact matrix pseudogroups on C ∗ -algebras”, in Recent Advances in Operator Algebras, Ast´erisque 232 (1995) 93–109. [3] M. D. Choi and E. Effros, “Nuclear C ∗ -algebras and injectivity; the general case”, Indiana Univ. Math. J. 26 (1977) 443–446. [4] A. Connes, “Classification of injective factors”, Ann. Math. 104 (1976) 73–115. [5] A. Connes, “On the equivalence between injectivity and semidiscreteness for operator algebras”, in Alg`ebres d’Op´erateurs et leurs Applications en Physique Math´ematique, ed. D. Kastler, Colloques Internationaux du CNRS 274 (1979) 107–112. [6] S. Doplicher and J. E. Roberts, “Duals of compact Lie groups realized in the Cuntz algebras and their actions on C ∗ -algebras”, J. Funct. Anal. 74 (1987) 96–120. [7] E. Effros and C. Lance, “Tensor product of operator algebras”, Adv. Math. 25 (1977) 1–34. [8] R. Høegh-Krohn, M. B. Landstad and E. Størmer, “Compact ergodic groups of automorphisms,” Ann. Math. 114 (1981) 75–86. [9] J. Kustermans and S. Vaes, “The operator algebra approach to quantum groups”, Proc. Natl. Acad. Sci. 97(2) (2000) 547–552. [10] T. Masuda and Y. Nakagami, “A von Neumann algebra framework for the duality of the quantum groups”, Publ. Res. Inst. Math. Sci. 30(5) (1994) 799–850. [11] S ¸ . Str˘ atil˘ a, D. V. Voiculescu and L. Zsid´ o, “On crossed products I,” Revue Roum. Math. Pures Appl. 21 (1976) 1411–1449 [12] A. Van Daele, “The Haar measure on a compact quantum pseudogroup,” Proc. Amer. Math. Soc. 123 (1995) 3125–3128. [13] A. Van Daele and S. Wang, “Universal quantum groups”, Int. J. Math. 7(2) (1996) 255–264. [14] S. Wang, “Ergodic actions of universal quantum groups on operator algebras,” Commun. Math. Phys. 203 (1999) 481–498. [15] S. Woronowicz, “Compact matrix pseudogroups,” Commun. Math. Phys. 111 (1987) 613–665. [16] S. Woronowicz, “Compact quantum groups,” in Sym´etries Quantiques, eds. A. Connes, K. Gawedzki and J. Zinn-Justin, Les Houches, Session LXIV (1 Aoˆ ut – 8 Sept. 1995), Elsevier Science, 1998.
August 22, 2002 9:43 WSPC/148-RMP
00140
Reviews in Mathematical Physics, Vol. 14, Nos. 7 & 8 (2002) 797–828 c World Scientific Publishing Company
QUANTUM ‘ax + b’ GROUP
S. L. WORONOWICZ∗ and S. ZAKRZEWSKI Department of Mathematical Methods in Physics, Faculty of Physics, University of Warsaw, Ho˙za 74, 00-682 Warszawa, Poland ∗
[email protected] Received 10 June 2002 Dedicated to professor Huzihiro Araki on his 70th birthday ‘ax + b’ is the group of affine transformations of the real line R. In quantum version ab = q 2 ba, where q 2 = e−i~ is a number of modulus 1. The main problem of constructing quantum deformation of this group on the C ∗ -level consists in non-selfadjointness of ∆(b) = a ⊗ b + b ⊗ I. This problem is overcome by introducing (in addition to a and b) a new generator β commuting with a and anticommuting with b. β (or more precisely β ⊗ β) is used to select a suitable selfadjoint extension of a ⊗ b + b ⊗ I. Furthermore we π , where k = 0, 1, 2, . . . . In this case, q is a root of 1. have to assume that ~ = ± 2k+3 To construct the group, we write an explicit formula for the Kac–Takesaki operator W . It is shown that W is a manageable multiplicative unitary in the sense of [3, 19]. Then using the general theory we construct a C ∗ -algebra A and a comultiplication ∆ ∈ Mor(A, A ⊗ A). A should be interpreted as the algebra of all continuous functions vanishing at infinity on quantum ‘ax + b’-group. The group structure is encoded by ∆. The existence of coinverse also follows from the general theory [19]. Keywords: Locally compact quantum groups; manageable multiplicative unitary quantum groups at roots of unity.
0. Introduction (written by Woronowicz) The research of this paper was proposed and originated by S. Zakrzewski at the end of 1997. Working within the semiclassical framework (Poisson–Lie groups, simplectic leaves, Manin pairs, simplectic groupoids), he gained a deep understanding, how certain incompletenesses on the semiclassical level are reflected in an attempt to construct the corresponding quantum group on the C ∗ -level. This paper was supposed to contain a number of sections devoted to this framework. We planned to explain in details how the semiclassical considerations lead in a natural way to a concept of reflection operator used on the C ∗ -level. It was Stanislaw Zakrzewski, who was supposed to write these sections. Unfortunately after Zakrzewski’s sudden death in April 1998, the first author was unable to reconstruct this part of the paper. In the construction of the quantum deformation of the ‘ax + b’ group on the Hilbert space level one meets the following two problems. First one has to give 797
August 22, 2002 9:43 WSPC/148-RMP
798
00140
S. L. Woronowicz & S. Zakrzewski
meaning to the relation ‘ab = q 2 ba’, where a, b are selfadjoint operators acting on a Hilbert space and q 2 is a number of modulus 1. This problem was considered by many authors. Assume for the moment that a and b are strictly positive. In [10, 11], K. Schm¨ udgen proposed to rewrite ‘ab = q 2 ba’ in the Weyl form: ait biτ = ei~tτ biτ ait , where ~ is a real number such that q 2 = e−i~ and t, τ are variables running over R. We shall use this formula in the form: ait ba−it = e~t b which is meaningful for any selfadjoint b. The original relation ‘ab = q 2 ba’ is recovered by analytic continuation up to the point t = −i. The second problem is related to the formula ∆(b) = a ⊗ b + b ⊗ I. Since the comultiplication ∆ is a C ∗ -algebra morphism, we expect that ∆(b) has the same analytical properties as b. In particular a ⊗ b + b ⊗ I should be selfadjoint. However this is not guaranteed and we have to use the theory of selfadjoint extension developed in [20]. We would like to make a short comment on the quantization of ~. It comes from iπ 2 the formula (6.5) of [20], where the constant α = ie 2~ enters in an implicit way. The point is that this formula essentially simplifies when α = α ¯ . Solving this condition π , where k is an integer. The theory presented in this paper works we obtain ~ = 2k+3 only for these values of ~. It follows that q 2 is a root of unity: q 2(2k+3) = −1. π (k-integer). There is another version of quantum ‘ax + b’ group for which ~ = 2k It will be described in a separate paper [8]. The quantization of ~ seems to be of analytical nature. In particular the semiclassical theory developed by the second author does not imply any limitation of this sort. A few words about the content of the paper. In Sec. 1 we present the quantum ‘ax + b’ group on the Hopf ∗ -algebra level. Next we outline the passage to the Hilbert space and C ∗ -levels. To solve the selfadjointness problem arising on the way we have to extend our group by adding a new generator β called the reflection operator. The three operators a, b, β are subject to suitable commutation relations. This section ends with a short description of the quantum ‘ax + b’ group on the Hilbert space and C ∗ -levels. To construct ‘ax + b’ we shall use the theory of multiplicative unitaries of Baaj and Skandalis [3, 19]. In Sec. 2 we consider a unitary operator W acting on the tensor square of a Hilbert space H: W ∈ B(H ⊗ H). It is introduced by an explicit formula containing four selfadjoint operators: a, b, β, s acting on H. The first three operators are subject to the commutation relations introduced in Sec. 1. The main result of Sec. 2 is Theorem 2.1, which states that W is a manageable multiplicative unitary. The proof of Theorem 2.1 is based on the Fourier transform formula (1.41) of [20]. Once we have a manageable multiplicative unitary W , we apply the theory developed in [3, 19] to construct a quantum group. This is done in Secs. 3 and 4. In Sec. 3 we introduce the C ∗ -algebra Acp generated by three elements a, b, β subject to the commutation relations considered in Sec. 1. By definition, Acp is the crossed product: Acp = B0 ×σ R, where B0 is an algebra of continuous M2×2 (C)-valued
August 22, 2002 9:43 WSPC/148-RMP
00140
Quantum ‘ax + b’ Group
799
functions on R+ and σ is a natural action of R on B0 . We investigate in details properties of Acp . In particular an interesting action φ of Z4 on Acp is described at the end of this section. In Sec. 4 we show that the crossed product algebra Acp coincides with the Baaj– Skandalis left-slice algebra related to the multiplicative unitary W considered in Sec. 2. We compute that the comultiplication acts on generators a, b, β in the way described in Sec. 1. In this way the construction of the quantum ‘ax + b’ group on the C ∗ -algebra level is completed. At the end of Sec. 4 we show that the action φ preserves the group structure of quantum ‘ax + b’. Section 5 is devoted to the dual of the quantum ‘ax + b’ group. By the definition ˆ = the regular dual is the quantum group related to the multiplicative unitary W ∗ ΣW Σ. We show that the regular dual of the quantum ‘ax + b’ group is isomorphic to the same group, provided that we reverse the order of the group rule. The same result holds for the universal (Pontryagin) dual. It is shown in [9] that in this case the regular dual and universal dual coincide. This paper heavily depends on the results of [20]. In particular we shall use the quantum exponential function ( Vθ (log r) for r > 0 and % = 0 (0.1) F~ (r, %) = π [1 + i%|r| ~ ]Vθ (log |r| − πi) for r < 0 and % = ±1 where θ =
2π ~
and Vθ is a meromorphic function on C such that Z ∞ 1 da log(1 + a−θ ) Vθ (x) = exp 2πi 0 a + e−x
(0.2)
for all x ∈ C such that |=x| < π. We shall also use the theory of selfadjoint operators on Hilbert spaces [1, 6], in particular the theory of selfadjoint extensions and functional calculus of many strongly commuting selfadjoint operators. Throughout the paper the symbol χ(R) denotes the logical evaluation of sentence R: χ(R) = 1 for true R and χ(R) = 0 for false R. The sentence R may depend on a selfadjoint operator (or a pair of strongly commuting selfadjoint operators). Then χ(R) is the corresponding spectral projection. The range of χ(R) will be denoted by H(R), where H stands for the Hilbert space on which the operator acts (for details, see the last part of Sec. 0 of [20]). We refer to [2, 7] for the theory of C ∗ -algebras. We shall freely use such notions as: multiplier algebra M (A) of a C ∗ -algebra A, unbounded elements affiliated with a C ∗ -algebra A, the set Mor(A, B) of all morphisms from A into B, a C ∗ -algebra generated by a set of affiliated elements and so on. All these notions are presented in [16–18]. In this paper we use the physicists’ conventions concerning Hilbert spaces. In particular the scalar product (x|y) is by definition linear with respect to y. We shall also use the triple product (x|a|y) to denote (x|ay). When vectors x, y and
August 22, 2002 9:43 WSPC/148-RMP
800
00140
S. L. Woronowicz & S. Zakrzewski
operator a are themselves complicated expressions, then (x|a|y) is more readable than (x|ay). Formula (2.25) is a good example of this situation. We would like to point out the further development of the subject. In what follows, G denotes the quantum ‘ax + b’-group constructed in this paper. A. Van Daele [15] has found left and right invariant Haar weights on G. He has shown that G is a locally compact quantum group in the sense of Kustermans and Vaes [4]. It turned out that the Haar weights are scaled by the scaling group in a nontrivial way. This is one of the first examples of this phenomenon. It was foreseen by the theory of Kustermans and Vaes, however some of the experts believed that in the proper theory the Haar weights should be invariant with respect to the scaling group. Using the nontrivial scaling of the Haar weights, S. Vaes and L. Vainerman have shown [14] that G is essentially different from the quantum deformation of classical ‘ax + b’ proposed by Baaj and Skandalis [13]. In [9], M. Rowicka has shown that all unitary representations of G acting on a Hilbert space K are described by the formula (2.6). The quantum group ‘ax + b’ described in the present paper will be used as a building block in future constructions of higher-dimensional quantum groups. We refer to [22], where quantum deformations of SL(2, R) are presented. For a long time quantum groups at roots of unity seemed to be inaccessible for the C ∗ -approach. The present paper is one of the first successful attempts to include these groups into the theory of locally compact quantum groups. Another example of this kind is given in [21]. 1. First Encounter with ‘ax + b’-Group The group ‘ax + b’ considered in this paper is the group of affine transformations of real line R preserving the orientation (in the transformation formula x0 = ax+b the coefficient a is strictly positive). The group will be denoted by G. The ∗ -algebra A of polynomial functions on G is generated by three hermitian commuting elements a, a−1 , b subject to the relation: a−1 a = I. The comultiplication ∆ encoding the group structure is the ∗ -algebra homomorphism from A into A ⊗ A such that ∆(a) = a ⊗ a , ∆(b) = a ⊗ b + b ⊗ I .
(1.1)
One can easily verify that (A, ∆) is a Hopf ∗ -algebra. In particular counit e and coinverse κ are given by the formulae: e(a) = 1 ,
κ(a) = a−1 ,
e(b) = 0 ,
κ(b) = −a−1 b .
(1.2)
Now we perform quantum deformation of G. The quantum ‘ax+ b’-group on the level of Hopf ∗ -algebra is an object with no problems. The deformation parameter q is a complex number of modulus 1. We shall assume that q 2 6= −1. Then q 2 = e−i~ ,
(1.3)
August 22, 2002 9:43 WSPC/148-RMP
00140
Quantum ‘ax + b’ Group
801
where ~ is a real number such that |~| < π. The change of sign of ~ is equivalent to the passage to the opposite algebra. Therefore we shall assume that ~ > 0. The Hopf ∗ -algebra A of polynomial functions on quantum ‘ax + b’ is generated by three hermitian elements a, a−1 , b subject to the following relations: a−1 a = aa−1 = I , ab = q 2 ba .
(1.4)
The comultiplication ∆ : A → A ⊗ A is the ∗ -algebra homomorphism acting on generators in the way described in (1.1). One can easily verify that the object (A, ∆) described above is a Hopf ∗ -algebra. The counit e and coinverse κ are given by the same formulae (1.2) as in the classical case. Moreover the matrix ! a, b u= 0, I is a corepresentation of (A, ∆). In other words, u is a two dimensional representation of the quantum ‘ax + b’-group. On the Hilbert space level, generators a, a−1 and b should be treated as unboundeda selfadjoint operators acting on a Hilbert space. Since for unbounded operators the algebraic operations are often ill defined, one has to give a more precise meaning to the formulae (1.4). In the operator setting equation a−1 a = aa−1 = I simply means that a−1 is the inverse of a. Furthermore we shall assume that a is positive. This condition is obviously related to the fact that the corresponding classical group consists of transformations preserving the orientation of R. Let ~ be the number related to the deformation parameter q via formula (1.3). To give the precise meaning to the second relation of (1.4), we shall use the following definition: Definition 1.1. Let a and b be selfadjoint operators acting on a Hilbert space H. Assume that a is strictly positive. We write a−o b if ait b a−it = e~t b
(1.5)
for any t ∈ R. More general definition of the relation a−o b is given in [20]. It does not require any additional assumption on a. Inserting t = −i in (1.5) and using (1.3) we obtain the second relation of (1.4). The reader should notice that the condition (1.4) is much weaker than the relation a−o b. For example, (1.4) remains unchanged, when ~ is replaced by ~ + 2π, whereas (1.5) is very sensitive to the choice of ~ solving Eq. (1.3). Recall that we chose ~ such that |~| < π. Let H be a Hilbert space and a, b be operators acting on H. We say that (a, b) is a G-pair if ) a, b are selfadjoint operators on H , (1.6) a is strictly positive, a−o b . a One
can easily check that relations (1.4) cannot be satisfied by bounded operators a and b 6= 0.
August 22, 2002 9:43 WSPC/148-RMP
802
00140
S. L. Woronowicz & S. Zakrzewski
We shall use the terminology introduced in [18]. By the procedure described in [18, Sec. 7], relations (1.6) give rise to a C ∗ -algebra A. This C ∗ -algebra is generated by two unbounded elements log a, b affiliated with it and π ↔ (π(a), π(b))
(1.7)
defines a continuous one to one correspondence between the set Rep(A, H) of all representation of A acting on a Hilbert space H and the set of all G-pairs acting on H. Assume now, that A is equipped with a comultiplication ∆ ∈ Mor(A, A ⊗ A) such that (1.1) holds. Then for any π1 ∈ Rep(A, H1 ) and π2 ∈ Rep(A, H2 ), one may consider the tensor product: π1 > π2 = (π1 ⊗ π2 ) ◦ ∆ . Clearly π1 > π2 ∈ Rep(A, H1 ⊗ H2 ). Using the one to one correspondence (1.7), we may define the tensor product for G-pairs. If (a1 , b1 ) is a G-pair acting on a Hilbert space H1 and (a2 , b2 ) is a G-pair acting on a Hilbert space H2 , then by virtue of (1.1): a, ˜b) , (a1 , b1 ) > (a2 , b2 ) = (˜ where a ˜ = a1 ⊗ a2 , ˜b = a1 ⊗ b2 + b1 ⊗ I .
(1.8)
One expects that (˜ a, ˜b) is a G-pair acting on H1 ⊗ H2 . Unfortunately this is not always the case. It turns out that the operator ˜b is symmetric but not selfadjoint in general (cf. [20 of Theorem 5.4]). This is a serious obstacle in constructing the quantum ‘ax + b’-group on C ∗ -level. One may try to overcome this problem by extending ˜b to a larger domain. Let R = a1 ⊗ b2 and S = b1 ⊗ I. Then R−o S and ˜b = R + S. By the theory developed in [20], selfadjoint extensions of R + S are determined by reflection operators τ such that τ ∗ = τ , τ anticommutes with R and S and τ 2 = χ(ei~/2 RS < 0). The selfadjoint extension of R + S corresponding to a reflection operator τ will be denoted by [R + S]τ . By definition, [R + S]τ is the restriction of (R + S)∗ to the domain D(R + S) + {x ∈ D((R + S)∗ ) : τ x = x}. For given R and S, the existence of a reflection operator is not guaranteed (R+S may have no selfadjoint extensions). To assure the existence of τ in our setting we have to extend our scheme. Instead of G-pairs, we have to consider G-triples. Let a, b, β be operators acting on a Hilbert space H. We say that (a, b, β) is a G-triple if a, b, β are selfadjoint operators on H , (1.9) a is strictly positive, a−o b, 2 β = χ(b 6= 0), βa = aβ and βb = −bβ .
August 22, 2002 9:43 WSPC/148-RMP
00140
Quantum ‘ax + b’ Group
803
The set of all G-triples acting on a Hilbert space H will be denoted by GH . Passing from G-pairs to G-triples means that we extended our group by adding a new element β to the set of generators of the algebra of functions on G. The extended group is in a sense two times bigger than the original one. In what follows, the term “quantum ‘ax + b’-group” will refer to the extended group (we shall omit the word ‘extended’ as we did in the title of this paper). Let H1 and H2 be Hilbert spaces. We would like to introduce the ‘ >’ product of G-triples: for any (a1 , b1 , β1 ) ∈ GH1 and (a2 , b2 , β2 ) ∈ GH2 , ˜ ∈ GH ⊗H . a, ˜b, β) (a1 , b1 , β1 ) > (a2 , b2 , β2 ) = (˜ 1 2
(1.10)
The first formula of (1.8) may be kept unchanged. To modify the second formula we chose α = ±1. Then the operator τ = α(β1 ⊗ β2 )χ(b1 ⊗ b2 < 0) is a reflection operator defining a selfadjoint extension of a1 ⊗ b2 + b1 ⊗ I. Let ˜b = [a1 ⊗ b2 + b1 ⊗ I]τ .
(1.11)
˜ To end the definition of the ‘ >’ product (1.10), we have to write a formula for β. The simplest proposal is: β˜ = β1 ⊗ β2 .
(1.12)
However this formula is not correct. It leads to the tensor product (1.10) which is not associative, which contradicts to the coassociativity of comultiplication ∆. Also the computations performed by the second author within the theory of Poisson– Lie grupoids indicated that the correct formula for β˜ should be rather linear than quadratic with respect to β. To find the correct replacement for (1.12) we shall use the theory of quantum exponential function developed in [20]. In particular the exponential equality (cf. [20, formula (6.5)]) will play an essential role. It implicitly contains a phase factor α related to the deformation parameter ~ by the formula: iπ 2
α = ie 2~ .
(1.13)
The theory presented in this paper works only if this number coincides with the one used in the definition of the reflection operator appearing in (1.11). Now the condition α = ±1 selects a discrete set of admissible values of deformation parameter π , where k = 0, 1, 2, . . . . Clearly α = (−1)k . ~ = 2k+3 The correct formula replacing (1.12) is rather complicated: −1 −1 −1 (β1 ⊗ I) + (I ⊗ β2 )w(ei~/2 b1 a−1 , β˜ = w(ei~/2 b−1 1 a1 ⊗ b 2 ) 1 ⊗ b2 )
(1.14)
where w is the polynomial of order (2k + 3) introduced by the formula: w(t) =
2k+3 Y
(1 + ei( 2 −`)~ t) . 1
(1.15)
`=1
Now we completed the definition of the tensor product (1.10). It will be shown that the triple on the right hand side of (1.10) really belongs to GH1 ⊗H2 and that the ‘ >’ product is associative.
August 22, 2002 9:43 WSPC/148-RMP
804
00140
S. L. Woronowicz & S. Zakrzewski
We end this section with a short description of quantum ‘ax + b’ group on the C ∗ -level. The C ∗ -algebra A of all ‘continuous functions vanishing at infinity on G’ is generated (in the sense explained in [18]) by three selfadjoint affiliated elements: log a, b and iβb. Element β is not affiliated with A. It corresponds to a ‘non-continuous function’ on the group. It becomes continuous when we remove the manifold b = 0 out of G. More precisely β ∈ M (Ab=0 ), where Ab=0 is the ideal of A generated by b. The comultiplication ∆ ∈ Mor(A, A ⊗ A) is associative. On generators it acts in the following way: ∆(a) = a ⊗ a , ∆(b) = [a ⊗ b + b ⊗ I]α(β⊗β)χ(b⊗b<0) , ∆(iβb) = i{w(ei~/2 b−1 a ⊗ b)−1 (β ⊗ I) + (I ⊗ β)w(ei~/2 ba−1 ⊗ b−1 )−1 }∆(b) . It would be interesting to investigate, how the above objects and formulae behave when ~ → 0. This subject is not discussed in the present paper. 2. The Kac Takesaki operator The theory of Baaj and Skandalis provides us with a powerful tool of constructing quantum groups on the C ∗ -level. Let H be a Hilbert space and W ∈ B(H ⊗ H) a unitary operator. We shall use the leg numbering notation: Wkl is a copy of W acting on H ⊗ H ⊗ H, affecting only kth and lth copy of H in H ⊗ H ⊗ H. According to [3], W is called a multiplicative unitary if it satisfies the pentagon equation: W23 W12 = W12 W13 W23 .
(2.1)
¯ be the complex conjugate of H. For any x ∈ H, the corresponding element of Let H ¯ will be denoted by x ¯ is an antiunitary map. We say that H ¯. Then H 3 x → x ¯∈H a multiplicative unitary W is manageable [19] if there exist a positive selfadjoint ˜ acting on H ¯ ⊗ H such that operator Q acting on H and a unitary operator W ker(Q) = {0}, W ∗ (Q ⊗ Q)W = Q ⊗ Q
(2.2)
˜ |¯ (x ⊗ u|W |z ⊗ y) = (¯ z ⊗ Qu|W x ⊗ Q−1 y)
(2.3)
and
for any x, z ∈ H, y ∈ D(Q−1 ) and u ∈ D(Q). As it is shown in [19], any manageable multiplicative unitary gives rise to a quantum group on the C ∗ -level. Throughout this section we assume that the deformation parameter q 2 = e−i~ , π , k = 0, 1, 2, . . . . Then the constant where ~ = ± 2k+3 iπ 2
α = ie 2~ = (−1)k = ±1 .
(2.4)
August 22, 2002 9:43 WSPC/148-RMP
00140
Quantum ‘ax + b’ Group
805
The main result of this section is Theorem 2.1. Let H be a Hilbert space, (a, b, β) ∈ GH and r, s be strictly positive selfadjoint operators acting on H. Assume that ker b = {0}, r and s strongly commute with a, b and β and r−o s. Then the operator W = F~ (ei~/2 b−1 a ⊗ b, α(β ⊗ β)χ(b ⊗ b < 0))∗ e ~ log(s|b| i
−1
)⊗log a
(2.5)
is a manageable multiplicative unitary. The pentagon equation for (2.5) will follow from ˆ ∈ GK a, ˆb, β) Proposition 2.2. Let H and K be Hilbert spaces, (a, b, β) ∈ GH , (ˆ and s be a strictly positive selfadjoint operators acting on H. Assume that ker b = {0} and s strongly commutes with a, b and β. Then the operators (2.5) and i (2.6) V = F~ (ˆb ⊗ b, α(βˆ ⊗ β)χ(ˆb ⊗ b < 0))∗ e ~ log aˆ⊗log a satisfy the pentagon equation: W23 V12 = V12 V13 W23 .
(2.7)
Proof. We shall consider the following selfadjoint operators acting on K ⊗ H ⊗ H: R = ˆb ⊗ a ⊗ b , ρ = α(βˆ ⊗ I ⊗ β)χ(ˆb ⊗ I ⊗ b < 0) , S = ˆb ⊗ b ⊗ I ,
σ = α(βˆ ⊗ β ⊗ I)χ(ˆb ⊗ b ⊗ I < 0) ,
T = I ⊗ ei~/2 b−1 a ⊗ b ,
τ = α(I ⊗ β ⊗ β)χ(I ⊗ b ⊗ b < 0) .
One can easily verify that these operators satisfy the assumptions of [20, Theorem 6.1]. Therefore F~ (R, ρ)F~ (S, σ) = F~ (T, τ )∗ F~ (S, σ)F~ (T, τ ) . Rearranging this formula we obtain: F~ (T, τ )∗ F~ (S, σ)∗ = F~ (S, σ)∗ F~ (R, ρ)∗ F~ (T, τ )∗ .
(2.8)
Using the leg numbering notation we get: X23 Y12 = Y12 Y˜ X23 ,
(2.9)
where X = F~ (ei~/2 b−1 a ⊗ b, α(β ⊗ β)χ(b ⊗ b < 0))∗ , Y = F~ (ˆb ⊗ b, α(βˆ ⊗ β)χ(ˆb ⊗ b < 0))∗ , Y˜ = F~ (ˆb ⊗ a ⊗ b, α(βˆ ⊗ I ⊗ β)χ(ˆb ⊗ I ⊗ b < 0))∗ . We notice that replacing a by I in the right hand side of the third formula we obtain Y13 . With this small modification, (2.9) coincides with the pentagon equation of Baaj and Skandalis. The operators X, Y are the first factors appearing on the right hand side of definitions (2.5) and (2.6). Now we shall investigate the second factors: U = e ~ log aˆ⊗log a , i
i
Z = e ~ log(s|b|
−1
)⊗log a
.
August 22, 2002 9:43 WSPC/148-RMP
806
00140
S. L. Woronowicz & S. Zakrzewski
Using the relations s|b|−1 −o a (which follows immediately from a−o b) and a ˆ−o ˆb, one can easily verify that Z(a ⊗ I)Z ∗ = a ⊗ a , U (ˆb ⊗ I)U ∗ = ˆb ⊗ a .
(2.10)
The first relation implies that ∗ = e ~ log aˆ⊗log(a⊗a) = e ~ log aˆ⊗(log a⊗I+I⊗log a) = U12 U13 . Z23 U12 Z23 i
i
(2.11)
The reader should notice that U12 commutes with (βˆ ⊗ I ⊗ β)χ(ˆb ⊗ I ⊗ b < 0). Therefore the second relation of (2.10) implies that ∗ = Y˜ . U12 Y13 U12
(2.12)
One can easily verify that b and β commute with s|b|−1 . Therefore Y12 commutes with Z23 : Y12 Z23 = Z23 Y12 .
(2.13)
Our assumptions imply that ei~/2 b−1 a ⊗ b, β ⊗ β and χ(b ⊗ b < 0) commute with a ⊗ a. Therefore X commutes with a ⊗ a. Taking into account (2.11) we obtain: X23 U12 U13 = U12 U13 X23 .
(2.14)
Now the proof of (2.7) is a matter of elementary computations. Remembering that W = XZ and V = Y U and using (2.13), (2.9), (2.11), (2.14) and (2.12), we obtain: W23 V12 = X23 Z23 Y12 U12 = X23 Y12 Z23 U12 = Y12 Y˜ X23 U12 U13 Z23 = Y12 Y˜ U12 U13 X23 Z23 = Y12 U12 Y13 U13 X23 Z23 = V12 V13 W23 . Let H be a Hilbert space, (a, b, β) ∈ GH and s be a strictly positive selfadjoint operator acting on H. Assume that ker b = {0} and s strongly commutes with a, b and β. Then one can easily verify that (s|b|−1 , ei~/2 b−1 a, β) ∈ GH . For K = H, a ˆ = s|b|−1 , ˆb = ei~/2 b−1 a and βˆ = β, the operator (2.6) coincides with (2.5) and using (2.7) we obtain (2.1). It shows that the operator W introduced by (2.5) is a multiplicative unitary. ¯ the complex conjugate Hilbert space. For any Hilbert space K we denote by K Then we have canonical antiunitary bijection: ¯. K 3x→x ¯∈K
(2.15)
If m is a closed operator acting on K, then its transpose m> is introduced by the formula ¯ = m∗ x m> x
August 22, 2002 9:43 WSPC/148-RMP
00140
Quantum ‘ax + b’ Group
807
¯ with the domain for any x ∈ D(m∗ ). Clearly m> is a closed operator acting on K x : x ∈ D(m∗ )}. If x ∈ D(m∗ ) and z ∈ D(m), then D(m> ) = {¯ x) = (x|m|z) . (¯ z |m> |¯
(2.16)
z |m∗ x) = (m∗ x|z) = (x|mz). Indeed: (¯ z |m> x¯) = (¯ One can easily verify that the transposition commutes with the adjoint operation: (m∗ )> = (m> )∗ . So m> is selfadjoint for selfadjoint m. Moreover the transposition inverses the order of multiplication: (ab)> = b> a> . Therefore a−o b implies ˆ is a selfadjoint operator on K and f is a bounded measurable function b> −o a> . If a a)> = f (ˆ a> ). on Sp a ˆ, then Sp a ˆ = Sp a ˆ> and f (ˆ Let a ˆ and a be selfadjoint operators acting on K and H respectively. Then a ˆ⊗I and I ⊗a are strongly commuting selfadjoint operators acting on K ⊗H. Their joint spectrum coincides with Sp a ˆ × Sp a. We have the following ‘partial transposition’ formula x ⊗ y) = (x ⊗ u|f (ˆ a ⊗ I, I ⊗ a)|z ⊗ y) . (¯ z ⊗ u|f (ˆ a> ⊗ I, I ⊗ a)|¯ In this formula x, z ∈ K, u, y ∈ H and f (·, ·) is a bounded measurable function on Sp a ˆ × Sp a. By linearity and continuity it is sufficient to prove this formula for functions of the form f = f1 ⊗ f2 , where f1 and f2 are functions of one variable. In this case the formula follows immediately from (2.16). We shall use the following particular case of the partial transposition formula: i
>
(¯ z ⊗ u|e ~ aˆ
⊗a
|¯ x ⊗ y) = (x ⊗ u|e ~ aˆ⊗a |z ⊗ y) . i
(2.17)
To prove the manageability of the multiplicative unitary (2.5), we shall use the following ˆ ∈ GK a, ˆb, β) Proposition 2.3. Let H and K be Hilbert spaces, (a, b, β) ∈ GH and (ˆ and let V be the unitary operator introduced by (2.6). Moreover let Q be a strictly positive selfadjoint operator acting on H such that Q strongly commutes with a and β and Q2 −o b. We set: > i V˜ = F~ (−ˆb> ⊗ ei~/2 ba−1 , −(βˆ> ⊗ β)χ(ˆb> ⊗ b > 0))e ~ log aˆ ⊗log a .
(2.18)
Then V˜ is unitary and for any x, z ∈ K, y ∈ D(Q−1 ), u ∈ D(Q), we have: (x ⊗ u|V |z ⊗ y) = (¯ z ⊗ Qu|V˜ |¯ x ⊗ Q−1 y) .
(2.19)
We remark that Formula (2.7) shows that (2.6) is an adapted operator in the sense of [19, Definition 1.3]. Comparing (2.18) with Statement 5 of Theorem 1.6 of [19], one can easily find the unitary antipode R of our quantum group. It acts on a, b, β as follows: aR = a−1 , bR = −ei~/2 ba−1 , β R = −αβ .
August 22, 2002 9:43 WSPC/148-RMP
808
00140
S. L. Woronowicz & S. Zakrzewski
Proof. To make our formulae shorter, we set: U = e ~ log aˆ⊗log a ,
˜ = e ~i log aˆ> ⊗log a , U
B = |ˆb ⊗ b| ,
˜ = |ˆb> ⊗ ei~/2 ba−1 | . B
i
(2.20)
˜ and (2.19) follows immediately from If either ˆb = 0 or b = 0, then V = U , V˜ = U (2.17) (recall that Q commutes with a). Therefore we may assume that ker ˆb = {0} and ker b = {0}. In this case, by the spectral theorem K = K+ ⊕ K− , H = H + ⊕ H− , where K+ = K(ˆb > 0) ,
K− = K(ˆb < 0) ,
H+ = H(b > 0) ,
H− = H(b < 0) .
For tensor products we have the decompositions: K ⊗ H = K + ⊗ H+ ⊕ K + ⊗ H− ⊕ K − ⊗ H+ ⊕ K − ⊗ H− ,
(2.21)
¯ + ⊗ H− ⊕ K ¯ − ⊗ H+ ⊕ K ¯ − ⊗ H− . ¯ ⊗H = K ¯ + ⊗ H+ ⊕ K K
(2.22)
Operators B and U respect the decomposition (2.21), whereas (βˆ ⊗ β)χ(ˆb ⊗ b < 0) interchanges K+ ⊗H− with K− ⊗H+ and kills K+ ⊗H+ and K− ⊗H− . For the same ˜ and U ˜ respect the decomposition (2.22), whereas (βˆ> ⊗ β)χ(ˆb> ⊗ b > 0) reason, B ¯ − ⊗ H− and kills K ¯ + ⊗ H− and K ¯ − ⊗ H+ . ¯ interchanges K+ ⊗ H+ with K
We may assume that x ∈ Ksx , u ∈ Hsu , z ∈ Ksz , y ∈ Hsy , where sx , su , sz , sy = +, −. There are 24 = 16 possible combinations of the signs. However a moment of reflection shows that for 10 combinations both sides of (2.19) vanish. The remaining combinations are: (+, +, +, +) – case 1 (−, −, −, −) (+, −, +, −) – case 2 (sx , su , sz , sy ) = (−, +, −, +) (+, −, −, +) – case 3 . (−, +, +, −) We have divided the six possibilities into three cases. Using Formula (0.1) it is not difficult to show that Eq. (2.19) reduces to ˜ − πi)U ˜ |¯ z ⊗ Qu|Vθ (log B x ⊗ Q−1 y) (x ⊗ u|Vθ (log B)∗ U|z ⊗ y) = (¯
(2.23)
˜ U ˜ |¯ z ⊗ Qu|Vθ (log B) x ⊗ Q−1 y) (x ⊗ u|Vθ (log B − πi)∗ U |z ⊗ y) = (¯
(2.24)
and
August 22, 2002 9:43 WSPC/148-RMP
00140
Quantum ‘ax + b’ Group
809
in cases 1 and 2 respectively. In case 3 we obtain a more complicated formula: π (x ⊗ u|[iα(βˆ ⊗ β)B ~ Vθ (log B − πi)]∗ U |z ⊗ y)
˜ π~ Vθ (log B ˜ − πi)]U ˜ |¯ x ⊗ Q−1 y) . = (¯ z ⊗ Qu|[−i(βˆ> ⊗ β)B ˆ β are selfadjoint, I ⊗ β commutes with B and using (βˆ> )∗ z¯ = Remembering that β, ˆ we may rewrite the above equation in the following equivalent form: βz, (x ⊗ u0 |αB ~ Vθ (log B − πi)∗ U |z 0 ⊗ y) π
˜ π~ Vθ (log B ˜ − πi)U ˜ |¯ x ⊗ Q−1 y) , = (z 0 ⊗ Qu0 |B
(2.25)
ˆ To prove formulae (2.23), (2.24) and (2.25) we shall where u0 = βu and z 0 = βz. use the following Proposition 2.4. Let a, b, Q be selfadjoint operators acting on a Hilbert space H and a ˆ, ˆb be selfadjoint operators acting on a Hilbert space K. Assume that a and Q are strictly positive, ker b = {0}, a−o b, Q strongly commutes with a and Q2 −o b. Assume also that a ˆ is strictly positive, ker ˆb = {0} and a ˆ−o ˆb. Moreover, let x, z ∈ H, −1 y ∈ D(Q ), u ∈ D(Q) and for any k ∈ R, ϕ(k) = (x ⊗ u|B ik U |z ⊗ y) , ˜ |¯ ˜ ik U x ⊗ Q−1 y) , ψ(k) = (¯ z ⊗ Qu|B
(2.26)
˜ and U ˜ are operators introduced by (2.20). Then where B, U, B ψ(k) = e~k/2 e− 2 k ϕ(k) i~
2
(2.27)
for any k ∈ R. Proof. Relation a−o b implies that ei~/2 |b|a−1 is selfadjoint and that (ei~/2 |b|a−1 )ik = e− 2 k |b|ik a−ik i~
2
(2.28)
for any k ∈ R (cf. [20, Formula (3.8)]). Remembering that Q strongly commutes with a and Q2 −o b, one can easily ˜ Therefore (I ⊗ Q)B ˜ ik U ˜ = ˜ and I ⊗ Q2 −o B. show that I ⊗ Q2 commutes with U ~k/2 ˜ ik ˜ B U(I ⊗ Q) and e ˜ x ⊗ y) . ˜ ik U|¯ z ⊗ u|B ψ(k) = e~k/2 (¯ Taking into account (2.28) and using in the third step (2.17) we obtain: > i~ 2 i z ⊗ u|(|ˆb> |ik ⊗ |b|ik a−ik )e ~ log aˆ ⊗log a |¯ x ⊗ y) ψ(k) = e~k/2 e− 2 k (¯ > i~ 2 i x ⊗ y) = e~k/2 e− 2 k (|ˆb|ik z ⊗ aik |b|−ik u|e ~ log aˆ ⊗log a |¯ i~ 2 i = e~k/2 e− 2 k (x ⊗ aik |b|−ik u|e ~ log aˆ⊗log a ||ˆb|ik z ⊗ y) i~ 2 i = e~k/2 e− 2 k (x ⊗ u|(I ⊗ |b|ik a−ik )e ~ log aˆ⊗log a (|ˆb|ik ⊗ I)|z ⊗ y) .
August 22, 2002 9:43 WSPC/148-RMP
810
00140
S. L. Woronowicz & S. Zakrzewski
Now, to prove (2.27) it is sufficient to show that i i e ~ log aˆ⊗log a (|ˆb|ik ⊗ I) = (|ˆb|ik ⊗ aik )e ~ log aˆ⊗log a .
If a is a multiple of I: a = e~l I, then e induces that the equality
i ~
log a ˆ ⊗log a
(2.29)
= a ˆil and the above formula
ˆil a ˆil |ˆb|ik = ei~kl |ˆb|ik a equivalent to the assumed relation a ˆ−o|ˆb|. By spectral decomposition, (2.29) holds for any strictly positive operator a. We have to investigate the regularity properties of functions ϕ and ψ introduced by (2.26). If x ∈ D(ˆb±1 ) ,
u ∈ D(b±1 Q±2 ) ,
y ∈ D(Q±2 )
(2.30)
for all possible combinations of signs, then the functions ϕ and ψ belong to the Schwartz space S(R). Indeed using the relation Q2 −o b one can easily show that (I ⊗ Q2 )−o B and e±~k ϕ(k) = (x ⊗ Q±2 u|B ik U|z ⊗ Q∓2 y) . By (2.30), x ⊗ Q±2 u ∈ D(B ±1 ). Therefore the functions e±~k ϕ(k) admit holomorphic continuation to functions bounded on the strip {k ∈ C : −1 < =k < 1}. It implies that ϕ ∈ S(R). Moreover using (2.27) we see that the functions e±~k/4 ψ(k) admit holomorphic continuation to functions bounded on the strip {k ∈ C :−1/4 < =k < 1/4}. It shows that ψ ∈ S(R). In the following we shall use the language of distribution theory. Let f and g be measurable bounded functions on R+ . Then the functions R 3 t → f (et ) ∈ C and R 3 t → g(et ) ∈ C are bounded and may be considered as a tempered distributions on R. We denote by fˆ and gˆ the inverse Fourier transforms of these distributions. Then Z fˆ(k) tik dk , f (t) = R
Z
(2.31) ik
gˆ(k) t dk
g(t) = R
for almost all t ∈ R+ . Proposition 2.5. Let f, g be bounded measurable functions on R+ and fˆ and gˆ be tempered distributions related to f and g via (2.31). Assume that −i~ 2 fˆ(k) = e~k/2 e 2 k gˆ(k) .
(2.32)
Then, using the notation and assumptions of Proposition 2.4, we have: ˜ U ˜ |¯ (x ⊗ u|f (B)U |z ⊗ y) = (¯ z ⊗ Qu|g(B) x ⊗ Q−1 y) .
(2.33)
August 22, 2002 9:43 WSPC/148-RMP
00140
Quantum ‘ax + b’ Group
811
Proof. Assume for the moment that vectors x, y, z, u satisfy conditions (2.30). Then the functions (2.26) belong to S(R). Comparing (2.31) with (2.26) we obtain: Z fˆ(k)ϕ(k)dk , (x ⊗ u|f (B)U |z ⊗ y) = R
˜ U ˜ |¯ (¯ z ⊗ Qu|g(B) x ⊗ Q−1 y) =
Z
gˆ(k)ψ(k)dk . R
Using now (2.32) and (2.27) we see that the right hand sides of the above formulae coincide and (2.33) follows. To end the proof we notice that the conditions (2.30) select sufficiently large sets of vectors: D(ˆb) ∩ D(ˆb−1 ) is dense in H, D(bQ2 ) ∩ D(bQ−2 ) ∩ D(b−1 Q2 ) ∩ D(b−1 Q−2 ) is a core for Q and D(Q2 ) ∩ D(Q−2 ) is a core for Q−1 . We continue the proof of Proposition 2.3. For any t ∈ R+ we set f1 (t) = Vθ (log t) ,
g1 (t) = Vθ (log(t) − πi) ,
f2 (t) = Vθ (log(t) − πi) ,
g2 (t) = Vθ (log t),
π
f3 (t) = αt ~ Vθ (log(t) − πi) ,
(2.34)
π
g3 (t) = t ~ Vθ (log(t) − πi) .
Let fˆi and gˆi be the tempered distributions related to the above functions via (2.31). We already know that (2.19) resolves into (2.23), (2.24) and (2.25). By virtue of Proposition 2.5, in order to prove this relation it is sufficient to verify that −i~ 2 fˆi (k) = e~k/2 e 2 k gˆi (k)
(2.35)
for i = 1, 2, 3. Let us notice that f1 (t) = g2 (t), f2 (t) = g1 (t) and f3 (t) = −αg3 (t). Therefore fˆ1 (k) = gˆ2 (−k) ,
fˆ2 (k) = gˆ1 (−k) ,
fˆ3 (k) = αˆ g3 (−k) .
(2.36)
To verify relations (2.35) we shall use the formulae (cf formulae (1.36) and (1.41) of [20]): 2 i¯ x x) , (2.37) Vθ (−¯ CVθ (x) = exp 2~ Z iy2 ixy 1 i~ √ − iπ e 2~ e ~ dy = C 0 Vθ (x) , Vθ y + iε − (2.38) 2 2π~ R ~ π π ~ π 0 where C = exp{( 2π ~ + 2π ) 12 i } and C = exp{i( 4 + 24 + 6~ )} are phase factors and ε is a small positive number indicating that the integration path is rounding the pole of the integrand at the point y = 0 from above. Inserting in (2.38), x = log t and y = ~k we obtain: Z ~ i~k2 i~ − iπ e 2 tik dk. Vθ ~k + iε − Vθ (log t) = √ 2 C 0 2π~ R 2
August 22, 2002 9:43 WSPC/148-RMP
812
00140
S. L. Woronowicz & S. Zakrzewski
The left hand side coincides with g2 (t). Therefore i~k2 ~ i~ Vθ ~k + iε − − iπ e 2 . gˆ2 (k) = √ 0 2 C 2π~ Now, using (2.36) and (2.37) we obtain: i~k2 ~C 0 i~ − iπ e− 2 Vθ −~k + iε − fˆ1 (k) = √ 2 2π~ 2 i i~ i~k2 ~C 0 i~ Vθ ~k + iε − − iπ e 2~ (~k+iε− 2 −iπ) − 2 = √ 2 C 2π~ ~ ~C 00 i~ Vθ ~k + iε − − iπ e( 2 +π)k , = √ 2 2π~ 0
~
(2.39)
(2.40)
where C 00 = CC e− 2~ ( 2 +π) = C10 . Function g1 is related to g2 by imaginary shift: replacing ‘log t’ by ‘log t − iπ’ in the formula for g2 (t) we obtain g1 (t). Using this fact one can easily show that gˆ1 (k) = gˆ2 (k)eπk . Taking into account (2.39), we obtain: i~k2 ~ i~ Vθ ~k + iε − − iπ eπk e 2 . (2.41) gˆ1 (k) = √ 2 C 0 2π~ i
2
Comparing now (2.40) with (2.41) one can easily verify (2.35) for i = 1. Using (2.36) one can easily show that (2.35) for i = 1 and i = 2 are equivalent. To π end the proof we have to verify (2.35) for i = 3. According to (2.34), g3 (t) = g1 (t)t ~ . Therefore gˆ3 is related to gˆ1 by imaginary shift: gˆ3 (k) = gˆ1 (k + iπ ~ ). Taking into account (2.41) we get: ~ iπ 2 i~k2 i~ Vθ ~k − (2.42) e 2~ e 2 . gˆ3 (k) = √ 0 2 C 2π~ Now, using (2.36) and (2.37) we obtain: α~C 0 i~ − iπ2 − i~k2 ˆ Vθ −~k − e 2~ e 2 f3 (k) = √ 2 2π~ i i~ 2 iπ 2 i~k2 α~C 0 i~ Vθ ~k − = √ e 2~ (~k− 2 ) − 2~ − 2 2 C 2π~ ~C 00 i~ ~k/2 Vθ ~k − = √ , e 2 2π~ 0
(2.43)
iπ 2
− 8 − 2~ . Comparing now (2.43) with (2.42) one can easily verify where C 000 = αC C e (2.35) for i = 3. This ends the proof of (2.19) and then also of Proposition 2.3. i~
Now we are able to prove Theorem 2.1. Let a, b, r, s, β be selfadjoint operators acting on a Hilbert space H, satisfying the assumptions of Theorem 2.1. Setting √ K = H, a ˆ = s|b|−1 , ˆb = ei~/2 b−1 a, βˆ = β and Q = ra we satisfy all the
August 22, 2002 9:43 WSPC/148-RMP
00140
Quantum ‘ax + b’ Group
813
assumptions of Propositions 2.2 and 2.3. Introducing the above data into (2.6) and ˜ ∈ B(H ¯ ⊗ H). Clearly W (2.18), we obtain unitary operators W ∈ B(H ⊗ H) and W is given by (2.5). Proposition 2.2 shows that the operator W satisfies the pentagon equation (2.7). In the present setting, (2.19) coincides with (2.3). To finish the proof of manageability of W we have to show that W commutes with Q ⊗ Q. To this end we notice that ra−o b and ei~/2 b−1 a−o ra. Therefore Q2 ⊗ Q2 = ra ⊗ ra strongly commutes with ei~/2 b−1 a ⊗ b. Moreover recall that a−o b and r−o s it is easy to show that Q2 = ra strongly commutes with s|b|−1 . Clearly Q commutes with a. Therefore Q2 ⊗ Q2 strongly commutes with log(s|b|−1 ) ⊗ log a. Using this information we see that Q ⊗ Q commutes with (2.5) and manageability of W follows. This ends the proof of Theorem 2.1. Remark 2.6. Operators r, s appearing in this section play an auxiliary role and may be removed from the considerations. Let H be a Hilbert space and (a, b, β) ∈ ˆ = |b|−1 , ˆb = ei~/2 b−1 a, βˆ = β, GH . Assume that ker b = {0}. Setting K = H, a √ s = I and Q = a, then all the assumptions of Propositions 2.2 and 2.3 are satisfied. Substituting the above data into (2.6) and (2.18) we obtain unitary operators W ∈ ˜ ∈ B(H ¯ ⊗ H). Now W is given by the simpler formula B(H ⊗ H) and W W = F~ (ei~/2 b−1 a ⊗ b, α(β ⊗ β)χ(b ⊗ b < 0))∗ e ~ log(|b| i
−1
)⊗log a
.
(2.44)
By Proposition 2.2, the above operator satisfies the pentagon equation (2.1). As before, (2.19) coincides with (2.3). However now W does not commute with Q ⊗ Q. It means that operator (2.44) is not manageable in the sense of [19]. Instead of (2.2), we have: ˆ ⊗Q, ˆ ⊗ Q)W ∗ = Q W (Q ˆ= where Q
(2.45)
√ 1 a ˆ = |b|− 2 is a strictly positive selfadjoint operator.
It turns out that all the results of [19] remain valid, when (2.2) is replaced by (2.45) (cf. [12]). 3. Crossed Product Algebra In this section we construct the C ∗ -algebra related to the commutation relations (1.9). Let C∞ (R+ ) be the C ∗ -algebra of all continuous functions vanishing at infinity on the closed halfline R+ = [0, +∞[, M2 be the algebra of all 2 × 2 matrices with complex entries and B = C∞ (R+ ) ⊗ M2 . Elements of B are continuous mappings f : R+ → M2 such that limτ →∞ f (τ ) = 0. Imposing an additional condition saying that f (0) is a multiple of I we select a non-degenerate C ∗ -subalgebra B0 ⊂ B. The matrix elements of any f ∈ B will be denoted by fkl ∈ C∞ (R+ ) (k, l = 1, 2): ! f11 f12 . (3.1) f= f21 f22
August 22, 2002 9:43 WSPC/148-RMP
814
00140
S. L. Woronowicz & S. Zakrzewski
Then
( B0 =
f ∈B:
For any τ ∈ R+ we set: ! τ 0 b(τ ) = , 0 −τ
.
f12 (0) = f21 (0) = 0
β(τ ) =
0
χ(τ 6= 0)
χ(τ 6= 0)
0
Then bβ = −βb and (ibβ)(τ ) =
)
f11 (0) = f22 (0),
0
iτ
−iτ
0
! .
(3.2)
! .
(3.3)
We note that b(τ ) and (ibβ)(τ ) depend continuously on τ and that b(0) and (ibβ)(0) are multiple of I ∈ M2 . According to [18, (2.6)], b and ibβ are elements affiliated with B0 . Clearly these elements are selfadjoint. On the other hand, β(τ ) is not continuous with respect to τ . Therefore β is not affiliated with B0 . Instead, it belongs to the W ∗ -envelope of B0 . Let t ∈ R and f ∈ B0 . For any τ ∈ R we set: (σt f )(τ ) = f (e~t τ ) . Then σt f ∈ B0 , σt ∈ Aut(B0 ) and (σt )t∈R is a pointwise continuous one parameter group of automorphisms of B0 . In other words, (B0 , (στ )τ ∈R ) is a C ∗ -dynamical system. Let Acp = B0 ×σ R
(3.4)
∗
be the corresponding C -crossed product algebra [5]. The canonical embedding B0 ,→ M (Acp ) is a morphism from B0 into Acp . Therefore the elements affiliated with B0 are affiliated with Acp . In particular b, ibβηAcp . The similar conclusion holds for β. It belongs to W ∗ -envelope of Acp . By the definition of crossed product, M (Acp ) contains a strictly continuous one parameter group of unitaries implementing the action σ of R on B0 . The infinitesimal generator of this group will be denoted by log a. Then a is a strictly positive selfadjoint element affiliated with Acp . For any f ∈ B0 we have: ait f a−it = σt f . One can easily verify that σt b = e~t b and σt (ibβ) = e~t ibβ. Therefore ait b = e~t bait and ait ibβ = e~t ibβait for any t ∈ R. It means that a−o b
and
aβ = βa .
By construction, the set {f g(log a) : f ∈ B0 , g ∈ C∞ (R)}linear
envelope
is a dense subset of the C ∗ -crossed product Acp = B0 ×σ R.
(3.5)
August 22, 2002 9:43 WSPC/148-RMP
00140
Quantum ‘ax + b’ Group
815
Proposition 3.1. The C ∗ -algebra Acp is generated (in the sense explained in [18]) by the three affiliated elements log a, b, ibβηAcp. Proof. We shall use Theorem 3.3 of [18]. One can easily verify that ! g1 (τ ) iτ g2 (τ ) (g1 (b) + g2 (b)ibβ)(τ ) = −iτ g2 (−τ ) g1 (−τ ) for any g1 , g2 ∈ Ccompact (R) and that the set of elements of the above form is dense in B0 . Therefore elements b and ibβ separate representations of B0 . Recall that (3.5) is dense in Acp . We see that elements log a, b and ibβ separate representations of Acp . This way we verified Assumption 1 of Theorem 3.3 of [18]. Let r1 = (I + b∗ b)−1 and r2 = (I + (log a)∗ (log a))−1 . To end the proof, it is sufficient to notice that r1 r2 = f g(log a) , where f = (I +b2 )−1 and g(λ) = (1+λ2 )−1 . Clearly f ∈ B0 and g ∈ C∞ (R). Therefore r1 r2 belongs to (3.5) and consequently r1 r2 ∈ Acp . It shows that Assumption 2 of Theorem 3.3 of [18] holds. Now this theorem says that Acp is generated by log a, b and ibβ. Let H be a Hilbert space and π be a non-degenerate representation of Acp acting on H: π ∈ Rep(Acp , H). According to the general theory, π admits a natural extension to the set of affiliated elements Aηcp and to the W ∗ -envelope of Acp . Clearly π(a), π(b) and π(β) are selfadjoint operators. Moreover π(a) is strictly positive, π(a)−o π(b), π(β)2 = χ(π(b) 6= 0), π(β) commutes with π(a) and anticommutes with π(b). It means that (π(a), π(b), π(β)) is a G-triple. It turns out that any Gtriple is of this form. Proposition 3.2. Let H be a Hilbert space and (ao , bo , βo ) ∈ GH . Then there exists unique representation π ∈ Rep(Acp , H) such that ao = π(a), bo = π(b) and βo = π(β). If A ∈ C ∗ (H) and log ao , bo , ibo βo ηA, then π ∈ Mor(Acp , A). Proof. For any f ∈ B0 of the form (3.1) we set: ( f11 (bo )χ(bo ≥ 0) + f12 (bo )χ(bo > 0)βo πo (f ) = + βo f21 (bo )χ(bo > 0) + βo f22 (bo )χ(bo ≥ 0)βo .
(3.6)
Elementary computations show that πo is a non-degenerate representation of B0 . The action of πo on elements affiliated with B0 is described by the same formula (3.6). In particular for elements (3.2) we have: πo (b) = bo and πo (β) = βo . Indeed πo (b) = bo χ(bo ≥ 0) − βo bo χ(bo ≥ 0)βo = bo χ(bo ≥ 0) + bo χ(−bo ≥ 0)βo2 = bo χ(bo ≥ 0) + bo χ(bo < 0) = bo
August 22, 2002 9:43 WSPC/148-RMP
816
00140
S. L. Woronowicz & S. Zakrzewski
and similarly πo (β) = χ(bo > 0)βo + βo χ(bo > 0) = χ(bo > 0)βo + χ(bo < 0)βo = βo . −it = e~t bo . We also Now we shall use the relation ao −o bo . It means that ait o b o ao know that ao commutes with βo . Therefore ( f11 (e~t bo )χ(bo ≥ 0) + f12 (e~t bo )χ(bo > 0)βo it −it ao πo (f )ao = + βo f21 (e~t bo )χ(bo > 0) + βo f22 (e~t bo )χ(bo ≥ 0)βo
= πo (σt f ) .
(3.7)
∗ It shows that the pair (πo , (ait o )t∈R ) is a covariant representation of the C dynamical system (B0 , (σt )t∈R ). Let π be the corresponding representation of the crossed product algebra Acp . Then π(ait ) = ait o and π(a) = ao . Moreover π restricted to Bo ⊂ M (Acp ) coincides with πo . In particular π(b) = bo and π(β) = βo . This way we constructed representation π ∈ Rep(Acp , H) having desired properties. The uniqueness of π and the last Statement of the proposition follows immediately from Proposition 3.1 (cf. Definition 3.1 and Theorem 6.2 of [18]).
We shall use the above results to show the following Proposition 3.3. There exists unique automorphism φ of the C ∗ -algebra Acp such that φ(a) = a , φ(b) = b ,
(3.8)
φ(iβb) = β|b| . This automorphism is of order 4 : φ4 = id. Proof. We may assume that Acp is a non-degenerate C ∗ -algebra of operators acting on a Hilbert space H. Then (a, b, β) ∈ GH . Let ao = a, bo = b and βo = −iβ sign b. One can easily verify that (ao , bo , βo ) ∈ GH and that ao , bo and iβo bo = β|b| are affiliated with Acp . By Proposition 3.2, there exists unique φ ∈ Mor(Acp , Acp ) satisfying relations (3.8). Let f be a continuous function on R vanishing at 0 and at infinity. Then βf (b) ∈ Acp . The second and third formulae of (3.8) show that φ(βf (b)) = −iβ sign(b)f (b). Iterating this formula we obtain: φ2 (βf (b)) = −βf (b) and φ4 (βf (b)) = βf (b). It shows that φ4 = id. 4. From Multiplicative Unitary to Quantum Group Let G be the quantum space corresponding to the C ∗ -algebra Acp . In other words, elements of Acp are interpreted as continuous functions vanishing at infinity on G. In this section we endow G with a group structure introducing a comultiplication
August 22, 2002 9:43 WSPC/148-RMP
00140
Quantum ‘ax + b’ Group
817
∆ ∈ Mor(Acp , Acp ⊗ Acp ). It will be shown that the quantum group G coincides with the (extended) ‘ax + b’-group introduced in Sec. 1. From now until the end of the paper we assume that the deformation parameter π , ~= 2k + 3 2
k where k = 0, 1, 2, . . . . Then α = i exp iπ 2~ = (−1) . Any C ∗ -algebra may be embedded in a non-degenerate way into B(H), where H is a Hilbert space. Then affiliated elements become closed operators acting on H. Let
: Acp ,→ B(H)
(4.1)
be a non-degenerate embedding. Then ∈ Rep(Acp , H) and (a), (b) and (β) are selfadjoint operators acting on H. To simplify the notation we will drop the embedding symbol ‘’ writing a, b, β instead of (a), (b), (β). With this notation Acp ⊂ B(H). One can check that the subspace ker b is Acp -invariant. Replacing if necessary H by (ker b)⊥ we may assume that ker b = {0}. We may also assume that the commutant A0cp = {a0 ∈ B(H) : a0 c = ca0 for any c ∈ Acp } contains a W ∗ -algebra isomorphic to B(K), where K is an infinite-dimensional Hilbert space. If this is not the case, then we replace (4.1) by 0 : Acp ,→ B(K ⊗ H) introduced by the formula 0 (c) = IB(K) ⊗(c) for any c ∈ Acp . Since the commutant A0cp is large enough, there exist strictly positive selfadjoint operators r, s acting on H such that r, s strongly commute with a, b, β and r−o s. ˆ = s|b|−1 . Then sign b = sign ˆb. Therefore Let ˆb = ei~/2 b−1 a, βˆ = β and a χ(b ⊗ b < 0) = χ(ˆb ⊗ b < 0) and the operator (2.5) equals i W = F~ (ˆb ⊗ b, α(βˆ ⊗ β)χ(ˆb ⊗ b < 0))∗ e ~ log aˆ⊗log a .
(4.2)
This operator acts on H ⊗ H. By Theorem 2.1, W is a manageable multiplicative ˜ are given by: Q = √ra and unitary. Corresponding operators Q and W ˜ = F~ (−ˆb> ⊗ ei~/2 ba−1 , −(βˆ> ⊗ β)χ(ˆb> ⊗ b > 0))e ~i log aˆ> ⊗log a . W
(4.3)
We shall use the theory developed in [3, 19]. Let B(H)∗ be the set of all normal linear functionals defined on B(H) and A = {(ω ⊗ id)W : ω ∈ B(H)∗ }norm
closure
.
(4.4)
According to the general theory [3, 19], A is a C ∗ -algebra and W ∈ M (CB(H)⊗A), where CB(H) the C ∗ -algebra of all compact operators acting on H. The algebra A is interpreted as the algebra of all ‘continuous functions vanishing at infinity on the quantum group’. The corresponding comultiplication ∆ is introduced by the formula: ∆(c) = W (c ⊗ I)W ∗ .
(4.5)
August 22, 2002 9:43 WSPC/148-RMP
818
00140
S. L. Woronowicz & S. Zakrzewski
It is known that ∆(c) ∈ M (A ⊗ A) for any c ∈ A and that ∆ ∈ Mor(A, A ⊗ A). By the pentagon equation we have (id ⊗ ∆)W = W12 W13 . Using this formula one can easily show that ∆ is coassociative. The main result of this section is the following Theorem 4.1. (1) The Baaj–Skandalis algebra (4.4) coincides with the crossed product algebra Acp : A = Acp .
(4.6)
(2) The comultiplication ∆ acts on distinguished elements affiliated with Acp in the following way: • ∆(a) = a ⊗ a, • ∆(b) is the selfadjoint extension of a ⊗ b + b ⊗ I corresponding to the reflection operator τ = α(β ⊗ β)χ(b ⊗ b < 0). In short: ∆(b) = [a ⊗ b + b ⊗ I]τ . • ∆(iβb) = i{w(ei~/2 b−1 a ⊗ b)−1 (β ⊗ I) + (I ⊗ β)w(ei~/2 ba−1 ⊗ b−1 )−1 }∆(b), where w is the polynomial introduced by (1.15). Proof. (1) Any closed operator acting on H is affiliated with CB(H). In particular ˆ ˆb, iˆbβ, ˆ log a ˆ ∈ CB(H)η . Notice that b, ibβ, log a ∈ Aηcp . Then we obtain: ˆb⊗b, ˆbβ⊗bβ, η log a ˆ ⊗ log a ∈ (CB(H) ⊗ Acp ) . Therefore e ~ log aˆ⊗log a ∈ M (CB(H) ⊗ Acp ) , i
F~ (ˆb ⊗ b, α(βˆ ⊗ β)χ(ˆb ⊗ b < 0)) ∈ M (CB(H) ⊗ Acp ) . To obtain the second relation we used [20, Theorem 8.1]). Consequently W ∈ M (CB(H) ⊗ Acp ). Now using (4.4) we obtain A ⊂ M (Acp ) and AAcp ⊂ Acp . W is a unitary element of the multiplier algebra. Therefore W (CB(H) ⊗ Acp ) = CB(H) ⊗ Acp and the set {W (m ⊗ c) : m ∈ CB(H), c ∈ Acp }
(4.7)
is linearly dense in CB(H) ⊗ Acp . For any ω ∈ B(H)∗ , m ∈ CB(H) and c ∈ Acp we have: (ω ⊗ id)(W (m ⊗ c)) = ((mω ⊗ id)W )c ∈ AAcp . Applying ω ⊗ id to all elements of (4.7) we see that AAcp is a linearly dense subset of Acp .
(4.8)
We shall prove that log a, b, ibβηA .
(4.9)
For all t ∈ R we set i V (t) = F~ (tˆb ⊗ b, α(βˆ ⊗ β)χ(tˆb ⊗ b < 0))∗ e ~ log aˆ⊗log a .
(4.10)
August 22, 2002 9:43 WSPC/148-RMP
00140
Quantum ‘ax + b’ Group
819
Then V (t) ∈ B(H ⊗ H) = M (CB(H) ⊗ CB(H)). In what follows we endow multiplier algebras with the strict topology. Using Theorem 8.1 of [20] one can easily show that (V (t))t∈R is a continuous family of elements of M (CB(H) ⊗ CB(H)). Tensoring by I ∈ M (A) and using the leg numbering notation V12 (t) = V (t) ⊗ I we obtain a continuous family (V12 (t))t∈R of elements of M (CB(H) ⊗ CB(H) ⊗ A). By Proposition 2.2, operators (4.10) satisfy the pentagon equation (2.7). Therefore ∗ . V13 (t) = V12 (t)∗ W23 V12 (t)W23
(4.11)
Using this formula and remembering that W ∈ M (CB(H) ⊗ A) we see that (V13 (t))t∈R is a continuous family of elements of M (CB(H) ⊗ CB(H) ⊗ A). It implies that (V (t))t∈R is a continuous family of elements of M (CB(H) ⊗ A). Therefore F~ (tˆb ⊗ b, α(βˆ ⊗ β)χ(tˆb ⊗ b < 0)) = V (0)V (t)∗ ∈ M (CB(H) ⊗ A) depends continuously on t ∈ R. Now, Theorem 8.1 of [20] shows that ˆb ⊗ b and ˆbβˆ ⊗ bβ are affiliated with CB(H) ⊗ A. Taking into account (A.1) we get b, ibβηA. Let t ∈ R. Inserting ˆb = 0 and a ˆ = e~t I in Proposition 2.2 we see that the operator V (t) = I ⊗ eit log a = I ⊗ ait
(4.12)
satisfies the pentagon equation (2.7). In the present case equation (4.11) takes the form I ⊗ ait = (a−it ⊗ I)W (ait ⊗ I)W ∗ .
(4.13)
It shows that (I ⊗ a )t∈R is a continuous one parameter group of unitary elements of the multiplier algebra M (CB(H)⊗A). Consequently (ait )t∈R is a continuous one parameter group of unitary elements of M (A). Therefore the infinitesimal generator log a is affiliated with A. This way (4.9) is shown. Now we combine Proposition 3.1 with (4.9). By Definition 3.1 of [18], the embedding (4.1) belongs to Mor(Acp , A). It means that Acp A is a linearly dense subset of A. Comparing this result with (4.8) we obtain (4.6). This way we revealed the structure of the algebra of ‘continuous functions vanishing at infinity on G’. (2) Let ∆ be the comultiplication introduced by (4.5). Clearly the action of ∆ on elements affiliated with A is described by the same formula. We have to compute the action of ∆ on generators a, b, ibβ of A. Formula (4.13) shows that ∆(ait ) = ait ⊗ ait for any t ∈ R. Therefore it
∆(a) = a ⊗ a . One can easily verify that b strongly commutes with a ˆ = s|b|−1 . Therefore b ⊗ I i log a ˆ ⊗a and commutes with e ~ ∆(b) = F~ (ˆb ⊗ b, α(βˆ ⊗ β)χ(ˆb ⊗ b < 0))∗ (b ⊗ I)F~ (ˆb ⊗ b, α(βˆ ⊗ β)χ(ˆb ⊗ b < 0)) . We know that a−o b. Therefore b−1 −o a, ˆb = ei~/2 b−1 a−o a and ˆb ⊗ b−o b ⊗ I. The ˆ ˆb⊗b < reader should notice that b⊗I anticommutes with the operator τ = α(β⊗β)χ(
August 22, 2002 9:43 WSPC/148-RMP
820
00140
S. L. Woronowicz & S. Zakrzewski
0) = α(β ⊗ β)χ(b ⊗ b < 0). Using Theorem 5.3 of [20] we see that ∆(b) is the selfadjoint extension of ei~/2ˆbb ⊗ b + b ⊗ I = a ⊗ b + b ⊗ I corresponding to the reflection operator τ : ∆(b) = [a ⊗ b + b ⊗ I]τ . More explicitly ∆(b) is the restriction of (a ⊗ b + b ⊗ I)∗ to the domain D(∆(b)) = D(a ⊗ b + b ⊗ I) + D((a ⊗ b + b ⊗ I)∗ ) ∩ (H ⊗ H)(τ = 1) , where (H ⊗ H)(τ = 1) is the eigenspace of τ corresponding to the eigenvalue 1. The action of ∆ on the third generator is given by the formula ˜ ∆(ibβ) = iβ∆(b) , where β˜ = W (β ⊗ I)W ∗ . ˜ Remembering that β commutes with a We have to find a formula for β. ˆ and ani ticommutes with ˆb we see that β ⊗ I commutes with e ~ log aˆ⊗a and anticommutes with ˆb ⊗ b. Therefore β˜ = F~ (ˆb ⊗ b, α(βˆ ⊗ β)χ(ˆb ⊗ b < 0))∗ (β ⊗ I)F~ (ˆb ⊗ b, α(βˆ ⊗ β)χ(ˆb ⊗ b < 0)) = F~ (ˆb ⊗ b, α(βˆ ⊗ β)χ(ˆb ⊗ b < 0))∗ F~ (−ˆb ⊗ b, α(βˆ ⊗ β)χ(ˆb ⊗ b > 0))(β ⊗ I) . Taking into account formula (B.2), we obtain: β˜ = {w(ˆb ⊗ b)−1 + (βˆ ⊗ β)w(−(ˆb ⊗ b)−1 )−1 }(β ⊗ I) . Remembering that βˆ = β anticommutes with ˆb = ei~/2 b−1 a, we finally obtain: β˜ = w(ei~/2 b−1 a ⊗ b)−1 (β ⊗ I) + (I ⊗ β)w(ei~/2 ba−1 ⊗ b−1 )−1 .
(4.14)
This formula proves the last point of Statement 2 of our theorem. Remark 4.2. Using (1.4), (1.3) and (1.15) one can verify that on the Hopf ∗ algebra level the product (b2k+3 ⊗ I)w(ei~/2 b−1 a ⊗ b) = (∆b)2k+3 = (a2k+3 ⊗ b2k+3 )w(−ei~/2 ba−1 ⊗ b−1 ). Combining this formula with (4.14) we get ∆(ib2k+3 β) = ib2k+3 β ⊗ I + a2k+3 ⊗ ib2k+3 β .
(4.15)
On the Hilbert space and C ∗ -levels, instead of equality we have inclusion: operator on the left hand side of (4.15) is a selfadjoint extension of the symmetric operator appearing on the right hand side. This extension is determined by reflection operator −sign(b ⊗ b): ∆(ib2k+3 β) = [ib2k+3 β ⊗ I + a2k+3 ⊗ ib2k+3 β]−sign(b⊗b) .
(4.16)
See [8] for details. The formula (4.16) seems to be very interesting. It encodes in a simple form the complicated formula (4.14) describing the action of ∆ on β.
August 22, 2002 9:43 WSPC/148-RMP
00140
Quantum ‘ax + b’ Group
Moreover it shows that in a certain sense 0
u =
a2k+3
ib2k+3 β
0
I
821
!
is a two-dimensional representation of quantum ‘ax + b’ group. Now we shall discuss the coinverse map κ. According to [19] we have the polar decomposition κ(c) = (τi/2 (c))R ,
(4.17)
where τi/2 is the analytic generator of the scaling group and the map A 3 c 7→ cR is the unitary antipode. The action of the scaling group is described by the formula: τt (c) = Q2it cQ−2it . Remembering that Q2 = ra commutes with a and β and that Q2 −o b we obtain: τt (b) = e~t b and τt (β) = β .
τt (a) = a ,
Consequently: τi/2 (a) = a, τi/2 (b) = ei~/2 b and τi/2 (β) = β. The unitary antipode ˜ ∗ (cf. [19, Formula (1.14)]). Comparing (4.2) is defined by the relation W >⊗R = W with (4.3) and remembering that > ⊗ R is antimultiplicative we obtain: aR = a−1 ,
bR = −ei~/2 ba−1
and β R = −αβ .
Now, formula (4.17) shows that: κ(a) = a−1 ,
κ(b) = −a−1 b and κ(β) = −αβ .
It turns out that the automorphism φ introduced in Proposition 3.3 preserves the group structure of our quantum group. We have: Proposition 4.3. For any c ∈ A: ∆(φ(c)) = (φ ⊗ φ)∆(c) .
(4.18)
Proof. We recall that we use the embedding A ,→ B(H) such that b is represented by an operator with trivial kernel. Therefore β 2 = χ(b 6= 0) = I and the operator w = χ(b > 0) + iχ(b < 0) is unitary. Clearly w commutes with a and b. We compute: w∗ βw = (χ(b > 0) − iχ(b < 0))β(χ(b > 0) + iχ(b < 0)) = β(χ(b < 0) − iχ(b > 0))(χ(b > 0) + iχ(b < 0)) = −iβ(χ(b > 0) − χ(b < 0)) = −iβ sign b . Therefore w∗ (iβb)w = β|b|. It shows (cf. (3.8)) that w implements the action of φ: φ(c) = w∗ cw
August 22, 2002 9:43 WSPC/148-RMP
822
00140
S. L. Woronowicz & S. Zakrzewski
for any c ∈ A. We claim that w ⊗ w commutes with τ . Indeed (w ⊗ w)τ (w ⊗ w)∗ = α(wβw∗ ⊗ wβw∗ )χ(wbw∗ ⊗ wbw∗ < 0) = α((−iβ sign b) ⊗ (−iβ sign b))χ(b ⊗ b < 0) = −α(β ⊗ β)(sign b ⊗ sign b)χ(b ⊗ b < 0) = α(β ⊗ β)χ(b ⊗ b < 0) = τ . Using this formula one can easily show that w ⊗ w commutes with W (cf. (4.2)). Now, for any c ∈ A we have (φ ⊗ φ)∆(c) = (w ⊗ w)∗ W (c ⊗ I)W ∗ (w ⊗ w) = W (w ⊗ w)∗ (c ⊗ I)(w ⊗ w)W ∗ = W (φ(c) ⊗ I)W ∗ = ∆(φ(c)) . We end this section with a short discussion showing that manageability is the condition distinguishing groups from semigroups. We recall that classical ‘ax + b’ group Gclassical consists of all affine transformations R 3 x 7→ ax + b ∈ R with a > 0. Assuming in addition that b > 0 we define a subsemigroup G+ classical ⊂ Gclassical . In the quantum setting, the condition b > 0 selects a subspace of H. Let H+ = H(b > 0) and x ∈ H+ ⊗ H+ . On this subspace operator ˆb ⊗ b is strictly positive and computing F~ (ˆb ⊗ b, α(βˆ ⊗ β)χ(ˆb ⊗ b < 0))∗ we have to use the first version of formula (0.1). Therefore −1
W x = Vθ (log(ei~/2 b−1 a ⊗ b))∗ e ~ log(sb i
)⊗log a
x.
All operators appearing in this formula leave H+ invariant. Therefore H+ ⊗ H+ is W -invariant. The restriction of W to this invariant subspace will be denoted by W+ : −1
∗ ~ log(s+ b+ W+ = Vθ (log(ei~/2 b−1 + a+ ⊗ b+ )) e i
)⊗log a+
,
where a+ , b+ , s+ are restrictions of a, b, s to H+ . Restricting both sides of (2.1) to the subspace H+ ⊗ H+ ⊗ H+ we see that W+ is a multiplicative unitary. This ˜ -invariant and multiplicative unitary is not manageable. Indeed H+ ⊗ H+ is not W > > ˜ ˜ the operator W+ = χ(b ⊗ b > 0)W χ(b ⊗ b > 0) is not unitary. To obtain a C ∗ -algebra we have to replace (4.4) by the formula A+ = {(ω ⊗ id)W+ + (ω 0 ⊗ id)W+∗ : ω, ω 0 ∈ B(H+ )∗ }norm
closure
(4.19)
One can show that log a+ , b+ ηA+ and that A+ is generated by these two elements. Let G+ be the quantum space corresponding to the C ∗ -algebra A+ . The formula ∆+ (c) = W+ (c ⊗ I)W+∗
August 22, 2002 9:43 WSPC/148-RMP
00140
Quantum ‘ax + b’ Group
823
defines coassociative comultiplication ∆+ ∈ Mor(A+ , A+ ⊗ A+ ). One can verify that ∆+ (a+ ) = a+ ⊗ a+ ,
∆+ (b+ ) = a+ ⊗ b+ + b+ ⊗ I .
In the second formula a+ ⊗ b+ + b+ ⊗ I is essentially selfadjoint and has unique selfadjoint extension. ∆+ introduces a semigroup structure on G+ . Clearly G+ is a quantum deformation of G+ classical . In this way we constructed an example of a quantum semigroup coming from a non-manageable multiplicative unitary. This is rather surprising: Kac–Takesaki operators corresponding to semi-subgroups of locally compact groups are non-unitary coisometries. It shows that manageability (rather than unitarity) is the condition distinguishing groups from semigroups. 5. The Dual of ‘ax + b’ Quantum Group Let W be the multiplicative unitary introduced by (4.2). The theory of multiplicative unitaries provide a simple method of constructing group duals. Following Baaj and Skandalis we denote by Σ : H ⊗ H → H ⊗ H the flip operator: Σ(x ⊗ y) = y ⊗ x for any x, y ∈ H. The corresponding flip acting on operators will be denoted by σ: σ(c ⊗ c0 ) = Σ(c ⊗ c0 )Σ = c0 ⊗ c for any c, c0 ∈ B(H). It is well known that for any manageable multiplicative ˆ = ΣW ∗ Σ is also a manageable multiplicative unitary. unitary W , the operator W By definition the regular dual of the quantum group related to a multiplicative ˆ . The algebra of ‘continuous functions unitary W is the quantum group related to W vanishing at infinity’ on the dual of the group is introduced by the formula: Aˆ = {(id ⊗ ω)W ∗ : ω ∈ B(H)∗ }norm
closure
.
ˆ ∈ Mor(A, ˆ Aˆ ⊗ A) ˆ such The dual group structure is given by the comultiplication ∆ that ˆ ⊗ id)W = W23 W13 . (∆
(5.1)
The following theorem reduces the description of the dual of ‘ax + b’ group to the original group. ˆ There exists a C ∗ Theorem 5.1. Operators a ˆ, ˆb, iˆbβˆ are affiliated with A. ˆ ˆ This ˆ isomorphism ψ : A → A such that ψ(a) = a ˆ, ψ(b) = b and ψ(ibβ) = iˆbβ. isomorphism reverses order of the group operation: ˆ ∆(ψ(c)) = σ(ψ ⊗ ψ)∆(c)
(5.2)
for any c ∈ A. We shall use the following Proposition 5.2. Let H be a Hilbert space and (a, b, β) ∈ GH . Assume that ker b = {0}. Then the triples (a, b, β), (a, ei~/2 ab, β) and (ei~/2 |b|−1 a, b, β) are unitarily
August 22, 2002 9:43 WSPC/148-RMP
824
00140
S. L. Woronowicz & S. Zakrzewski
equivalent. In particular the triple (a, ei~/2 ab, β) ∈ GH and (ei~/2 |b|−1 a, b, β) ∈ GH . Moreover if s is a strictly positive selfadjoint operator acting on H such that s commutes with a, b, β, then the triple (sa, b, β) is unitarily equivalent to (a, b, β) ∈ GH . Proof. Let i
2
U1 = e 2~ (log a) ,
U2 = e 2~ (log |b|) i
2
and U3 = |b|− ~ log s . 1
Clearly β commutes with U1 , U2 , U3 , a commutes with U1 and b commutes with U1 , U3 . Using [20 Statement 3 of Theorem 3.3] we check that U1 bU1∗ = ei~/2 ab, U2 aU2∗ = ei~/2 |b|−1 a and U3 aU3∗ = sa. Therefore U1 (a, b, β)U1∗ = (a, ei~/2 ab, β) , U2 (a, b, β)U2∗ = (ei~/2 |b|−1 a, b, β) , U3 (a, b, β)U3∗ = (sa, b, β) . Proof of Theorem 5.1. We set (a1 , b1 ) = (ei~ b−2 a, b), (a2 , b2 ) = (a1 , ei~/2 a1 b1 ), (a3 , b3 ) = (ei~/2 |b2 |−1 a, b2 ) and (a4 , b4 ) = (sa3 , b3 ). One can easily verify that ˆ and b4 = ei~/2 b−1 a = ˆb. By Proposition 5.2 the triples (a, b, β), a4 = s|b|−1 = a ˆ are unitarily equivalent. a, ˆb, β) (a1 , b1 , β), (a2 , b2 , β), (a3 , b3 , β) and (a4 , b4 , β) = (ˆ Let Z ∈ B(H) be a unitary operator such that a ˆ = Z ∗ aZ, ˆb = Z ∗ bZ, βˆ = Z ∗ βZ and ψ be the automorphism of B(H) implemented by Z: ψ(c) = Z ∗ cZ . Then ψ(a) = a ˆ, ψ(b) = ˆb =, ψ(β) = βˆ and taking into account definition (4.2) we see that operator (id ⊗ ψ)W is invariant with respect to the flip: σ(id ⊗ ψ)W = (id ⊗ ψ)W . Therefore, for any ω ∈ B(H)∗ we have: ψ((ω ⊗ id)W ) = (ω ⊗ id)(id ⊗ ψ)W = (id ⊗ ω)(id ⊗ ψ)W = (id ⊗ ωZ )W ,
(5.3)
ˆ We shall verify where ωZ = ω ◦ ψ ∈ B(H)∗ . Formula (5.3) shows that ψ(A) = A. (5.2). Let c = (ω ⊗ id)W ∈ A. Then by the above formula ψ(c) = (id ⊗ ωZ )W and using (5.1) we obtain ˆ ⊗ id)W = (id ⊗ id ⊗ ωZ )W23 W13 . ˆ ∆(ψ(c)) = (id ⊗ id ⊗ ωZ )(∆ Therefore ˆ σ ∆(ψ(c)) = (id ⊗ id ⊗ ωZ )W13 W23 .
(5.4)
August 22, 2002 9:43 WSPC/148-RMP
00140
Quantum ‘ax + b’ Group
825
On the other hand (ψ ⊗ ψ)∆(c) = (ψ ⊗ ψ)∆((ω ⊗ id)W ) = (ω ⊗ ψ ⊗ ψ)W12 W13 = (ω ⊗ id ⊗ id)[(id ⊗ ψ)W ]12 [(id ⊗ ψ)W ]13 . Remembering that (id⊗ψ)W is flip-invariant and that ψ is multiplicative we obtain: (ψ ⊗ ψ)∆(c) = (id ⊗ id ⊗ ω)[(id ⊗ ψ)W ]13 [(id ⊗ ψ)W ]23 = (id ⊗ id ⊗ ω)(id ⊗ id ⊗ ψ)W13 W23 = (id ⊗ id ⊗ ωZ )W13 W23 . Comparing this formula with (5.4) we obtain (5.2). Appendix A. Affiliation Relation and Tensor Product For any Hilbert space H we denote by C ∗ (H) the set of all non-degenerate separable C ∗ -algebras of operators acting on H. Proposition A.1. Let T1 , T2 be nonzero normal operators acting on Hilbert spaces H1 , H2 respectively and let A1 ∈ C ∗ (H1 ) and A2 ∈ C ∗ (H2 ). Then (T1 ⊗ T2 ηA1 ⊗ A2 ) ⇔ (T1 ηA1 and T2 ηA2 ) .
(A.1)
Proof. The implication ‘⇐’ follows from [17, Theorem 6.1]. We shall prove the converse. Multiplying if necessary T2 by a complex number, we may assume that 1 ∈ Sp T2 . Then for any r > 0 the spectral subspace H2 (|T2 − 1| < r) 6= {0}. Let Ωr be a norm 1 vector belonging to this subspace and ωr be the state of A2 corresponding to this vector: ωr (c) = (Ωr |c|Ωr ) for any c ∈ A2 . For any f ∈ C∞ (C) and any t ∈ C we set fr (t) = (Ωr |f (tT2 )|Ωr ) . Clearly
Z fr (t) =
f (tτ )dµr (τ ) , R
where µr is a probability measure on C such that µr (Λ) = (Ωr |χ(T2 ∈ Λ)|Ωr ) for any measurable subset Λ ⊂ R. Condition Ωr ∈ H2 (|T2 − 1| < r) implies that the support of µr is contained in the ball {t ∈ C : |t − 1| < r}. Using this result one can easily show that fr ∈ C∞ (C) and that fr converges uniformly to f , when r → 0 lim fr = f .
r→0
(A.2)
A moment of reflection shows that (id ⊗ ωr )f (T1 ⊗ T2 ) = fr (T1 ) .
(A.3)
August 22, 2002 9:43 WSPC/148-RMP
826
00140
S. L. Woronowicz & S. Zakrzewski
If T1 ⊗T2 ηA1 ⊗A2 , then f (T1 ⊗T2 ) ∈ M (A1 ⊗A2 ) and the above formula shows that fr (T1 ) ∈ M (A1 ). Taking into account (A.2) we obtain: f (T1 ) ∈ M (A1 ). Clearly the mapping C∞ (C) 3 f → f (T1 ) ∈ M (A1 )
(A.4)
is a ∗ -algebra homomorphism. Assume for the moment that f (t) > 0 for all t ∈ C. Then (cf. [18, formula 1.8]) f (T1 ⊗ T2 ) > 0 on Sp(A1 ⊗ A2 ) and by (A.3), fr (T1 ) > 0 on Sp A1 . It means that fr (T1 )A1 is dense in A1 . Therefore (A.4) is a morphism from C∞ (C) into A1 . The function f (t) = t is an element affiliated with C∞ (C). Applying the morphism (A.4) to this element we obtain f (T1 ) = T1 . Therefore T1 ηA1 . In the same way one can show that T2 ηA2 . Remark A.2. We are strongly convinced that the equivalence (A.1) holds for any nonzero closed operators T1 and T2 . However we were unable to find a proof working for operators that are not normal. Appendix B. A QEF Equality This appendix may be treated ity satisfied by the quantum k = 0, 1, 2, . . . . We start with the order 2k + 3 introduced by
as a supplement to [20]. We shall prove an equalπ , where exponential function F~ with ~ = 2k+3 some simple properties of the polynomial w(t) of (1.15). Let
Φ = {−e−i( 2 −`)~ : ` = 1, 2, . . . , 2k + 3} 1
be the set of all zeroes of w. The reader should notice that Φ is contained in the upper half plane. One can easily verify that the set Φ∪(−Φ) = {t : 1 + t2(2k+3) = 0}. Therefore w(t)w(−t) = 1 + t2(2k+3) . Moreover Φ−1 = −Φ and the product of all elements of Φ equals −i(−1)k = −iα. Therefore w(t) = iαt2k+3 w(−t−1 ) .
(B.1)
Finally Φ is symmetric with respect to the imaginary axis. Therefore w(t) = w(−t¯) . Let r ∈ R and % = ±1. We claim that F~ (r, %χ(r < 0))F~ (−r, %χ(r > 0)) = w(r)−1 [1 + i%r2k+3 ] = w(r)−1 + α%w(−r−1 )−1 .
(B.2)
The last equality follows immediately from (B.1). Applying complex conjugation to all parts of (B.2) we obtain the same formula with r replaced by −r. Therefore it is sufficient to prove (B.2) for r > 0. In this case computing F~ (r, %χ(r < 0))
August 22, 2002 9:43 WSPC/148-RMP
00140
Quantum ‘ax + b’ Group
827
(F~ (−r, %χ(r > 0)) respectively) we have to use the first (the second respectively) version of formula (0.1). We obtain: π
LHS = Vθ (log r)Vθ (log r − πi)[1 + i%r ~ ] .
(B.3)
π , where k = 0, 1, 2, . . . . Therefore π = (2k + 3)~. We know We recall that ~ = 2k+3 (cf. [20, Formula (1.31)]) that
Vθ (x + i~) = (1 + ei~/2 ex )Vθ (x) for any x ∈ C. Using this formula (2k + 3)-times with x = log r − i`~ (` = 1, 2, . . . , 2k + 3) we obtain Vθ (log r) =
2k+3 Y
(1 + ei( 2 −`)~ r)Vθ (log r − πi). 1
`=1
For real r, |Vθ (log r)| = 1. Therefore Vθ (log r)Vθ (log r − πi) =
2k+3 Y
(1 + ei( 2 −`)~ r)−1 = w(r)−1 . 1
`=1
Formula (B.3) shows now, that LHS = w(r)−1 [1 + i%r ~ ] π
and (B.2) follows. Acknowledgement A large part of the paper was written during the stays of the first author at the Institute of Mathematics of Trondheim University. Numerous discussions with members and visitors of the Institute helped me fix many points of this work. Among persons who contributed in this way were: Fons Van Daele, Magnus Landstad, Johan Kustermans, Stefaan Vaes, Yoshiomi Nakagami and many others. Special thanks are due to Magnus Landstad, Christian Skau and other members of the Institute for their exceptional hospitality and creation of an excellent atmosphere for the work. Finally the first author would like to thank his close collaborators in Warsaw: Wieslaw Pusz and Piotr Soltan. The paper owes a lot to them. They read the entire manuscript making many important remarks and pointing out numerous errors and misprints. The authors are grateful to Komitet Bada´ n Naukowych (grant No 2 P0A3 030 14) and to The Foundation for Polish Science for the financial support. References [1] N. I. Achiezer and I. M. Glazman, Theory of Linear Operators in Hilbert Space, Pitman Publishing, Boston, London, Melbourne, 1981. [2] W. Arveson, An Invitation to C ∗ -Algebra, Springer-Verlag New York, Heidelberg, Berlin, 1976.
August 22, 2002 9:43 WSPC/148-RMP
828
00140
S. L. Woronowicz & S. Zakrzewski
[3] S. Baaj and G. Skandalis, Unitaires multiplicatifs et dualit´ e pour les produits crois´es ´ Norm. Sup., 4e s´erie, t. 26 (1993) 425–488. de C ∗ -alg`ebres, Ann. Scient. Ec. [4] J. Kustermans and S. Vaes, Locally compact quantum groups, to appear Ann. Scient. Ec. Norm. Sup., see also A simple definition for locally compact quantum groups, C. R. Acad. Sci., Paris, s´er. I 328 (10) (1999) 871–876. [5] M. B. Landstad, Duality theory of covariant systems, Trans. Amer. Math. Soc. 248 (2) (1979) 223–267. [6] K. Maurin, Methods of Hilbert Spaces, Warszawa, 1967. [7] G. K. Pedersen, C ∗ -Algebras and Their Automorphism Groups, Academic Press, London, New York, San Francisco 1979. [8] W. Pusz and S. L. Woronowicz, A new quantum deformation of ‘ax + b’ group, under preparation. [9] M. Rowicka-Kudlicka, Unitary representations of the quantum “ax + b” group, math.QA/0102151. [10] K. Schm¨ udgen, Operator representations of R2q , Publications of RIMS Kyoto University 29 (1993) 1030–1061. [11] K. Schm¨ udgen, Integral operator representations of R2q , Xq,γ and SLq (2, R), Commun. Math. Phys. 159 (1994) 217–237. [12] P. M. Soltan and S. L. Woronowicz, A remark on manageable multiplicative unitaries, under preparation. [13] G. Skandalis, Duality for locally compact ‘quantum groups’ (joint work with S. Baaj), Mathematische Forschungsinstitut Obervolfach, Tagungsbericht 46, 1991, C ∗ -algebren, 20.10–26.10.1991, p. 20. [14] S. Vaes and L. Vainerman, Extensions of locally compact quantum groups and bicrossed product construction, Preprint, 2001. [15] A. Van Daele, The Haar measure on some locally compact groups, Preprint, 2001. [16] S. L. Woronowicz, Unbounded elements affiliated with C ∗ -algebras and non-compact quantum groups, Commun. Math. Phys. 136 (1991) 399–432. [17] S. L. Woronowicz and K. Napi´ orkowski, Operator theory in the C ∗ -algebra framework, Reports on Math. Phys. 31 (1992) 353–371. [18] S. L. Woronowicz, C ∗ -algebras generated by unbounded elements, Reviews Math. Phys. 7(3) (1995) 481–521. [19] S. L. Woronowicz, From multiplicative unitaries to quantum groups, Int. J. Math. 7(1) (1996) 127–149. [20] S. L. Woronowicz, Quantum exponential function, Reviews Math. Phys. 12(6) (2000) 873–920. [21] S. L. Woronowicz, Quantum ‘az + b’ group on complex plane, to appear in Int. J. Math. [22] S. L. Woronowicz, Quantum SL(2, R) group on the C ∗ -algebra level, under preparation.
August 22, 2002 10:28 WSPC/148-RMP
00144
Reviews in Mathematical Physics, Vol. 14, Nos. 7 & 8 (2002) 829–871 c World Scientific Publishing Company
KMS, ETC.
† ¨ ¨ LOTHAR BIRKE∗ and JURG FROHLICH
Theoretical Physics, ETH-H¨ onggerberg, CH-8093 Z¨ urich ∗
[email protected] †
[email protected] Received 28 March 2002 Revised 2 July 2002 Dedicated to Huzihiro Araki on the occasion of his seventieth birthday, with admiration, affection and best wishes. A general form of the “Wick rotation”, starting from imaginary-time Green functions of quantum-mechanical systems in thermal equilibrium at positive temperature, is established. Extending work of H. Araki, the rˆ ole of the KMS condition and of an associated anti-unitary symmetry operation, the “modular conjugation”, in constructing analytic continuations of Green functions from real- to imaginary times, and back, is clarified. The relationship between the KMS condition for the vacuum with respect to Lorentz boosts, on one hand, and the spin-statistics connection and the PCT theorem, on the other hand, in local, relativistic quantum field theory is recalled. General results on the reconstruction of local quantum theories in various non-trivial gravitational backgrounds from “Euclidian amplitudes” are presented. In particular, a general form of the KMS condition is proposed and applied, e.g., to the Unruh- and the Hawking effects. Keywords: KMS condition; quantum statistical mechanics; quantum field theory on curved space-times; PCT theorem.
1. Introduction and Summary of Results The purpose of this paper is to review the general theory of quantum-mechanical matter in thermal equilibrium and to describe some applications of this theory to quantum field theory, in particular to theories on some curved space-times. We shall emphasize the rˆ ole played by imaginary-time (“temperature-ordered”) Green functions (TOGF’s) in the analysis of quantum-mechanical systems in thermal equilibrium, because the TOGF’s are the objects that are most accessible to analytical studies of such systems based on functional integration; see [1, 2] and references given there, and [3]. Many features of quantum systems with infinitely many degrees of freedom, such as phase transitions and long-range order, critical behavior, strong correlations, etc. are encoded into the TOGF’s. Nevertheless, it is the real-time Green functions (RTGF’s) of systems in thermal equilibrium which are the 829
August 22, 2002 10:28 WSPC/148-RMP
830
00144
L. Birke & J. Fr¨ ohlich
physical objects. In order to calculate e.g. the response of such systems to small changes in the external control parameters, we need to know their RTGF’s. The connection between the RTGF’s and the TOGF’s of a quantum system in thermal equilibrium is analogous to the one between Wightman distributions and Schwinger functions of a local relativistic quantum field theory (QFT) at zero temperature, which has been unraveled in the work of Osterwalder and Schrader [4], see also [5], building on a lot of previous, deep work in axiomatic quantum field theory, see [6, 7] and references given there: One passes from TOGF’s to RTGF’s, and back, by analytic continuation in the time variables (“Wick rotation”). However, in contrast to the situation in local, relativistic QFT at zero temperature, one cannot make use of an (energy-) spectrum condition, in order to accomplish the analytic continuation at positive temperatures. While at zero temperature, the Hamiltonian of any reasonable quantum system is bounded from below, the spectrum of the thermal Hamiltonian, or “Liouvillian”, of a system with infinitely many degrees of freedom at positive temperature usually covers the entire real axis. At zero temperature, the analytic continuation from real to imaginary time, and back, is based on the fact that if the Hamiltonian H is a non-negative operator, then exp(izH) is bounded in operator norm by 1, provided Im(z) > 0. At positive temperature, the analytic continuation of RTGF’s in the time variables to the TOGF’s, and back, is based on the Kubo–Martin–Schwinger (KMS ) condition [8, 9] known to characterize thermal equilibrium states of quantum systems. The very formulation of the KMS condition for RTGF’s involves an analytic continuation of RTGF’s in one time-difference variable. An application of the generalized tube theorem then implies joint analyticity of RTGF’s in all time variables in a tubular domain containing, as a subset, cyclically ordered n-tuples of imaginary times. The TOGF’s are the restrictions of the analytically continued RTGF’s to the subset of cyclically ordered imaginary time arguments. The main problem studied in this paper is to start from Green functions (calculated e.g. with the help of functional integrals) which have all the properties of TOGF’s — including an invariance under cyclic rearrangements of their arguments, which is the imaginary-time version of the KMS condition — and prove that they can be analytically continued in their (imaginary-)time arguments back to real times to yield RTGF’s with all the right properties. Thus, we present a variant of the Osterwalder–Schrader reconstruction theorem at positive temperature. The reader is right in assuming that this cannot be a new result. However, while all the elements of our constructions have appeared in the literature, a complete synthesis does not appear to have been presented anywhere. It therefore seems worthwhile to attempt such a synthesis. The interest of the senior (second) author in these problems goes back to the first half of the 70’s. It was triggered by the work of Osterwalder and Schrader [4] mentioned above, Ruelle’s continuation [10] of Ginibre’s work on reduced density matrices [1], the classic work of Haag, Hugenholtz and Winnink [11] on KMS states in quantum statistical mechanics, Araki’s analytic continuation of RTGF’s [12], and
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
831
some work of Hoegh–Krohn on thermal field theory [13]. First results appeared in [14]. In preparing a course on statistical mechanics at Princeton University [15], he also became familiar with important work of Araki [16] on “relative Hamiltonians”. This led to a translation of the results of [11] to imaginary time [15]. A basic step towards a general (Osterwalder–Schrader type) reconstruction theorem at positive temperature was undertaken in [17], with crucial help by E. Nelson. Subsequently, there was important parallel work by A. Klein and L. Landau [18, 19]. However, in their work, use is made of mathematical structure, in particular of notions from the theory of random fields, which is not intrinsic to the general theory of KMS states. It may thus appear to be of interest to present details of some general results on the connection between RTGF’s and TOGF’s, even though, informally, they have been known since the late 1970’s. Ever since the work of Bisognano and Wichman [20], it has been known that the KMS condition also plays a fundamental rˆ ole in relativistic QFT at zero temperature. The vacuum is a KMS state for every one-parameter subgroup of Lorentz boosts. This observation is intimately related to (and based on) the connection between spin and statistics [6, 7] and Jost’s general form of the PCT theorem [21]. This will be briefly recalled towards the end of the paper from the point of view of an imaginary-time formulation of QFT. In particular, the KMS condition at imaginary time will be seen to be a consequence of locality and of the connection between spin and statistics (and conversely!) and to give rise to a direct definition of the anti-unitary PCT symmetry operation. The paper is concluded with lengthy comments on the imaginary-time formulation of QFT on some curved space-times, in particular the space-time of a Schwarzschild black hole and de Sitter space. Recalling some general results on “virtual representations of symmetric spaces” proven in [22], (see [23] for a general survey of results of this type and additional references), it is shown how to reconstruct unitary representations of the Killing symmetries of space-time. The KMS condition then yields obvious variants of the spin-statistics connection and of the PCT theorem and provides general interpretations of the Unruh- and the Hawking effects. 2. KMS States According to Haag Hugenholtz Winnink, and Araki 2.1. Finite systems in thermal equilibrium Consider a quantum-mechanical physical system confined to a compact subset of space. Its time-evolution is generated by a self-adjoint Hamiltonian, H, on the Hilbert space, H, of pure physical state vectors. The energy spectrum of H is discrete and bounded from below. Let Q1 , . . . , QN be self-adjoint operators on H representing conserved quantities (i.e. “[H, Qi ] = 0”, i = 1, . . . , N ) and commuting with all “observables”, which are identified with the self-adjoint operators in a subalgebra, A, of the algebra of all bounded operators on H. Let µ1 , . . . , µN denote
August 22, 2002 10:28 WSPC/148-RMP
832
00144
L. Birke & J. Fr¨ ohlich
the chemical potentials conjugate to the conserved quantities Q1 , . . . , QN . As recognized by Landau and von Neumann, the state, h(·)iβ,µ , of the system describing thermal equilibrium at inverse temperature β and chemical potentials µ1 , . . . , µN is given by the density matrix ρβ,µ := Ξ−1 β,µ exp[−βHµ ] ,
(2.1)
where Hµ := H −
N X
µi Qi ,
i=1
Ξβ,µ = trH [e−βHµ ] ; namely haiβ,µ := trH [ρβ,µ a] ,
(2.2)
a ∈ A. The time-evolution of operators in A in the Heisenberg picture is given by αt (a) := eitH ae−itH = eitHµ ae−itHµ ,
(2.3)
a ∈ A, where the second equation follows from the fact that elements of A commute with Q1 , . . . , QN . From (2.1)–(2.3) and the cyclicity of the trace we conclude that hαt (a)biβ,µ = hbαt+iβ (a)iβ,µ ,
(2.4)
for arbitrary a, b in A. This is the famous KMS condition characterizing equilibrium states. 2.2. Systems with infinitely many degrees of Thermodynamic limit freedom Systems in non-compact subsets of physical space (e.g. the thermodynamic limit of physical systems) with infinitely many degrees of freedom are conveniently described as C ∗ -dynamical systems: The algebra of “observables” of such a system is thought to be a C ∗ -algebra A (with kak the C ∗ -norm of an element a ∈ A), its states are described as normalized, positive linear functionals, ω, on A; (we may assume that A contains an identity element, 1, and that states are normalized such that ω(1) = 1). Symmetries of such a system are described by a group of ?-automorphisms of A. In particular, the time-translations are described by a one-parameter group, {αt (·)|t ∈ R} , of ?-automorphisms of A weakly measurable in t.
(2.5) ◦
It is convenient to introduce the following subalgebra, A, of A: Z ◦ ∞ ˆ := a ≡ dt f (t)α (a) (R) , a ∈ A, f ∈ C A f t 0
(2.6)
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
833
where fˆ denotes the Fourier transform of f . Since fˆ is assumed to have compact support, f (t) is the restriction of an entire function to the real axis. If the ◦
?-automorphisms αt are norm-continuous in t, then A is dense in A in norm. (For definition (2.6) to make sense, αt (a) need only be weakly measurable in t. A reasonable hypothesis is to assume that ω(αt (a)) is continuous in t, for a weak? -dense set of states, ω, of A — norm continuity of αt in t does not usually hold.) For an ◦
element a = bf ∈ A,
Z αz (a) :=
dt f (t − z)αt (b)
(2.7)
is entire in z. As suggested by Eq. (2.4) and argued in [11], a state, ωβ , of such a system describing thermal equilibrium at inverse temperature β should satisfy the KMS condition ωβ (αt (a)b) = ωβ (bαt+iβ (a)) ,
(2.8)
◦
for all a ∈ A, b ∈ A, t ∈ R. A state ωβ satisfying Eq. (2.8) is said to be a (β-)KMS state for αt . Note that the KMS condition (2.8) implies that ωβ is αt -invariant: ωβ (αt (a)) = ωβ (a) ,
◦
a∈A .
(2.9)
(To show (2.9), one sets b = 1 in (2.8)!) In order to characterize equilibrium states of infinite systems, the KMS condition (2.8) must be supplemented by an appropriate “separability-continuity condition”. For the purposes of this paper, the following notion appears to be adequate: A state ωβ of (A, αt ) is said to be an equilibrium state at inverse temperature β if and only if (1) ωβ is a β-KMS state for αt ; (2) for arbitrary elements a and b of A, ωβ (aαt (b)) is a continuous function of t; (3) the algebra A can be given a topology, τ , which makes A a separable topological space, and such that ωβ (a · b) is jointly continuous in a and b in the product topology on A × A. 2.3. The GNS construction A pair (A, ω) of a C ∗ -algebra A and a state ω of A gives rise to a Hilbert space Hω , a ?-representation λω of A on Hω , and a cyclic vector Ωω ∈ Hω such that ω(a) = hλω (a)Ωω , Ωω i ,
a ∈ A,
(2.10)
where h·, ·i is the scalar product on Hω ; see e.g. [24]. If Property (3), above, holds for the state ω, then Hω is separable. If ω is invariant under a one-parameter
August 22, 2002 10:28 WSPC/148-RMP
834
00144
L. Birke & J. Fr¨ ohlich
?-automorphism group αt and (A, αt , ω) satisfy Properties (2) and (3), above, then there is a strongly continuous one-parameter group {eitL |t ∈ R}
(2.11)
of unitary operators, with a self-adjoint generator L = L∗
(2.12)
λω (αt (a)) = eitL λω (a)e−itL ,
(2.13)
eitL Ωω = Ωω ,
(2.14)
such that
and
for all t ∈ R. We define the kernel, Nω , of ω to be the left-ideal in A given by Nω := {a ∈ A|ω(a∗ a) = 0} .
(2.15)
Let us now assume that ω = ωβ is an equilibrium state for (A, αt ) at inverse temperature β, in the sense that Properties (1)–(3) in Sec. 2.2 hold. Then Hβ := Hωβ is separable (Property (3)), the vector Ωβ := Ωωβ is not only cyclic for λ(A), λ ≡ λωβ , but separating (i.e. λ(a)Ωβ = 0 implies λ(a) = 0 on Hβ , a consequence of the KMS condition (2.8)), and Nωβ is a two-sided ?-ideal in A (Property (1), i.e. KMS condition); hence Nωβ = {0} if A is simple and dim Hβ > 1. The generator L is then called thermal Hamiltonian or Liouvillian; (see [24, 25, 11] for further details). 2.4. Bi-module structure of Hβ and modular conjugation J The KMS condition gives rise to the following remarkable objects identified by Haag, Hugenholtz and Winnink in their fundamental paper [11]: It is assumed that ωβ is an equilibrium state for (A, αt ) at inverse temperature β, in the sense of Properties (1) through (3) of Sec. 2.2. As noted in Sec. 2.3, the vector Ωβ is then cyclic and separating for the algebra λ(A). Thus, one can introduce a densely defined, anti-linear operator S by Sλ(a)Ωβ := λ(a)∗ Ωβ ,
a ∈ A.
(2.16)
The KMS condition can be used to show that S can be extended to a closed operator and to construct the polar decomposition of S. For this purpose, we define an anti-linear operator J by setting Jλ(a)Ωβ := Sλ(α−iβ/2 (a))Ωβ = λ(αiβ/2 (a∗ ))Ωβ ,
(2.17)
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc. ◦
835
◦
for arbitrary a ∈ A. By Property (2), Ωβ is cyclic and separating for λ(A); hence J is a densely defined, anti-linear operator. Using the KMS condition and the invariance of ωβ under αt , one easily verifies that J is an anti-unitary involution .
(2.18)
Using (2.16)–(2.18) and (2.13), (2.14), we see that S = Je−βL/2 = eβL/2 J
(2.19)
is the polar decomposition of S. From (2.17), (2.13) and (2.14), JeitL = eitL J ,
∀t ∈ R,
(2.20)
or, equivalently (recalling that J is anti-linear ), JL = −LJ ,
(2.21)
on the domain of definition of L. One defines ρ(a) := Jλ(a)J ,
a ∈ A.
(2.22)
Since J is an anti-unitary involution and λ is a ?-representation of A, ρ is an anti-(linear )?-representation of A. By purely algebraic calculations, one finds that [ρ(a), λ(b)] = 0 ,
(2.23)
for arbitrary a, b in A. In fact [11], ρ(A)00 = λ(A)0 ,
(2.24)
where B 0 denotes the commutant (commuting algebra) of an algebra B ⊆ L(Hβ ), and B 00 denotes the double commutant. These results of Haag, Hugenholtz and Winnink contributed to the development of Tomita–Takesaki theory, see [26, 25], which is among the deepest results in the theory of von Neumann algebras. The starting point is a von Neumann algebra M acting on a Hilbert space H, with a cyclic and separating vector Ω ∈ H. One defines SM Ω = M ∗ Ω ,
M ∈ M.
It is difficult, but possible, to prove that S can be closed. This implies that S¯ has a polar decomposition S¯ = J exp(−πL) , where J is an anti-unitary involution, and L = L∗ is self-adjoint. One then proves that αt (M ) := eitL M e−itL
August 22, 2002 10:28 WSPC/148-RMP
836
00144
L. Birke & J. Fr¨ ohlich
is a ?-automorphism group of M, and that ω(M ) := hM Ω, Ωi is a 2π-KMS state for (M, αt ). As in (2.23), (2.24), it then follows that M0 := JMJ is the commutant of M. The anti-unitarity of J and (2.23) may remind one of the PCT theorem in relativistic QFT and its proof [21]. The similarities are not accidental; see Sec. 4.1. Let (A, αt ) be a C ∗ -dynamical system, and ωβ an equilibrium state at inverse temperature β for (A, αt ), in the sense of Properties (1)–(3) of Sec. 2.2. Let us assume that 0 is a simple eigenvalue of the Liouvillian L corresponding to the unique eigenvector Ωβ . Let ω be an arbitrary state which is normal with respect to ωβ . Then the KMS condition for ωβ can be used to prove the property of “return to equilibrium”; namely Z 1 T dt ω(αt (a)) = ωβ (a) , a ∈ A , (2.25) lim T →∞ T 0 which is a remarkable dynamical stability property of KMS states under local perturbations; see [27–29]. 2.5. Thermal Green functions and their analytic continuation Most properties of a physical system in thermal equilibrium are encoded in its (real-time) thermal Green functions (RTGF), which we define below. Let (A, αt ) be a C ∗ -dynamical system, and let ωβ be an equilibrium state for (A, αt ) at inverse temperature β, with Properties (1)–(3) of Sec. 2.2. For arbitrary a1 , . . . , an in A, t1 , . . . , tn in R, we define ! n Y αtj (aj ) . (2.26) Fβ (a1 , t1 , . . . , an , tn ) := ωβ j=1
The functions Fβ are the real-time thermal Green functions. Because the state ωβ is αt -invariant, they only depend on the variables s1 , s2 , . . . , sn−1 defined by t j = t1 +
j−1 X
si ,
j = 2, . . . , n .
(2.27)
i=1 ◦
If a1 , . . . , an are elements of the algebra A defined in (2.6), then Hβ (s1 , . . . , sn−1 ) := Fβ (a1 , t1 , . . . , an , tn )
(2.28)
is the restriction of an analytic function Hβ (ζ1 , . . . , ζn−1 ), (ζ1 , . . . , ζn−1 ) ∈ Cn−1 , to the real slice Rn−1 ⊂ Cn−1 . On the real slice this function is bounded by |Hβ (s1 , . . . , sn−1 )| = |Fβ (a1 , t1 , . . . , an , tn )|
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
= ωβ ≤
n Y
837
! αtj (aj )
n Y j=1
kai k ,
(2.29)
i=1
because kαt (a)k = kak, for a ∈ A, t ∈ R, and because ωβ is a state on A. The KMS condition (2.8) implies that, for every j = 1, . . . , n − 1, Hβ (s1 , . . . , sj + iβ, . . . , sn−1 ) = Fβ (a1 , t1 , . . . , aj , tj , aj+1 , tj+1 + iβ, . . . , an , tn + iβ) = Fβ (aj+1 , tj+1 , . . . , an , tn , a1 , t1 , . . . , aj , tj ) ,
(2.30)
and thus, as in (2.29), |Hβ (s1 , . . . , sj + iβ, . . . , sn−1 )| ≤
n Y
kai k ,
(2.31)
i=1
for j = 1, . . . , n−1. By the generalized tube theorem, due to Kunze, Stein, Malgrange and Zerner (see e.g. [30, 31]), |Hβ (ζ1 , . . . , ζn−1 )| ≤
n Y
kai k ,
(2.32)
i=1 ◦
for a1 , . . . , an in A and (ζ1 , . . . , ζn−1 ) in the tube ( ) n−1 X Im ζi < β . Tn−1 := (ζ1 , . . . , ζn−1 )|Im ζi > 0,
(2.33)
i=1 ◦
It follows that, for a1 , . . . , an in A, Fβ (a1 , t1 , . . . , an , tn ) is the boundary value of a function Fβ (a1 , z1 , . . . , an , zn ) analytic in (z1 , . . . , zn ) on Tn := {(z1 , . . . , zn )|Im z1 < Im z2 < · · · < Im zn < Im z1 + β}
(2.34)
and bounded on the closure, T¯n , of Tn by |Fβ (a1 , z1 , . . . , an , zn )| ≤
n Y
kai k .
(2.35)
i=1
By Property (2), Sec. 2.2, and definition (2.6), it follows that Properties (2.34) and (2.35) hold for arbitrary a1 , . . . , an in A. These results have first been noticed by Araki [12]. The functions Fβ have an important positivity property. To start with, we note ◦
that, for a ∈ A, z ∈ C, (αz (a))∗ = αz¯(a∗ ) .
(2.36)
August 22, 2002 10:28 WSPC/148-RMP
838
00144
L. Birke & J. Fr¨ ohlich ◦
Let ai1 , . . . , aini , ni = 1, 2, . . ., be elements of A, and z1i , . . . , zni i be complex numbers with 0 < Im z1i < · · · < Im zni i < β/2 ,
(2.37)
for i = 1, . . . , N < ∞. Let Z1 , . . . , ZN be arbitrary complex numbers and set a=
N X
Zj
j=1
nj Y
αzj −iβ/2 (ajk ) ∈ A . k
k=1
Since ωβ is a state, and by (2.36), ∗
0 ≤ ωβ (aa ) =
N X
Zi Z¯j Fβ (ai1 , z1i − iβ/2, . . . , aini , zni i − iβ/2 ,
i,j=1
(ajnj )∗ , z¯nj j + iβ/2, . . . , (aj1 )∗ , z¯1j + iβ/2) .
(2.38)
Fβ (a1 , z1 + z, . . . , an , zn + z) = Fβ (a1 , z1 , . . . , an , zn ) ,
(2.39)
By invariance, i.e.
the positivity (2.38) is seen to imply that the complex numbers Πij := Fβ (ai1 , z1i , . . . , aini , zni i , (ajnj )∗ , z¯nj j + iβ, . . . , (aj1 )∗ , z¯1j + iβ) ,
(2.40)
i, j = 1, . . . , N , are the matrix elements of a positive semi-definite matrix Π. Note that, by (2.37), (z1i , . . . , zni i , z¯nj j + iβ, . . . , z¯1j + iβ) ∈ Tni +nj , for all i, j. Thus, by (2.35) and Property (2), Sec. 2.2, the positivity Property (2.38) holds for arbitrary aik ∈ A, zki as in (2.37), k = 1, . . . , ni , i = 1, . . . , N < ∞. We observe that the KMS condition (2.8), see also (2.30), implies that, for arbitrary a1 , . . . , an in A, (z1 , . . . , zn ) ∈ Tn , Fβ (a1 , z1 , . . . , an , zn ) = Fβ (aj+1 , zj+1 , . . . , an , zn , a1 , z1 + iβ, . . . , aj , zj + iβ) . (2.41) Finally, Property (3), Sec. 2.2, and the KMS condition imply that all RTGF’s and all functions Fβ (a1 , z1 , . . . , an , zn ) can be obtained as limits of such functions evaluated on a countable set of n-tuples (a1 , . . . , an ). Our main results in this section are stated in (2.34), (2.35) and in (2.39)– (2.41). In particular, (2.34) shows that we can define imaginary-time (“temperatureordered”) Green functions (TOGF’s), φβ , by setting φβ (a1 , τ1 , . . . , an , τn ) := Fβ (a1 , iτ1 , . . . , an , iτn ) ,
(2.42)
for a1 , . . . , an in A, and τ1 < τ2 < · · · < τn < τ1 + β .
(2.43)
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
839
It is convenient to think of τ1 , . . . , τn as angles on a circle of circumference β, ordered in accordance with the orientation chosen on the circle; see (2.39) and (2.41). The main properties of TOGF’s can immediately be inferred from (2.35) and (2.39)–(2.41). In the next section, we show that functions with all the general properties of TOGF’s are, in fact, the TOGF’s corresponding to an equilibrium state, ωβ , of a C ∗ -dynamical system. 3. A Reconstruction Theorem at Positive Temperature In this section, we show how to reconstruct the RTGF’s of a C ∗ -dynamical system in an equilibrium state from functions with all the properties of TOGF’s. Our result is an analogue of the Osterwalder–Schrader reconstruction theorem [4, 5], which has solved a similar problem at zero temperature. A result of the kind we shall prove in this section, but with additional assumptions that make it inapplicable to systems of fermions, for example, such as non-relativistic electron liquids (see [2]), has been proven in [19]; see [10, 13–15, 17] for earlier, partial results. 3.1. Green functions on an (imaginary-time) circle Our starting point, in this section, is a set of Green functions depending on n-tuples [a1 , τ1 , . . . , an , τn ], where ai is an element of a separable topological space S, τi is a point on a circle of circumference β, for all i = 1, . . . , n, and (τ1 , . . . , τn ) ∈ Tn< , where Tn< := {(σ1 , . . . , σn )|σ1 < σ2 < · · · < σn < σ1 + β} .
(3.1)
These Green functions are denoted φβ (a1 , τ1 , . . . , an , τn ) ,
(3.2)
n = 0, 1, 2, . . . , with φβ (∅) = 1. They are assumed to have the following properties. For arbitrary a1 , . . . , an in S, and n = 0, 1, 2, . . . : (P1) Continuity: φβ (a1 , τ1 , . . . , an , τn ) is defined for arbitrary (a1 , . . . , an ) ∈ S ×n and (τ1 , . . . , τn ) ∈ Tn< ; it is jointly continuous in (a1 , . . . , an ) in the product topology of S ×n , and it is a continuous function of (τ1 , . . . , τn ) on Tn< . (P2) Translation invariance: φβ (a1 , τ1 , . . . , an , τn ) = φβ (a1 , τ1 + τ, . . . , an , τn + τ ) , for arbitrary τ ∈ R. (P3) KMS condition: φβ (a1 , τ1 , . . . , an , τn ) = φβ (aj+1 , τj+1 , . . . , an , τn , a1 , τ1 + β, . . . , aj , τj + β) , for arbitrary j = 1, . . . , n − 1. (P4) Reflection positivity: There is a continuous involution ∗
S 3 a 7→ a ∈ S ,
∗ ∗
(a ) = a ,
∀a,
∗
on S,
August 22, 2002 10:28 WSPC/148-RMP
840
00144
L. Birke & J. Fr¨ ohlich
with the property that, for all N = 1, 2, 3, . . ., arbitrary ai1 , . . . , aini in S, ni = 0, 1, 2, . . ., i = 1, . . . , N , the matrix Π = (Πij )i,j=1,...,N , defined by Πij := φβ (ai1 , τ1i , . . . , aini , τni i , (ajnj )∗ , β − τnj j , . . . , (aj1 )∗ , β − τ1j ) ,
(3.3)
with 0 < τ1i < · · · < τni i < β/2 ,
∀i,
(3.4)
is positive semi-definite. In much of this section, we shall require a much stronger version of Property (P1), namely: (P∗ ) TOGF’s on a C ∗ -algebra: The space S is a C ∗ -algebra with identity, 1, and the involution ∗ in (P4) is the usual ?-operation on S. It is then assumed that (P∗ i) φβ (a1 , τ1 , . . . , an , τn ) is linear in each argument ai , i = 1, . . . , n, jointly continuous in (a1 , . . . , an ) in the product topology on S ×n of a topology on S in which S is separable, and continuous in τ1 , . . . , τn on the closure, Tn< , of Tn< ; ∗ (P ii) φβ (a1 , τ1 , . . . , aj , τj , aj+1 , τj , . . . , an , τn ) = φβ (a1 , τ1 , . . . , aj · aj+1 , τj , . . . , an , τn ) , (P∗ iii)
for arbitrary j = 1, . . . , n − 1, n = 2, 3, . . . ; φβ (a1 , τ1 , . . . , aj−1 , τj−1 , 1, τj , . . . , an , τn ) = φβ (a1 , τ1 , . . . , aj−1 , τj−1 , aj+1 , τj+1 , . . . , an , τn ) , for arbitrary j = 1, . . . , n; and
(P∗ iv)
|φβ (a1 , τ1 , . . . , an , τn )| ≤
n Y
kaj k ,
j=1
where k(·)k is the C ∗ -norm on S. Remark 3.1. In the last section, we have seen that Properties (P1)–(P4) and (P∗ ) hold for the TOGF’s associated with an equilibrium state, ωβ , of a C ∗ -dynamical system (A, αt ), with S = A. It may be appropriate to mention some examples of physical systems with TOGF’s satisfying Properties (P1)–(P4) and (P∗ ): (1) Let S be the CAR algebra of a system of non-relativistic fermions of the kind considered by Ginibre in [1], and let φβ (a1 , τ1 , . . . , an , τn ) be the TOGF’s of such a system as constructed in [1, 10], for sufficiently small β. The functional-integral definition of φβ (a1 , τ1 , . . . , an , τn ) makes it clear that these functions can be defined
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
841
for arbitrary n-tuples (τ1 , . . . , τn ), and if a1 , . . . , an are creation- or annihilation operators then φβ (a1 , τ1 , . . . , an , τn ) is totally antisymmetric in its n arguments (ai , τi ), i = 1, . . . , n. If Ψ∗ and Ψ denote a creation- and the corresponding annihilation operator in S, then φβ (Ψ∗ , τ1 , Ψ, τ2 )
KMS = φβ (Ψ, τ2 , Ψ∗ , τ1 + β) = −φβ (Ψ∗ , τ1 + β, Ψ, τ2 ) .
(3.5)
Thus, φβ (Ψ∗ , τ1 , Ψ, τ2 ) is an anti-periodic function of τ1 − τ2 ∈ [0, β]. (2) For systems of non-relativistic bosons or of Bose quantum fields, as considered in [13, 14, 19], one may choose S to be a C ∗ -algebra generated by Weyl operators constructed from bosonic creation- and annihilation operators. For bosons, the creation- and annihilation operators, Φ∗ , Φ, are unbounded operators (in contrast to the bounded creation- and annihilation operators for fermions). Yet, it may happen that, for arbitrary n, the TOGF’s # φβ (Φ# 1 , τ1 , . . . , Φn , τn ) ∗ are well defined; here Φ# j = Φj or Φj , for all j. The TOGF’s turn out to be totally symmetric under permutations of their arguments. Hence, the KMS condition implies that
φβ (Φ∗ , τ1 , Φ, τ2 )
KMS = φβ (Φ, τ2 , Φ∗ , τ1 + β) = φβ (Φ∗ , τ1 + β, Φ, τ2 ) ,
(3.6)
i.e. φβ (Φ∗ , τ1 , Φ, τ2 ) is a periodic function of τ1 − τ2 ∈ [0, β]. 3.2. The main theorem In this section, we describe our main result concerning the reconstruction of a thermal equilibrium state and of real-time Green functions from a set of TOGF’s with the properties of Sec. 3.1. Let S, Tn< , etc. be as in Sec. 3.1. Theorem 3.1 (Main Theorem). (1) Assume that the TOGF’s {φβ (a1 , τ1 , . . . , an , τn )}∞ n=0 have Properties (P 1)–(P 4) of Sec. 3.1. Then they uniquely determine a separable Hilbert space Hβ , a continuous, unitary one-parameter group {eitL }t∈R on Hβ , a vector Ωβ ∈ Hβ invariant under {eitL }t∈R , and an anti-unitary operator J on Hβ such that JΩβ = Ωβ ,
eitL J = JeitL .
(3.7)
(2) If S is a C ∗ -algebra, and, in addition to (P 1)–(P 4), Property (P ∗ ) of Sec. 3.1 holds, then the TOGF’s {φβ (a1 , τ1 , . . . , an , τn )}∞ n=0
August 22, 2002 10:28 WSPC/148-RMP
842
00144
L. Birke & J. Fr¨ ohlich
determine a ?-representation, λ, of S on Hβ and an anti-representation, ρ, of S on Hβ given by ρ(a) = Jλ(a)J ,
a∈S,
(3.8)
such that [eitL λ(a)e−itL , eisL ρ(b)e−isL ] = 0 ,
(3.9)
for all a, b in S and t, s real. The state ωβ (·) := h(·)Ωβ , Ωβ i
(3.10)
is a KMS state for λ(S) and the time evolution λ(a) 7→ eitL λ(a)e−itL ,
a∈S.
The functions {φβ (a1 , τ1 , . . . , an , τn )}∞ n=0 are the TOGF’s obtained from the realtime Green functions + * n Y eitj L λ(aj )e−itj L Ωβ , Ωβ (3.11) j=1
by analytic continuation in the time variables t1 , . . . , tn to the tube Tn defined in (2.34) and restriction to {tj = iτj |j = 1, . . . , n, τ1 < τ2 < · · · < τn < τ1 + β} . Remark 3.2. A similar result, but in a more special situation, has been established by Klein and Landau in [19]; (the results in [19] do not apply to systems of fermions, for example). With the exception of the very last part, this theorem was proven in [15]; see also [22, 23, 32] for further results. Our result is an analogue, at positive temperature, of the Osterwalder–Schrader reconstruction theorem [4, 5]. The proof of the main theorem forms the core of our paper. 3.3. Proof of Part (1) of the main theorem The proof of the main theorem consists of a highly non-trivial extension of the GNS construction. The first step is to construct the Hilbert space Hβ . (i) Construction of Hilbert space We consider the linear space Vβ :=
∞ M
(n)
Vβ ,
(3.12)
n=0
of formal expressions, where ( (n) Vβ
:=
X i
) Zi [ai1 , τ1i , . . . , ain , τni ] ,
(3.13)
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
843
with Zi ∈ C, ai1 , . . . , ain in S, 0 < τ1i < · · · < τni < β/2 ,
(3.14)
for all i, and (0)
Vβ := C .
(3.15)
The space Vβ can be equipped with a positive semi-definite inner product determined from h[a1 , τ1 , . . . , an , τn ], [b1 , σ1 , . . . , bm , σm ]i := φβ (a1 , τ1 , . . . , an , τn , b∗m , β − σm , . . . , b∗1 , β − σ1 ) ,
(3.16)
by linearity in the first and anti-linearity in the second argument; 0 < τ1 < · · · < τn < β/2, 0 < σ1 < · · · < σm < β/2. The reflection positivity property, (P4), implies that, indeed, (3.16) determines a positive (semi-)definite inner product on Vβ . We define the kernel of h·, ·i by Nβ := {v ∈ Vβ |hv, vi = 0} .
(3.17)
The equivalence class modulo Nβ of an element v ∈ Vβ is denoted by Φ(v) := v mod Nβ .
(3.18)
Hβ := Vβ /Nβ ,
(3.19)
Clearly,
where the closure is taken in the norm determined by the scalar product h·, ·i on Vβ /Nβ , is a Hilbert space. By Property (P1) and the separability of S, Hβ is a separable Hilbert space. By construction, the linear space Dβ := Φ(Vβ )
(3.20)
is dense in Hβ . We define the vector Ωβ by Ωβ = Φ([∅]) ,
(3.21)
with hΩβ , Ωβ i = φβ (∅) := 1 . (ii) Construction of a unitary one-parameter group of time translations By linearity, the equation [a1 , τ1 , . . . , an , τn ]τ := [a1 , τ1 + τ, . . . , an , τn + τ ] ,
(3.22)
for −τ1 < τ < β/2 − τn (0 < τ1 < · · · < τn < β/2, a1 , . . . , an in S), defines a shift operator Vβ 3 v 7→ vτ ∈ Vβ ,
(3.23)
August 22, 2002 10:28 WSPC/148-RMP
844
00144
L. Birke & J. Fr¨ ohlich
for all τ ∈ (−− (v), + (v)), for some positive numbers − (v) and + (v) (with − (v) = τ1 , + (v) = β/2 − τn , for v as in (3.22)). It is clear from (3.22) that (vτ )σ = vτ +σ ,
(3.24)
if τ , σ and τ + σ all belong to the open interval (−− (v), + (v)). Let v and w be two vectors in Vβ . Then the definition (3.16) of the inner product and Property (P2) (translation invariance) readily imply that hvτ , wi = hv, wτ i
(3.25)
if − min(− (v), − (w)) < τ < min(+ (v), + (w)). We claim that v ∈ Nβ ⇒ vτ ∈ Nβ , for − − (v) < τ < + (v) .
(3.26)
To prove (3.26), we notice that, for −− (v)/2 < τ < + (v)/2, 0 ≤ hvτ , vτ i = hv, v2τ i ≤ hv, vi1/2 hv2τ , v2τ i1/2 = 0 ,
(3.27)
for v ∈ Nβ , by the Cauchy–Schwarz inequality; hence vτ ∈ Nβ . For τ ∈ (−− (v)/2, + (v)/2) and τ1 ∈ (−− (v)/4, + (v)/4), we have that vτ +τ1 and vτ +2τ1 are in Vβ , and 0 ≤ hvτ +τ1 , vτ +τ1 i = hvτ , vτ +2τ1 i ≤ hvτ , vτ i1/2 hvτ +2τ1 , vτ +2τ1 i1/2 = 0 , because vτ ∈ Nβ , by (3.27). This makes it clear that the proof of (3.26) can be completed inductively. Observation (3.26) permits us to define operators, Γτ , on the dense domain Dβ ⊂ Hβ as follows: Each Ψ ∈ Dβ is of the form Ψ = Φ(v), for some v ∈ Vβ . For τ ∈ (−− (v), + (v)), we set Γτ Ψ := Φ(vτ ) .
(3.28)
± (Ψ) := sup {± (v)|Φ(v) = Ψ} ,
(3.29)
Defining v∈Vβ
we see that (3.26) implies that the left hand side of (3.28) is well defined, for τ ∈ (−− (Ψ), + (Ψ)). Property (P1) (continuity) then implies that s-lim Γτ Ψ = Ψ , τ →0
∀ Ψ ∈ Dβ .
(3.30)
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
845
˜ = Φ(˜ ˜ Next, for Ψ = Φ(v) and Ψ v ) in Dβ , and − min(− (Ψ), − (Ψ)) < τ < ˜ min(+ (Ψ), + (Ψ)), ˜ = hΓτ Φ(v), Φ(˜ v )i hΓτ Ψ, Ψi = hvτ , v˜i,
by (3.28), (3.18)
= hv, v˜τ i,
by (3.25)
v )i = hΦ(v), Γτ Φ(˜ ˜ . = hΨ, Γτ Ψi
(3.31)
Finally, for Ψ = Φ(v) ∈ Dβ , and τ, σ, τ + σ all in the interval (−− (Ψ), + (Ψ)), Γτ (Γσ Ψ) = Γτ (Γσ Φ(v)) = Γτ Φ(vσ ) = Φ(vτ +σ ) = Γτ +σ Φ(v) = Γτ +σ Ψ .
(3.32)
A somewhat remarkable theorem on the essential self-adjointness of local, Hermitian semigroups proven in [17, 18] says that from (3.28) through (3.32) it follows that Γτ Ψ = eτ L Ψ, for − − (Ψ) < τ < + (Ψ) ,
(3.33)
for every Ψ ∈ Dβ , where L, the “Liouvillian”, is essentially self-adjoint on a domain ◦
◦
D β ⊂ Dβ which is dense in Hβ . (In [17], there is an explicit construction of D β .) Clearly, Γτ Ωβ = Γτ Φ([∅]) = Φ([∅]) = Ωβ , for arbitrary τ , i.e. LΩβ = 0 .
(3.34)
By Stone’s theorem, exp(itL)|t∈R defines a strongly continuous one-parameter group of unitary operators on H leaving Ωβ invariant. (iii) Construction of an anti-unitary operator J For v := Z[a1 , τ1 , . . . , an , τn ] ∈ Vβ ,
(3.35)
¯ ∗n , β/2 − τn , . . . , a∗1 , β/2 − τ1 ] . jv := Z[a
(3.36)
Z ∈ C, we define
Equation (3.36) is required for arbitrary n and hence, by anti-linearity, defines an anti-linear operator j on all of Vβ . Choosing w := ζ[b1 , σ1 , . . . , bm , σm ] ∈ Vβ ,
ζ ∈ C,
August 22, 2002 10:28 WSPC/148-RMP
846
00144
L. Birke & J. Fr¨ ohlich
we observe that, by (3.36) and (3.16), hjv, jwi
=
¯ β (a∗n , β/2 − τn , . . . , a∗1 , β/2 − τ1 , b1 , β/2 + σ1 , . . . , bm , β/2 + σm ) Zζφ
(P2) ¯ = Zζφβ (a∗n , −τn , . . . , a∗1 , −τ1 , b1 , σ1 , . . . , bm , σm ) (P3) ¯ = ζ Zφβ (b1 , σ1 , . . . , bm , σm , a∗n , β − τn , . . . , a∗1 , β − τ1 ) (3.16) = hw, vi .
(3.37)
Thus, if v ∈ Nβ , hjv, jvi = hv, vi = 0 , i.e. jv ∈ Nβ .
(3.38)
This observation enables us to define an anti-linear operator J on Dβ by setting JΦ(v) := Φ(jv) .
(3.39)
Then, hJΦ(v), JΦ(w)i
=
hΦ(jv), Φ(jw)i
=
hjv, jwi
(3.37) = hw, vi =
hΦ(w), Φ(v)i ,
(3.40)
i.e. J is anti-unitary. Next, we note that, for v as in (3.35), ¯ ∗ , β/2 − τn − τ, . . . , a∗ , β/2 − τ1 − τ ] = (jv)−τ , j(vτ ) = Z[a n 1
(3.41)
for τ ∈ (−− (v), + (v)). It then follows from (3.28) and (3.39) that, for Ψ ∈ Dβ and τ ∈ (−− (Ψ), + (Ψ)), JΓτ Ψ = Γ−τ JΨ .
(3.42)
Since J is anti-unitary, and by (3.33), JeitL = eitL J, or JL = −LJ .
(3.43)
JΩβ = JΦ([∅]) = Φ([∅]) = Ωβ .
(3.44)
Finally,
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
847
3.4. Proof of Part (2) of the main theorem To prove Part (2) of our main theorem, we must assume that the imaginary-time Green functions (TOGF’s), φβ , not only obey Properties (P1)–(P4) of Sec. 3.1, but, in addition, Property (P∗ ). In particular, we shall henceforth assume that S is a C ∗ -algebra. (i) Construction of a ?-representation λ and an anti-representation ρ of S on Hβ Thanks to Property (P∗ ), in particular (P∗ i), we may define the linear spaces V˜β :=
∞ M
(n) V˜β ,
n=0
where
( (n) V˜β
:=
X
) Zi [ai1 , τ1i , . . . , ain , τni ]
,
i
with Zi ∈ C, ai1 , . . . , ain ∈ S, and 0 ≤ τ1i ≤ τ2i ≤ · · · ≤ τni ≤ β/2 ,
(3.45)
(0) (0) for all i; V˜β = Vβ = C. Note that, thanks to Property (P∗ ii),
[a1 , τ1 , . . . , aj , τj , aj+1 , τj , . . . , an , τn ] ≡ [a1 , τ1 , . . . , aj · aj+1 , τj , . . . , an , τn ]
(3.46)
must be identified, for τj+1 = τj , for arbitrary j. Obviously, the space V˜β contains the space Vβ defined in (3.12). For a ∈ S and v := [a1 , τ1 , . . . , an , τn ] ∈ V˜β , we define av := [a, 0, a1 , τ1 , . . . , an , τn ] ∈ V˜β ,
(3.47)
va∗ := [a1 , τ1 , . . . , an , τn , a∗ , β/2] ∈ V˜β .
(3.48)
and
˜β denote the kernel These definitions can be extended to all of V˜β by linearity. Let N of the inner product h·, ·i on V˜β , defined as in (3.16), (3.17) (An example of a vector ˜β is the difference of the two vectors in (3.46)). By (3.16), in N hva∗ , wi = hv, wai ,
(3.49)
and, using the KMS condition (Property (P3)), hav, wi = hv, a∗ wi ,
(3.50)
for arbitrary v and w in V˜β . These equations and the Cauchy–Schwarz inequality ˜β is a two-sided ideal under left- and right multiplication by elements show that N
August 22, 2002 10:28 WSPC/148-RMP
848
00144
L. Birke & J. Fr¨ ohlich
˜ β , of Hβ and, for a ∈ S, of S. This permits us to define a dense, linear subspace, D ˜ linear operators λ(a) and ρ(a) on Dβ by setting ˜β , ˜ β := Φ(V˜β ) = V˜β mod N D
(3.51)
and λ(a)Φ(v) := Φ(av) ,
ρ(a)Φ(v) := Φ(va∗ ) ,
(3.52)
for arbitrary v ∈ V˜β . We note that λ(·) is linear, while ρ(·) is anti-linear on S. Property (P∗ ii) shows that, for arbitrary a and b in S, λ(a) · λ(b) = λ(a · b) ,
ρ(a) · ρ(b) = ρ(a · b) ,
(3.53)
˜ β . Further important properties of λ and ρ are described in the on the domain D following lemma. Lemma 3.1. (1) For arbitrary a ∈ S, ρ(a)Ψ = Jλ(a)JΨ ,
˜β , Ψ∈D
(3.54)
where J is the anti-unitary operator defined in (3.36), (3.39); (2) ˜ , ˜ = hΨ, λ(a∗ )Ψi hλ(a)Ψ, Ψi ˜ in D ˜ β , i.e. for arbitrary Ψ and Ψ λ(a)∗ ⊇ λ(a∗ ) ;
(3.55)
(3) λ(a) extends to a bounded operator on Hβ with kλ(a)k ≤ kak .
(3.56)
Remark 3.3. By (1), Parts (2) and (3) also hold for ρ(a), instead of λ(a). Proof. (1) For v = [a1 , τ1 , . . . , an , τn ] ∈ V˜β , Jλ(a)JΦ(v) = Jλ(a)Φ(jv) = Jλ(a)Φ[a∗n , β/2 − τn , . . . , a∗1 , β/2 − τ1 ] = JΦ[a, 0, a∗n , β/2 − τn , . . . , a∗1 , β/2 − τ1 ] = Φ[a1 , τ1 , . . . , an , τn , a∗ , β/2] = Φ(va∗ ) = ρ(a)Φ(v) ,
(3.57)
by (3.39), (3.36) and (3.52). Part (1) then follows by linearity. Part (2) is an immediate consequence of (3.50). Here are some details: Let v be ˜ := Φ(˜ v ). Then, as above and v˜ := [b1 , σ1 , . . . , bm , σm ] ∈ V˜β . We set Ψ := Φ(v), Ψ using (3.52), (3.16) and the KMS condition (P3), ˜ = φβ (a, 0, a1 , τ1 , . . . , an , τn , b∗ , β − σm , . . . , b∗ , β − σ1 ) hλ(a)Ψ, Ψi m 1 = φβ (a1 , τ1 , . . . , an , τn , b∗m , β − σm , . . . , b∗1 , β − σ1 , a, β) ˜ . = hΨ, λ(a∗ )Ψi
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
849
It remains to prove Part (3). By the Cauchy–Schwarz inequality, part (2) and (3.53), ˜ Ψi ˜ = hλ(a∗ a)Ψ, ΨihΨ, ˜ Ψi ˜ ˜ 2 ≤ hλ(a)Ψ, λ(a)ΨihΨ, |hλ(a)Ψ, Ψi| ˜ Ψi ˜ ≤ hλ(a∗ a)Ψ, λ(a∗ a)Ψi1/2 hΨ, Ψi1/2 hΨ, ˜ Ψi ˜ = hλ((a∗ a)2 )Ψ, Ψi1/2 hΨ, Ψi1/2 hΨ, ˜ Ψi ˜ ≤ hλ((a∗ a)2 )Ψ, λ((a∗ a)2 )Ψi1/4 hΨ, Ψi3/4 hΨ, −N
≤ · · · ≤ hλ((a∗ a)2 )Ψ, Ψi2 N
−N
hΨ, Ψi1−2
˜ Ψi, ˜ hΨ,
for all N = 1, 2, 3, . . . . Next, we note that |hλ((a∗ a)2 )Ψ, Ψi| = |φβ ((a∗ a)2 , 0, a1 , τ1 , . . . , an , τn , a∗n , β − τn , . . . , a∗1 , β − τ1 )| N
N
≤ k(a∗ a)2 k N
n Y
(kaj k · ka∗j k) ≤ kak2
N +1
j=1 ∗
n Y
kaj k2 ,
j=1 ∗
by Property (P iv). We have used that k(·)k is a C -norm. By letting N tend to ∞, we find that ˜ Ψi ˜ 1/2 , ˜ ≤ kakhΨ, Ψi1/2hΨ, |hλ(a)Ψ, Ψi| ˜ resp. from which Part (3) follows by (anti-)linearity in Ψ, Ψ, ˜β of Hβ by We define a linear subspace Dβ+ ⊂ D Dβ+ := {Φ([a, β/2])|a ∈ S} .
(3.58)
ˆ : D+ → Hβ , by setting To each vector Ψ ∈ Hβ , we associate an operator Ψ β ˆ ΨΦ([a, β/2]) := ρ(a∗ )Ψ .
(3.59)
ˆ β = Ψ, ΨΩ
(3.60)
ˆ˜ ˆ ˜ = ΨΩ Ψ β = ΨΩβ = Ψ ,
(3.61)
ˆ˜ = Ψ ˆ, Ψ
(3.62)
Clearly,
and (3.59), (3.60) show that if
then as operators on Dβ+ . Lemma 3.2. For arbitrary a, b in S and real numbers t, s, [eitL λ(a)e−itL , eisL ρ(b)e−isL ] = 0 , where L is the Liouvillian constructed in Sec. 3.3; see Eq. (3.33).
(3.63)
August 22, 2002 10:28 WSPC/148-RMP
850
00144
L. Birke & J. Fr¨ ohlich
Remark 3.4. Lemmas 3.1 and 3.2 show that Hβ is a bi-module for the C ∗ -algebra, A, generated by {eitL λ(a)e−itL |a ∈ S, t ∈ R} . Proof. Since {exp(itL)}t∈R is a one-parameter unitary group, it is enough to prove ˜ ∈ Hβ . By unitarity of exp(itL) and Lemma 3.1(3) (3.63) for s = 0. Let Ψ ˜ ∈ Hβ . Ψ := eitL λ(a)e−itL Ψ Using (3.61) and (3.62), it is not hard to show that ˆ˜ . ˆ = eitL λ(a)e−itL Ψ Ψ This equality and (3.59) then yield ∗ ˆ , β/2]) ρ(b)Ψ = ΨΦ([b
ˆ˜ ∗ , β/2]) = eitL λ(a)e−itL ΨΦ([b ˜, = eitL λ(a)e−itL ρ(b)Ψ which proves (3.63) for s = 0. We conclude this section with a comment on the KMS condition at real time. For a and b in S, t ∈ R, we have that heitL λ(a)e−itL λ(b)Ωβ , Ωβ i = hλ(b)Ωβ , eitL λ(a∗ )Ωβ i = hJeitL λ(a∗ )Ωβ , Jλ(b)Ωβ i = heitL eβL/2 λ(a)Ωβ , eβL/2 λ(b∗ )Ωβ i , by (3.36), (3.39) and (3.28). This implies that Fab (t) := heitL λ(a)e−itL λ(b)Ωβ , Ωβ i is the boundary value of a function Fab (z) analytic in z on the strip {z|−β < Im z < 0}, which is the KMS condition! In the next subsection, we use somewhat more sophisticated arguments of this type to reconstruct all real-time Green functions from TOGF’s, φβ , by analytic continuation in the time variables. (ii) Back to real times In this subsection, we show that if a set of TOGF’s, φβ (a1 , τ1 , . . . , an , τn ), ai ∈ S, for all i, (τ1 , . . . , τn ) ∈ Tn< (see (3.1)), have Properties (P1)–(P4) and (P∗ ) of Sec. 3.1, then they are the restrictions of functions Fβ (a1 , z1 , . . . , an , zn ), analytic in (z1 , . . . , zn ) on the tubular domain Tn defined in Eq. (2.34), to the region (z1 , . . . , zn ) = (iτ1 , . . . , iτn ), (τ1 , . . . , τn ) ∈ Tn< .
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
851
Real-time Green functions are then obtained as the boundary values of the functions Fβ (a1 , z1 , . . . , an , zn ) when zi tends to the real axis, for all i = 1, . . . , n. Our results in this subsection will complete the proof of our main theorem, stated in Sec. 3.2. To start with, we note that λ(a1 )Ωβ = λ(a1 )Φ([∅]) = Φ([a1 , 0]) ∈ Hβ ,
(3.64)
for all a1 ∈ S. Furthermore, by (3.33) and (3.28), eτ1 L λ(a1 )Ωβ = eτ1 L Φ([a1 , 0]) = Φ([a1 , τ1 ]) ∈ Hβ ,
(3.65)
for 0 ≤ τ1 ≤ β/2. Since {exp(itL)}t∈R is a one-parameter unitary group on Hβ , ez1 L λ(a1 )Ωβ = eit1 L Φ([a1 , τ1 ]) ∈ Hβ ,
(3.66)
for z1 = τ1 + it1 , 0 ≤ τ1 ≤ β/2, t1 ∈ R, and the left hand side of (3.66) is holomorphic in z1 , for 0 < Re z1 ≡ τ1 < β/2. Furthermore, kez1 L λ(a1 )Ωβ k2 = hΦ([a1 , τ1 ]), Φ([a1 , τ1 ])i = φβ (a1 , τ1 , a∗1 , β − τ1 ) ≤ ka1 k2 ,
(3.67)
by Property (P∗ iv). Part (3) of Lemma 3.1 then shows that Ψa2 a1 (z1 ) := λ(a2 )ez1 L λ(a1 )Ωβ
(3.68)
is a holomorphic Hβ -valued function of z1 , for 0 < Re z1 < β/2, with kΨa2 a1 (z1 )k ≤ ka2 k ka1 k ,
(3.69)
for 0 ≤ Re z1 ≤ β/2. The idea is now to proceed inductively, showing that Ψa2 a1 (z1 ) is in the domain of definition of the unbounded operator λ(a3 ) exp(z2 L), as long as 0 ≤ Re z2 ≤ β/2 − Re z1 , etc. The induction hypothesis is [An−1 ] For arbitrary a1 , . . . , an in S, Ψan ···a1 (zn−1 , . . . , z1 ) := λ(an )ezn−1 L λ(an−1 ) · · · λ(a2 )ez1 L λ(a1 )Ωβ (β) is a vector in Hβ , for all (z1 , . . . , zn−1 ) ∈ T¯n−1 , where ( ) n−1 X (β) Re zi < β/2 ; Tn−1 := (z1 , . . . , zn−1 )|Re zi > 0, ∀ i,
(3.70)
(3.71)
i=1 (β) (β) it is holomorphic in (z1 , . . . , zn ) ∈ Tn−1 and, on T¯n−1 , is bounded in norm by
kΨan ···a1 (zn−1 , . . . , z1 )k ≤
n Y
kaj k .
j=1
In (3.68), (3.69), [A1 ] has been established. We shall now carry out the
(3.72)
August 22, 2002 10:28 WSPC/148-RMP
852
00144
L. Birke & J. Fr¨ ohlich
Induction Step: [An−1 ] ⇒ [An ], ∀ n. Let χN be the characteristic function of the interval [−N, N ]. Then, χN (L)ezL = ezL χN (L) is an entire operator-valued function of z, bounded in norm by exp(N |Re z|). Thus, [An−1 ] implies that the vectors ) (β/2− Π(N an ...a1 (zn−1 , . . . , z1 ) := χN (L)e
Pn−1 i=1
zi )L
Ψan ···a1 (zn−1 , . . . , z1 )
(3.73)
(β) are well defined, for all (z1 , . . . , zn−1 ) ∈ T¯n−1 , and depend holomorphically on (β) (z1 , . . . , zn−1 ), for (z1 , . . . , zn−1 ) ∈ Tn−1 , for all N < ∞. For zi = τi non-negative, Pn−1 for i = 1, . . . , n − 1, with i=1 τi ≤ β/2, ! 1 Y P (N ) (β/2− n−1 τ )L τ L i i=1 λ(an ) e i λ(ai ) Ωβ Πan ...a1 (τn−1 , . . . , τ1 ) = χN (L)e i=n−1
" = χN (L)Φ
n−1 X
an , β/2 −
n−2 X
τi , an−1 , β/2 −
i=1
τi , . . . , a1 , β/2
i=1
" = χN (L)JΦ
#!
a∗1 , 0, a∗2 , τ1 , . . . , a∗n ,
n−1 X
#! τi
i=1
= χN (L)J
n−1 Y
! λ(a∗i )eτi L
λ(a∗n )Ωβ
i=1
= χN (L)JΨa∗1 ···a∗n (τ1 , · · · , τn−1 ) ,
(3.74)
by (3.36), (3.39) and (3.70). The induction hypothesis [An−1 ] tells us that Ψa∗1 ...a∗n (β)
(¯ z1 , . . . , z¯n−1 ) is holomorphic in (¯ zn−1 , . . . , z¯1 ) ∈ Tn−1 and bounded in norm by Qn (β) ¯ ka k, for (¯ z , . . . , z ¯ ) ∈ T i n−1 1 n−1 . Since J is an anti-unitary operator, i=1 z1 , . . . , z¯n−1 ) JΨa∗1 ···a∗n (¯
(3.75)
(β)
is holomorphic in (z1 , . . . , zn−1 ) ∈ Tn−1 , and z1 , . . . , z¯n−1 )k ≤ kJΨa∗1 ···a∗n (¯
n Y
kai k ,
(3.76)
i=1 (β) for (z1 , . . . , zn−1 ) ∈ T¯n−1 , by (3.72). If zi is non-negative for i = 1, . . . , n − 1, and Pn−1 i=1 zi ≤ β/2, then (3.74) shows that (N )
z1 , . . . , z¯n−1 ) . Πan ···a1 (zn−1 , . . . , z1 ) = χN (L)JΨa∗1 ···a∗n (¯
(3.77)
Since the left hand side and the right hand side of (3.77) are holomorphic Hβ -valued (β) functions of (z1 , . . . , zn−1 ) ∈ Tn−1 , equation (3.77) holds for all (z1 , . . . , zn−1 ) ∈
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
853
(β)
Tn−1 , for all N < ∞, and, with (3.76) and (3.74), and using that kχN (L)k = 1, we find that (N )
kΠan ···a1 (zn−1 , . . . , z1 )k ≤
n Y
kai k ,
(3.78)
i=1
uniformly in N < ∞. Since exp[(β/2 −
Pn−1 i=1
zi )L] is a closed operator, and
s-lim χN (L)Ψan ···a1 (zn−1 , . . . , z1 ) = Ψan ···a1 (zn−1 , . . . , z1 ) , N →∞
z1 , . . . , z¯n−1 ) = JΨa∗1 ···a∗n (¯ z1 , . . . , z¯n−1 ) , s-lim χN (L)JΨa∗1 ···a∗n (¯ N →∞
(β) for (z1 , . . . , zn−1 ) ∈ T¯n−1 , by [An−1 ], it follows that (N )
s-lim Πan ···a1 (zn−1 , . . . , z1 ) = e(β/2−
Pn−1 i=1
zi )L
N →∞
Ψan ···a1 (zn−1 , . . . , z1 )
z1 , . . . , z¯n−1 ) , = JΨa∗1 ···a∗n (¯
(3.79)
(β) for (z1 , . . . , zn−1 ) ∈ T¯n−1 , and the bound (3.78) remains true in the limit N → ∞. Next, we define functions M (N ) (σ) by
M (N ) (σ) := hχN (L)eσ(β/2− χN (L)eσ(β/2−
Pn−1 i=1
Pn−1 i=1
Re zi )L Re zi )L
Ψan ···a1 (zn−1 , . . . , z1 ) ,
Ψan ···a1 (zn−1 , . . . , z1 )i .
(3.80)
Since exp(itL) is unitary, the right hand side of (3.80) does not change if exp[σ(β/2− Pn−1 Pn−1 i=1 Re zi )L] is replaced by exp[σ(β/2 − i=1 zi )L] in both arguments of the scalar product. Thus, using [An−1 ], (3.79), (3.78) and that kχN (L)k = 1, we find that 0 ≤ M (N ) (0) ≤
n Y
kai k2 ,
i=1
and 0 ≤ M (N ) (1) ≤
n Y
kai k2 .
(3.81)
i=1
For N < ∞, M (N ) (σ) is smooth in σ ∈ R. Differentiating M (N ) (σ) twice in σ and using that L = L∗ , hence L2 ≥ 0, we conclude that M (N ) (σ) is a convex function of σ. Thus, 0≤M
(N )
(σ) ≤ max(M
(N )
(0) , M
(N )
(1)) ≤
n Y
kai k2 ,
(3.82)
i=1
for all σ ∈ [0, 1], uniformly in N . Inequality (3.82) and the induction hypothesis [An−1 ] show that χN (L)eτ L Ψan ···a1 (zn−1 , . . . , z1 )
(3.83)
August 22, 2002 10:28 WSPC/148-RMP
854
00144
L. Birke & J. Fr¨ ohlich
Qn
(β)
is holomorphic in (z1 , . . . , zn−1 ) ∈ Tn−1 and bounded in norm by (β) T¯ , as long as
i=1
kai k on
n−1
0 ≤ τ ≤ β/2 −
n−1 X
Re zi ,
(3.84)
i=1
uniformly in N < ∞. Using the spectral theorem for L and, in particular, that exp(τ L) is a closed operator, we conclude, similarly to (3.79), that s-lim χN (L)eτ L Ψan ···a1 (zn−1 , . . . , z1 ) = eτ L Ψan ···a1 (zn−1 , . . . , z1 ) N →∞
(3.85)
exists and has the same analyticity- and boundedness properties, provided (3.84) holds. Since exp(itL) is unitary, for t ∈ R, we conclude that, for zn = τ + it, with 0 < Re zn = τ < β/2 −
n−1 X
Re zi ,
(3.86)
i=1
ezn L Ψan ···a1 (zn−1 , . . . , z1 )
(3.87) (β)
is an Hβ -valued function of (z1 , . . . , zn ). It is holomorphic in (z1 , . . . , zn ) ∈ Tn Qn (β) and bounded in norm by i=1 kai k, for all (z1 , . . . , zn ) ∈ T¯n . By Lemma 3.1 (3) Ψan+1 an ···a1 (zn , . . . , z1 ) := λ(an+1 )ezn L Ψan ···a1 (zn−1 , . . . , z1 ) (β)
is holomorphic in (z1 , . . . , zn ) ∈ Tn
(3.88)
and
kΨan+1 an ···a1 (zn , . . . , z1 )k ≤
n+1 Y
kai k ,
(3.89)
i=1 (β)
for all (z1 , . . . , zn ) ∈ T¯n , for arbitrary an+1 ∈ S. Equation (3.88) and inequality (3.89) establish [An ], hence the induction step is complete. Next, we set bj := an−j+1 , 0 := zj0 − izn−j , zj+1
Then, (3.70) implies that
*
hΨan ···a1 (zn−1 , . . . , z1 ), Ωβ i =
n Y
j = 1, . . . , n , j = 1, . . . , n − 1 . + e
izj0 L
λ(bj )e
−izj0 L
Ωβ , Ωβ
.
(3.90)
j=1
Real-time Green functions, as in (3.11), are obtained from (3.90) by taking the boundary values of these functions when zi0 tends to the real axis, for all i = 1, . . . , n. When izj0 = τj ∈ R ,
j = 1, . . . , n ,
with 0 < τ1 < · · · < τn < β/2, then (3.90) is clearly given by φβ (b1 , τ1 , . . . , bn , τn ) ;
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
855
see (3.16), (3.28), (3.33), etc. In order to obtain the Green functions on their maximal domain of analyticity, Tn , see (2.34), one must consider scalar products zk+1 , . . . , z¯n−1 )i hΨak ···a1 (zk−1 , . . . , z1 ), ez¯k L Ψa∗k+1 ···a∗n (¯ and use that ezL Ωβ = Ωβ . As a consequence of Properties (P3), (P∗ iv) and analyticity on the tubular domain Tn , they satisfy the KMS condition (2.41). A C ∗ -dynamical system can be constructed by choosing A to be the smallest ∗ C -algebra generated, for example, by all the operators Z itL −itL f (t)e λ(a)e f ∈ C0 (R), a ∈ S , and noticing that αt (A) := eitL Ae−itL , A ∈ A , defines a ?-automorphism group of A. 4. KMS ↔ SSC, PCT, (A)dS, etc. In this final section, we first recall the relationship between the KMS condition for Lorentz boots in local, relativistic quantum field theory (QFT) on Minkowski space, at zero temperature, on the one hand, and the usual connection between spin and statistics (SSC) and the PCT theorem, on the other hand. Of course our discussion, which is an adaption of one in [33], is based on the deep results in [21, 20]. We then recall a generalization of Part (1) of our main theorem and of the results in Sec. 3.3 useful for an “imaginary-time analysis” of quantum field theory on some non-trivial gravitational backgrounds, in particular on Schwarzschild space-time [34], de Sitter space [35] and on anti-de Sitter space (AdS), [40, 41, 42]. Our discussion is based on results in [22, 23, 32, 36] and is meant to merely recall and illustrate the usefulness of the general results in these papers. It goes beyond these papers only in so far as it includes a general form of the KMS condition. It does, however, not include analytic continuations of general Green functions from imaginary to real times, (which represents a much harder problem than the one solved in our paper). 4.1. SSC and PCT for local, relativistic QFT’s on Minkowski space We consider a local, relativistic QFT on Minkowski space Md , at zero temperature. We suppose that this theory satisfies the Wightman axioms; see [6, 7]. Let H denote the Hilbert space of pure state vectors of the theory, and Ω ∈ H the vacuum vector. We consider a two-dimensional plane, π, in Md containing a time-like direction.
August 22, 2002 10:28 WSPC/148-RMP
856
00144
L. Birke & J. Fr¨ ohlich
We may choose coordinates, x0 , x1 , ~x, on Md such that π is the 01-coordinate plane. Let M denote the self-adjoint operator on H representing the generator of Lorentz boosts in π. Let Ψ(x0 , x1 , ~x) denote a local field of the theory. Lorentz covariance implies that eiαM Ψ(x0 , x1 , ~x)e−iαM = (S(α)Ψ)(x0α , x1α , ~x) ,
(4.1)
where x0α = cosh(α)x0 + sinh(α)x1 , x1α = sinh(α)x0 + cosh(α)x1 ,
(4.2)
and S is a finite-dimensional, projective representation of the Lorentz group, L↑+ , of Md . It is well known that, for a QFT satisfying the Wightman axioms, the passage from real to purely imaginary times (Wick rotation) is possible. Let Ψ] denote either Ψ or Ψ∗ , and let S (n) (]1 , t1 , x11 , ~x1 , . . . , ]n , tn , x1n , ~xn )
(4.3)
denote the imaginary-time Green- or Schwinger functions of the fields Ψ, Ψ∗ , where the arguments (tj , x1j , ~xj ) are points in Euclidean space, tj being the imaginary time of the jth point, and ]j = ∅, ?, if Ψ, Ψ∗ , respectively, is inserted in the jth argument, for j = 1, . . . , n. We introduce polar coordinates, (τ, r), in the (t, x1 )-coordinate plane of Ed , where τ is the polar angle, and r ≥ 0 the radial variable. Let S denote the linear space of column vectors f1 .. (4.4) a= . fk of Schwartz-space test functions, fα (r, ~x), on Rd−1 with support contained in {(r, ~x)|r ≥ 0}, denote by S ∗ the space of row vectors, (f1 , . . . , fk ), of test functions, and let ∗ be the map from S to S ∗ given by a∗ := (f¯1 , . . . , f¯k ) ,
(4.5)
for a as in (4.4). In (4.4), (4.5), k is the dimension of the (projective) representation S of L↑+ under which Ψ transforms. For a1 , . . . , an in S, we define Z ]1 ]n φ2π (a1 , τ1 , . . . , an , τn ) := S (n) (]1 , t(τ1 , r1 ), x1 (τ1 , r1 ), ~x1 , . . . , ]n , t(τn , rn ) , × x1 (τn , rn ), ~xn )
n Y
]
ajj (rj , ~xj )drj d~xj ,
j=1
with t(τ, r) := r sin τ, x1 (τ, r) := r cos τ .
(4.6)
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
857
It follows from the results in [4] that the Green functions φ2π (a]11 , τ1 , . . . , a]nn , τn ) satisfy Properties (P1), continuity, and (P4), reflection positivity, of Sec. 3.1; see Eqs. (3.3), (3.4) with β = 2π! Property (P2) (translation invariance) must be replaced by (P2’) Rotation Invariance φ2π (a]11 , τ1 , . . . , a]nn , τn ) = φ2π (a]11 (τ ), τ1 + τ, . . . , a]nn (τ ), τn + τ ) ,
(4.7)
where a∗ (τ ) = (R(−τ )a)∗ = (a(−τ ))∗ ,
a(τ ) = R(τ )a ,
(4.8)
and R is the k-dimensional, projective representation of the group of rotations of Ed obtained from the representation S by analytic continuation in the rapidity, α. If S is irreducible, then R(τ = 2π) = ei2πsΨ ,
(4.9)
where sΨ is the “spin” of the field Ψ. It is well known that 1 mod Z, for d ≥ 3 , 2
(4.10)
sΨ = [0, 1) mod Z, for d = 2 .
(4.11)
sΨ = 0, while
We are interested in understanding whether a property similar to Property (P3), Sec. 3.1, i.e. the KMS condition, holds, too. To this end, we first recall that, in QFT, the Green functions φ2π (a]11 , τ1 , . . . , a]nn , τn ) are defined for arbitrary, not necessarily ordered, n-tuples (τ1 , . . . , τn ) ∈ Tn (with β = 2π), and φ2π (a]11 , τ1 , . . . , a]nn , τn ) ]
]
j+1 , τj+1 , . . . , a]nn , τn , a]11 , τ1 , . . . , ajj , τj ) , = eiπΨ j(n−j) φ2π (aj+1
(4.12)
where Ψ is the statistics parameter of Ψ, and Ψ = 0, 1, for d ≥ 3 ,
(4.13)
with Ψ = 0 corresponding to Bose- and Ψ = 1 corresponding to Fermi statistics, while Ψ ∈ [0, 2), for d = 2 ,
(4.14)
(fractional-, or braid statistics [37]). Theorem 4.1. If Properties (P2’) (rotation invariance) and (P 4) (reflection positivity) hold, then Ψ = 2sΨ mod 2Z , i.e. the usual connection between spin and statistics (SSC) holds.
(4.15)
August 22, 2002 10:28 WSPC/148-RMP
858
00144
L. Birke & J. Fr¨ ohlich
Proof. Let a(τ ), a∗ (τ ) be as in (4.8). Reflection positivity, (P4) (for β = 2π), says that φ2π (a(τ ), τ, a∗ (−τ ), 2π − τ ) = φ2π (a(τ ), τ, (a(τ ))∗ , 2π − τ ) ≥ 0 . Hence, e−i2πsΨ φ2π (a(τ ), τ, a∗ (2π − τ ), 2π − τ ) = φ2π (a(τ ), τ, R(2π)−1 a∗ (2π − τ ), 2π − τ ) = φ2π (a(τ ), τ, a∗ (−τ ), 2π − τ ) ≥ 0 .
(4.16)
Rotation invariance, (P2’), implies that φ2π (a(τ ), τ, a∗ (2π − τ ), 2π − τ ) = φ2π (a(τ − π), τ − π, a∗ (π − τ ), π − τ ) .
(4.17)
By (4.12), we have that φ2π (a(τ − π), τ − π, a∗ (π − τ ), π − τ ) = eiπΨ φ2π (a∗ (π − τ ), π − τ, a(τ − π), τ − π) = eiπΨ φ2π (a∗ (π − τ ), π − τ, R(−2π)a(π + τ ), π + τ ) = eiπΨ e−i2πsΨ φ2π (a∗ (π − τ ), π − τ, a(π + τ ), π + τ ) .
(4.18)
By (P4) and (4.16), the product of the second and the third factors on the right hand side is positive. Thus, comparing (4.18) with (4.17) and (4.16), we readily find that eiπΨ = ei2πsΨ ,
(4.19)
or Ψ = 2sΨ mod 2Z , which completes our proof. The heuristic idea behind our proof is captured in the following formal calculation: For τ ∈ (0, π), 0 ≤ heτ M Ψ(0, a)Ω, eτ MΨ(0, a)Ωi = hΨ(0, a)Ω, e2τ M Ψ(0, a)Ωi, (M = M∗ ) = φ2π (a, 0, a∗ (−2τ ), 2τ ) = eiπΨ φ2π (a∗ (−2τ ), 2τ, a, 0) = eiπΨ hΨ∗ (2τ, (R(2τ )a)∗ )Ω, Ψ∗ (0, a∗ )Ωi τ →π
= eiπΨ e−i2πsΨ hΨ∗ (0, a∗ )Ω, Ψ∗ (0, a∗ )Ωi ,
and the last factor on the right hand side is positive, which yields (4.19).
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
859
Remark 4.1. The general SSC for relativistic QFT’s with arbitrarily many local Bose- and Fermi fields in dimension d ≥ 3 has been established by Araki in [38]. SSC for two-dimensional theories or three-dimensional gauge theories with braid statistics has been established in [39], (see also refs. given there). Next, we apply Theorem 4.1 to establish the 2π-KMS condition for the Lorentz boosts, Ψ] 7→ eiαM Ψ] e−iαM , at imaginary rapidities (“times”). Corollary 4.1 (KMS). In d ≥ 3 dimensions, the Green functions φ2π (a]11 , τ1 , . . . , a]nn , τn ) satisfy the KMS condition φ2π (a]11 , τ1 , . . . , a]nn , τn ) ]
]
j+1 , τj+1 , . . . , a]nn , τn , a]11 (2π), τ1 + 2π, . . . , ajj (2π), τj + 2π) . = φ2π (aj+1
(4.20)
Proof. Equation (4.12) tells us that φ2π (a]11 , τ1 , . . . , a]nn , τn ) ]
]
j+1 , τj+1 , . . . , a]nn , τn , a]11 , τ1 , . . . , ajj , τj ) = eiπΨ j(n−j) φ2π (aj+1
]
]
j+1 , τj+1 , . . . , a]nn , τn , a]11 , τ1 +2π, . . . , ajj , τj +2π) . = eiπΨ j(n−j) φ2π (aj+1
(4.21)
If sΨ 6= 0 (Ψ 6= 0), then n must necessarily be even for the Green functions in (4.20), (4.21) to be different from zero. Then, j(n − j) ∼ = j(mod 2 Z) . = j2 ∼ Hence, by Theorem 4.1, eiπΨ j(n−j) = ei2πsΨ j . The proof is completed by using (4.9). Thanks to Corollary 4.1, we may now define an anti-linear involution, J, as follows: On a vector Ψ = zΦ([a]11 (τ1 ), τ1 , . . . , a]nn (τn ), τn ])
(4.22)
in the Hilbert space reconstructed from the Schwinger functions (4.3) of a QFT, as in [4], we set JΨ := z¯Φ([(a]nn (τn − π))∗ , π − τn , . . . , (a]11 (τ1 − π))∗ , π − τ1 ]) .
(4.23)
A variant of the Reeh–Schlieder theorem [6, 7] shows that vectors of the form (4.22) span a dense set in the Hilbert space of the theory, and a calculation essentially identical to (3.37), based on using (4.8) and the KMS condition (4.20), proves that J is an anti-unitary involution (see also (4.61), (4.63), below). By (3.43), Jeiτ M = eiτ M J ,
τ ∈ R.
(4.24)
August 22, 2002 10:28 WSPC/148-RMP
860
00144
L. Birke & J. Fr¨ ohlich
Equations (4.23), (4.24) and (4.6) show that J has the interpretation J = P1 CT ,
(4.25)
where P1 represents the spatial reflection (x0 , x1 , ~x) 7→ (x0 , −x1 , ~x) ,
(4.26)
C is charge conjugation, i.e. Ψ 7→ Ψ∗ , and T represents time reversal, (x0 , x1 , ~x) 7→ (−x0 , x1 , ~x) .
(4.27)
If the dimension d is even, the product of the reflection (4.26) with space reflection, P : (x0 , x1 , ~x) 7→ (x0 , −x1 , −~x) , has determinant +1, hence belongs to L↑+ . Thus, PP1 is always a symmetry of the theory, and hence Θ := PCT = PP1 J
(4.28)
is always an anti-unitary symmetry of the theory. This is Jost’s PCT theorem [21]. Remark 4.2. (1) The KMS condition (4.20) and Eq. (4.12) (for n = 2) also imply the SSC, Ψ = 2sΨ mod 2Z, without assuming reflection positivity: By (4.20), (4.8) and (4.9), φ2π (a]1 , τ1 , a]2 , τ2 ) = φ2π (a]2 , τ2 , a]1 (2π), τ1 + 2π) = ei2πsΨ φ2π (a]2 , τ2 , a]1 , τ1 ) ,
(4.29)
which when compared with (4.12) proves the SSC, Eq. (4.15). (2) Quite clearly, QFT’s with braid statistics in two or three space-time dimensions require a more elaborate analysis, which we will not present here; but see [39], and refs. given there. Suffice it to say that Theorem 4.1, suitably interpreted, remains valid. Our analysis shows that, in three dimensions, braid statistics and fractional spin do not arise in theories with only point-like localized fields. 4.2. QFT in some non-trivial gravitational backgrounds Let X d be a d-(complex-)dimensional complex manifold equipped with a (symmetric) quadratic form, η, on the tangent bundle TX d . We assume that X d contains two d-real-dimensional submanifolds, NLd and NEd , such that ηL := η|TNLd is a Lorentz metric on NLd ,
(4.30)
ηE := η|TNEd is a Riemannian metric on NEd .
(4.31)
and
We shall interpret NLd as the space-time of a physical system and will be interested in studying local QFT’s on NLd . Our strategy will be to attempt to construct “imaginary-time” Green functions over the Riemannian slice, NEd , of X d
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
861
and reconstruct from them data of a local quantum theory on NLd (not, however, including general real-time Green functions of local field operators!). Our results can be viewed as a general (group-theoretical) version of the “Wick rotation”. Our analysis is based on results in [22, 23, 36]. It actually does not make use of the complex manifold X d — requiring NEd to have appropriate symmetry properties will suffice! Our techniques are purely group-theoretical. The rˆ ole of the KMS condition will be elucidated. Here are examples of space-times which fit into the context described above. (i) Complexified Minkowski space: X d = Cd Points in X d are denoted by z = (z 0 , ~z); and η(z) = −(dz 0 )2 + (d~z)2 .
(4.32)
Then NLd = Md = {z = (x0 , ~x)|x0 ∈ R, ~x ∈ Rd−1 } ,
(4.33)
NEd = Ed = {z = (x0 = it, ~x)|t ∈ R, ~x ∈ Rd−1 } . (ii) de Sitter and AdS: We choose (
d ) X j 2 2 (z ) = R ,
d+1
z = (z , z , . . . , z ) ∈ C
d
0
X :=
1
d
(4.34)
j=0
Pd for some R > 0, and η to be the restriction of j=0 (dz j )2 to X d . Then ( ) d X d d 0 1 d j 0 2 j 2 2 NL := dSR = z = (ix , x , . . . , x ) x ∈ R, j = 0, . . . , d, −(x ) + (x ) = R , j=1
) d X d = z = (x0 , x1 , . . . , xd ) xj ∈ R, j = 0, . . . , d, (xj )2 = R2 NEd := SR (
(4.35)
j=0
are d-dimensional de Sitter space and the d-sphere, respectively. Furthermore, ( d d 0 1 d−1 d ˜L := AdS = z = (x , ix , . . . , ix , x ) xj ∈ R, j = 0, . . . , d, N (x ) + (x ) − 0 2
d 2
)
d−1 X
j 2
(x ) = R
2
,
j=1
( ˜d N E
d
:= H =
0
1
d−1
z = (ix , ix , . . . , ix
x > 0, (x ) − d
d 2
d−1 X
, x ) xj ∈ R, j = 0, . . . , d, d
) j 2
(x ) = R
2
j=0
are d-dimensional anti-de Sitter space and hyperbolic space, respectively.
(4.36)
August 22, 2002 10:28 WSPC/148-RMP
862
00144
L. Birke & J. Fr¨ ohlich
Obviously, AdS d is not simply connected. It admits time-like closed curves. In general, there will be a connection between imaginary-time Green functions on Hd ^d , [40]. and a quantum theory on the (universal) covering space, AdS Recently, the so-called AdS-CFT correspondence has been discovered [43] and widely studied. The simplest example of this correspondence is one between QFT’s on AdS 2 and chiral conformal field theories on a light ray. Here, we just wish to note that results concerning the passage from imaginary-time Green functions on Hd to ^d can be translated into statements concerning conformal quantum theory on AdS field theory; see [41], and refs. given there. Next, we consider examples with the following product structure: X d = U k × Y d−k ,
k = 1, 2, . . . ,
where U k is a subset of a k-dimensional complex manifold, while Y d−k is a (d − k)dimensional, real manifold, and NLd = ULk × Y d−k ,
NEd = UEk × Y d−k .
(4.37)
Here is a concrete example. (iii) Schwarzschild black hole: X d = U 2 × S d−2 , η(z) = h(ξ 2 )(dξ 2 + ξ 2 dτ 2 ) + k(ξ 2 )ds2 ,
(4.38)
where (ξ, τ ) are suitable coordinates on U 2 ⊂ C2 , and h and k are analytic functions of ξ 2 , positive on the real axis. Then NLd = UL2 × S d−2 , with UL2 = {(ξ, τ = it)|ξ, t ∈ R} .
(4.39)
Note that, for d = 4, the space-time outside the horizon of a Schwarzschild black hole (together with its isometric twin) is of the form (4.38), (4.39). NEd = UE2 × S d−2 , with UE2 = {(ξ, τ )|ξ ≥ 0, τ ∈ R/2πZ} .
(4.40)
See e.g. [44] for background material. From now on, only the Riemannian manifold NEd will be featured, as promised. We must specify the properties of NEd needed in our analysis and then check that they are valid in the examples just considered. We simplify our notation by setting N := NEd , η := ηE = η|TNEd . Properties of (N , η)
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
863
(I) Reflection symmetry: N admits an isometric involution (reflection), r. The fixed-point set of r is a submanifold, M , of N , of co-dimension 1; M is called the “equator ” of N . It is equipped with the induced metric. (II) Killing symmetries: There is a real symmetric space (G, K, σ), where G is a real, simply connected Lie group, σ is an involutive homomorphism of G, and K ⊂ G is the fixed-point set of σ, with the properties that there is an action, π, of G on N generated by Killing vector fields of the metric η, that the equator M is invariant under the action of K, and rπ(g)r = π(σ(g)), for all g ∈ G .
(4.41)
Of course, K is a subgroup of G. ˜ d ), It is an easy exercise to check that in all our examples, (i)–(iii), N := NEd (N E η := ηE have Properties (I) and (II). The simplest examples have the following structure: ˜ (1) × K = R × K ; G = U (1) × K, or G = U for g = (eiα , k) ∈ U (1) × K, σ(eiα , k) = (e−iα , k) . In these examples, X d = C × M, N = NEd = R × M, or N = S 1 × M, NLd = iR × M . Examples, where G = L × K, with L some non-abelian Lie group, are incompatible with the condition that, at zero temperature, the energy spectrum be contained in R+ ; see [22]. Following [22], we next describe the general mathematical structure underlying a formulation of local, relativistic quantum theory at imaginary time. It consists of the following objects. (a) A Riemannian manifold (N, η) with Properties (I) and (II), above. We let (G, K, σ) denote the symmetric space appearing in Property (II). (For simplicity, we may assume that K is a compact subgroup of G. This would exclude examples (i) and AdS d ↔ Hd , Eq. (4.36), above. But these examples are covered by the results of [22, 36, 23].) (b) A separable topological vector space, V, containing two isomorphic subspaces, V+ and V− , (usually with V+ ∩ V− = {0}). (c) A continuous representation, ρ, of the Lie group G on V with the property that, for every v ∈ V± , there exists an open neighborhood, Uv , of the identity element e ∈ G such that ρ(g)v ∈ V± , for all g ∈ Uv ,
(4.42)
ρ(k)v ∈ V± , for all k ∈ K .
(4.43)
and
August 22, 2002 10:28 WSPC/148-RMP
864
00144
L. Birke & J. Fr¨ ohlich
(d) An anti-linear involution, θr , on V representing the reflection r in Property (I) such that θr V ± = V ∓ ,
(4.44)
θr ρ(g)θr = ρ(σ(g)) ,
(4.45)
and
for all g ∈ G. (e) A bilinear functional, φ, on V+ × V− with the following properties. ˜ Continuity: φ is continuous on V+ × V− in the product topology of V × V. (P1) ˜ (P2) Invariance: φ(ρ(g)v, ρ(g)w) = φ(v, w) ,
(4.46)
for all g ∈ Uv ∩ Uw . ˜ KMS condition: Let h be any element of G such that (P3) σ(h) = h−1 , ρ(h)V± ⊆ V∓ . Then, for arbitrary v ∈ V+ , w ∈ V− , φ(v, w) = φ(ρ(h−1 )w, ρ(h)v) .
(4.47)
˜ Reflection positivity: For arbitrary v ∈ V+ , (P4) φ(v, θr v) ≥ 0 .
(4.48)
The point is that the structure described here enables us to formulate and prove a generalization of Part (1) of the main theorem proven in Sec. 3. Our result involves the Lie group, G∗ , dual to the Lie group G of Killing symmetries of the manifold (N, η). The group G∗ is defined as follows. Let g denote the Lie algebra of G and k the Lie algebra of K, the symmetry group of the “equator” of N . Clearly, [k, k] ⊆ k ,
(4.49)
where [·, ·] is the Lie bracket on g, and g has a decomposition into linear subspaces, g = k⊕m,
(4.50)
with the property that [k, m] ⊆ m, [m, m] ⊆ k , σ|k = id, σ|m = −id .
(4.51)
The dual symmetric Lie algebra, g∗ , is defined by g∗ := k ⊕ im . ∗
(4.52) ∗
By (4.51), g is again a real Lie algebra. Let G be the simply connected, real Lie group with Lie algebra g∗ . We say that G∗ is dual to G, and that (G∗ , K, σ) is the symmetric space dual to (G, K, σ).
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
865
The idea is that G∗ is the group of Killing symmetries of “physical spacetime”, (NL , ηL ), associated with (N = NE , η = ηE ). One may expect that, usually, (NL , ηL ) can be reconstructed unambiguously from (N, η) if (N, η) has Properties (I) and (II), with dim(m) ≥ 1 (with NL assumed to be simply connected). ˜ Let h be an element of G as described in Property (P3). It is easy to see that h = exp M, for some M ∈ m ,
(4.53)
˜ holds, too, that and, assuming that Property (P2) k −1 hk satisfies Eq. (4.47), for all k ∈ K .
(4.54)
Let gh be the Lie subalgebra of g on which the adjoint action of h is trivial, let gh∗ be the corresponding Lie subalgebra of g∗ ; and let Gh , G∗h be the subgroups of G and G∗ generated by gh and gh∗ , respectively. Clearly Gh is the subgroup of G commuting with h. We note that Gh contains the one-parameter subgroup {exp τ M |τ ∈ R}. ˜ is called a KMS-element Henceforth, any element h of G as in Property (P3) of G. We are now prepared to state the main result of this section. Theorem 4.2. Let (N, η) have Properties (I) and (II). Furthermore, let (N, η), the associated symmetric space (G, K, σ), V, ρ, θr and φ be as described in points ˜ ) through (P4 ˜ ). (a) through (e), above, with Properties (P1 These data uniquely determine a separable Hilbert space, H, a continuous, unitary representation, π, of the group G∗ on H, and, for any KMS-element h of G, an anti-unitary involution, Jh , such that π(g)Jh = Jh π(g), for all g ∈ G∗h ,
(4.55)
π(k)Jh = Jkhk−1 π(k) ,
(4.56)
and
for all k ∈ K. Remark 4.3. (1) With the exception of the statements concerning the anti-unitary operators, Jh , associated with KMS-elements h ∈ G, this theorem has been proven, under different hypotheses on (G, K, σ), in [36, 22, 23]. (2) The proof of the theorem follows steps (i), (ii) and (iii) of the proof of Part (1) of the main theorem in Sec. 3 (see Sec. 3.3). (i) Construction of Hilbert Space An inner product, h·, ·i, on the subspace V+ ⊂ V is defined by hv, wi := φ(v, θr w) ;
(4.57)
˜ see (P4). Let N denote the kernel of h·, ·i in V+ , and Φ(v) := v mod N ,
v ∈ V+ .
(4.58)
August 22, 2002 10:28 WSPC/148-RMP
866
00144
L. Birke & J. Fr¨ ohlich
One defines H to be the closure of Φ(V) ≡ V/N in the norm determined by h·, ·i, and hΦ(v), Φ(w)i := hv, wi defines the scalar product on H. (ii) Construction of a Unitary Representation, π, of G∗ on H The representation π is defined as follows: For k ∈ K, π(k)Φ(v) := Φ(ρ(k)v) .
(4.59)
˜ and (4.45) that π(k) is a unitary operator. It follows directly from Property (P2) Furthermore, with every M ∈ m we associate an operator M on H by setting etM Φ(v) := Φ(ρ(etM )v) , ˜ and for t so small that exp(tM ) ∈ Uv . Note that by (4.45), (4.51), Property (P2) results in [22], M is self-adjoint. Thus, π(eitM ) := eitM defines a one-parameter unitary group. As shown in [22, 36, 32, 23], under somewhat different hypotheses, π defines a unitary representation of G∗ on H. (iii) Construction of the anti-unitary involution Jh , h a KMS-element of G If h is a KMS-element of G and v ∈ V+ , we set jh v := θr ρ(h)v = ρ(h−1 )θr v .
(4.60)
Then hjh v, jh wi = φ(θr ρ(h)v, ρ(h)w) = φ(ρ(h−1 )θr v, ρ(h)w) = φ(w, θr v) = hw, vi ,
(4.61)
˜ by (4.45) and Property (P3), (4.47). It follows that N is invariant under jh , and we may thus set Jh Φ(v) := Φ(jh v) ,
v ∈ V+ .
(4.62)
Equation (4.61) then implies that Jh is anti-unitary. Furthermore, by (4.45) and ˜ Property (P3), Jh2 Φ(v) = Φ(jh2 v) = Φ(θr ρ(h)θr ρ(h)v) = Φ(ρ(h−1 )ρ(h)v) = Φ(v), for arbitrary v ∈ V+ .
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
867
Hence, Jh2 = 1 .
(4.63)
Using (4.60), (4.62), (4.54) and (4.59), we find that π(k)Jh Φ(v) = Φ(ρ(k)jh v) = Φ(jkhk−1 ρ(k)v) = Jkhk−1 π(k)Φ(v) ,
(4.64)
for all k ∈ K and all v ∈ V+ . Equation (4.55) easily follows from the definition of Gh and G∗h by using [22, Theorems 1 and 3]. (3) To come up with an analogue of Part (2) of the main theorem stated in Sec. 3.2 and of the results in Sec. 4.1 would require introducing more structure. As an example, let us imagine that V+ contains a linear subspace, V0 , with θr V0 = V0
(4.65)
(hence V0 ⊆ V+ ∩ V− ). Then, the following variant of the real-time KMS condition holds: Let h be a KMS-element of G, with h = exp M ,
M ∈ m;
see (4.53). Let M := dπ(M )
(4.66)
be the self-adjoint operator representing M on the Hilbert space H. Then Prop˜ erty (P3), (4.47), (4.65) and the theorem stated above imply that, for arbitrary v and w in V0 , heitM Φ(v), Φ(w)i = hJh Φ(w), Jh eitM Φ(v)i = hJh Φ(w), eitM Jh Φ(v)i, cf. (4.24) = hΦ(ρ(h−1 )θr w), eitM Φ(ρ(h−1 )θr v)i, cf. (4.60) = he−M Φ(θr w), eitM e−M Φ(θr v)i, cf. (3.28), (3.33) = he−i(t−i)M Φ(θr w), e−M Φ(θr v)i , and we have used that M is self-adjoint. The usual arguments show that Fvw (t) := heitM Φ(v), Φ(w)i
(4.67)
is the boundary value of a function, Fvw (z), analytic in z on the strip {z|0 < Im z < 2i} ,
(4.68)
August 22, 2002 10:28 WSPC/148-RMP
868
00144
L. Birke & J. Fr¨ ohlich
with Fvw (t + 2i) = hΦ(θr w), eitM Φ(θr v)i = Fθr w θr v (−t) .
(4.69)
We conclude that if L := 2β −1 M can be interpreted as the generator of time evolution (the Liouvillian) in a suitably chosen frame of reference, then, apparently, the quantum theory reconstructed in Theorem 4.2 describes a system in thermal equilibrium at inverse temperature β. This yields a (rather standard) imaginary-time interpretation of the Unruh- and the Hawking effects. Comparing Eqs. (4.67)–(4.69) with (4.29), (4.20), we easily arrive at a formulation of the connection between spin and statistics (SSC) in the present context. Theorem 4.2 and the considerations above apply to all the examples described at the beginning of this section. (i) Minkowski space: N ≡ NEd = Ed , G=g SO(d) .×Rd ,
NLd = Md ,
G∗ = g SO(d − 1, 1) .×Rd ,
iM a boost generator (Unruh effect). (ii) de Sitter and AdS: (1) d , N = SR
d NLd = dSR ,
SO(d, 1) , G∗ = g
G=g SO(d + 1) ,
iM a boost generator (“cosmic Unruh effect ”). (2) N = Hd , g 1) , G = SO(d,
^d , NLd = AdS SO(d − 1, 2) , G∗ = g
(“AdS-Unruh-effect” [40]). (iii) Schwarzschild black hole: N = R2 × S d−2 , NLd = Schwarzschild space-time , g g − 1) , G = SO(2) × SO(d
g − 1) , G∗ = R × SO(d
M the generator of rotations of the plane R2 = UE2 (i.e. iM ∝ generator of time translations, t 7→ t + τ ): Hawking effect! We hope to present an extension of the analysis in this section to quantum theories on more general spaces, including non-commutative ones, elsewhere. (In
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
869
this connection, note that it is really only the symmetries (G, K, σ) and the antilinear involution θr which are important in the proof of Theorem 4.2 and in the remarks (1) through (3), above, but not the manifolds (NE , ηE ) and (NL , ηL )!) A particularly interesting case concerns two-dimensional conformal field theories, where G is infinite-dimensional. This case is not covered by the results in [22, 23, 36]. It would be desirable to understand it more fully. Another important problem concerns the construction of analytic continuations of “imaginary-time” Green functions (defined over NE ) to real-time Green functions (defined over NL ) in non-trivial gravitational backgrounds, which we have not touched upon, in this section. In this connection, it should be noted that n-tuples of real points of NL which could be reached by “KMS analytic continuation” starting from general n-tuples of points in NE and belonging to the complex orbits associated with any given KMS-element h of the symmetry group G do not cover all of NL×n , but are always confined to a region Wh×n , where Wh consists of a “wedge-shaped” subset of NL and its causal complement. Acknowledgments The senior author is deeply grateful to his scientific grandfathers, fathers and uncles for having created the atmosphere and the facts which made considerations like the ones presented in this paper appear worthwhile and possible. He thanks his collaborators in work on which this review is based, and, in particular, E. Nelson and E. Seiler, for all they have taught him. He is grateful to H. Epstein for very helpful discussions, and to H. Araki for support and encouragement. We thank the referees for their comments and for drawing our attention to some relevant references. The research of the first author is supported in part by the Swiss National Fundation.
References [1] J. Ginibre, Reduced density matrices of quantum gases, J. Math. Phys. 6, 238–251, 252–262, 1432–1446; and in Statistical Mechanics, ed. T. Bak, Benjamin, New York, 1967, p. 148. [2] J. Feldman and E. Trubowitz, Perturbation theory for many fermion systems, Helv. Phys. Acta. 63 (1990) 156–260, J. Feldman, H. Kn¨ orrer and E. Trubowitz, A twodimensional Fermi liquid, single scale analysis of many-fermion systems, Convergence of perturbation expansions in fermionic models, see http://www.math.ubc.ca/people/faculty/feldman/fl.html [3] T. Chen, J. Fr¨ ohlich and M. Seifert, Renormalization group methods: Landau-Fermi liquid and BCS superconductor, in Fluctuating Geometries in Statistical Mechanics and Field Theory, eds. F. David, P. Ginsparg and J. Zinn-Justin, Les Houches Summer School 1994, Elsevier Science, Amsterdam, 1995. [4] K. Osterwalder and R. Schrader, Axioms for Euclidean Green’s functions, Commun. Math. Phys. 42 (1975) 281–305.
August 22, 2002 10:28 WSPC/148-RMP
870
00144
L. Birke & J. Fr¨ ohlich
[5] V. Glaser, On the equivalence of the Euclidean and Wightman formulation of field theory, Comm. Math. Phys. 37 (1974) 257–272. [6] R. Jost, The General Theory of Quantized Fields, American Mathematical Society, Providence, Rhode Island, 1965. [7] R. F. Streater and A. S. Wightman, PCT, Spin and Statistics, and All That, Benjamin, New York, 1964. [8] R. Kubo, J. Phys. Soc. Japan 12 (1957) 570. [9] P. C. Martin and J. Schwinger, Phys. Rev. 115 (1959) 1342. [10] D. Ruelle, Analyticity of Green’s functions of dilute quantum gases, J. Math. Phys. 12 (1971) 901–903, and Definition of Green’s functions for dilute Fermi gases, Helv. Phys. Acta. 45 (1972) 215–219. [11] R. Haag, N. Hugenholtz and M. Winnink, On the equilibrium states in quantum statistical mechanics, Commun. Math. Phys. 5 (1967) 215–236. [12] H. Araki, Multiple time analyticity of a quantum statistical state satisfying the KMS boundary condition, Publ. RIMS, Kyoto Univ. Ser. A 4 (1968) 361–371. [13] R. Høgh-Krohn, Relativistic quantum statistical mechanics in two-dimensional spacetime, Comm. Math. Phys. 38 (1974) 195–224; R. Figari, R. Høgh-Krohn and C. R. Nappi, Interacting relativistic boson fields in the De Sitter universe with two spacetime dimensions, Comm. Math. Phys. 44 (1975) 265–278. [14] J. Fr¨ ohlich, The reconstruction of quantum fields from Euclidean Green’s functions at arbitrary temperatures, Helv. Phys. Acta 48 (1975) 355–369. [15] J. Fr¨ ohlich, Lectures at Princeton University 1976/77, unpublished. [16] H. Araki, Relative Hamiltonian for faithful normal states of a von Neumann algebra, Publ. RIMS Kyoto Univ. 9 (1973) 165–209. [17] J. Fr¨ ohlich, Unbounded symmetric semigroups on a separable Hilbert space are essentially selfadjoint, Adv. Appl. Math. 1 (1980) 237–256. [18] A. Klein and L. J. Landau, Construction of a unique self-adjoint generator for a symmetric local semigroup, J. Funct. Anal. 44 (1981) 121–137. [19] A. Klein and L. J. Landau, Stochastic processes associated with KMS states, J. Funct. Anal. 42 (1981) 368–428. [20] J. J. Bisognano and E. H. Wichmann, On the duality condition for a Hermitian scalar field, J. Math. Phys. 16 (1975) 985–1007. [21] R. Jost, Eine bemerkung zum CTP theorem, Helv. Phys. Acta 30 (1957) 409–416. [22] J. Fr¨ ohlich, K. Osterwalder and E. Seiler, On virtual representations of symmetric spaces and their analytic continuation, Ann. Math. 118 (1983) 461–489. [23] P. E. T. Jorgensen and G. Olafson, Proc. Symp. Pure Math. 68, Providence R.I.: AMS Publ. 2000, pp. 331–401. [24] O. E. Lanford, M´ecanique Statistique et Th´eorie Quantique des Champs, eds. C. De Witt, R. Stora, Gordon and Breach, New York, 1971 (Les Houches Summer School 1970), pp. 109–214. [25] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Springer Verlag, New York, Volume I: 1979, Volume II: 1981. [26] M. Takesaki, Tomita’s Theory of Modular Hilbert Algebras and its Applications, LNM 128, Springer Verlag, Berlin, Heidelberg, 1970. [27] R. Haag, D. Kastler and E. B. Trych-Pohlmeyer, Stability and equilibrium states, Comm. Math. Phys. 38 (1974) 173–193. [28] V. Jakˇsi´c and C. A. Pillet, On a model for quantum friction III. ergodic properties of the Spin-Boson system, Comm. Math. Phys. 178 (1996) 627–651. [29] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Return to equilibrium, J. Math. Phys. 41 (2000) 3985.
August 22, 2002 10:28 WSPC/148-RMP
00144
KMS, etc.
871
[30] H. Epstein, in Axiomatic Field Theory, eds. M. Chretien and S. Deser, Gordon and Breach, New York, 1966 (Brandeis Summer School 1965). [31] H. Kunze and E. Stein, Uniformly bounded representations II, Amer. J. Math. 83 (1960) 723–786. [32] A. Klein and L. J. Landau, From the Euclidean group to the Poincar´ e group via Osterwalder-Schrader positivity, Comm. Math. Phys. 87 (1983) 469–484. [33] J. Fr¨ ohlich and P. A. Marchetti, Spin-statistics theorem and scattering in twodimensional condensed matter physics, Nucl. Phys. B356 (1991) 533–573. [34] G. L. Sewell, Relativity of temperature and the Hawking effect, Phys. Lett. 79A (1980) 23/24. [35] J. Bros, H. Epstein and U. Moschella, Analyticity properties and thermal effects for general quantum field theory on De Sitter space-time, Comm. Math. Phys. 196 (1998) 535–570. [36] M. L¨ uscher and G. Mack, Global conformal invariance in quantum field theory, Comm. Math. Phys. 41 (1975) 203–234. [37] J. Fr¨ ohlich, Statistics of fields, the Yang–Baxter equation and the theory of knots and links, in Non-Perturbative Quantum Field Theory, eds. G. ’t Hooft et al., Carg`ese 1987, NATO Advanced Science Institutes Series B: Physics 185, Plenum, New York, pp. 71–100. [38] H. Araki, On the connection of spin and commutation relations between different fields, J. Math. Phys. 2 (1961) 267–270. [39] J. Fr¨ ohlich and F. Gabbiani, Braid statistics in local quantum theory, Rev. Math. Phys. 2 (1990) 251–353. [40] J. Bros, H. Epstein and U. Moschella, Towards a general theory of quantized fields on the anti-de Sitter space-time, preprint, 2001. [41] M. Bertola, J. Bros, U. Moschella and R. Schaeffer, A general construction of conformal field theories from scalar anti-de Sitter quantum field theories, Nucl. Phys. B 587 (2000) 619. [42] D. Buchholz, M. Florig and S. Summers, Hawking-Unruh temperature and Einstein causality in anti-de Sitter space time, Class. Quant. Gravity 17 (2000) L31–L37. [43] J. Maldacena, The large N limit of superconformal field theories and supergravity, Adv. Theor. Math. Phys. 2 (1998) 231–252. [44] N. Straumann, General Relativity and Relativistic Astrophysics, Springer Verlag, Berlin, Heidelberg, New York, 1984.
August 22, 2002 10:44 WSPC/148-RMP
00143
Reviews in Mathematical Physics, Vol. 14, No. 7 & 8 (2002) 873–885 c World Scientific Publishing Company
ANY COMPACT GROUP IS A GAUGE GROUP
SERGIO DOPLICHER∗ and GHERARDO PIACITELLI† Dipartimento di Matematica, Universit` a degli studi di Roma “La Sapienza”, P.le A. Moro, 2, 00185 Roma, Italy ∗
[email protected] †
[email protected] Received 26 April 2002 Revised 20 June 2002 Dedicated to Huzihiro Araki on the occasion of his 70th birthday. The assignment of the local observables in the vacuum sector, fulfilling the standard axioms of local quantum theory, is known to determine uniquely a compact group G of gauge transformations of the first kind together with a central involutive element k of G, and a complete normal algebra of fields carrying the localizable charges, on which k defines the Bose/Fermi grading. We show here that any such pair {G, k}, where G is compact metrizable, does actually appear. The corresponding model can be chosen to fulfill also the split property. This is not a dynamical phenomenon: a given {G, k} arises as the gauge group of a model where the local algebras of observables are a suitable subnet of local algebras of a possibly infinite product of free field theories. Keywords: Gauge groups; local algebras; superselection theory. Mathematics Subject Clasification 2000: 81T05, 43A95, 22D25
1. Introduction When few standard assumptions of Local Quantum Physics on the physical, fourdimensional Minkowski spacetime are fulfilled, the assignment of the local observables is sufficient to construct an algebra F of fields carrying localizable charges and a compact group G of gauge transformations of the first kind, where F is complete in the sense that all superselection sectors with finite statistics (intrinsically determined by the local observables) can be reached acting with the fields on the vacuum [1]. Moreover, the construction is unique if in addition one requires normal Bose/Fermi commutation relations at spacelike separations; the corresponding Z2 grading is defined by a central involution k ∈ G. The local observables are then identified with the gauge invariant part of F, and the superselection sectors are in one to one correspondence with the classes of irreducible continuous unitary representations of the gauge group G. More precisely, F is constructed from a cross product of A by the superselection category, and G appears as the group of all automorphisms 873
August 22, 2002 10:44 WSPC/148-RMP
874
00143
S. Doplicher & G. Piacitelli
of F leaving A pointwise fixed. Representations of A obeying the selection criterion of localizability and with finite statistics are described by localized morphisms of A induced by finite dimensional G invariant Hilbert spaces in F with support I; the correspondence between a localized morphism and the representation of G obtained as the restriction of gauge automorphisms to the corresponding Hilbert space in F provides a natural isomorphism between the superselection category and a category of representations of G, equivalent to the category of all finite dimensional unitary continuous representations. For a survey see e.g. [2, 3]. The proof of this reconstruction theorem was closely related to a new duality theory for compact groups [4]. The gauge group G will be metrizablea precisely in theories with at most countably many superselection sectors, hence, in particular, in theories where the split property is fulfilled by the field net [5]. So it is quite natural to ask which compact groups may arise as gauge groups. Indeed, here we provide a natural, functorial construction which maps any given system {G, k, µ} — with G a metrizable compact group, k ∈ Z(G), k 2 = e, and µ a suitable mass function over a generating subset of the spectrum of G, cf. below — to an observable net with a distinguished vacuum sector, admitting precisely G as its gauge group, and k as the grading element. Moreover, the field net will fulfill the split property. Let Irr(G) be the set of all irreducible subrepresentations of the regular representation. We say that a subset ∆ ⊂ Irr(G) is symmetric if it is stable under conjugation, generating if every element of Irr(G) appears as a subrepresentation of tensor products of elements of ∆.b In what follows, by a mass function we shall always understand a map µ : ∆(µ) → (0, ∞) with ∆(µ) ⊂ Irr(G) a symmetric, generating subset associated to mutually orthogonal subspaces, and such that inf{µ(∆(µ) )} > 0; moreover, µ will satisfy the growth condition X d(σ)µ(σ)4 e−µ(σ)δ < ∞ (1.1) ∀δ > 0 , σ∈∆(µ)
(d(σ) = dim(Kσ ) is the dimension of the representation σ ∈ ∆(µ) acting on the Hilbert space Kσ ⊂ L2 (G)), which will imply the split property for the field net of the associated model. A gauge triple is a system T = {G, k, µ} where G is a compact, metrizable group, k ∈ G an involutive central element and µ a mass function. A continuous, η surjective group homomorphism G −→ G1 defines an arrow η
{G, k, µ} −→ {G1 , k1 , µ1 } in the category Gauge of gauge triples if and only if η(k) = k1 a Equivalently, the compact group G has a countable basis of open sets, or has a countable set of equivalence classes of irreducible continuous unitary representations. b By the Stone–Weierstrass theorem, a symmetric ∆ is generating if and only if it is faithful, i.e. it separates points of G.
August 22, 2002 10:44 WSPC/148-RMP
00143
Any Compact Group is a Gauge Group
875
η∗
and the transposed map Irr(G1 ) −→ Irr(G), η∗ σ = σ ◦ η, fulfills the conditions η∗ ∆(µ1 ) ⊂ ∆(µ) , µ ◦ η∗ = µ1 . We shall construct a functor F from Gauge to the category Obs of covariant observable (i.e. local) nets in a vacuum sector. The objects of Obs are the systems {H, A, U }, where A is an irreducible, additive local net of von Neumann algebras on the separable Hilbert space H, defined on the set K of double cones in Minkowski space, and fulfilling duality;c U is a strongly continuous unitary representation of the Poincar´e group P, covariantly acting on A, satisfying the spectrum condition and having an invariant vector (the vacuum vector), unique up to a phase. An arrow {H1 , A1 , U1 } → {H, A, U } is given by an isometry V : H1 → H and an inclusion of nets Φ : A1 ,→ A such that V A = Φ(A)V , V U1 (L) = U (L)V ,
A ∈ A1 , L∈P.
Hence, A1 ⊂ A in the sense of [6]. The results of [1] provide us with a reconstruction functor r which associates to any system {H, A, U } in Obs the corresponding couple {G, k} where G is the gauge group and k is the element grading the commutation rules. We are ready to state our main result. Theorem 1.1. There exists a faithful, contravariant functor F
Gauge −→ Obs such that the following diagram is commutative F
{G, k, µ} −−−−−−→ {H, A, U } r f & . {G, k} where f forgets the mass function. Moreover, for any gauge triple T, the canonical field net associated with F (T ) fulfills the split property. We recall here that the split property selects the models which are more relevant for physics among theories fulfilling the general axioms of locality, covariance, and spectrum condition; in particular it allows us to derive rigorously a weak form of the Noether theorem and an exact variant of the current algebra [7, 8, 9]; this provides a natural approach towards a full Quantum Noether Theorem (see [10] for partial results in conformal models). c We
recall that the net A is said to fulfill duality if, for any double cone O, A(O) = A(O 0 )0 .
August 22, 2002 10:44 WSPC/148-RMP
876
00143
S. Doplicher & G. Piacitelli
Remark 1.1. Let T = {G, k, µ} be a gauge triple, and F (T ) = {H, A, U }; A will be obtained as the G-invariant part of a field net F associated to T . F will be generated, as σ runs through ∆(µ) , by d(σ) independent free fields of mass µ(σ), acting on a Hilbert space Hσ , of scalar or Dirac type according to whether σ(k) = +1 or −1 respectively, and fulfilling normal commutation relations; thus fields associated to σ 6= σ 0 anticommute if σ(k) = σ 0 (k) = −1, and commute otherwise. The net F will act irreducibly over the (infinite, if ∆(µ) is infinite) tensor product (relative to the sequence of vacuum state vectors) over σ ∈ ∆(µ) of the Hilbert spaces Hσ . To prove that G is indeed the full gauge group associated to A, it suffices to prove that the field net F fulfills the split property, together with a further cohomological condition [11, Theorems 4 and 2, Sec. 3.4.5.]. Actually, one might conjecture that the infinite tensor product of full free theories has no nontrivial sectors.d Were this conjecture true, the extension of the functor F to a larger category of gauge triples, defined by dismissing the growth condition on the mass functions, still would make the above diagram commutative, without any need to invoke and establish the split property of the infinite tensor product model. Furthermore, this might allow us to extend the present result to theories with uncountably many superselection sectors, where the field algebra cannot fulfill the split property, but which however might hardly have any physical interest. Remark 1.2. Let T , T1 be two gauge triples, and F, F1 the canonical field nets η associated with F (T ), F (T1 ), respectively. If there is an arrow T −→ T1 , then F (T1 ) is a subsystem of F (T ) and one has the following commuting square of inclusions of nets (in the sense of [6]) A1
⊂
∩ A
F1 ∩
⊂
F
Then F1 ∨ A = Fker η , 1 F1 ∧ A = FG 1 ,
a statement about the superselection theory for subsystems, which holds true in general. See [6]. The construction of our models is outlined in Sec. 2, and the proof of the main theorem is completed in Sec. 3. 2. Construction of the Models A mass function µ naturally defines a faithful representation D of G as the direct sum of the representations σ contained in its domain ∆(µ) , with representation dA
related conjecture has been formulated by Roberto Longo [12]; we defer this problem to future investigation.
August 22, 2002 10:44 WSPC/148-RMP
00143
Any Compact Group is a Gauge Group
877
space K ⊂ L2 (G) stable under the canonical conjugation on L2 (G), which induces a conjugation J of K. The rˆole of the grading element k is to select Fermi and Bose (µ) (µ) sectors: since k is central and involutive, we can decompose ∆(µ) = ∆+ ∪ ∆− , (µ) where σ(k) = ±IKσ , σ ∈ ∆± . Correspondingly, the space K has the natural decomposition K = K+ ⊕ K− as the direct sum of the two eigenspaces of the selfadjoint, unitary operator D(k). We consider two generalized free fields, a scalar field φ defined on the K+ -valued test functions, and a Dirac field ψ defined on the K− ⊗ C4 -valued test functions (a brief reminder is included in the appendix, for convenience of the reader). Define F as the tensor product of the nets F− and F+ associated to φ and ψ respectively; the action of G on the fields is induced by the natural pointwise actions of G on the test functions, and the mass function determines the energy–momentum spectrum. The observable net A = FG is the net of gauge invariant elements. The vacuum representation of A is generated by the restriction to A of the vacuum state of F. The field algebra F is thus a product of massive free field models, on which the elements of the desired group G act as gauge transformations on multiplets of fields and k determines the Bose/Fermi grading. For each σ ∈ ∆(µ) , there is a multiplet of d(σ) fields on which G acts and defines an unitary representation equivalent to σ. Since the vacuum is cyclic and G-invariant, the action of G on the fields is implemented by a faithful, strongly continuous unitary representation V of G. The commutation relations are graded by V(k). The basic properties of locality, covariance, and spectrum condition [13, 14] are evidently fulfilled, only the proof of the duality property of the net A in its vacuum sector has to be sketched. We first prove twisted duality for the field net F. The net of fixed points under the action of a gauge group Gmax , a priori larger than G, will be the tensor product of nets, each fulfilling duality, hence it will fulfill duality as well (cf. e.g. [5, Lemma 10.1]). This implies ([1, Theorem 3.6]) that F fulfills twisted duality, which in turn implies that for any closed subgroup G ⊂ Gmax = U (H) ∩ F0 , the fixed point net A = FG fulfills duality in the vacuum [15]. To be more precise, the Tychonov product Gmax of the full unitary groups ¯ ) is represented on K by U (Kσ ) (or of the full orthogonal group O(Kσ ) if σ = σ the diagonal action in the decomposition of K as the direct sum of the subspaces Kσ ; composing this representation with the second quantization functor Γ (which intertwines direct sums and tensor products, cf. Remark 1.1) provides a unitary representation of Gmax which induces gauge automorphisms on F, generated by the products of the actions of the full gauge groups U (Kσ ) (or O(Kσ )). The subnet Amin = FGmax of fixed points under Gmax is now the tensor product U(K ) O(K ) of the nets Aσ := Fσ σ (or Fσ σ ) of fixed points in the field algebra generated by the free fields φσ or ψσ . Since each Aσ fulfills duality in its vacuum sector, so does the tensor product net Amin , hence A too. We thus defined an object of Obs associated by F to an object of Gauge; to an η arrow {G, k, µ} −→ {G1 , k1 , µ1 } in Gauge, we associate an arrow F (η) = (V, Φ) in Obs as follows. Let vη be the restriction to K1 of the isometry from L2 (G1 ) to L2 (G)
August 22, 2002 10:44 WSPC/148-RMP
878
00143
S. Doplicher & G. Piacitelli
given by the transposition of η, and define the isometry V˜ and the *-homomorphism ˜ : F1 → F by Φ V˜ := Γ(vη ) ,
˜ iφ(f ) ) := eiφ(vη f ) , Φ(e
˜ Φ(ψ(f )) := ψ(vη f ) ,
where φ, ψ is the generating Bose, resp. Fermi generalized free field, for all the appropriate test functions f . The desired arrow F (η) is obtained as the restriction ˜ respectively to the gauge invariant subspace and subalgebra. A routine of V˜ , Φ check shows that F (η1 ◦ η2 ) = F (η2 ) ◦ F (η1 ) whenever η1 ◦ η2 is defined, and F is indeed a contravariant functor. By [6, Theorem 2.2] and the comments appended there, F (η) in turn determines η, and F is faithful. 3. Split Property and Completeness of the Superselection Theory In this section we show that the net F constructed as in the appendix from the gauge triple {G, k, µ} is the canonical complete field net associated to FG . To this end, we have to show that it fulfills the split property and the cohomological condition (3.4) below [11]. The arguments for the split property are adaptations from the discussion of the Bose case [16–19]. In particular we prove that the split property holds if the mass function fulfills the growth condition (1.1). We recall that a net R is said to fulfill the split property if, for any inclusion of double cones O1 ⊂ O2 such that the closure of O1 is contained in the open set O2 (in symbols: O1 ⊂⊂ O2 ), then the inclusion R(O1 ) ⊂ R(O2 ) is split, that is, there exists a type I factor N such that R(O1 ) ⊂ N ⊂ R(O2 ). If the canonical field net associated with an observable net fulfills the split property, then the observable net also fulfills the split property, but sufficiently general conditions that imply the converse are not known [7]. The field net F is the tensor product of the nets F(±) ; since finite tensor products preserve the split property, it is enough to show that F(±) both satisfy the split (µ) property. In the case of F(+) , condition (1.1) (restricted to ∆+ ) is sufficient by the results of [5, 19]. Here we show that the same condition (restricted to ∆µ− ) is sufficient also in order to ensure the split property for F(−) . As a first step, we give an alternative proof of the split property for the field net generated by a massive Dirac free field, which was first proved in [20]. The argument is an adaptation of the one given in [18] for the Bose case, and provides an estimate analogous to the one given in [19]; for the reader’s convenience, we give some details and point out the main differences. Let us consider the doubled theory of the massive Dirac free field (followed by a Klein transformation); this theory is induced by the fields ψ1 = ψm ⊗ I ,
ψ2 = V ⊗ ψm ,
where Q is the usual charge and V = eiπQ . We now choose as gauge transformations
August 22, 2002 10:44 WSPC/148-RMP
00143
Any Compact Group is a Gauge Group
879
αθ : ψ1 7→ cos(θ)ψ1 + sin(θ)ψ2 αθ : ψ2 7→ − sin(θ)ψ1 + cos(θ)ψ2 , for which the conventional Noether theorem gives the conserved current jµ =
i {ψ1 γ µ ψ2 − ψ2 γ µ ψ1 } . 2
Equivalently, with η = (η1 , η2 ), we can write ψ(f, η) = η1 ψ1 (f ) + η2 ψ2 (f ), so that αθ (ψ(f, η)) = ψ(f, R(θ)η), with R(θ) the orthogonal rotation by θ. Note that απ/2 (F ⊗ I) = (I ⊗ F )Klein = I ⊗ F+ + V ⊗ F− where F = F+ + F− is the decomposition of F into the sum of a bosonic and a fermionic operator, F± = (F ± V F V )/2. Let Dr = {(0, x) ∈ R4 : |x| < r} and Or = Dr 00 be the double cone of radius r, centred at the origin; then Or ⊂⊂ Or+δ , δ > 0. Thanks to covariance, it suffices to show that the inclusion F(Or ) ⊂ F(Or+δ ) is split, r, δ > 0. Standard (see [18] for the Bose case) allow one to give sense R 4 techniques µ µ to j (f ) = d xj (x)f (x) as a selfadjoint operator for real test functions f , and 0 show that, for a suitable f , eiθj (f ) is in F(Or+δ ) and implements αθ on F(Or ). The C ∞ functions f will have the form f (x0 , x) = h(x0 )g(x), with g supported in {x : |x| < r + 2 3δ }, g(x) = 1 in Dr+ δ , and h with support in [−1, 1] and such that 3 R h(s)ds = 1; note that the desired action of j 0 (f ) does not depend on the smearing in time, i.e. on the choice of h [21, Lemma I]. In what follows, we shall write π Jm = j 0 (f ) . 2 Here the argument differs from [18], due to the presence of the Klein transformation (απ/2 is not the flip any more). Consider the algebraic *-isomorphism ! n n X X 0 Fi Fi = Fi ⊗ Fi0 σ i=1
i=1
between the *-algebra generated by Fm (Or )Fm (Or+δ )0 and the algebraic tensor product Fm (Or ) Fm (Or+δ )0 . We wish to find two normal, faithful states λ and `, such that ! ! n n X X 0 0 Fi Fi = ` Fi ⊗ Fi , (3.1) λ i=1
i=1
Fi0 ’s)
are arbitrarily chosen in Fm (Or ) (resp. Fm (Or+δ )0 ). where the Fi ’s (resp. This would imply that σ could be extended to a unique normal *-isomorphism ¯ Fm (Or+δ )0 , which implies the split property. σ ¯ : Fm (Or ) ∨ Fm (Or+δ )0 → Fm (Or ) ⊗ Taking the vector Φ = e−iJm Ω ⊗ Ω, we define on Fm (Or ) ∨ Fm (Or+δ )0 the normal state λ(F ) = (Φ, F ⊗ IΦ). Then, we define the normal positive map X of ˆ 0 into itself, extending X(T ⊗ S) = T− ⊗ S− by linearity and con¯ Fm (O) Fm (O) ⊗ tinuity. Finally we define Ψ = e−2iJm Ω ⊗ Ω, and the linear functional `(·) = (Ω ⊗ Ω, ·Ω ⊗ Ω) + (Ψ, X(·)Ω ⊗ Ω) ,
August 22, 2002 10:44 WSPC/148-RMP
880
00143
S. Doplicher & G. Piacitelli
¯ m (Or+δ )0 . The functionals λ and ` are ultraweakly continuous by on Fm (Or )⊗F construction; λ is evidently positive, normalized, and faithful. Also ` will be normalized, positive and faithful if (3.1) holds. If F ∈ Fm (Or ) and F 0 ∈ Fm (Or+δ )0 , we denote F = F+ + F− , F 0 = F+0 + F−0 where F± = (F ± V F V )/2 and analogously for F 0 ; then λ(F F 0 ) = λ(F+ F+0 ) + λ(F+ F−0 ) + λ(F− F+0 ) + λ(F− F−0 ). Since F 0 ⊗ I commutes with eiJm and Fermi fields have zero vacuum expectation values, we have λ(F+ F+0 ) = ω0 (F )ω0 (F 0 ) and λ(F− F+0 ) = 0; since eiJm commutes with Uπ (the unitary operator implementing απ ), we have λ(F+ F−0 ) = (Uπ Ω ⊗ Ω, eiJm (F+ F−0 ⊗ I)e−iJm Uπ Ω ⊗ Ω) = −λ(F+ F−0 ) = 0. Thus, λ(F F 0 ) = ω0 (F )ω0 (F 0 ) + λ(F− F−0 ) = ω0 (F )ω0 (F 0 ) + (Ω ⊗ Ω, eiJm (F−0 ⊗ I)e−iJm (I ⊗ F− )Ω ⊗ Ω) , where we used that eiJm (F− F−0 ⊗I)e−iJm = eiJm (F− ⊗I)e−iJm (V ⊗F−0 ), and V Ω = Ω. Since (F−0 ⊗ I)eiJm = e−iJm (F−0 ⊗ I), we have λ(F− F−0 ) = (Ψ, (F−0 ⊗ F− )Ω ⊗ Ω), hence Eq. (3.1). The former was an alternative proof of the split property for the Fermi–Dirac model; we are now interested in getting the estimate 4 6 3 m mδ δ 2 + 2 e−mδ , (3.2) kJm (f )Ω ⊗ Ωk 6 const r + 2 δ 2 where we recall the rˆ ole of r and δ, Or ⊂⊂ Or+δ . To this purpose we can slightly modify the estimate of [19, Lemma C.1] to obtain, for all positive a’s: 2 a 1 a ˆ + 2 e− 2 . sup |ω 2 h(ω)| (3.3) 6 4a 2 inf h∈D(−1,1) 2 R ω>a h=1 Then standard calculations give 1 Z p 2 p 4m2 2 dκ 1 − 2 d3 p |p|2 + κ2 fˆ |p|2 + κ2 , p , kJm Ω⊗ Ωk 6 const κ 4m2 R 4 µ 1 d xf (x)eixµ p . Now, proceeding as in [19], we obtain where, as usual, fˆ(p) = (2π) 2 Z
2
∞
2
6 4 2 δ 2 ˆ sup |ω 2 h(ω)| , kJm Ω ⊗ Ωk 6 const m 2 r + 2 δ ω>mδ 2
2
and using (3.3) and the fact that Jm does not depend on the choice of h, we obtain the desired estimate (3.2). We are now ready to prove the split property for F(−) . The doubled theory (−) ⊗ F(−) )Klein is generated by the Dirac fields ψσ,i (f, η) of mass µ(σ) (where (F (µ) σ ∈ ∆− and i = 1, 2, . . . , dim(Kσ )), with the addition of the internal degrees of freedom η ∈ C2 . Defining the gauge transformation αθ (ψσ,i (f, η)) = ψσ,i (f, R(θ)η) ,
August 22, 2002 10:44 WSPC/148-RMP
00143
Any Compact Group is a Gauge Group
881
where R(θ) is the rotation by θ, we find as generator of such rotations the operator P Pdim J = (µ) i=1 Kσ Jσ,i where the Jσ,i ’s strongly commute with each other. σ∈∆− The only remaining question is whether J exists in the incomplete infinite tensor product. Writing Jm for the local generator of the rotation of a doubled Dirac P field of mass m, this is the case if σ∈∆µ d(σ)kJµ(σ) Ω ⊗ Ωk < ∞, as can be seen applying the following lemma. Using estimate (3.2) we get the desired condition (µ) (1.1), restricted to ∆− . Lemma 3.1. Let {Jn , n ∈ N}, be a family of strongly commuting selfadjoint operators affiliated to the von Neumann algebra R on the Hilbert space H; if Ω ∈ P Q∞ T 0 kJn Ωk < ∞, then U (λ) := n=1 eiJn λ n D(Jn ) is a cyclic vector for R and if P defines a strongly continuous one parameter group with generator J = n Jn . Proof. For any real λ and T ∈ R0 , the sequence ! ! n n Y Y eiλJk T Ω = T eiλJk Ω xn (λ) = k=1
Pn
k=1
fulfills kxn (λ) − xm (λ)k 6 kT k |λ| k=m+1 kJk Ωk, where we used repeatedly the unitarity and the triangle inequality, and finally that |eit −1| 6 |t| together with the functional calculus. Hence, there exists x(λ) = limn→∞ xn (λ), and kx(λ)−xn (λ)k 6 P gives continuity in norm of λ 7→ x(λ). Since kT k |λ| ∞ k=n+1 kJk Ωk; a “3ε argument” Qn 0 R Ω dense in H, U (λ) = s-limn→∞ k=1 eiλJk exists for any λ and λ 7→ U (λ) is strongly continuous. By the results of [11], a field net F fulfilling the split property is a complete field net for the fixpoint net FG if it also fulfills the following condition. Let p be a path connecting the double cones O0 and O, i.e. a collection of 2n + 1 double ˆ1 , . . . , O ˆn such that (Oi−1 ∪ Oi ) ⊂ O ˆi ; we denote cones O0 , O1 , . . . , On = O, O S ˆ |p| = i Oi . Then the condition is ^ F(|p|) = F(O0 ) ∨ F(O) , (3.4) p
where the intersection is taken over all paths p connecting O0 and O. In order to apply these results to our case, we need the following Lemma 3.2. If F1 , F2 are two field nets with normal commutation relations, ¯ 2 (O))Klein has normal commutation fulfilling (3.4), then also F(O) = (F1 (O)⊗F relations and satisfies (3.4). Proof. Since the Klein transformation is normal, _ ¯ F2 (O0 ))Klein (F1 (O) ⊗ ¯ F2 (O))Klein (F1 (O0 ) ⊗ h =
F1 (O0 )
_
iKlein _ ¯ F2 (O0 ) F2 (O) F1 (O) ⊗
August 22, 2002 10:44 WSPC/148-RMP
882
00143
S. Doplicher & G. Piacitelli
" =
^
! F1 (|p|)
p
=
¯ ⊗
^
!#Klein F2 (|q|)
=
q
^
!Klein ¯ F2 (|q|) F1 (|p|) ⊗
p,q
^ ¯ F2 (|p|))Klein . (F1 (|p|) ⊗ p
This lemma applies equally well to any finite tensor product F(n) (O) = ¯ ···⊗ ¯ F(O))Klein . It can be easily extended to the infinite tensor product (F(O) ⊗ W ˜ (n) ˜ (n) = F(n) ⊗ I ⊗ I ⊗ · · · acts on NΩj Hj , since the maps F = n F , where F j P (n) F1 ⊗ F2 ⊗ · · · = F1 ⊗ · · · ⊗ Fn ⊗ ω0 (Fn+1 )I ⊗ ω0 (Fn+2 )I ⊗ · · · converge pointwise strongly to the identity map as n → ∞, commute with the ˜ (n) (O) = P (n) F(O). This is precisely the case of the Klein transformations, and F models constructed in Sec. 2. Hence, the field nets enjoy both the split property and the cohomological condition (3.4). Therefore the superselection structure of the net of local observables A associated to the gauge triple T = {G, k, µ} is described precisely by the given group G. Appendix A We first recall the definition of the generalized free scalar field φ (case D(k) = I). Let J be a complex conjugation on K commuting with D, and H the subspace of real (i.e. J-invariant) vectors; our test functions f on R4 take values in H, have range each in a finite dimensional D-invariant subspace, and (ξ, f (·)) is C ∞ with compact support for each ξ ∈ K; they carry actions of G and P given by g f (x) (a,Λ) f (x)
= D(g)f (x) ,
g ∈ G, x ∈ R4 ,
= f (Λ−1 (x − a)) ,
(a, Λ) ∈ P, x ∈ R4 .
(A.1) (A.2)
On these test functions define the scalar product XZ (fˆσ (p), gˆσ (p))dΩ+ (f, g) = µ(σ) (p) , σ
where fˆ is the Fourier transform of f , and dΩ+ m is the Lorentz invariant measure on the positive energy mass hyperboloid such that for each continuous function g with compact support Z p Z d3 p . (p) = g (p2 + m2 ), p p g(p)dΩ+ m 2 (p2 + m2 ) The generalized free scalar field φ is the map from our real test functions f to selfadjoint operators φ(f ) defined by the vacuum functional ω0 on the Weyl relations eiφ(f ) eiφ(g) = e− 2 Im(f,g) eiφ(f +g) , i
ω0 (eiφ(f ) ) = e− 4 (f,f ) ; 1
August 22, 2002 10:44 WSPC/148-RMP
00143
Any Compact Group is a Gauge Group
one may check Im(f, g) =
XZ
883
(fσ (x), (∆µ(σ) ? gσ )(x))d4 x ,
σ 1 (dΩ+ where, in the convolution product, ∆m , the Fourier transform of 2i m (p) − + dΩm (−p)), is the usual Lorentz invariant commutator kernel with support excluding all spacelike points. The generalized free Dirac field ψ (case D(k) = −I) will be defined on test functions f on R4 with values in K ⊗ C4 , each with range in a finite dimensional D-invariant subspace, and such that (ξ, f (·)) is C ∞ with compact support for each ξ ∈ K ⊗ C4 ; they carry actions of G and of the covering group P˜ of the Poincar´e groupe given by g f (x) (a,A) f (x)
= D(g) ⊗ If (x) ,
g ∈ G, x ∈ R4 ,
= I ⊗ S(A)f (Λ(A)−1 (x − a)) ,
(A.3)
˜ x ∈ R4 , (a, A) ∈ P,
(A.4)
where S is the ( 12 , 12 ) linear representation of SL(2, C) on C4 acting on the associated Dirac γ-matricesf by S(A) 6 pS(A)−1 = 6 p0 ,
p0 = Λ(A)p ;
6 p := γµ pµ .
On these test functions define the positive semidefinite scalar product XZ (fσ (x), iγ0 (−i6 ∂ + µ(σ))∆µ(σ) ? gσ )(x))d4 x , (f, g) =
(A.5)
σ
and let (f, g)+ be its positive frequency part, obtained by replacing in (A.5) ∆m 1 + by ∆+ m , the Fourier transform of 2i dΩm . The Dirac generalized free field ψ is defined by the GNS representation of the pure gauge invariant quasi free state ω0 on the CAR algebra associated to (A.5): {ψ(f )∗ , ψ(g)} = (f, g)I , ω0 (ψ(f )∗ ψ(g)) = (f, g)+ . In this way one may obtain functors Γ± whose tensor product combines, in the general case where D(k) splits K into K+ ⊕ K− , to give the desired functor Γ. In particular, since the actions (A.1, A.2, A.3, A.4) of G and P˜ on test functions e i.e. f An
the semi direct product of SL(2, C) and the translations by the action x 7→ Λ(A)x. explicit representation is e.g. given by ! ! ! A 0 0 I 0 σ , γ0 = S(A) = , γ= , 0 A∗−1 I 0 −σ 0
where σ= are the Pauli matrices.
0
1
1
0
! ,
0
−i
i
0
! ,
1
0
0
−1
!!
August 22, 2002 10:44 WSPC/148-RMP
884
00143
S. Doplicher & G. Piacitelli
induce automorphisms of the tensor product of the Weyl and CAR algebras on K+ and K− , leaving the (tensor product) vacuum state invariant, these actions are implemented by unitary strongly continuous representations U, V. The von Neumann algebra F(O), O ∈ K, is generated by exp iφ(f ) ⊗ I and I ⊗ ψ(g), f, g varying over the test functions associated respectively to K+ and K− as above, with support in O. The case of generic spins and possibly continuous mass spectrum, not used in this paper, might be handled in a similar way.
Acknowledgement The research is supported by MIUR and GNAMPA–INDAM.
References [1] Sergio Doplicher and John E. Roberts, Why there is a field algebra with a compact gauge group describing the superselection structure in particle physics, Commun. Math. Phys. 131 (1990) 51–107. [2] Sergio Doplicher, Abstract compact group duals, operator algebras and quantum field theory, in Proceedings of the International Congress of Mathematicians, Kyoto 1990, Springer 1991, pp. 1319–1333. [3] Sergio Doplicher, Quantum field theory, categories and duality, in Advances in Dynamical Systems and Quantum Physics, Proceedings of the Capri Conference 1993, eds. S. Albeverio et al., World Scientific 1995, pp. 106–110. [4] Sergio Doplicher and John E. Roberts, A new duality theory for compact groups, Invent. Math. 98 (1989) 157–218. [5] Sergio Doplicher and Roberto Longo, Standard and split inclusions of von Neumann algebras, Invent. Math. 75 (1984) 495–536. [6] Roberto Conti, Sergio Doplicher and John E. Roberts, Superselection theory for subsystems, Commun. Math. Phys. 218 (2000) 263–281. [7] Sergio Doplicher, Local aspects of superselection rules, Commun. Math. Phys. 85 (1982) 73–86. [8] Sergio Doplicher and Roberto Longo, Local aspects of superselection rules. II, Commun. Math. Phys. 88 (1983) 399–409. [9] Detlev Buchholz, Sergio Doplicher and Roberto Longo, On Noether’s theorem in quantum field theory, Ann. Phys. 170 (1986) 1–17. [10] Sebastiano Carpi, Quantum Noether theorem and conformal field theory: study of some models, Rev. Math. Phys. 11 (1999) 519–532. [11] John E. Roberts, Lectures on algebraic quantum field theory, in The Algebraic Theory of Superselection Sectors. Introduction and Recent Results, ed. Daniel Kastler, World Scientific Publ., Singapore, 1990, pp. 1–112, [12] Roberto Longo, private communication. [13] Rudolph Haag, Local Quantum Physics. Fields, Particles, Algebras, Springer–Verlag, 1996. [14] Huzihiro Araki, Mathematical Theory of Quantum Fields, Oxford University Press, 1999. [15] Sergio Doplicher, Rudolf Haag and John E. Roberts, Fields, observable and gauge transformations I, Commun. Math. Phys. 13 (1969) 1–23.
August 22, 2002 10:44 WSPC/148-RMP
00143
Any Compact Group is a Gauge Group
885
[16] Wulf Driessler, Duality and the absence of locally generated superselection sectors for CCR–type algebras, Commun. Math. Phys. 70 (1979) 213–220. [17] J¨ urg Fr¨ ohlich, New superselection sectors (“soliton states”) in two-dimensional Bose quantum field models, Commun. Math. Phys. 47 (1976) 269–310. [18] Claudio D’Antoni and Roberto Longo, Interpolation by type I factors and the flip automorphism, J. Funct. Anal. 51 (1983) 361–371. [19] Claudio D’Antoni, Sergio Doplicher, Klaus Fredenhagen and Roberto Longo, Convergence of local charges and continuity properties of W ∗-inclusions, Commun. Math. Phys. 110 (1987) 325–348. [20] Stephen J. Summers, Normal product states for fermions and twisted duality for CCR- and CAR-type algebras with application to the Yukawa2 quantum field model, Commun. Math. Phys. 86 (1982) 111–141. [21] Daniel Kastler, Derek W. Robinson and J. A. Swieca, Conserved currents and associated symmetries; Goldstone’s theorem, Commun. Math. Phys. 3 (1966) 108–120.
August 22, 2002 11:6 WSPC/148-RMP
00142
Reviews in Mathematical Physics, Vol. 14, Nos. 7 & 8 (2002) 887–895 c World Scientific Publishing Company
DERIVATIVES WITH TWISTS
ARTHUR JAFFE∗ Boston University, 111 Cummington Street, Boston, MA 02115, USA
Received 22 May 2002 Revised 10 July 2002 We study derivatives on an interval of length ` (or the associated circle of the same length), and certain pseudo-differential operators that arise as their fractional powers. We compare different translations across the interval (around the circle) that are characterized by a twisting angle. These results have application in the study of twist quantum field theory. Keywords: Twisting; twists; relative bound.
Consider the Hilbert space K = L2 ([0, `]; dx) over the interval of length `. It is d with the well known that the skew-symmetric operator of differentiation D = dx domain of smooth, compactly-supported functions yields a one-parameter family of skew-adjoint extensions, parameterized by an angle χ. Each extension has an orthonormal basis of eigenfunctions for D given by, 1 fk (x) = √ eikx , `
for k ∈ K =
χ 2π Z− . ` `
(1)
The angle χ specifies a twist, and (1) extends each fk to a smooth function on R satisfying fk (x + `) = e−iχ fk (x) .
(2)
1. Motivation We described the twisted interval above in terms of pure mathematics; yet twisting plays several roles in physics. First, one often encounters parallel transport about a closed trajectory. The physical role of twisting includes the fact that the condition (2) ensures that angular momentum zero is not allowed, 0 6∈ K. Hence twisting provides an infra-red regularization, which can be useful in the study of massless fields. In fact, this author has taken advantage of these properties in recent works, see [1, 2, 3] and other works cited there. ∗ On
leave from Harvard University. 887
August 22, 2002 11:6 WSPC/148-RMP
888
00142
A. Jaffe
These investigations led to the genesis of the current paper, for in the detailed estimates one must compare different twists. This comparison can be carried out using the bounds that we establish here. In a different direction, one is fascinated by the possibility to measure twisting in the laboratory. A team of British physicists has accomplished this recently, according to a report in Science [4]. One can complement photon helicity (which takes two values) with a measurement of angular momentum (which takes values in correspondence with the magnitude of the twist). In this way, one has the potential to revolutionize communications through the increase in the density of data transmission, for the possibility to measure twisting allows an individual photon to carry more information than the single bit associated with helicity. 2. Fractional Derivatives Let T = {`, χ} denote the interval and the twist, and let DT denote the finite linear span of the set of basis vectors {fk }. Define DT as the closure of the derivative d defined on the domain DT . The vectors fk form an eigenbasis for DT , operator dx so −iDT is essentially self adjoint, and the spectrum of its closure is K. Define the energy operator for a unit mass as µT = (I − DT2 )1/2 ,
(3)
which for large frequency Fourier modes is asymptotically equal to |k|. The largefrequency behavior of µδT for δ > 0 is similar to the absolute value of a fractional derivative of order δ. We also use the function µ(k) = (1 + k 2 )1/2 defined in Fourier space. 3. Comparison of Derivatives Consider a pair of twists T and T 0 = {`, χ0 } on a fixed interval, and the corres0 ponding orthonormal bases {fk = √1` eikx } and {gk0 = √1` eik x }, for k ∈ K and for k 0 ∈ K 0 respectively. Assume that χ 6= χ0 (mod 2π) ,
(4)
0
so the momentum sets are disjoint, K ∩ K = φ. Proposition 3.1 (Domains). The self-adjoint operator µδT has the following properties: 0
(i) If 0 ≤ δ < 12 , then the domain of the operator µδT contains D{T } . 0 (ii) If 0 ≤ δ < 1, then the domain of the sesqui-linear form µδT contains D{T } × 0 D{T } . (iii) For 0 ≤ δ < 1, the matrix elements of µδT in the basis {gk0 } are χ − χ0 X 4 µ(k)δ 2 δ , where k10 , k20 ∈ K 0 . hgk10 , µT gk20 i = 2 sin ` 2 (k10 − k)(k20 − k) k∈K
(5)
August 22, 2002 11:6 WSPC/148-RMP
00142
Derivatives with Twists
889
Proof. (i) It is sufficient to show that the domain of µδT contains each gk0 , which we now demonstrate. There exists a unitary operator U that relates the bases fk and gk0 . The matrix elements Ukk0 = hfk , gk0 i of U satisfy X Ukk0 fk (6) gk 0 = k∈K
and direct computation yields 0
U
kk0
2ei(χ−χ )/2 sin = `(k 0 − k)
χ − χ0 2
= hfk , gk0 i .
(7)
By definition each vector fk lies in the domain of µδT , and hence so does any finite linear combination of these basis vectors. Let Λ < ∞ denote a parameter and define an approximating sequence gk0 ,Λ to gk0 by X Ukk0 fk ∈ D{T } . (8) gk0 ,Λ = k∈K |k|≤Λ
Clearly kgk0 − gk0 ,Λ k → 0 as Λ → ∞. If in addition it is the case that µδT gk0 ,Λ converges as Λ → ∞, then gk0 lies in the domain of the (self-adjoint) closure of µδT . Since fk is an eigenvector of µδT , we infer X X Ukk0 fk = Ukk0 µ(k)δ fk , (9) µδT gk0 ,Λ = µδT k∈K |k|≤Λ
k∈K |k|≤Λ
so that kµδT gk0 ,Λ k2 =
X
|µ(k)δ Ukk0 |2 =
k∈K |k|≤Λ
X k∈K |k|≤Λ
This sum over k is finite, and for δ < Furthermore, for Λ < Λ0 ,
1 2
µδT (gk0 ,Λ − gk0 ,Λ0 ) =
4µ(k)2δ sin2 2 ` (k − k 0 )2
χ − χ0 2
.
(10)
the bound on the sum is uniform in Λ.
X
Ukk0 µ(k)δ fk ,
(11)
k∈K |Λ|<|k|≤Λ0
from which one infers kµδT (gk0 ,Λ − gk0 ,Λ0 )k2 =
X
|µ(k)δ Ukk0 |2
k∈K |Λ|<|k|≤Λ0
=
X k∈K |Λ|<|k|≤Λ0
4µ(k)2δ sin2 2 ` (k − k 0 )2
χ − χ0 2
.
(12)
Since 2δ < 1, the sum on the right of (12) converges and kµδT (gk0 ,Λ − gk0 ,Λ0 )k ≤ o(1) ,
(13)
as Λ → ∞. Thus µδT gk0 ,Λ converges to a limit as Λ, Λ0 → ∞, completing the proof of (i).
August 22, 2002 11:6 WSPC/148-RMP
890
00142
A. Jaffe
(ii) This follows immediately from (i). (iii) Take the inner product of gk10 with the representation (6) and use (7) to obtain X X Ukk0 µ(k)δ hgk10 , fk i = hgk10 , fk iµ(k)δ hfk , gk20 i . (14) hgk10 , µδT gk20 i = k∈K
k∈K
Substituting the values in (7) for the matrix elements of U yields (5) and completes the proof. Remark. We give here a second derivation of the identity (5) in the case that δ = 0. Let a, b be non-integers and consider the convergent sum, which for a 6= b equals X π sin(π(b − a)) 1 = . (15) F (a, b) = (n + a)(n + b) (b − a) sin(πa) sin(πb) n∈Z
One can obtain the value above by considering the contour integral of a meromorphic function, Z 1 dz , (16) π cot(πz) (z + a)(z + b) Cn taken on a sequence of circular contours Cn , centered at the origin and of radius n + 12 , where n ∈ Z+ . For a 6= b the singularities of the integrand are simple poles. Assuming also that n > |a|, |b|, the contour Cn encloses 2n+3 poles of the integrand, as follows. The function π cot(πz) has a pole at each integer, with residue 1, and Cn encloses 2n + 1 of these poles. The other two poles occur at z = −a, −b. On the contour Cn of length O(n) the function π cot(πz) is bounded uniformly in n, and the function 1/|(z + a)(z + b)| tends to zero as O(1/n2 ) as n → ∞. Therefore the magnitude of the integrals (16) converge to zero as O(1/n). Using the Cauchy integral theorem we infer that the sum of the residues of the integrand vanish. Using the addition law for sines, this yields in the n → ∞ limit, π sin(π(b − a)) π cot(πa) π cot(πb) − = . (17) (b − a) (b − a) (b − a) sin(πa) sin(πb) P Letting b → a gives F (a, a) = n∈Z (n + a)−2 = π 2 / sin2 (πa). Thus in case b − a is integer, ( 1 if a = b 2 . (18) π −2 sin (πa)F (a, b) = 0 if 0 6= b − a ∈ Z F (a, b) =
Parameterize the sum (5) in the case δ = 0 as follows: take `kj0 = 2πn0j − χ0 and 0
0
−χ) −χ) and b = (χ2π + n01 − n02 . Then a, b `k = 2πn − χ with n, n0j ∈ Z; take a = (χ2π are non-integer, while b − a is integer. In terms of these variables, (5) has the form
hgk10 , gk20 i = π −2 sin2 (πa)F (a, b) , which by (18) equals δk10 k20 .
(19)
August 22, 2002 11:6 WSPC/148-RMP
00142
Derivatives with Twists
Proposition 3.2 (Relative Bound). Let 0 ≤ δ < is a bounded transformation with norm M,
1 2
891
and δ < 12 δ 0 . Then µδT µ−δ T0
0
kµδT µ−δ T0 k ≤ M ,
0
(20)
0
0
where M = M (δ, δ , `) can be chosen independently of χ, χ . Definition 3.3. (`1,∞ Norm). Consider an orthonormal basis B = {ei } for the Hilbert space H and a closed linear transformation X with domain containing B as a core. Let Xij = hei , Xej i denote the matrix elements of X in the basis. Define the `1,∞ norm of X with respect to the basis B as !1/2 X X |Xij | sup |Xi0 j 0 | . (21) kXkB1,∞ = sup i
j0
j
i0
In case X is self-adjoint or skew-adjoint, the `1,∞ norm reduces to X |Xij | . kXkB1,∞ = sup i
(22)
j
Lemma 3.4. (`1,∞ Estimate). The `1,∞ norm given in Definition 3.3 dominates the operator norm kXk of X, kXk ≤ kXkB1,∞ .
(23)
P P Proof. Let f = i fi ei and g = i gi ei be unit vectors. Consider hf, Xgi = P ¯ X g . Thus the Schwarz inequality yields f i ij j ij X |fi Xij gj | |hf, Xgi| ≤ ij
≤
X
1/2 |fi2 Xij |
X
ij
≤
X
1/2 |Xij gj2 |
ij
!1/2 sup
|fi |2
i
i
= kXkB1,∞ ,
X j
1/2 |Xij |
sup j
X i
!1/2 |Xij |
X
1/2 |gj2 |
j
(24)
from which the claim follows. Lemma 3.5. Let χ, χ0 ∈ (0, π). Then there exists a constant J = J(`) < ∞ such that |χ − χ0 | µ(k 0 − k) sin ≤J. (25) sup sup `|k 0 − k| 2 χ,χ0 k∈K 0 0 k ∈K
August 22, 2002 11:6 WSPC/148-RMP
892
00142
A. Jaffe
Proof. The momentum difference in the denominator has the value `(k 0 − k) = 2πn + χ − χ0 for n ∈ Z. If n 6= 0, then `|(k 0 − k)| ≥ π. In this case, µ(k 0 − k)/`|k 0 − k| is uniformly bounded, as long as ` is bounded away from zero. On the other hand, if n = 0, then |χ − χ0 | |χ − χ0 | µ((χ − χ0 )/`) µ(k 0 − k) sin sin = . (26) `|k 0 − k| 2 |χ0 − χ| 2 Since |sinx/x| is bounded, it follows that (26) is bounded uniformly in χ, χ0 for fixed `. Lemma 3.6. Let α, β ∈ R with α + β > 1, and define γ = min{α, β, α + β − 1}. Then there are constants 0 < M± = M± (α, β) < ∞ such that 1 X µ(k − p)−α µ(p)−β (27) F (k) = ` p∈K
satisfies the upper bound ( M+ µ(k)−γ , F (k) ≤ M+ µ(k)−γ ln(1 + µ(k)) , and the lower bound ( M− µ(k)−γ ln(1 + µ(k)) , F (k) ≥ M− µ(k)−γ ,
if α, β 6= 1 if α = 1 or β = 1
,
if α = 1, β ≤ 1, or if β = 1, α ≤ 1 otherwise
(28)
.
(29)
Furthermore the same bounds hold if the sum ranges over p ∈ K 0 in place of p ∈ K. Proof. The proofs for the lattices K and K 0 are the same. Furthermore µ(p) is even, so it is no loss of generality to assume that β ≥ 0. Divide the p sum into three disjoint regions: 1 1 I = p : |p| ≤ |k| , II = p : |k| < |p| < 2|k| , and III = {p : 2|k| ≤ |p|} , 2 2 (30) and denote the corresponding sums FI , etc. First we prove the upper bounds. In region I, it is the case that µ(k − p) ≤ µ(3k/2) ≤ const. µ(k), so µ(k − p)|α| ≤ const. µ(k)|α| . Also µ(k − p) ≥ µ(k/2) ≥ const. µ(k), so µ(k − p)−|α| ≤ const. µ(k)−|α| . Therefore for either sign of α, there ˜ 1 µ(k)−α , and ˜ 1 such that µ(k − p)−α ≤ M is a constant M if β > 1 1, X 1 −α −β −α ˜ 1 µ(k) ln(1 + µ(k)) , if β = 1 µ(p) ≤ M1 µ(k) FI (k) ≤ M ` p∈I if β < 1 µ(k)−β+1 , ( 1, if β 6= 1 . (31) ≤ M1 µ(k)−γ ln(1 + µ(k)) , if β = 1
August 22, 2002 11:6 WSPC/148-RMP
00142
Derivatives with Twists
893
Similarly, in region II use the bound µ(p) ≥ const. µ(k), as well as the bound ˜ 2 µ(k)−β . Therefore µ(p) ≤ const. µ(k) to obtain µ(p)−β ≤ M if α > 1 1, X −β 1 −α −β ˜ µ(k − p) ≤ M2 µ(k) FII (k) ≤ M2 µ(k) ln(1 + µ(k)) , if α = 1 ` p∈II if α < 1 µ(k)−α+1 , ( 1, if α 6= 1 ≤ M2 µ(k)−γ . (32) ln(1 + µ(k)) , if α = 1 Finally, in region III, it is the case that µ(k−p) ≥ const. µ(p) and also µ(k−p) ≤ µ(3p/2) ≤ const. µ(p). Thus X ˜3 1 µ(p)−α−β ≤ M3 µ(k)−α−β+1 ≤ M3 µ(k)−γ . (33) FIII (k) ≤ M ` p∈III
In all three cases, and hence for the union of the regions, there is a constant M+ such that F (k) satisfies the upper bound ( 1, if α, β 6= 1 −γ . (34) F (k) ≤ M+ µ(k) ln(1 + µ(k)) , if α = 1 or β = 1 In order to obtain a lower bound, use the above inequalities in the opposite direction. It is convenient to assume that |k| is sufficiently large so that the sets I and II are both non-empty. On the set of |k| too small to achieve this, direct inspection shows that any contribution to F (k) from region III is bounded below by a strictly positive constant. Then observe that in region I, it is the case that µ(k − p)−|α| ≥ const. µ(k)−|α| . Also in region I one has µ(k − p)|α| ≥ const. µ(k)|α| . ˜ 1 such that µ(k − p)−α ≥ Thus regardless of the sign of α there is a new constant M ˜ 1 µ(k)−α . We infer that M if β > 1 1, X 1 −α −β −α ˜ 1 µ(k) µ(p) ≥ M1 µ(k) FI (k) ≥ M ln(1 + µ(k)) , if β = 1 . (35) ` p∈I if β < 1 µ(k)−β+1 , Similarly in region II we use the bound µ(p) ≤ const. µ(k) and the assumption ˜ 2 µ(k)−β . Therefore β ≥ 0 to obtain µ(p)−β ≥ M X ˜ 2 µ(k)−β 1 µ(k − p)−α FII (k) ≥ M ` p∈II
1, −β ≥ M2 µ(k) ln(1 + µ(k)) , µ(k)−α+1 ,
if α > 1 if α = 1 .
(36)
if α < 1
Taking the greater lower bound from (35) with (36) yields the lower bound on F (k) in (29) and completes the proof.
August 22, 2002 11:6 WSPC/148-RMP
894
00142
A. Jaffe 0
∗ Proof of Proposition 3.2. Let T = µδT µ−δ T 0 . The matrix elements for T T in the {gk0 } basis are 0
0
0 −δ µ(k20 )−δ hgk10 , T ∗ T gk20 i = hgk10 , µ2δ T gk20 iµ(k1 ) 0 0 χ − χ0 X µ(k10 )−δ µ(k)2δ µ(k20 )−δ 4 2 . = 2 sin ` 2 (k10 − k)(k20 − k)
(37)
k∈K
The `1,∞ estimate of the operator norm of this self-adjoint matrix yields 0
2 2 ∗ ∗ kµδT µ−δ T 0 k = kT k = kT T k ≤ kT T kB1,∞ .
Therefore
0
2 kµδT µ−δ T 0 k ≤ sup k10 ∈K 0
≤
4 sin2 `2
X
|hgk10 , T ∗ T gk20 i|
k20 ∈K 0
(38)
χ−χ 2
0
0 −δ 0 2δ 0 −δ 0 X µ(k1 ) µ(k) µ(k2 ) sup . 0 0 |(k10 − k)(k − k20 )| k1 ∈K k∈K
(39)
k0 ∈K 0 2
Apply Lemma 3.5 to obtain the twist-independent bound 0 2δ 0 −δ X 0 0 µ(k) µ(k2 ) 2 2 sup µ(k10 )−δ kµδT µ−δ T 0 k ≤ 4J 0 − k)µ(k − k 0 ) . 0 0 µ(k k1 ∈K 1 2 k∈K
(40)
k0 ∈K 0 2
This bound is a sum of positive terms, so if it is convergent, it must be summable in any order. Apply Lemma 3.6 to the k20 sum to obtain ! X µ(k)2δ−δ0 ln(1 + µ(k)) δ −δ 0 2 2 0 0 −δ 0 . (41) kµT µT 0 k ≤ 4J `M (1, δ ) sup µ(k1 ) µ(k10 − k) k10 ∈K 0 k∈K
0
Since 2δ − δ 0 < 0, we infer that µ(k)2δ−δ ln(1 + µ(k)) ≤ M1 µ(k)− for a constant M1 < ∞ and any ∈ (0, δ 0 − 2δ). Thus apply Lemma 3.6 to the remaining k sum to obtain ! X µ(k)− δ −δ 0 2 2 0 0 −δ 0 kµT µT 0 k ≤ 4J `M (1, δ ) sup µ(k1 ) µ(k10 − k) k10 ∈K 0 k∈K
0
≤ 4J 2 `2 M (1, δ 0 )M (1, ) sup µ(k10 )−δ − ln(1 + µ(k10 )) < ∞ . k10 ∈K 0
(42)
This bound does not involve the angles χ, χ0 , so the estimate is uniform in these parameters. Renaming the final bound to be the constant M 2 = M (δ, δ 0 , `)2 completes the proof.
August 22, 2002 11:6 WSPC/148-RMP
00142
Derivatives with Twists
895
Acknowledgment I presented this material in preliminary form as part of a course of lectures. I am grateful to Jon Tyson for suggesting the use of Lemma 3.5 to simplify my proof of Proposition 3.2. References [1] Arthur Jaffe, Twist fields, the elliptic genus, and hidden symmetry, Proc. Nat. Acad. Sci. 97 (2000) 1418–1422. [2] Arthur Jaffe, The elliptic genus and hidden symmetry, Commun. Math. Phys. 219 (2001) 89–124. [3] Arthur Jaffe, Twists, supersymmetry, and field theory, in preparation. [4] Andrew Watson, New twist could pack photons with data, Science 296 (2002) 2316–2317.
August 22, 2002 11:14 WSPC/148-RMP
00141
Reviews in Mathematical Physics, Vol. 14, Nos. 7 & 8 (2002) 897–903 c World Scientific Publishing Company
REMARKS ON TIME-ENERGY UNCERTAINTY RELATIONS
ROMEO BRUNETTI and KLAUS FREDENHAGEN II Inst. f. Theoretische Physik, Universit¨ at Hamburg, 149 Luruper Chaussee, D-22761 Hamburg, Germany Received 13 May 2002 Revised 8 July 2002
Dedicated to Huzihiro Araki on the occasion of his seventieth birthday Using a recent construction of observables characterizing the time of occurence of an effect in quantum theory, we present a rigorous derivation of the standard time-energy uncertainty relation. In addition, we prove an uncertainty relation for time measurements alone. Keywords: Quantum mechanics; uncertainty relations; positive operator valued measures.
1. Introduction Time-energy uncertainty relations played an important rˆ ole in the early discussions on the physical interpretation of quantum theory [5]. But contrary to the positionmomentum uncertainty relation, their derivation and even precise formulation suffer from the difficulty of assigning an observable (in the sense of selfadjoint operators) of quantum theory with the measurement of time [10, 8]. Meanwhile, as advocated long ago by Ludwig [7], it is widely accepted that the concept of observables should be generalized, by allowing not only selfadjoint operators (corresponding to projection valued measures) but also positive operator valued measures [9], and it was quickly realized that one can find such measures which transform covariantly under time translations and fulfil therefore all formal requirements for a time observable [3, 6, 13, 15]. In a recent paper [2] we gave the first (to our knowledge) general construction of such measures starting from an arbitary positive operator which may be interpreted as the effect whose occurence time is described by the measure; this notion of time is closely related to the concept of time of arrival but in contrast to this our construction always leads to positive operator valued measures (cf. the book in [3]). All that is done in a strict quantum language, no classical ideas or generalizations of quantum mechanics are involved (we stress that the construction is valid also in quantum field theory, a subject which we plan to deal with in the future). In the present paper we show that for these time observables the usual 897
August 22, 2002 11:14 WSPC/148-RMP
898
00141
R. Brunetti & K. Fredenhagen
time-energy uncertainty relation holds, and that in addition, provided the Hamiltonian is positive, one finds an uncertainty relation for time measurements alone which takes the form const ∆T ≥ hHi with a universal constant. Arguments for these relations have been given by several authors [1, 3, 11, 13, 16], mainly in special situations and often only on a heuristic level. It is the aim of the present note to show that these relations can be rigorously derived for an arbitrary time translation covariant positive operator valued measure. 2. Covariant Naimark Stinespring Dilation Let H be a Hilbert space and (U (t)) a strongly continuous unitary group on H describing the time evolution. The occurence time of an effect is, following [2], described by a covariant positive operator valued measure F , i.e. for any Borel subset B of the real line R we have a positive contraction F (B) on H, and the operators F (B) satisfy the conditions ! X [ Bn = F (Bn ) F n
n
if the sets Bn are pairwise disjoint and where the sum on the r.h.s. converges strongly, F (R) = 1 ,
F (∅) = 0 ,
U (t)F (B)U (−t) = F (B + t) . We want to construct from these data an extended Hilbert space K with a projection P onto the subspace H, a unitary group (V (t)) on K which reduces on H to the original time translation (U (t)) and a covariant projection valued measure E on K such that P E(B)P = F (B)P . Let K0 be the space of bounded piecewise continuous functions on R with values in H. On this space we introduce the positive semidefinite scalar product Z hΦ, Ψi = (Φ(t), F (dt)Ψ(t)) . H can be isometrically embedded into K0 by identifying the elements of H with constant functions. The enlarged Hilbert space K is then defined as the completion of the quotient of K0 by the null space of the scalar product. On K0 we define ( Φ(t) , t ∈ B (E(B)Φ)(t) = 0, t 6∈ B
August 22, 2002 11:14 WSPC/148-RMP
00141
Remarks on Time-Energy Uncertainty Relations
899
(V (t)Φ)(s) = U (t)Φ(s − t) , Z P Φ = F (dt)Φ(t) . E(B) and V (t) map the null space into itself, P annihilates it. Therefore they are well defined operators on K and form the desired covariant dilation. In particular, (E, V ) is a system of imprimitivity over R [14] and therefore unitarily equivalent to a multiple of the Schr¨ odinger representation. 3. The General Time-Energy Uncertainty Relation Using the covariant dilation described in the previous section we can use the standard uncertainty relation in the Schr¨ odinger representation on L2 (R), 1 d 1 ≥ ∆Φ (x)∆Φ i dx 2 which holds for all wave functions Φ in the intersection of the domains of x and 1 d i dx . This follows from the validity of the canonical commutation relations in the sense of quadratic forms, 1 d 1 d Ψ − Φ, xΨ = i(Φ, Ψ) xΦ, i dx i dx d is the generator of translations U (a), which may be derived from the fact that 1i dx and that, by Stone’s Theorem [12], a 7→ U (a)Φ is strongly differentiable for Φ in d . Namely, we have the domain of 1i dx ! ! 1 d 1 d Ψ − Φ, xΨ xΦ, i dx i dx 1 d = ((xΦ, U (a)Ψ) − (U (−a)Φ, xΨ)) i da a=0
1 d = i da
((xΦ, U (a)Ψ) − (Φ, (x + a)U (a)Ψ))
a=0
1 d = i da
(−a)(Φ, U (a)Ψ) = i(Φ, Ψ) .
a=0
We can now state the general time-energy uncertainty relation: Theorem 3.1. Let F be a time translation covariant positive operator valued measure, and let H denote the Hamiltonian. Let Φ be a unit vector in the domain of the Hamiltonian for which the second moment of the probability measure dµ(t) = (Φ, F (dt)Φ) is finite. Then we have the uncertainty relation 1 ∆Φ (TF )∆Φ (H) ≥ 2
August 22, 2002 11:14 WSPC/148-RMP
900
00141
R. Brunetti & K. Fredenhagen
where ∆Φ (TF ) is the square root of the variance of µ and ∆Φ (H) = (kHΦk2 − 1 (Φ, HΦ)2 ) 2 is the usual energy uncertainty. Proof. We use the covariant dilation described in Sec. 2. For Φ ∈ H we have (Φ, E(B)Φ) = (Φ, F (B)Φ) , hence Φ is in the domain of definition of the selfadjoint operator TE defined by the projection valued measure E. Moreover, since the dilated time translations V (t) restrict on H to the original time translations, Φ is also in the domain of the generator K of V . But K and TE satisfy the canonical commutation relation in the sense of quadratic forms, thus Φ fulfils the uncertainty relations with respect to TE and K. The desired time-energy uncertainty relation now simply follows from the equalities ∆Φ (TE ) = ∆Φ (TF ) ,
∆Φ (K) = ∆Φ (H) .
It is clear that the tricky point was to find a useful representation of the Hilbert space K with which we could reduce the computation to the standard positionmomentum uncertainty relation. However, we stress that this representation only plays an auxiliary rˆ ole, no physical interpretation has to be associated with it. (An essentially equivalent derivation may already be found in [15].) 4. Uncertainty of Time The replacement of projections by positive operators in the description of time observables leads to an intrinsic uncertainty. We will assume in this section that the Hamiltonian is positive. Under this condition, we will show that the minimal time uncertainty is inversely proportional to the expectation value of the energy. Let Φ be a unit vector for which the time uncertainty ∆Φ (TF ) and the expectation value of the Hamiltonian are finite. We use the same covariant dilation as before. Because of the positivity of H, H must be contained in the spectral subspace of K corresponding to the positive real axis. We may realize K as the space of square integrable functions L2 (R, L) where L is the Hilbert space which describes the multiplicity of the Schr¨ odinger representation. K acts by multiplication and TE as generator of translations. Since Φ is in the domain of TE , it is absolutely continuous, and since H ⊂ K+ = L2 (R+ , L), Φ has to vanish at x = 0. Hence Φ is in d2 the quadratic form domain q of the operator − dx 2 on K+ with Dirichlet boundary 2 d condition at x = 0 (symbolically − dx2 |D ). Since the problem is invariant under time shifts we may assume that the expectation value of TF vanishes, and to determine the infimum (over Φ) of the quantity ! d2 Φ, − 2 Φ (Φ, xΦ)2 , dx D
August 22, 2002 11:14 WSPC/148-RMP
00141
Remarks on Time-Energy Uncertainty Relations
it would be sufficient to take it over the set
901
d2 S = {Φ ∈ K+ , kΦk = 1, Φ ∈ q − 2 |D ∩ q(x)} . dx We use the following relation which is valid for a, b > 0, 4 (a + λb)3 . ab2 = inf λ>0 27λ2 The relation may be verified by noting that the argument of the infimum assumes the value of the left side for λ = 2a b , hence it suffices to check the inequality 4 (a + λb)3 , a, b, λ > 0 . ab2 ≤ 27λ2 Setting c = λb, we obtain the equivalent inequality 3 2 a+c c ≤ . a 2 3 Taking now the logarithm on both sides we find again an equivalent inequality which is a direct consequence of the concavity of the logarithm. We therefore obtain the following relation d2 inf Φ, − 2 Φ (Φ, xΦ)2 Φ∈S dx D 3 4 d2 = inf inf + λx Φ . Φ, − Φ∈S λ>0 27λ2 dx2 D We may perform on the right hand side first the infimum over Φ. We then d2 can exploit the behaviour of the operator − dx 2 |D + x under scale transformations. Namely, let 1
(D(µ)Φ)(x) = µ 2 Φ(µx) be the unitary scale transformations on K+ . Then we have 1 1 2 d2 d2 D(λ 3 )−1 − 2 + λx D(λ 3 ) = λ 3 − 2 + x . dx dx D
D
Since the set S is scale invariant, the infimum over Φ is independent of λ. We thus obtain 4 3 d2 c , inf Φ, − 2 Φ (Φ, xΦ)2 = Φ∈S dx D 27 2
d where c is the infimum of the spectrum of − dx 2 |D + x. The spectrum of this operator is a pure point spectrum [12, 4]. Its eigenfunctions are
Φn (x) = Ai(x − λn ) ,
August 22, 2002 11:14 WSPC/148-RMP
902
00141
R. Brunetti & K. Fredenhagen
with eigenvalues λn where Ai is the Airy function and −λn are its zeros. The smallest eigenvalue is λ1 = 2.338. So we finally arrive at the uncertainty relation ∆Φ (TF ) ≥
d hHiΦ
with d = 1.376. Some comments are in order now: (1) The new relation gives a rather large bound if compared to the original timeenergy uncertainty, indeed we have ∆Φ (TF )2 hH 2 iΦ = ∆Φ (TF )2 (hHi2Φ + ∆Φ (H)2 ) ≥ d2 +
(2)
(3)
(4) (5)
1 , 4
the exact largest lower bound of the left hand side being 9/4. Let us also notice that the bound d is universal, i.e. does not depend on the details of the Hamiltonian H. The stated relation is covariant, i.e. energy shifts do not change it. In case the infimum of the Hamiltonian is not zero we may change H with H −inf(σ(H))·1, where σ(A) is the spectrum of the operator A and 1 is the unit operator on the Hilbert space. We have an explicit formula for the state with minimum uncertainty, namely the state Φ1 (x) = Ai(x − λ1 ). Its shape shows how the energy spectrum has to be distributed in order to have minimal dispersion in time. (Recall that the variable x labels the energy of the system.) In the light of the last remark one wonders whether it would be possible to prepare such a kind of state in a laboratory and check the relation explicitely. The same relation holds for the radial momentum of the system in place of TF and by replacing the Hamiltonian by the radius.
References [1] Y. Aharonov, J. Oppenheim, S. Popescu, B. Reznik and W. G. Unruh, Phys. Rev. A57 (1998) 4130. [2] R. Brunetti and K. Fredenhagen, Time of occurence observable in quantum mechanics, to appear in Phys. Rev. A. [3] P. Busch, The time energy uncertainty relation, in Time in Quantum Mechanics, eds. J. G. Muga, R. Sala Mayato and I. L. Egusquiza, Lecture notes in Physics vol. 72, Springer-Verlag, Berlin-Heidelberg-New York, 2002. [4] S. Fl¨ ugge, Practical Quantum Mechanics, Springer-Verlag, Berlin, 1999. [5] W. Heisenberg, Z. Phys. 69 (1927) 56. [6] J. Kijowski, Rep. Math. Phys. 6 (1974) 362, and Phys. Rev. A59 (1999) 897. [7] G. Ludwig, Foundations of Quantum Mechanics, Springer-Verlag, New York, 1983. [8] J. G. Muga and C. R. Leavens, Phys. Rep. 338 (2000) 353. [9] M. A. Naimark, Izv. Acad. Nauk. SSSR Ser. Mat. 4 (1940) 277. [10] W. Pauli, General Principles of Quantum Mechanics, Spinger-Verlag, New York, 1980. [11] P. Pfeifer and J. Fr¨ ohlich, Rev. Mod. Phys. 67 (1995) 759.
August 22, 2002 11:14 WSPC/148-RMP
00141
Remarks on Time-Energy Uncertainty Relations
903
[12] M. Reed and B. Simon, Functional Analysis, second edition, Academic Press, New York, 1980. [13] M. D. Srinivas and R. Vijayalakshmi, Pramana 16 (1981) 173. [14] V. S. Varadarajan, Geometry of Quantum Theory, vol. II, D. van Nostrand Cp., Princeton New-Jersey, 1968. [15] R. Werner, J. Math. Phys. 27 (1986) 793. [16] E. P. Wigner, On the time-energy uncertainty relation, in Aspects of Quantum Theory, eds. A. Salam and E. P. Wigner, Cambridge University Press, 1972.
August 22, 2002 11:24 WSPC/148-RMP
00130
Reviews in Mathematical Physics, Vol. 14, Nos. 7 & 8 (2002) 905–911 c World Scientific Publishing Company
LIST OF PUBLICATIONS OF HUZIHIRO ARAKI
[1] Interaction between electrons in one-dimensional free-electron model with application to absorption spectra of cyanine dyes (with G. Araki), Progr. Theor. Phys. 11(1) (1954) 20–24. [2] Quantum field theory of unstable particles (with Y. Munakata, M. Kawaguchi and T. Goto), Progr. Theor. Phys. 17(3) (1957) 419–442. [3] Quantum-electrodynamical corrections to energy-levels of Helium, Progr. Theor. Phys. 17(5) (1957) 619–642. [4] On weak time-symmetric gravitational waves, Ann. Physics 7(4) (1959) 456–465. [5] Measurement of quantum mechanical operators (with M. Yanase), Phys. Rev. 120(2) (1960) 622–626. [6] On asymptotic behavior of vacuum expectation values at large space-like separation, Ann. Physics 11(2) (1960) 260–274. [7] Properties of the momentum space analytic function (with N. Burgoyne), Nuovo Cimento, Ser. X 18 (1960) 342–346. [8] Hamiltonian formalism and the canonical commutation relations in quantum field theory, J. Math. Phys. 1(6) (1960) 492–504. [9] Generalized retarded functions and analytic function in momentum space in quantum field theory, J. Math. Phys. 2(2) (1961) 163–177. [10] The determination of a local or almost local field from a given current (with R. Haag and B. Schroer), Nuovo Cimento, Ser. X 19 (1961) 90–102. [11] On the connection of spin and commutation relations between different fields, J. Math. Phys. 2(3) (1961) 267–270. [12] Wightman functions, retarded functions and their analytic continuations, Progr. Theor. Phys. Suppl. (18) (1961) 83–125. [13] On the asymptotic behaviour of Wightman functions in space-like directions (with K. Hepp and D. Ruelle), Helv. Phys. Acta 35(3) (1962) 164–174. [14] A generalization of Borchers theorem, Helv. Phys. Acta 36(2) (1963) 132–139. [15] Representations of the canonical commutation relations describing a nonrelativistic infinite free Bose gas (with E. J. Woods), J. Math. Phys. 4(5) (1963) 637–662. [16] A lattice of von Neumann algebras associated with the quantum theory of a free Bose field, J. Math. Phys. 4(11) (1963) 1343–1362. [17] Von Neumann algebras of local observables of free scalar field, J. Math. Phys. 5(1) (1964) 1–13. [18] Representations of canonical anticommutation relations (with W. Wyss), Helv. Phys. Acta 37(2) (1964) 136–159. [19] On the algebras of all local observables, [RIMS-5, July 1964] Progr. Theor. Phys. 32(5) (1964) 844–854. [20] Type of von Neumann algebra associated with free field, [RIMS-4, June 1964] Progr. Theor. Phys. 32(6) (1964) 956–965. [21] Hamiltonian formalism and the canonical commutation relations in quantum field theory, Ph. D. Thesis at Princeton University, 1960.
905
August 22, 2002 11:24 WSPC/148-RMP
906
00130
List of Publications of Huzihiro Araki
[22] Einf¨ uhrung in die axiomatische quantenfeldtheorie, I and II, Lecturenote at Eidgen¨ ossische Technische Hochschule, Z¨ urich, 1961–62. Note taken by K. Hepp, F. Riahi and W. Wyss. [23] Axiomatic method in quantum field theory, Bull. Phys. Soc. Japan 16(7) (1961) 463–465. [24] Equivalence of locality and paralocality in free parafield theory (with O. W. Greenberg and J. S. Toll), Phys. Rev. 142(4) (1966) 1017–1018. [25] Complete boolean algebras of type I factors (with E. J. Woods), Publ. RIMS, Kyoto Univ. Ser. A 2(2) (1966) 157–242. [26] Addenda: Complete boolean algebras of type I factors (with E. J. Woods), Publ. RIMS, Kyoto Univ. Ser. A 2(3) (1967) 451–452. [27] A remark on Piron’s paper (with I. Amemiya), [RIMS-12, August 1966], Publ. RIMS, Kyoto Univ. Ser. A 2(3) (1967) 423–427. [28] A report on the axiomatic approach to quantum physics, a talk given by M. H. Stone, note taken by H. Araki, Sugaku 17(B) (1965) 140–149. [29] Collision cross sections in terms of local observables (with R. Haag), Comm. Math. Phys. 4(2) (1967) 77–91. [30] C ∗ -Algebra Approach in Quantum Statistical Mechanics, (Lecture Note) Technical Report No. 728, Univ. of Maryland, 1967. [31] A classification of factors (with E. J. Woods), Publ. RIMS, Kyoto Univ. Ser. A 4(1) (1968) 51–130. [32] Multiple time analyticity of a quantum statistical state satisfying the KMS boundary conditions, [RIMS-38, July 1968], Publ. RIMS, Kyoto Univ. Ser. A 4(2) (1968) 361–371. [33] On KMS boundary conditions (with H. Miyata), [RIMS-37, July 1968], Publ. RIMS, Kyoto Univ. Ser. A 4(2) (1968) 373–385. [34] On the diagonalization of a bilinear hamiltonian by a Bogoliubov transformation, [RIMS-39, July 1968], Publ. RIMS, Kyoto Univ. Ser. A 4(2) (1968) 387–412. [35] A classification of factors, II, Publ. RIMS, Kyoto Univ. Ser. A 4(3) (1969) 585–593. [36] Physics and operator algebras, Sugaku 20(3) (1968) 142–153. [37] Gibbs states of a one dimensional quantum lattice, Comm. Math. Phys. 14(2) (1969) 120–157. [38] Factorizable representation of current algebra — non commutative extension of the L´evy–Kinchin formula and cohomology of a solvable groups with values in a Hilbert space, [RIMS-41, June 1969], Publ. RIMS, Kyoto Univ. Ser. A, 5(3) (1970) 361–422. [39] Local quantum theory, I, in Proceedings of the International school of physics “Enrico Fermi ”, Course 45, ed. R. Jost, Academic Press, New York and London, 1969, pp. 65–96. [40] Functional analytical method in foundation of statistical mechanics, Bull. Phys. Soc. Japan 25(1) (1970) 51–55. [41] Entropy inequalities (with Elliott H. Lieb), [RIMS-61, March 1970], Comm. Math. Phys. 18(2) (1970) 160–170. [42] On quasifree states of CAR and Bogoliubov automorphisms, [RIMS-62, April 1970], Publ. RIMS, Kyoto Univ. 6(3) (1971) 385–442. [43] One dimensional quantum lattice system, in Syst`emes a un Nombre Infini de Degr´es de Libert´e, Colloques Internationaux du CNRS No. 181, CNRS, Paris, 1970, pp. 75–86. [44] On representations of canonical commutation relations, [Queen’s 1970-27], Comm. Math. Phys. 20(1) (1971) 9–25. [45] Asymptotic ratio set and property L0λ , [Queen’s 1970–28], Publ. RIMS, Kyoto Univ. 6(3) (1971) 443–460.
August 22, 2002 11:24 WSPC/148-RMP
00130
List of Publications of Huzihiro Araki
907
[46] A remark on Bures distance function for normal states, Publ. RIMS, Kyoto Univ. 6(3) (1971) 477–482. [47] Product states, in Carg`ese Lectures in Physics, vol. 4, ed. D. Kastler, Gordon and Breach Sci. Publ., New York, 1970, pp. 1–30. [48] On quasifree states of the canonical commutation relations (I) (with Masafumi Shiraishi), [RIMS-76, December 1970], Publ. RIMS, Kyoto Univ. 7(1) (1971) 105–120. [49] On quasifree states of the canonical commutation relations (II), [RIMS-77, December 1970], Publ. RIMS, Kyoto Univ. 7(1) (1971) 121–152. [50] On the homotopical significance of the type of von Neumann algebra factors (with Mi-Soo Bae Smith and Larry Smith), [RIMS-81, February 1971], Comm. Math. Phys. 22(1) (1971) 71–88. [51] Some topics in the theory of operator algebras, in Actes du Congr`es International des Math´ematiciens 1970, vol. 2, Ganthier-Villars, Paris, 1971, pp. 379–382. [52] Reports on the section of quantum field theory and operator algebras in ICM 1970, Sugaku 23(2) (1971) 125–130. [53] Theory of Wyler, Bull. Phys. Soc. Japan 27(5) (1972) 387–390. [54] A remark on an infinite tensor product of von Neumann algebras (with Yoshiomi Nakagami), [RIMS-110, June 1972], Publ. RIMS, Kyoto Univ. 8(2) (1972) 363–374. [55] Bures distance function and a generalization of Sakai’s non-commutative Radon– Nikodym theorem, [Queen’s 1972–11], Publ. RIMS, Kyoto Univ. 8(2) (1972) 335–362. [56] Normal positive linear mappings of norm 1 from a von Neumann algebra into its commutant and its application, [Queen’s 1972–12], Publ. RIMS, Kyoto Univ. 8(3) (1973) 439–469. [57] Remarks on spectra of modular operators of von Neumann algebras, [Queen’s 1972–15], Comm. Math. Phys. 28(4) (1972) 267–277. [58] Some properties of modular conjugation operator of von Neumann algebra and a non-commutative Radon–Nikodym theorem with a chain rule, [RIMS-120, August 1972], Pacific J. Math. 50(2) (1974) 309–354. ´ [59] Expansional in Banach algebras, [RIMS-121, September 1972], Ann. Sci. Ecole Norm. Sup. 6(1) (1973) 67–84. [60] Structure of some von Neumann algebras with isolated discrete modular spectrum, [Queen’s 1972–24], Publ. RIMS, Kyoto Univ. 9(1) (1973) 1–44. [61] On the definition of C ∗ -algebras (with George A. Elliott), Publ. RIMS, Kyoto Univ. 9(1) (1973) 93–112. [62] Topologies induced by representations of the canonical commutation relations (with E. J. Woods), [Queen’s 1972–32], Rep. Math. Phys. 4(3) (1973) 227–254. [63] Statistical mechanics and probability, Suurikagaku 107 (1972) 17–21. [64] Parallel session on axiomatic field theory, in Proceedings of the XVI International Conference on High Energy Physics, vol. 2, National Accelerator Laboratory, Batavia, 1972, pp. 1–6. [65] Introduction to operator algebras, in Statistical Mechanics and Field Theory, eds. R. N. Sen and C. Weil, Halsted Press, Jerusalem-London, 1972, pp. 1–26. [66] Relative Hamiltonian for faithful normal states of a von Neumann algebra, [RIMS126, December 1972], Publ. RIMS, Kyoto Univ. 9(1) (1973) 165–209. [67] Relative Hamiltonian for faithful normal states of a von Neumann algebra, Trudy Mat. Inst. Akad. Nauk USSR 135 (1975) 18–25. [68] A classification of factors, in Statistical Mechanics and Field Theory, eds. R. N. Sen and C. Weil, Halsted Press, Jerusalem-London, 1972, pp. 27–30. [69] Mathematical problems in quantum field theory and quantum statistical mechanics, Bull. Phys. Soc. Japan 28(6) (1973) 514–516.
August 22, 2002 11:24 WSPC/148-RMP
908
00130
List of Publications of Huzihiro Araki
[70] One-parameter family of Radon–Nikodym theorems for states of a von Neumann algebra, [RIMS-139, July 1973], Publ. RIMS, Kyoto Univ. 10(1) (1974) 1–10. [71] Golden–Thompson and Peierls–Bogoliubov inequalities for a general von Neumann algebra, [RIMS-140, July 1973], Comm. Math. Phys. 34 (1973) 167–178. [72] On the equivalence of KMS and Gibbs conditions for states of quantum lattice systems (with P. D. F. Ion), [RIMS-143, August 1973], Comm. Math. Phys. 35(1) (1974) 1–12. [73] Positive cone, Radon–Nikodym theorems, relative Hamiltonian and the Gibbs condition in statistical mechanics; an application of the Tomita–Takesaki theory, [RIMS151, October 1973], in C ∗ -Algebras and Their Applications to Statistical Mechanics and Quantum Field Theory, ed. D. Kastler, North-Holland Pub. Co., Amsterdam, 1975, pp. 64–100. [74] On the equivalence of the KMS condition and the variational principle for quantum lattice systems, [RIMS-157, April 1974], Comm. Math. Phys. 38(1) (1974) 1–10. [75] Inequalities of Lieb, Bull. Phys. Soc. Japan 29(4) (1974) 346–348. [76] Recent development in the theory of operator algebras (with Y. Nakagami), Sugaku 26(4) (1974) 330–345. Addendum, Sugaku 27(2) (1975) 158–159. [77] Recent development in the theory of operator algebras and their significance in the theoretical physics, [RIMS-188, July 1975], Symposia Math. 20 (1976) 395–423. [78] On uniqueness of KMS states of one-dimensional quantum lattice systems, Comm. Math. Phys. 44(1) (1975) 1–7. [79] Relative entropy and its applications, [RIMS-190, August 1975], in Les M´ethodes Math´ematiques de la Th´eorie Quantique des Champs, Colloques Internationaux du C.N.R.S. No. 248, CNRS, Paris, 1976, pp. 61–79. [80] Relative entropy of states of von Neumann algebras, [RIMS-191, September 1975], Publ. RIMS, Kyoto Univ. 11(3) (1976) 809–833. [81] Inequalities in von Neumann algebras, [RIMS-192, September 1975], in R.C.P.25 22, IRMA, Univ. Louis Pasteur, Strasbourg, 1975, pp. 1–25. [82] Quantum mechanics of infinite systems, Kagaku 146(1) (1976) 56–61. [83] Introduction to relative Hamiltonian and relative entropy, [CPT preprint 75/p. 782], Lecture at Marseille, Nov., 1975. [84] On clustering property (with A. Kishimoto), [RIMS-198, March 1976], Rep. Math. Phys. 10(2) (1976) 119–125. [85] Symmetry and equilibrium states I (with A. Kishimoto), [RIMS-208, July 1976], Comm. Math. Phys. 52(3) (1977) 211–232. [86] KMS conditions and local thermodynamical stability of quantum lattice systems (with Geoffrey L. Sewell), Comm. Math. Phys. 52(2) (1977) 103–109. [87] Extension of KMS states and chemical potential (with Daniel Kastler, Masamichi Takesaki and Rudolf Haag), Comm. Math. Phys. 53(2) (1977) 97–134. [88] Constructive field theory, Bull. Phys. Soc. Japan 31(8) (1976) 623–631. [89] Relative entropy of states of von Neumann algebras II, Publ. RIMS, Kyoto Univ. 13(1) (1977) 173–192. [90] Some contact points of mathematics and physics (The First Pan-African Congress of Mathematicians, Rabat, 1976), Acta Appl. Math. 1 (1983) 5–15. [91] On KMS states of a C ∗ dynamical system, in Lecture Notes in Math. 650, 1978, pp. 66–84. [92] C ∗ -algebras and applications to physics, in International Colloquium on the Role of Mathematical Physics in the Development of Science, College de France, 1977. [93] Operator algebras and statistical mechanics, in Mathematical Problems in Theoretical Physics, Proceedings, Rome 1977, eds. G. Dell’Antonio, S. Doplicher and G. Jona-Lasinio, Lecture Notes in Phys. 80 (1978), pp. 94–105.
August 22, 2002 11:24 WSPC/148-RMP
00130
List of Publications of Huzihiro Araki
909
[94] Some topics in quantum statistical mechanics, in Proceedings of International Congress of Mathematicians, Helsinki 1978, vol. 2, ed. O. Lehto, Academia Scientiarum Fennica, 1980, pp. 873–880. [95] On the Kubo–Martin–Schwinger boundary condition, Progr. Theor. Phys. Suppl. 64 (1978) 12–20. [96] On chemical potential, Random Fields, eds. J. Fritz, J. L. Lebowitz and D. Sz´ asz, North-Holland, 1981, pp. 65–78. [97] On a characterization of the state space of quantum mechanics, Comm. Math. Phys. 75 (1980) 1–24. [98] Progress of the theory of renormalization, Kagaku 49(12) (1979). [99] A characterization of the state space of quantum mechanics, R.C.P. 25, 28 (1980) 59–67. [100] A remark on Machida–Namiki theory of measurement, Progr. Theor. Phys. 64 (1980) 719–730. [101] C ∗ -algebra approach in quantum field theory (Contribution to the Symposium “Perspective in Modern Field Theories” in Stockholm, 23–26 September 1980), Phys. Scripta 24(5) (1981) 881–885. [102] Positive cones for von Neumann algebras, in Operator Algebras and Applications, ed. R. V. Kadison, AMS, 1982, Part 2, Proceedings of Symposia in Pure Mathematics, vol. 38 (1982), pp. 5–15. [103] The role of C ∗ -algebras and its future, Kagaku 150(10) (1980) 630–635. [104] Recent trends in mathematical studies of quantum field theory (Contribution to the Japan–Italy Symposium on Fundamental Physics, held on January 27–30, 1981 at Istituto Italiano di Cultura, Tokyo), Proceedings of the Japan–Italy Symposium on Fundamental Physics, eds. S. Fukui and T. Toyoda, Nagoya Univ., pp. 35–45. [105] An inequality for Hilbert–Schmidt norm (with Shigeru Yamagami), [RIMS-355], Comm. Math. Phys. 81(1) (1981) 89–96. [106] On quasi-equivalence of quasifree states of the canonical commutation relations (with Shigeru Yamagami), Publ. RIMS, Kyoto Univ. 18(2) (1982) 703–758. [107] Positive cones and Lp -spaces for von Neumann algebras (with Tetsuya Masuda), Publ. RIMS, Kyoto Univ. 18(2) (1982) 759–831. [108] Some aspects of mathematical methods in quantum statistical mechanics (Lecture at Mathematiker-Kongress der DDR 1981), Mitteilungen der Mathematisschen Gesellschaft der Deutschen Demokratischen Republik, (1–2) (1982) 5–13. [109] On a certain class of ∗-algebras of unbounded operators (with Jean Paul Jurzak), Publ. RIMS, Kyoto Univ. 18(3) (1982) 1013–1044. [110] Theory of von Neumann algebras and quantum statistical mechanics, Mitteilungen der Mathematisschen Gesellschaft der Deutschen Demokratischen Republik, (1–2) (1982) 5–13. [111] On a pathology in indefinite metric inner product space, Comm. Math. Phys. 85 (1982) 121–128. [112] A remark on transition probability (with Guido A. Raggio), [RIMS-385], Lett. Math. Phys. 6(3) (1982) 237–240. [113] On the dynamics and ergodic properties of the XY-model (with Eytan Barouch), [RIMS-416], J. Stat. Phys. 31(2) (1983) 327–345. [114] Property T and actions on the hyperfinite II-factor (with M. Choda), Math. Japonica 28(2) (1983) 205–209. [115] On the XY-model on two-sided infinite chain, [RIMS-435], Publ. RIMS, Kyoto Univ. 20(2) (1984) 277–296. [116] On a C ∗ -algebra approach to phase transition in the two-dimensional Ising model (with David E. Evans), [RIMS-438], Comm. Math. Phys. 91 (1983) 489–503.
August 22, 2002 11:24 WSPC/148-RMP
910
00130
List of Publications of Huzihiro Araki
[117] Profiles of Field medalists 1982, Sugaku 35(1) (1983) 70–77. [118] Some contact points of mathematics and physics, Acta Appl. Math. 1 (1983) 5–15. [119] The work of Alain Connes, Proceedings of the International Congress of Mathematicians, August 16–24, 1983, Warszawa, vol. 1, pp. 3–10. [120] On C ∗ -algebra methods for the XY and Ising models, Physica 124A (1984) 47– 50; Mathematical Physics VII, eds. W. E. Brittain, K. E. Gustafson and W. Wyss, North-Holland, 1984, pp. 47–50. [121] Ergodic properties of some C ∗ -dynamical system, in Operator Algebras and their Connections with Topology and Ergodic Theory, Proceedings, Busteni, Romania 1983, Lecture Notes in Mathematics 1132 (1985), pp. 1–11. [122] Dynamical and ergodic properties of the XY-models, in Critical Phenomena, 1983, Brasov School Conference, Progress in Physics, vol. 11 (1985), pp. 287–314. [123] On On+1 (with A. L. Carey and D. E. Evans), J. Operator Theory 12(2) (1984). [124] C ∗ -algebra approach to ground states of the XY-model (with Taku Matsui), in Statistical Physics and Dynamical Systems — Rigorous Results, eds. J. Fritz, A. Jaffe and D. Sz´ asz, Birkhauser, Progress in Physics, vol. 10 (1985), pp. 17–39. [125] Indecomposable representations with invariant inner product — a theory of the Gupta–Bleuler triplets, Comm. Math. Phys. 97 (1985) 149–159. [126] Ground states of the XY-model (with Taku Matsui), Comm. Math. Phys. 101 (1985) 213–245. [127] On international congress of mathematicians in Warsow, Sugaku 36(1) (1984) 6–10. [128] Mathematical science in the last two decades, Sugaku 36(1) (1984) 51–67. [129] Taming infinity, Suurikagaku (1971) 31–36. [130] Around the fuzziness, Kagaku-Kisoron Kenkyu 13(4) (1978) 140–144. [131] On International Association of Mathematical Physics, Bull. Phys. Soc. Japan 11(11) (1978) 888–889. [132] The core spirit of non-commutativaty analysis, in Proceedings of the Conference at the 60th Anniversary of Prof. N. Tomita, 1984, pp. 1–18. [133] A continuous superselection rule as a model of classical measuring apparatus in quantum mechanics, in Fundamental Aspects of Quantum Theory, eds. V. Gorini and A. Frigerio, Plenum, 1986, pp. 23–33. [134] Analyticity of ground states of the XY-model (with Taku Matsui), [RIMS-522], Lett. Math. Phys. 11 (1986) 87–94. [135] Analyticity of correlation functions for the two-dimensional Ising models, [RIMS525], Comm. Math. Phys. 106 (1986) 241–266. [136] Bogoliubov automorphisms and Fock representations of canonical anticommutation relations, Amer. Math. Soc. Contemporary Mathematics 62 (1987) 23–141. [137] Analyticity in some models of quantum statistical mechanics, in Statistical Mechanics and Field Theory: Mathematical Aspects, eds. T. C. Dorlas, N. M. Hugenholtz and M. Winnink, Lecture Notes in Phys. 257, 1986, pp. 89–98. [138] Quantum mechanics of systems of infinite degrees of freedom — An introduction to the C ∗ -algebra approach, in Proceedings of the International Conference on Mathematical Physics, ed. J. N. Islam, Univ. of Chittagong, 1987, pp. 171–206. [139] On pseudo-unity representations, Czechoslovak J. Phys. B 37 (1987) 278–284. [140] Recent progress on entropy and relative entropy, in VIIIth International Congress on Mathematical Physics, eds. M. Mebkhout and R. Seneor, World Scientific, 1987, pp. 354–365. [141] On superselection rules, in Proceedings 2nd International Symposium Foundations of Quantum Mechanics, eds. M. Namiki et al., Phys. Soc. Japan, 1987, pp. 348–354. [142] On subadditivity of relative entropy, Wandering in the Fields, eds. K. Kawarabayashi and A. Ukawa, World Scientific, 1987, pp. 215–226.
August 22, 2002 11:24 WSPC/148-RMP
00130
List of Publications of Huzihiro Araki
911
[143] Schwinger terms as a cyclic cocycle, in Quantum Theories and Geometry, eds. M. Cahen and M. Flato, Math. Phys. Studies, 10, Kluwer Academic Publ., 1988, pp. 1–22. [144] A note on an exact solution for the optical absorptance by thin film (with E. Barouch), Lett. Math. Phys. 14 (1987) 227–234. [145] Recent Trends in Mathematical Physics. [146] Some Italian and Japanese contributions to Modern Mathematical Physics. [147] An application of Dye’s theorem of projection lattice to orthogonally decomposable isomorphisms, Pacific J. Math. 137 (1989) 1–13. [148] Some of the legacy of John von Neumann in physics — theory of measurement, quantum logic and von Neumann algebras in physics, in Proceedings of Symposia in Pure Math. 50 (1990), pp. 119–136. [149] On an inequality of Lieb and Thirring, Lett. Math. Phys. 19 (1990) 167–170. [150] Selfadjointness and positivity of a certain partial differential operator associated with the Kerr metric in general relativity, Lett. Math. Phys. 18 (1989) 355–363. [151] Physical image of the δ function, Suurikagaku 310 (1989) 19–23. [152] Master symmetries of the XY -model, Comm. Math. Phys. 132 (1990) 155–176. [153] Symmetries in theory of local observables and the choice of the net of local algebras, Rev. Math. Phys. Special issue (1992) 1–14. [154] Operator algebra approach to soluble models of quantum spin lattice systems, in Quantum and Non-Commutative Analysis, eds. H. Araki, K. R. Ito, A. Kishimoto and I. Ojima, Reidel Publ. Co., 1993, pp. 183–195. [155] A remark on a connection of return to equilibrium and multiple ground states in some perturbed XY -model, in On Klauder’s Path: A Field Trip, eds. G. G. Emch, G. C. Hegerfeldt and L. Streit, World Scientific, 1994, pp. 11–14. [156] C ∗ -algebras and statistical mechanics — Guadeloop winter school lecture, in Infinite Dimensional Geometry, Non-Commutative Geometry, Operator Algebras, Fundamental Interactions, eds. R. Coquereaux, M. Dubois-Violette and P. Flad, World Scientific, 1995, pp. 2–33. [157] On boundedness and k · kp continuity of second quantization (with K. B. Sinha and V. Sunder), Publ. RIMS, Kyoto Univ. 31(5) (1995) 941–952. [158] Soliton sectors of the XY-model, Inter. J. Modern Phys. B 10(13–14) (1996) 1685–1693. [159] Finite energy sectors of the XY-model, Lett. Math. Phys. 38 (1996) 399–410. [160] On commuting transfer matrices (with Takaaki Tabuchi), Helvetica Phys. Acta 69 (1996) 717–751. [161] Generalization of Krinsky’s commutativity proof of transfer matrices with Hamiltonians (with Takaaki Tabuchi), Foundation of Phys. 27 (1997) 1485–1494. [162] Some infinite-dimensional algebras arising in spin systems and in particle physics and their grand algebra (with M. Flato, S. Michea and D. Sternheimer), Lett. Math. Phys. 43 (1998) 155–171. [163] Operator algebra methods in quantum physics, in Mathematical Methods of Quantum Physics, eds. C. Bernido, M. V. Carpio-Bernido, K. Nakamura and K. Watanabe, Gordon and Breach Science Publ., 1999, pp. 29–41. [164] Asymptotic time evolution of a partitioned infinite two-sided isotropic XY-chain (with T. G. Ho), in Proceedings of the Steklov Institute of Mathematics 228 (2000) 191–204. [165] Jensen’s operator inequality for functions of several varibles (with F. Hansen), Proc. Amer. Math. Soc. 128 (2000) 2075–2084.
September 21, 2002 14:4 WSPC/148-RMP
00146
Reviews in Mathematical Physics, Vol. 14, No. 9 (2002) 913–975 c World Scientific Publishing Company
EXCEPTIONAL STRING: INSTANTON EXPANSIONS AND SEIBERG WITTEN CURVE
KENJI MOHRI Institute of Physics, University of Tsukuba Ibaraki 305-8571, Japan
[email protected] Received 19 December 2001 We investigate instanton expansions of partition functions of several toric E-string models using local mirror symmetry and elliptic modular forms. We also develop a method to determine the Seiberg–Witten curve of E-string with the help of elliptic functions. Keywords: E-string; local mirror symmetry; Seiberg–Witten curve.
Contents 1. Introduction 2. E-String 2.1. Homology lattice and affine root lattice 2.2. Partition functions 3. Four Torus Models 3.1. Periods of elliptic curves 3.2. Modular identities 4. Model Building of E-Strings 4.1. Principal series 4.2. E 0 and E ˜1 models 4.3. K¨ ahler classes 4.4. Periods 4.5. Relation to Seiberg–Witten periods 5. Genus Zero Partition Functions 5.1. Recursion relations 5.2. Instanton expansion 5.3. Partition functions and modular forms 5.4. Modular anomaly equation 5.5. Rational instanton numbers 6. Genus One Partition Functions 6.1. Genus one potentials 6.2. Modular anomaly equation 6.3. Elliptic instanton numbers 7. Higher Genus Partition Functions 7.1. Partition functions as modular forms 913
914 915 915 917 920 921 923 924 924 925 926 927 930 932 932 935 937 939 939 943 943 946 947 950 950
September 21, 2002 14:4 WSPC/148-RMP
914
00146
K. Mohri
7.2. Gopakumar–Vafa invariants 7.3. Partition functions as Jacobi forms 8. Seiberg–Witten Curve 8.1. Periods of rational elliptic surfaces 8.2. Wilson lines 9 Inverse Problem 9.1. General strategy 9.2. Several models with a few Wilson lines Acknowledgments Appendix A References
955 959 962 962 965 966 966 969 971 972 973
1. Introduction Exceptional string (E-string in short) has originally been discovered as the effective six-dimensional (6D) theory associated with a small E 8 instanton in heterotic string [1]. It has become clear since then that physical content of the toroidal compactification of E-string down to 4D is extremely rich [2–9]. 4D E-string theory can be realized as type IIA string “compactified” on the canonical line bundle of a rational elliptic surface B9 . However we must rely on mirror symmetry for any quantitative analysis. In fact, we have two versions of mirror symmetry. One is local mirror symmetry [10] applied to B9 , which must be realized torically [11]. At the expense of the restriction on K¨ ahler moduli, this method enables us to investigate systematically the BPS spectrum even at higher genera [9]. The other describes E-string by the Seiberg–Witten curve [2, 3] based on the fact that B9 is self-mirror; the E-string is mapped to type IIB string on a non-compact Calabi–Yau manifold containing a rational elliptic surface S9 [7], the complex moduli of which replaces the K¨ahler moduli of B9 . The purpose of this paper is then to explore further these two descriptions of E-string. This paper is organized as follows. In Sec. 2, we collect general results on the K¨ ahler moduli parameters and the partition functions of E-string. The remaining sections are divided into two parts. The first part consists of Secs. 3–7, where we investigate the six toric E-string models by means of local mirror symmetry; in Sec. 3, we study the four torus models associated with the E-string models, where the relation between the periods and elliptic modular forms is the central problem, in Sec. 4 we analyze the Picard–Fuchs system of the E-string models; we then investigate the partition functions of the E-string models of genus zero, one in Secs. 5, 6 respectively; finally in Sec. 7, partition functions of higher genera are considered in connection with elliptic modular forms, Gopakumar–Vafa invariants and E 8 Jacobi forms. The second part deals with the Seiberg–Witten curve of E-string; after a review of the period map of rational elliptic surface in Sec. 8, we obtain a procedure to determine the Seiberg–Witten curve for given Wilson lines using elliptic functions in Sec. 9.
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
915
2. E-String 2.1. Homology lattice and affine root lattice Let us first consider type IIA string compactified on a Calabi–Yau threefold which contains a rational elliptic surface B9 with a section [12] as a divisor, where we use the symbol BN for E N del Pezzo surface; a rational elliptic surface is alternatively called an almost E 9 del Pezzo surface. The resulting 4D theory is a N = 2 supergravity. The large radius limit of the normal direction to B9 then kills almost all the degrees of freedom of the original Calabi–Yau threefold; effectively we are left with KB9 , the canonical line bundle of B9 , as a compactification manifold, and the 4D theory reduces to a E-string theory which does not contain gravity. We are interested in the physical quantities of the E-string that depend only on the complexified K¨ ahler class J ∈ H2 (B9 ; C) of B9 inherited from the Calabi–Yau threefold. It should be noted that we can take the complex structure of B9 to be generic, that is, we can assume that the elliptic fibration π: B9 → P1 has the twelve singular fibers of the I1 type. We recall here some properties of the second homology classes H2 (B9 ). The lattice structure in H2 (B9 ) induced by intersection parings is given by H2 (B9 ) = Zl ⊕ ZE1 ⊕ · · · ⊕ ZE9 , l ·l = 1,
l · Ei = 0 ,
Ei · Ej = −δij .
(2.1) (2.2)
B9 is realized as a blow-up of P2 at the nine points which are the base points of a cubic pencil. The class l is given by the total transform of a line in P2 , while Ei the exceptional divisor associated with the ith base point. The first Chern class c1 (B9 ) is represented by the fiber class [δ] of the elliptic fibration π: B9 → P1 , which can be written in terms of the generators (2.2) as [δ] = 3l − E1 − · · · − E9 . It is readily verified that [δ] · l = 3, [δ] · Ei = 1, and [δ] · [δ] = 0. The sublattice [δ]⊥ of H2 (B9 ), which is the orthogonal complement of [δ] in (1) H2 (B9 ), is naturally identified with the root lattice of the affine Lie algebra E 8 : L (1) (1) 8 L(E 8 ) = i=0 Zαi , where {αi }8i=0 is the simple roots of E 8 . This can be seen ⊥ as follows. First we find the generators of [δ] as a free Z-module [δ]⊥ =
8 M
Z[αi ] ,
H2 (B9 ) ∼ = [δ]⊥ ⊕ ZE9 ,
i=0
[α0 ] = E8 − E9 ,
[αi ] = Ei − Ei+1 ,
i = 1, . . . , 7 ,
(2.3)
[α8 ] = l − E1 − E2 − E3 .
Then we see that the intersection pairings ([αi ] · [αj ]) coincides with the minus (1) (1) of the Cartan matrix of E 8 . Note that the element of L(E 8 ) corresponding to [δ] ∈ [δ]⊥ is
September 21, 2002 14:4 WSPC/148-RMP
916
00146
K. Mohri
[δ] = [α0 ] + 2[α1 ] + 4[α2 ] + 6[α3 ] + 5[α4 ] + 4[α5 ] + 3[α6 ] + 2[α7 ] + 3[α8 ] , which is in accord with the standard notation of the affine Lie algebra; α0 = δ − θ, with θ the highest root of E 8 . Let {ωi }8i=1 be the fundamental weights of E 8 . Then (ωi |αj ) = δi,j and θ = ω7 . We denote the corresponding elements of [δ]⊥ by {[ωi ]}8i=1 , which satisfy [ωi ]·[αj ] = −δi,j . (1) The dual of the Cartan subalgebra of E 8 , which we denote by h∗ , contains the zeroth fundamental weight Λ0 as a generator in addition to the simple roots {αi }, that is, h∗ = CΛ0 ⊕ Cδ ⊕ Cα1 ⊕ · · · ⊕ Cα8 . Λ0 satisfies (Λ0 |αi ) = 0, i 6= 0 and (Λ0 |δ) = 1, which leads to the final identification E9 = −[Λ0 ] − 1/2[δ]. Thus we have another basis of the vector space
H2 (B9 ; Q) =
8 M
Q[αi ] ⊕ Q[δ] ⊕ Q[Λ0 ] ,
(2.4)
i=1
from which it follows that H2 (B9 ; C) can be identified with h∗ , the CSA of (1) E8 . To summarize, we find the isomorphism: h∗ 3 x 7→ [x] ∈ H2 (B9 ; C) of the C-vector spaces such that (x|y) = −[x] · [y] for any x, y ∈ h∗ . The important consequence on the above observation is that the Weyl group (1) (1) W (E 8 ) of E 8 acts on H2 (B9 ; C). To see the effect of the Weyl action on the K¨ ahler moduli parameters of the E-string, which should be a physical symmetry, we put the K¨ ahler class J ∈ H2 (B9 ; C) ∼ = h∗ in the canonical form J=
8 X 1 τ + σ [δ] − τ [Λ0 ] − µi [ωi ] , 2 i=1
(2.5)
where τ is the complex modulus of the torus T2 , on which the 6D E-string theory is compactified to 4D, τ + σ the K¨ ahler modulus of T2 , or equivalently, the E-string tension and the self-dual B-flux on T2 , and (µi ) the E 8 Wilson lines, that is, the moduli of the flat E 8 bundles on T2 [2, 3]. (1) There exists a semi-direct product structure: W (E 8 ) = W (E 8 ) n T , where W (E 8 ) is the finite Weyl group and T := {tβ |β ∈ L(E 8 )}, the translation by the root lattice L(E 8 ). P W (E 8 ) affects only the Wilson lines µ := 8i=1 µi ωi ; it is clear that µi transforms in the same way as the simple root αi of E 8 . The embedding ı of the finite root lattice L(E 8 ) in the Euclidean space R8 defined by
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
1 2 α −1 2 α3 0 α4 → 0 ı: α 0 5 α6 0 α7 0 α8 1 α1
−
1 2 1
−
1 2 0
−
1 2 0
−
1 2 0
−
1 2 0
−
1 2 0
−1
1
0
0
0
0
0
−1
1
0
0
0
0
0
−1
1
0
0
0
0
0
−1
1
0
0
0
0
0
−1
1
1
0
0
0
0
0
1 e1 2 e 2 0 e3 0 e4 0 , e 5 0 e6 0 e7 0 e8 0
917
(2.6)
where {ei } is an orthonormal basis of R8 , greatly simplifies the description of W (E 8 ); it is generated by (i) the permutations of {ei }, (ii) the sign flips of even number of ei s and (iii) the involution, which is the Weyl reflection with respect to P8 P8 P7 the root θ − i=1 αi , ei 7→ ei − 1/4 j=1 ej [2, 3]. If ı(µ) = i=1 mi (µ)ei , we call (mi (µ)) the Euclidean coordinates of µ. On the other hand, the translation tβ by β ∈ L(E 8 ) is given by (β|β) (x|δ) + (x|β) δ . (2.7) tβ (x) = x + (x|δ)β − 2 ahler moduli parameters: Substituting x = J (2.5), we see the tβ action on the K¨ 1 σ 7→ σ + (β|β)τ + (µ|β) , 2
τ 7→ τ ,
µ 7→ µ + τ β ,
(2.8)
which is familiar as a symmetry in classical theta functions. Note that to realize another symmetry translation σ → σ, τ → τ , µ → µ + α, as a Weyl group action, (1) we need to consider the doubly-affinized E 8 algebra, E 9 [13–15], which might be possible only if we extend H2 (B9 ) to the full homology lattice H0 (B9 ) ⊕ H2 (B9 ) ⊕ H4 (B9 ). 2.2. Partition functions The Gromov–Witten partition function Zg;n (τ |µ) of genus g and winding number ahler moduli J given in (2.5) is defined by the n of the local B9 model with the K¨ expansion coefficient of the genus g potential Fg [16]: Fg (σ, τ, µ) =
∞ X
pn Zg;n (τ |µ) ,
p := e2πiσ .
(2.9)
n=1
We remark here that the genus g refers to that of type IIA string, while the winding number n refers to that of E-string. Q∞ Using ϕ(τ ) := n=1 (1 − q n ), with q = e2πiτ , we can write Zg;n as Zg;n (τ |µ) =
Tg;n (τ |µ) . ϕ(τ )12n
(2.10)
September 21, 2002 14:4 WSPC/148-RMP
918
00146
K. Mohri
The numerator Tg;n is so-called a Weyl-invariant E 8 quasi-Jacobi form [17] of index n and weight 2g − 2 + 6n, which means that Tg;n (τ |µ) is invariant under the W (E 8 ) action on µ, and it has the following transformation properties: Tg;n (τ |µ + α + βτ ) = e−πin[(β|β)τ +2(β|µ)]Tg;n (τ |µ) , α, β ∈ L(E 8 ) , ! µ nπic aτ + b ˆ = (cτ + d)2g−2+6n e cτ +d (µ|µ) Tˆg;n (τ |µ) , Tg;n cτ + d cτ + d ! a b ∈ SL(2; Z) , c d
(2.11)
(2.12)
ˆ2 (τ ) := where Tˆg;n is obtained from Tg;n by replacement of each E2 (τ ) in it with E E2 (τ ) − 3/(π Im τ ), which gets rid of the modular anomaly of Zg;n which comes from 12 c aτ + b 2 = (cτ + d) E2 (τ ) + , (2.13) E2 cτ + d 2πi cτ + d at the sacrifice of holomorphy [18]. (2.11) shows that e2πinσ Tg;n (τ |µ) is invariant under the translation tβ (2.8) as expected. In particular, the genus zero, singly winding partition function can be given by the classical level one E 8 theta function [4, 19, 7]: X eπi(α|α)τ +2πi(α|µ) . (2.14) T0;1 (τ |µ) = ΘE 8 (τ |µ) := α∈L(E 8 )
ΘE 8 can be written in terms of the Euclidean coordinates for the Wilson lines as 1 XY ϑa (τ |mi ) , 2 a=1 i=1 4
ΘE 8 (τ |µ) =
8
ι(µ) =
8 X
mi e i .
(2.15)
i=1
The relation to curve counting problem is as follows [9]: in Z0;1 (τ |µ), the E 8 theta function part is regarded as the contribution to Z0;1 from the Mordell–Weil lattice [20], while the denominator part ϕ12 [4, 19] from the twelve degenerate elliptic fibres of the fibration π : B9 → P1 . Furthermore it is clear from the analysis of the BPS states in [9, 21] that higher genus partition functions of the singly winding sector can be obtained by 2 ∞ X η(τ )3 Zg;1 (τ |µ)x2g−2 = Z0;1 (τ |µ) . (2.16) Φ1 := x ϑ1 (τ | 2π ) g=0 The genus expansion of the right hand side reads [18], "∞ # 2 ∞ X (−1)k+1 b2k X 1 η(τ )3 2k E = exp (τ )x x2g−2 Ag (τ ) , = 2k x ϑ1 (τ | 2π ) x2 k(2k)! g=0 k=1
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
919
where bn and the (2k)th Eisenstein series E2k (τ ) are defined by ∞ X xn x = , bn x e − 1 n=0 n!
E2k (τ ) = 1 −
∞ 4k X σ2k−1 (n)q n , b2k n=1
σk (n) :=
X
mk .
m|n
As E2k ∈ Q[E4 , E6 ], that is, E8 = E42 , E10 = E4 E6 , E12 = 1/691(250E62 + 441E43 ), E14 = E42 E6 , for example, the coefficients Ag above can be reduced to A1 =
1 E2 , 12
A4 =
1 (39E42 + 80E2 E6 + 210E22E4 + 175E24 ) , 87091200
A2 =
1 (E4 + 5E22 ) , 1440
A3 =
1 (4E6 + 21E2 E4 + 35E23 ) , 362880
1 (136E4 E6 + 429E2 E42 + 440E22 E6 + 770E23E4 + 385E25 ), . . . . 11496038400 The partition functions Zg;n obeys the modular anomaly equation X 1 X ∂Zg;n (τ |µ) = kk 0 Zh;k (τ |µ)Zh0 ;k0 (τ |µ) ∂E2 (τ ) 24 0 0
A5 =
h+h =g k+k =n
1 n(n + 1)Zg−1;n (τ |µ) , (2.17) 24 which has first been found in genus zero partition functions [6], and then generalized to higher genus partition functions in [9]. This can be rewritten using the genus g ˆ2 (τ ), Fˆg = P∞ pn Zˆg;n , as potential (2.9) with each E2 (τ ) in it replaced by E n=1 ! g X ∂ Fˆg 2 = Θp Fˆh Θp Fˆg−h + (Θp + 1)Θp Fˆg−1 , −16πi (Im τ ) ∂ τ¯ +
h=0
where Θp = p∂/∂p is the Euler derivative. Compare this with the holomorphic anomaly equation for the topological string amplitude Fg for g > 1 on a Calabi– Yau threefold X expressed in a general coordinate system [16] ! g−1 X 1 ¯ 2K j¯ j kk ˆ ˆ ˆ ˆ Dj Fh Dk Fg−h + Dj Dk Fg−1 ∂ ¯i Fg = C¯i¯j k¯ e G G 2 h=1
where Gi¯j = ∂i ∂ ¯j K is the Weil–Petersson–Zamolodchikov metric on the complex moduli space M(X ∗ ) of the mirror X ∗ , e−K a natural Hermite metric on a positive line bundle L → M(X ∗ ), with the holomorphic three-form of X ∗ being a local section, and Cijk the 273 Yukawa coupling. The covariant derivatives above have ¯ contributions not only from the Levi–Civita connection Γkij = Gkl ∂j Gi¯l but also from the Hermite connection −∂i K on L because Fˆg is a section of L2g−2 , Di Fˆg = [∂i + (2 − 2g)∂i K]Fˆg ,
Di Dj Fˆg = [∂i + (2 − 2g)∂i K]Dj Fˆg − Γkij Dk Fˆg .
Even more important is the potential Φn of fixed winding number n: Φn (λ; τ, µ) =
∞ X g=0
x2g−2 Zg;n (τ |µ) ,
1
2
λ := e 12 E2 (τ )x .
(2.18)
September 21, 2002 14:4 WSPC/148-RMP
920
00146
K. Mohri
Introduction of λ and the full potential A := simplifies the modular anomaly Eq. (2.17):
P∞ g=0
x2g−2 Fg =
P∞ n=1
pn Φn greatly
1 Θp (Θp + 1) exp(A) . (2.19) 2 P∞ If we substitute the p-expansion exp(A) = n=0 pn ψn (Φ1 , . . . , Φn ), where ψn is the nth Schur polynomial, then we obtain Θλ ψn = n(n + 1)/2ψn , hence Θλ exp(A) =
1
ψn (Φ1 , . . . , Φn ) = λ 2 n(n+1) ψn (Φ01 , . . . , Φ0n ) ,
(2.20)
where Φ0n = Φn |E2 =0 is the anomaly-free part; in particular, Φ1 = λΦ01 from (2.16). Finally (2.20) enables us to express the solution for the modular anomaly equation (2.17) concisely by Φn with the anomaly-free part Φ0n as its integration constant: Φ2 =
1 (λ − 1)(Φ1 )2 + λ3 Φ02 , 2
1 Φ3 = − (λ − 1)2 (2λ + 1)(Φ1 )3 + (λ2 − 1)Φ1 Φ2 + λ6 Φ03 , 6 Φ4 =
1 1 (λ − 1)3 (6λ3 + 6λ2 + 3λ + 1)(Φ1 )4 + (λ4 − 1)(Φ2 )2 + (λ3 − 1)Φ1 Φ3 24 2 1 − (λ − 1)2 (λ + 1)(2λ2 + λ + 1)(Φ1 )2 Φ2 + λ10 Φ04 . 2
(2.21)
We are interested in Φn also because it encapsulates the interaction of n E-strings, which can be quite different from that of fundamental strings [7]. We also give the prediction of the leading term of the partition function: Zg;n (τ |µ) = βg,0 n2g−3 + O(q n ) ,
βg,0 :=
|b2g (2g − 1)| , (2g)!
(2.22)
which generalizes the genus zero result in [6]. This can be used to partially fix the integration constants of the partition function. 3. Four Torus Models In this section, we investigate the instanton expansion by means of elliptic modular forms [22] of the periods of four one-parameter families of elliptic curves, E N , N = 5, 6, 7, 8, each of which is defined as a complete intersection in a toric variety as shown in the Table 1. The results of this section will play an essential role in the instanton expansion of E-string model in the next section. The E N torus model have been studied in connection with the one-parameter families of the local E N del Pezzo models [4, 5, 10, 23, 24], and Calabi–Yau threefolds with these elliptic fibers have been used to describe 4D string models the E 8 gauge symmetry of which is broken to E N by Wilson lines [25, 4].
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve Table 1.
Toric models.
Elliptic curve E5 E6 E7 E8
921
: P3 [2, 2] : P2 [3] : P1,1,2 [4] : P1,2,3 [6]
E9 del Pezzo E5 E6 E7 E8
E˜1 E0
3.1. Periods of elliptic curves The Picard–Fuchs operator for the periods of the E N family of elliptic curves is given by (N )
Dell = Θ2 − z(Θ + α(N ) )(Θ + β (N ) ) ,
(3.1)
where Θ = zd /d z is the Euler derivative with respect to the bare modulus z, and α, β is defined by α + β = 1, and (α(5) , α(6) , α(7) , α(8) ) = (1/2, 1/3, 1/4, 1/6). The Gauss hypergeometric equation defined by (3.1) has the regular singular points at z = 0, 1, ∞. The solutions around z = 0, which corresponds to the large radius limit point of the sigma model with the torus as target, are given by ∞ X a(n)z n = 2 F1 (α, β; 1; z) , (3.2) $(z) = n=0
X ∞ z a(n)b(n)z n , + $D (z) = $(z) log κ n=1
(3.3)
where (κ(5) , κ(6) , κ(7) , κ(8) ) = (16, 27, 64, 432), and n−1 X 1 1 2 (α)n (β)n + − , b(n) = , a(n) = (n!)2 k+α k+β k+1 k=0
with (α)n := Γ(n + α)/Γ(α) the Pochhammer symbol. The mirror map of the torus is defined by $D (z) , (3.4) 2πiτ = $(z) where τ is the K¨ahler parameter of the torus. In terms of the local coordinate u = 1 − z at another regular singular point z = 1, the Picard–Fuchs operator takes the same form as one at z = 0 (3.1), and the continuation of the solutions above becomes [24] 0 x $(u) $(z) sin(πα) . (3.5) →− , x= π x−1 0 $D (u) $D (z) Let M0,1,∞ be the monodromy matrices around the regular singular points z =0, 1, ∞ with respect to the basis {$D (z)/(2πi), $(z)}. We can compute M0 and M1 from (3.2), (3.3) and (3.5) respectively: 1 1 1 0 , M1 = −ST (9−N )S = , (3.6) M0 = T = 0 1 −(9 − N ) 1
September 21, 2002 14:4 WSPC/148-RMP
922
00146
K. Mohri
where S and T are the standard generators of SL(2; Z). The remaining one M∞ can be obtained from M∞ = M1 M0 . The monodromy group Γell is generated by M0 (N ) and M1 ; in particular, for N 6= 8, Γell ∼ = Γ0 (9 − N ), which is the Hecke subgroup of SL(2; Z) [22] a b ∈ SL(2; Z) c ≡ 0 mod h . Γ0 (h) := c d The structure of M∗ (Γ0 (9 − N )), the graded ring of of the modular forms of even degree of Γ0 (9 − N ), should be clear from E 5 : M∗ (Γ0 (4)) = C[ϑ3 (2τ )4 , ϑ4 (2τ )4 ] , E 6 : M∗ (Γ0 (3)) = C[$(6) , H] E 7 : M∗ (Γ0 (2)) = C[A, B] ,
,
H :=
even
(3.7) η(τ )9 , η(3τ )3
1 4 (ϑ + ϑ44 )(τ ) , 2 3
A :=
(3.8) B := ϑ43 ϑ44 (τ ) .
(3.9)
The fundamental solution $(z) admits the following expressions: $(5) (z) = ΘA1 ⊕A1 (τ ) = ϑ3 (2τ )2 ,
(3.10)
$(6) (z) = ΘA2 (τ ) = ϑ3 (2τ )ϑ3 (6τ ) + ϑ2 (2τ )ϑ2 (6τ ) ,
(3.11)
$(7) (z)2 = ΘD4 (τ ) = A(τ ) ,
(3.12)
$(8) (z)4 = ΘE 8 (τ ) = E4 (τ ) ,
(3.13)
where ΘK (τ ) is the theta function associated with the root lattice of K. Despite of the modular anomaly (2.13), hE2 (hτ ) − E2 (τ ) is a sound modular form of Γ0 (h) for each h ∈ N. The following identity then shows that $(z)2 is an (N ) element of M2 (Γell ) for N = 5, 6, 7: (8 − N )ω (N ) (z)2 = (9 − N )E2 ((9 − N )τ ) − E2 (τ ) .
(3.14)
Let us define for later use the modular function e2k by e2k (τ ) = E2k (τ )$−2k for each model, which turn out to be written in terms of y := 1/(1 − z): e6 (τ ) = −64 +
96 30 1 − 2 − 3, y y y
8 , y
e6 (τ ) = −27 +
8 36 − 2, y y
3 , y
e6 (τ ) = −8 +
9 , y
e6 (τ ) = −1 +
2 . y
E5 :
e4 (τ ) = 16 −
E6 :
e4 (τ ) = 9 −
E7 :
e4 (τ ) = 4 −
E8 :
e4 (τ ) = 1 ,
1 16 + 2, y y
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
923
We see that e4 and e6 are subject to the algebraic relation: E5 :
0 = e34 − e26 − 108e4 e6 − 8640e6 − 1620e24 − 17280e4 + 27648 ,
E6 :
0 = 8e6 + 18e4 + e24 − 27 ,
E7 :
0 = 3e4 + e6 − 4 . (N )
Let T (N )(τ ) be the modular function of Γell defined by 8 12 24 η(τ ) η(τ ) η(τ ) , T (6) (τ ) = , T (7) (τ ) = , T (5) (τ ) = η(4τ ) η(3τ ) η(2τ ) !2 3 E4 (τ ) 2 + E6 (τ ) (8) , (3.15) T (τ ) = 2η(τ )12 Q∞ where η(τ ) := q 1/24 n=1 (1 − q n ) is the Dedekind eta function with q = e2πiτ . Then the inversion of the mirror map (3.4) can be given by 1 1 = 1 + T (τ ) . z(τ ) κ
(3.16)
(N )
We note here that when N = 5, 6, 7, Γell is a genus zero modular group, that (N ) is, the upper half plane divided by the action of Γell is completed to P1 , and the (N ) q-expansion of T (N ) (τ ) reproduces the Thompson series associated with Γell [26]. Another useful expressions of 1 − z by modular forms are 4 ϑ4 (2τ ) H (3.17) , E 6 : 1 − z(τ ) = 3 , E 5 : 1 − z(τ ) = ϑ3 (2τ ) $ B 1 −3 , E 8 : 1 − z(τ ) = (1 + E6 E4 2 (τ )) . 2 A 2 For each model, $(z) and z satisfy the following equations: E7 :
1 − z(τ ) =
1 dz = z(1 − z)$(z)2 , 2πi d τ
κη(τ )24 = z(1 − z)9−N $(z)12 .
(3.18)
(3.19)
3.2. Modular identities The following power series will play an important role in the instanton expansion of the E-string models: ξ (0) (z) =
∞ X
a(6) (n)h(3n)z n ,
(3.20)
a(5) (n)h(2n)z n ,
(3.21)
n=1 ˜
ξ (1) (z) =
∞ X n=1
ξ (N ) (z) =
∞ X n=1
a(N ) (n)h(n)z n ,
for N = 5, 6, 7, 8 .
(3.22)
September 21, 2002 14:4 WSPC/148-RMP
924
00146
K. Mohri
where h(n) = the identities: E6 :
E5 :
EN :
Pn k=1
k −1 is the harmonic function. A computer experiment gives
(0) 1 27q 6 ξ (z) 1 ψ (τ ) = exp − (1 − z) 2 = $(z) z ! 1 ˜ 1 16q 4 ξ (1) (z) (˜ 1) = ψ (τ ) = exp − (1 − z) 2 , $(z) z (0)
ψ
(N )
(3.23)
(3.24)
(N ) (N ) 12 1 1 κ q ξ (z) (τ ) = exp − (1 − z) 2 = (qT (N ) (τ )) 2 . (3.25) = $(z) z
4. Model Building of E-Strings In this section, we give six models: E N , N = 0, ˜1, 5, 6, 7, 8, of E 9 almost del Pezzo surface listed in Table 1, which is realized as a complete intersection in a toric variety, and analyze the Picard–Fuchs systems defined by them at the large radius point. Among the six models, we call the four E 5,6,7,8 the principal series, the reason of which will become clear in the investigation of the Picard–Fuchs system of them. 4.1. Principal series The E 6,7,8 models are obtained as hypersurfaces in ambient toric threefolds with their K¨ ahler classes inherited from those of the ambient spaces. To describe these, let us first define the action of (C∗ )2 on C5 for each model by E6 :
(x1 , x2 , x3 , x4 , x5 ) → (λx1 , λx2 , λ−1 µx3 , µx4 , µx5 ) ,
(4.1)
E7 :
(x1 , x2 , x3 , x4 , x5 ) → (λx1 , λx2 , λ−1 µx3 , µx4 , µ2 x5 ) ,
(4.2)
E8 :
(x1 , x2 , x3 , x4 , x5 ) → (λx1 , λx2 , λ−1 µx3 , µ2 x4 , µ3 x5 ) ,
(4.3)
where λ, µ ∈ C∗ . Then the ambient toric variety A can be realized by the quotient A := (C5 − F )/(C∗ )2 , where F = {x1 = x2 = 0} ∪ {x3 = x4 = x5 = 0} is the bad point set of the (C∗ )2 action. It is easy to see that the ambient space A for the E 6,7,8 model has a structure of weighted projective surface bundle over P1 with fiber P2 , P1,1,2 , P1,2,3 respectively. We can now define the E 9 almost del Pezzo surface B9 for the E 6,7,8 model as a hypersurface in A of bidegree (0, 3), (0, 4) and (0, 6) respectively, where the bidegree refers to the (λ, µ) charge of the defining polynomial. The K¨ ahler classes for B9 induced from those of the ambient space are given by J (6) = σ[δ] + τ ([δ] + E7 + E8 + E9 ) ,
(4.4)
J (7) = σ[δ] + τ ([δ] + E8 + E9 ) ,
(4.5)
J (8) = σ[δ] + τ ([δ] + E9 ) ,
(4.6)
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
925
where we recall that [δ] is the first Chern class of B9 . The local model of B9 embedded in a Calabi–Yau threefold is described by the total space of the canonical line bundle KB9 , which is again realized as a hypersurface in the non-compact toric variety OA (−1, 0). From this fact, we can identify the Mori vectors of the local Calabi–Yau models, which plays an essential role in the formulation of mirror symmetry [11]: E6 :
l1 = (0; 1, 1, −1, 0, 0, −1) ,
l2 = (−3; 0, 0, 1, 1, 1, 0) ,
(4.7)
E7 :
l1 = (0; 1, 1, −1, 0, 0, −1) ,
l2 = (−4; 0, 0, 1, 1, 2, 0) ,
(4.8)
E8 :
l1 = (0; 1, 1, −1, 0, 0, −1) ,
l2 = (−6; 0, 0, 1, 2, 3, 0) .
(4.9)
It will suffice here to point out that the first components of the vectors l1 , l2 are (minus) the degrees of the hypersurface, while the rest the C∗ charges of the homogeneous coordinates of KA , where the last one corresponds to the non-compact direction. Next we consider the E 5 model. The ambient toric variety in this case is a P3 bundle over P1 . This space admits the quotient realization (C6 − F )/(C∗ )2 , where the (C∗ )2 action on C6 is defined by (x1 , x2 , x3 , x4 , x5 , x6 ) → (λx1 , λx2 , λ−1 µx3 , µx4 , µx5 , µx6 ), and the bad point set F = {x1 = x2 = 0}∪{x3 = x4 = x5 = x6 = 0}. The surface B9 is defined by a complete intersection of two hypersurfaces of bidegree (0, 2). The K¨ahler class of the E 5 model induced from the ambient space turns out to be J (5) = σ[δ] + τ ([δ] + E6 + E7 + E8 + E9 ) .
(4.10)
The Mori vectors in this case can be seen as in the case of the E 6,7,8 models above: E5 :
l1 = (0, 0; 1, 1, −1, 0, 0, 0, −1) ,
l2 = (−2, −2; 0, 0, 1, 1, 1, 1, 0) .
(4.11)
4.2. E 0 and E ˜1 models We present here the remaining two models, E 0 and E ˜1 . Because the ambient spaces for these models are products of projective spaces, rather than a twisted fiber bundle as in the case of principal series, the toric construction of them may be omitted. The E 0 model is realized as a hypersurface of bidegree (1, 3) in P1 × P2 , while the E ˜1 model as one of tridegree (1, 2, 2) in P1 × P1 × P1 . The K¨ahler classes induced from the ambient spaces are given by J (0) = σ[δ] + τ l , ˜
J (1) = σ[δ] + τ1 (l − E1 ) + τ2 (l − E2 ) .
(4.12) (4.13)
Note that for the E ˜1 model, we must put the restriction τ1 = τ2 = τ to obtain a two-parameter model.
September 21, 2002 14:4 WSPC/148-RMP
926
00146
K. Mohri
The Mori vectors of these two models are E0 :
l1 = (−1; 1, 1, 0, 0, 0, −1) ,
l2 = (−3; 0, 0, 1, 1, 1, 0) ,
E ˜1 :
l1 = (−1; 1, 1, 0, 0, 0, 0, −1) ,
l2 = (−2; 0, 0, 1, 1, 0, 0, 0) ,
l3 = (−2; 0, 0, 0, 0, 1, 1, 0) .
(4.14) (4.15)
4.3. K¨ ahler classes Each of the K¨ ahler classes of the six models obtained above admits two important representations, the one of which reveals the blow-up/down scheme of the twoparameter family of B9 s, and the other the canonical form (2.5) suggested by the investigation of 6D E-string compactified on T2 [2, 3]: τ 3 (0) (4.16) J = σc1 (B9 ) + c1 (B0 ) = σ + τ [δ] − 3τ [Λ0 ] − τ [ω8 ] , 3 2 ˜
J (1) = σc1 (B9 ) +
τ c1 (B˜1 ) = (σ + 2τ )[δ] − 4τ [Λ0 ] − τ [ω2 ] , 2
J (5) = σc1 (B9 ) + τ c1 (B5 ) = (σ + 2τ )[δ] − 4τ [Λ0 ] − τ [ω5 ] , 3 J (6) = σc1 (B9 ) + τ c1 (B6 ) = σ + τ [δ] − 3τ [Λ0 ] − τ [ω6 ] , 2
(4.17) (4.18) (4.19)
(4.20) J (7) = σc1 (B9 ) + τ c1 (B7 ) = (σ + τ )[δ] − 2τ [Λ0 ] − τ [ω7 ] , 1 (4.21) J (8) = σc1 (B9 ) + τ c1 (B8 ) = σ + τ [δ] − τ [Λ0 ] , 2 P9 where we recall that [δ] = c1 (B9 ) = 3l − i=1 Ei , E9 = −[Λ0 ] − 1/2[δ], BN the E N PN del Pezzo surface with c1 (BN ) = 3l − i=1 Ei , B0 = P2 with c1 (B0 ) = 3l, and B˜1 = P1 × P1 with c1 (B˜1 ) = 2(2l − E1 − E2 ). The existence of the blow-down of each model to the one parameter family of the corresponding del Pezzo surface, where the name of the model comes, leads to the relation between Gromov–Witten invariants of these models as we shall see below. The appearance of a single fundamental weight in the Wilson line term in each model is also suggestive. There are a few words to be said on what should be the EN models for N = 1, 2, 3, 4. Their K¨ ahler moduli are expected to be written as J (1) = σc1 (B9 ) + τ c1 (B1 ) = (σ + 8τ )[δ] − 8τ [Λ0 ] − τ ([ω1 ] + 2[ω8 ]) , (4.22) J (2) = σc1 (B9 ) + τ c1 (B2 ) = (σ + 7τ )[δ] − 7τ [Λ0 ] − τ ([ω2 ] + [ω8 ]) ,
(4.23)
J (3) = σc1 (B9 ) + τ c1 (B3 ) = (σ + 6τ )[δ] − 6τ [Λ0 ] − τ [ω3 ] ,
(4.24)
J (4) = σc1 (B9 ) + τ c1 (B4 ) = (σ + 5τ )[δ] − 5τ [Λ0 ] − τ [ω4 ] ,
(4.25)
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
927
In view of the fact that B1,2,3 are themselves toric surfaces, we might expect that at least for N = 1, 2, 3, E N models can be realized as a hypersurface in the toric threefold BN × P1 , which we will not pursue further. 4.4. Periods The solutions to the Picard–Fuchs differential equations around the large radius limit point (z1 , z2 ) = (0, 0) can be obtained by the Frobenius method [11]. We first (N ) define the formal power series Ωρ (z1 , z2 ) from the Mori vectors l1 , l2 : Ωρ (z1 , z2 ) =
∞ ∞ X X
A(n1 + ρ1 , n2 + ρ2 )z1n1 +ρ1 z2n2 +ρ2 ,
(4.26)
n1 =0 n2 =0
where A(n1 , n2 ) for the E 0 and E ˜1 models are n2 1 Γ(1 + n1 + 3n2 ) (0) , A (n1 , n2 ) = 27 Γ(1 − n1 )Γ(1 + n1 )2 Γ(1 + n2 )3 (˜ 1)
A
(n1 , n2 ) =
1 16
n2
Γ(1 + n1 + 2n2 )Γ(1 + 2n2 ) , Γ(1 − n1 )Γ(1 + n1 )2 Γ(1 + n2 )4
(4.27)
(4.28)
while for the principal series E N , N = 5, 6, 7, 8, A(N ) (n1 , n2 ) :=
Γ(α + n2 )Γ(β + n2 ) . Γ(α)Γ(β)Γ(1 + n2 )Γ(1 + n1 )2 Γ(1 − n1 )Γ(1 + n2 − n1 )
(4.29)
We remark that the E ˜1 model has been defined as the three parameter model. The three Mori vectors (4.15) produce the formal power series Ωρ (z1 , w1 , w2 ) =
∞ X ∞ ∞ X X
A(n1 + ρ1 , k1 + ρ21 , k2 + ρ22 )z1n1+ρ1 w1k1 +ρ21 w2k2 +ρ22 ,
n1 =0 k1 =0 k2 =0
A(n1 , k1 , k2 ) =
Γ(1 + n1 + 2k1 + 2k2 ) . Γ(1 − n1 )Γ(1 + n1 )2 Γ(1 + k1 )2 Γ(1 + k2 )2
Then the reduction to the two parameter model is achieved by setting w1 = w2 = z2 /16; correspondingly, the coefficient of the power series reduces to the one described in (4.28): X A(n1 , k1 , k2 ) 16n2 A(n1 , n2 ) = k1 +k2 =n2
=
=
Γ(1 + n1 + 2n2 ) Γ(1 − n1 )Γ(1 + n1 )2 Γ(1 + n2 )2
X k1 +k2 =n2
Γ(1 + n1 + 2n2 )Γ(1 + 2n2 ) . Γ(1 − n1 )Γ(1 + n1 )2 Γ(1 + n2 )4
(n2 )! (k1 )!(k2 )!
2
September 21, 2002 14:4 WSPC/148-RMP
928
00146
K. Mohri
The computation of the Picard–Fuchs operators can be done in a standard manner once we know the form of A(n1 , n2 ) [11]. For E 0 and E ˜1 model we have (0)
= Θ21 + z1 Θ1 (Θ1 + 3Θ2 + 1) ,
(0)
= 9Θ22 − 3Θ1 Θ2 − z2 (3Θ2 + Θ1 + 1)(3Θ2 + Θ1 + 2) − 3z1 Θ1 Θ2 ,
(1)
= Θ21 + z1 Θ1 (Θ1 + 2Θ2 + 1) ,
(1)
= 4Θ22 − 2Θ1 Θ2 − z2 (2Θ2 + 1)(2Θ2 + Θ1 + 1) − 2z1 Θ1 Θ2 ,
D1 D2 D1 D2
(4.30)
(4.31)
and for the principal series E N with N = 5, 6, 7, 8, (N )
= Θ21 − z1 Θ1 (Θ1 − Θ2 ) ,
(N )
= Θ2 (Θ2 − Θ1 ) − z2 (Θ2 + α(N ) )(Θ2 + β (N ) ) ,
D1 D2
(4.32)
where Θ1 = z1 ∂z1 and Θ2 = z2 ∂z2 . In passing, we remark here that the Picard– Fuchs system for the principal series gives an Appell–Horn hypergeometric system in four ways according to the choice of the base point (z1±1 = 0, z2±1 = 0). It would be quite interesting to analyze their monodromies. The Frobenius method then gives the four solutions of the Picard–Fuchs system (4.30), (4.31), (4.32): $(z2 ) = Ωρ (z1 , z2 )|ρ=0 ,
(4.33)
$D (z2 ) = ∂ρ2 Ωρ (z1 , z2 )|ρ=0 ,
(4.34)
φ(z1 , z2 ) = ∂ρ1 Ωρ (z1 , z2 )|ρ=0 ,
(4.35)
(0)
φD (z1 , z2 ) =
∂ρ1 ∂ρ2
1 2 (0) + ∂ρ2 Ωρ (z1 , z2 ) 6
1 2 ˜ (˜ 1) φD (z1 , z2 ) = ∂ρ1 ∂ρ2 + ∂ρ2 Ω(ρ1) (z1 , z2 ) 4
,
(4.36)
,
(4.37)
ρ=0
ρ=0
1 (N ) ) φD (z1 , z2 ) = ∂ρ1 ∂ρ2 + ∂ρ22 Ω(N ρ (z1 , z2 ) 2
,
N = 5, 6, 7, 8 .
(4.38)
ρ=0
The first two solutions $ and $D are the same as those of the corresponding torus (3.2), (3.3), which is evident from the structure of the Picard–Fuchs operators above. The third one is given by φ(z1 , z2 ) = $(z2 ) log(z1 ) + ξ (N ) (z2 ) +
∞ X n1 =1
n1 ) L(N n1 $(z2 )z1 ,
(4.39)
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve (N )
where ξ (N ) (z2 ) has been defined in (3.20), (3.21) and (3.22) and Ln ential operator defined by L(0) n =
929
is the differ-
n (−1)n Y (3Θ2 + k) , n · n!
(4.40)
n (−1)n Y (2Θ2 + k) , n · n!
(4.41)
k=1
˜
L(n1) =
k=1
) L(N n
n−1 (−1)n Y = (Θ2 − k) , n · n!
N = 5, 6, 7, 8 .
(4.42)
k=0
The flat coordinates τ , σ are obtained by the mirror map 2πiτ =
$D (z2 ) , $(z2 )
2πiσ =
φ(z1 , z2 ) . $(z2 )
Let cn := Ln $/$, then (4.43) with (3.23), (3.24), (3.25) yields ! ∞ X n1 2πiσ (N ) ψ (τ ) = z1 exp cn1 (τ )z1 , e
(4.43)
(4.44)
n1 =1
from which we know that the inversion of the mirror for z1 takes the form ! ∞ X n dn (τ )˜ p z1 = p˜ 1 + , (4.45) n=1
where p˜ := e
2πiσ
ψ(τ ), and dn ∈ Q[c1 , . . . , cn ]; explicitly,
d1 = −c1 ,
d2 =
3 2 c − c2 , 2 1
8 d3 = − c31 + 4c1 c2 − c3 , 3
(4.46) 125 4 25 2 5 2 c − c1 c2 + 5c1 c3 + c2 − c4 , . . . . d4 = 24 1 2 2 After some tedious calculations, we find that the last solution φD is given by X ∞ 1 1 fn (z2 ) n (0) z , (4.47) − φD (z1 , z2 ) = (2πi)2 $(z2 ) στ + τ 2 − 6 6 $(z2 ) 1 n=1 (˜ 1) φD (z1 , z2 )
(N ) φD (z1 , z2 )
X ∞ 1 2 1 fn (z2 ) n z , = (2πi) $(z2 ) στ + τ − − 4 8 $(z2 ) 1 n=1
(4.48)
X ∞ 1 2 1 fn (z2 ) n z , = (2πi) $(z2 ) στ + τ − − 2 2(9 − N ) $(z2 ) 1 n=1
(4.49)
2
2
where fn (z2 ) is the “higher Wronskian” defined by fn (z2 ) = −($Ln $D − $D Ln $)(z2 ) .
(4.50)
September 21, 2002 14:4 WSPC/148-RMP
930
00146
K. Mohri
It must be noted here that a crucial ingredient in obtaining the concise expression for φD above is the following formula of combinatoric nature: ! ∞ ! ! ∞ ! ∞ ∞ X X X X n n n n a(n)k(n)z2 a(n)z2 = a(n)b(n)z2 a(n)s(n)z2 , n=1
n=0
n=1
n=1
(4.51) where s(n) and k(n) are defined by n−1 X 1 1 + , s(n) = l+α l+β l=0
k(n) = s(n) s(n) − 2
n−1 X l=0
1 l+1
! −
n−1 X l=0
1 1 + 2 (l + α) (l + β)2
.
The instanton part of the genus zero prepotential of the model F0 is defined by ∞ X ∂F0 fn (z2 ) n 1 z , (4.52) =h 2 1 2πi ∂σ $(z 2) n=1 ˜
where (h(0) , h(1) , h(5) , h(6) , h(7) , h(8) ) = (3, 4, 4, 3, 2, 1) is the normalization factor, which may be found, for example, by computation of the classical central charge of a D-branes system corresponding to a coherent sheaf F with the K¨ahler class J (N ) [23]: 1 1 . Z class (F ) = − exp(−J (N ) ) · ch(F ) · [B9 ] + [δ] + [pt] 2 2 4.5. Relation to Seiberg Witten periods In this subsection we show how to realize the two periods φ and φD , which have been obtained as the solutions of the Picard–Fuchs differential equations, as the Seiberg–Witten integrals of the 6D non-critical string theory [1] compactified on T2 to 4D [2, 3, 5, 6]. Let φ0 (z1 , z2 ) be a solution of the Picard–Fuchs system (4.30), (4.31), or (4.32). It is easy to see that if Θ1 φ0 = 0, then D1 φ0 = 0 is automatic and D2 φ0 = 0 reduces to the Picard–Fuchs equation for the corresponding elliptic curve: Dell φ0 = 0; hence φ0 is a linear combination of $ and $D . Consider the case Θ1 φ0 6= 0. The first equation D1 φ0 = 0 gives a constraint on the function form of Θ1 φ0 : 1 z2 ω(˜ z2 ) , z˜2 := , (4.53) E 0 : Θ1 φ0 (z1 , z2 ) = 1 + z1 (1 + z1 )3 E ˜1 : E 5,6,7,8 :
Θ1 φ0 (z1 , z2 ) =
1 ω(˜ z2 ) , 1 + z1
Θ1 φ0 (z1 , z2 ) = ω(˜ z2 ) ,
z2 , (1 + z1 )2
(4.54)
z˜2 := z2 (1 − z1 ) .
(4.55)
z˜2 :=
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
931
Upon the action of Θ1 , the second equation D2 φ0 = 0 reduces to the Picard– Fuchs equation of the corresponding torus in terms of the variable z˜2 : (6)
(4.56)
(5)
(4.57)
E0 :
Θ1 D2 φ0 (z1 , z2 ) = 3Dell ω(˜ z2 ) ,
E ˜1 :
Θ1 D2 φ0 (z1 , z2 ) = 4Dell ω(˜ z2 ) , Θ1 D2 φ0 (z1 , z2 ) =
E 5,6,7,8 :
1 (N ) D ω(˜ z2 ) , 1 − z1 ell
(4.58)
that is, ω(˜ z2 ) is a period of the elliptic curve. We have thus arrived at the representation of the general solution φ0 for the Picard–Fuchs system in terms of a period ω of the fiber torus, which is closely related to the Seiberg–Witten periods: Z z1 z2 d z1 1 ω (4.59) + c(z2 ) , E 0 : φ0 (z1 , z2 ) = z1 1 + z1 (1 + z1 )3 Z E ˜1 :
φ0 (z1 , z2 ) =
E 5,6,7,8 :
φ0 (z1 , z2 ) =
Z
z1
z1
d z1 1 ω z1 1 + z1
z2 (1 + z1 )2
+ c(z2 ) ,
d z1 ω(z2 − z1 z2 ) + c(z2 ) , z1
(4.60)
(4.61)
where c(z2 ) is a counterterm to ensure D2 φ0 = 0. Curiously, the deformation of the single-variable function ω(z2 ) to the one appeared in the right hand side of (4.53), (4.54) and (4.55) admits the following (N ) description in terms of the differential operator Ln respectively: ! ∞ X z2 1 n (0) ω nz1 Ln (4.62) ω(z2 ) , = 1+ 1 + z1 (1 + z1 )3 n=1 1 ω 1 + z1
z2 (1 + z1 )2
=
1+
∞ X
! ˜ nz1n L(n1)
ω(z2 ) ,
(4.63)
n=1
ω(z2 − z1 z2 ) =
1+
∞ X
! ) nz1n L(N n
ω(z2 ) ,
N = 5, 6, 7, 8 .
(4.64)
n=1
For the period φ, we obtain the following integral formula: for E 0 and E ˜1 models, Z z1 z2 d z1 1 (0) (6) $ + ξ (0) (z2 ) , (4.65) E 0 : φ (z1 , z2 ) = z1 1 + z1 (1 + z1 )3 E ˜1 :
(˜ 1)
φ
Z (z1 , z2 ) =
and for the principal series
z1
d z1 1 $(5) z1 1 + z1
z2 (1 + z1 )2
˜
+ ξ (1) (z2 ) ,
(4.66)
September 21, 2002 14:4 WSPC/148-RMP
932
00146
K. Mohri
Z (N )
E 5,6,7,8 :
φ
z1
d z1 (N ) $ (z2 − z1 z2 ) + ξ (N ) (z2 ) z1
z1
d z1 (N ) $ (z2 − z1 z2 ) , z1
(z1 , z2 ) = Z
= 1
(4.67)
where we must discard a log() term before taking the limit → 0. The last equation implies that for the principal series, φ is a vanishing period at z1 = 1. Let τ˜ be the coupling constant of the Seiberg–Witten theory with the bare parameters (z1 , z2 ). It is given by the deformed mirror map 2πi τ˜ =
z2 ) $D (˜ , $(˜ z2 )
(4.68)
where z˜2 = z˜2 (z1 , z2 ) is given in (4.53), (4.54), and (4.55) respectively. We can also show using (4.62), (4.63), (4.64), that the instanton part of the period φD can be written as a Seiberg–Witten period, where we have no need to introduce the cut-off parameter in contrast to the case of φ: ! Z z1 ∞ (0) z2 1 X fn d z1 1 n (6) (˜ τ − τ )$ (z2 )z1 = , E0 : − 2πi n=1 $(6) z1 1 + z1 (1 + z1 )3 0
E ˜1 : −
E 5,6,7,8 : −
1 2πi
1 2πi
∞ X n=1 ∞ X n=1
(˜ 1) fn $(5)
(N ) fn $(N )
!
Z
z1
(z2 )z1n = 0
!
d z1 1 (˜ τ − τ )$(5) z1 1 + z1
(4.69) z2 , (1 + z1 )2 (4.70)
Z
z1
(z2 )z1n = 0
d z1 (˜ τ − τ )$(N ) (z2 − z1 z2 ) . z1
(4.71)
Note that in particular (4.71) for the E 8 model corresponds to the formula [6, (3.5)] obtained by direct evaluation of the Seiberg–Witten periods from the curve (9.6). τ )3 /E6 (˜ τ )2 = E4 (τ )3 /(E6 (τ ) − v)2 , The integration variable v there satisfies E4 (˜ from which we see the correspondence of the bare variables 27 1 = −2E4 (τ )3/2 z2 z1 = [E6 (τ ) − E4 (τ )3/2 ]z1 . v := π6 u 5. Genus Zero Partition Functions 5.1. Recursion relations In order to obtain the instanton expansion of the genus zero potential F0 , we have to convert the two sequences of functions of z2 : Ln $ , fn = −($Ln $D − $D Ln $) $ into modular functions of Γell , which is achieved by finding the recursion relations for {cn } and {fn }. To this end, let us make the ansatz cn (τ ) = Bn e2 (τ ) + Dn , with cn =
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
933
e2k := E2k $−2k . It will become clear from the recursion relations that Bn , Dn , fn are degree n polynomials in y, where y = (1 − z2 )−1 ; in particular they are all anomaly-free modular functions of Γell . Equation (3.19) enable us to evaluate the following logarithmic derivatives that are indispensable to establish the recursion relations: $ {ye2 − [(10 − N ) − (9 − N )y]} , (5.1) Θ2 $ = 12 $ {−ye4 + e2 [(10 − N ) − (9 − N )y]} , (5.2) 12 where $ is the fundamental period of the E N elliptic curves with N = 5, 6, 7, 8. Θ2 (e2 $) =
Principal series For them, from the relation (n + 1)2 Ln+1 = −n(Θ2 − n)Ln , n (Θ2 − n)($cn ) . $cn+1 = − (n + 1)2 Then Eqs. (5.1) and (5.2) yields the recursion relation for {cn }: Bn D1 − n −B1 Bn+1 n d + =− y(y − 1) , (n + 1)2 dy Dn+1 e4 B1 −D1 − n Dn where e4 = (16 − 16y −1 + y −2 ), (9 − 8y −1 ), (4 − 3y −1 ), 1, for N = 5, 6, 7, 8 respectively. Note that we have replaced Θ2 by y(y−1)d /d y in the recursion relation above because Bn and Dn depend on z2 only through y. Since c1 = −Θ2 $/$, the first term (B1 , D1 ) can be seen immediately from (5.1): y 1 , D1 = [(10 − N ) − (9 − N )y] . 12 12 The recursion relation for {fn } can be obtained in a similar manner: n d − n + c1 f n − f 1 cn , f 1 = y . y(y − 1) fn+1 = − (n + 1)2 dy B1 = −
Furthermore {fn } and {cn = Bn e2 + Dn } are related each other by 1 1 (n + 1)2 d fn+1 + y(y − 1) − n + D1 fn . Bn = − f n , D n = 12 y n dy Here we list the first few elements of {(Bn , Dn )} only for the E 8 model: B1 = −
1 y, 12
B3 = −
1 y(211 − 211y + 72y 2 ) , 7776
D1 =
1 (2 − y) , 12
B2 =
1 y(y − 2) , 48
D3 = −
D2 =
1 (7 − 7y + 3y 2 ) , 144
1 (y − 2)(72y 2 − 91y + 91) , 7776
B4 =
1 y(y − 2)(54y 2 − 103y + 103) , 10368
D4 =
1 (1729 − 3458y + 4477y 2 − 2748y 3 + 648y 4) . 124416
September 21, 2002 14:4 WSPC/148-RMP
934
00146
K. Mohri
E ˜1 model The procedure to get the recursion relations for {cn } and {fn } are similar to the case of principal series. In this case the E 5 torus is relevant. Using (n + 1)2 Ln+1 = −n(2Θ2 + n + 1)Ln we get the recursion relation for {cn } D1 + n + 2 Bn −B1 Bn+1 n d + =− 2y(y − 1) , (n + 1)2 dy Dn+1 e4 B1 −D1 + n Dn with (B1 , D1 ) = (−y/6, −(4y + 1)/6) and e4 = (16 − 16y −1 + y −2 ). The recursion relation for {fn } reads n d + 2 + n + c1 fn − f1 cn , f1 = 2y . 2y(y − 1) fn+1 = − (n + 1)2 dy The relation between {fn } and {cn } is 1 1 (n + 1)2 d fn+1 + 2y(y − 1) + 2 + n + D1 f n . Bn = − f n , D n = 12 2y n dy We list the first few members: 1 1 1 1 y(2y + 1) , D2 = (8y 2 + 1) , B1 = − y , D1 = − (1 + 4y) , B2 = 6 6 24 24 B3 = − B4 =
1 y(8y 2 + y + 2) , 108
D3 = −
1 y(24y 3 − 4y 2 + 2y + 3) , 288
1 (1 + 4y)(8y 2 − 5y + 2) , 108
D4 =
1 (96y 4 − 64y 3 + 7y 2 + 5y + 3) . 288
E 0 model This case has been analyzed in [9], which we briefly repeat here for convenience. Recall that underlying torus model is E 6 . First, the relation among the operators (n + 1)2 Ln+1 = −n(3Θ2 + n + 1)Ln yields D1 + n + 2 Bn −B1 Bn+1 n d + =− 3y(y − 1) , (n + 1)2 dy Dn+1 e4 B1 −D1 + n Dn with (B1 , D1 ) = (−y/4, −3y/4) and e4 = (9 − 8y −1 ). The recursion relation for {fn } becomes n d + n + 2 + c1 f n − c n f 1 , fn+1 = − 3y(y − 1) (n + 1)2 dy
f1 = 3y ,
with the relation between {fn } and {cn } 1 1 (n + 1)2 d fn+1 + 3y(y − 1) + n + 2 + D1 f n . Bn = − f n , D n = 12 3y n dy The first few members of {(Bn , Dn )} are 1 3 3 2 y , B1 = − y , D1 = − y , B2 = 4 4 16 B3 = − B4 =
1 2 y (18y − 7) , 72
D3 = −
1 2 y (27y − 2)(3y − 2) , 192
D2 =
1 y(9y − 4) , 16
1 y(54y 2 − 45y + 4) , 72
D4 =
1 2 y (243y 2 − 288y + 68) . 192
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
935
Finally, we have observed for any of the six models that ∂cn (τ ) fn (y) , = −12 2 $(z2 ) ∂E2(τ )
(5.3)
which turns out to be of fundamental importance both for the instanton expansion and for the investigation of the modular anomaly equations of the genus zero and one Gromov–Witten potentials below. 5.2. Instanton expansion The instanton expansion of the genus zero potential F0 is obtained by conversion of z1 and z2 in F0 to the function of q := e2πiτ and p := e2πiσ . We define the genus zero Gromov–Witten invariant N0;n,m ∈ Q of bidegree (n, m) by the q-expansion of Z0;n (τ ) Z0;n (τ ) =
∞ X
N0;n,m q m .
(5.4)
m=0
We find a useful expression of the genus zero potential: ∂z1 ∂F0 1 ∂ p˜ 1 = Θp F0 = 12h , 2πi ∂σ ∂E2 p˜ ∂z1
(5.5)
which follows from the fact that p˜ does not depend on E2 : X ∞ ∂z1 ∂ p˜ ∂cn ∂ p˜ d˜ p = + = 0, dE2 ∂E2 ∂z1 ∂E2 ∂cn n=1 because p˜ = pψ(τ ) with ψ(τ ) anomaly-free, and the substitution of the identity (5.3). Then we see from the expansion of the right hand side of (5.5) that the general form of the partition function Z0;n (τ ) reads yψ , N = 0, ˜1, 5, 6, 7, 8 , (5.6) Z0;1 = (9 − N ) $2 Z0;n = (Z0;1 )n $2(n−1) P0;n−1 (e2 , y −1 ),
(5.7)
where P0;n−1 is a degree n − 1 polynomial in e2 = E2 (τ )/$(z2 )2 and y −1 = 1 − z2 . The relation between Z0;1 and the special value of the E 8 theta function prescribed by the K¨ ahler class J (N ) defined in (4.17)–(4.21) will be discussed later. As space is limited, we give only the first four terms of the p-expansions of (5.5): y p˜ (y p˜)2 9 (y p˜)3 1 40 (0) 2 e + + 45 − 27e Θp F0 = 9 2 + 2 2 $ $2 4 $2 32 y 8 36 40 (y p˜)4 1 3 − + e 12e − 27 + 45 − , + 2 2 $2 32 y y y2
September 21, 2002 14:4 WSPC/148-RMP
936
00146
K. Mohri
(˜ 1) Θp F0
y p˜ (y p˜)2 2 1 =8 2+ 2e2 + 2 − $ $2 3 y 2 8 3 (y p˜)3 1 2 + + e 3e + 8 − 6 − + 2 2 $2 9 y y y2 " 24 108 27 (y p˜)4 1 3 2 16e + + e + e 48 − 108 − + 2 2 2 $2 162 y y y2 # 60 48 14 + 2− 3 , + 40 − y y y
(5) Θp F0
y p˜ (y p˜)2 1 1 =4 2+ e2 + 1 + $ $2 3 y 5 4 6 (y p˜)3 1 2 3e2 + e2 6 + +8+ + 2 + $2 72 y y y " 12 18 18 (y p˜)4 1 3 2 4e2 + e2 12 + + 2 + e2 27 + + $2 648 y y y # 39 12 10 + 2+ 3 , + 10 + y y y
(6)
Θp F0
(7) Θp F0
4 112 y p˜ (y p˜)2 1 2 e2 (y p˜)3 1 2 + 45 − + + + + 108 e + 27e 2 2 $2 $2 4 y $2 864 y y y2 e22 4 148 144 8 104 (y p˜)4 1 3 − 2+ 3 , 12e2 + 72 + e2 45 − + 2 − 27 + + $2 2592 y y y y y y
=3
y p˜ (y p˜)2 1 3 =2 2+ e2 − 1 + $ $2 6 y 33 51 36 (y p˜)3 1 2 + − e 6e + 16 − 12 − + 2 2 $2 288 y y y2 " 72 135 207 (y p˜)4 1 3 2 8e + − e + e 24 − 54 − + 2 2 2 $2 2592 y y y2 # 189 180 189 − 2 + 3 , − 56 + y y y
(8) Θp F0
y p˜ (y p˜)2 1 = 2+ $ $2 12
" 4 (y p˜)3 1 216 2 27e − e e2 − 2 + + 108 − 2 2 y $2 2592 y
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
937
" # 394 394 (y p˜)4 1 144 3 2 + 2 + 12e + 153 − − e 72 − 2 2 y y $2 7776 y # 538 538 734 924 616 + 2 − 213 + − 2 + 3 . + e2 189 − y y y y y At this point it would be helpful to mention in advance the general form of higher genus Gromov–Witten partition functions Zg;n (τ ), which is indeed predicted from the modular anomaly Eq. (2.17), Zg;n = (Z0;1 )n $2(g+n−1) Pg;g+n−1 (e2 , y −1 ) ,
(5.8)
where Pg;g+n−1 is a degree g + n − 1 polynomial in e2 and y −1 . 5.3. Partition functions and modular forms It is of primary importance to ensure the relation between the singly wound partition function Z0;1 (τ ) and the E 8 theta function (2.14) as a consistency check of our formalism. The specializations of the E 8 theta function to the K¨ahler moduli of the E ˜1 and E 5 models, for example, are calculated as follows: ! 8 X τ 8 τ 1 −3 X ϑa 4τ − ϑb 4τ E ˜1 : ΘE 8 (4τ |ω2 τ ) = q 2 2 2 2 a=2,3 b=1,4
1 = q −1 2
E5 :
1 ΘE 8 (4τ |ω5 τ ) = q −1 2
X
! ϑa (4τ ) ϑa (4τ |τ ) − ϑ4 (4τ ) ϑ4 (4τ |τ ) 2
6
2
6
,
a=2,3
X
! ϑa (4τ ) ϑa (4τ |τ ) − ϑ4 (4τ ) ϑ4 (4τ |τ ) 4
4
4
4
.
a=2,3
We find that Z0;1 can indeed be written by the E8 theta function with the prescribed K¨ahler class J (N ) as predicted in (2.14), in addition to the fact that it admits a concise description by the Dedekind eta function: (0) 1 yψ 1 (0) (z2 ) = 9q 6 = Z0;1 (3τ |ω8 τ ) , (5.9) Z0;1 (τ ) = 9 $2 η(τ )4
(˜ 1)
Z0;1 (τ ) = 8 (5) Z0;1 (τ )
=4
(6)
Z0;1 (τ ) = 3
yψ $2 yψ $2 yψ $2
(˜1) 1
(z2 ) = 8q 4 (5)
1
(z2 ) = 4q 2 (6)
1
(z2 ) = 3q 2
1 η(τ )2 η(2τ )2
= Z0;1 (4τ |ω2 τ ) ,
(5.10)
$(5) (z2 ) = Z0;1 (4τ |ω5 τ ) , η(2τ )6
(5.11)
$(6) (z2 ) = Z0;1 (3τ |ω6 τ ) , η(τ )3 η(3τ )3
(5.12)
September 21, 2002 14:4 WSPC/148-RMP
938
00146
K. Mohri
(7)
Z0;1 (τ ) = 2 (8)
Z0;1 (τ ) =
yψ $2 yψ $02
(7) 1
(z2 ) = 2q 2 (8) 1
(z2 ) = q 2
$(7) (z2 )2 = Z0;1 (2τ |ω7 τ ) , η(τ )4 η(2τ )4
(5.13)
$(8) (z2 )4 = Z0;1 (τ |0) . η(τ )12
(5.14)
Z0;1 (τ ) of the E 3 and the E 4 model calculated form the E 8 theta function using (2.14) can also be expressed by the eta functions: 1
E3 :
ΘE 8 (6τ |ω3 τ ) 6q 2 , = 12 ϕ(6τ ) η(τ )η(2τ )η(3τ )η(6τ )
E4 :
ΘE 8 (5τ |ω4 τ ) 5q 2 = . 12 ϕ(5τ ) η(τ )2 η(5τ )2
(5.15)
1
(5.16)
As for partition functions of multiple winding number, we here give the expressions of those in terms of modular forms only for the E 7 and E 8 models; the latter has originally been obtained in [6]. (7)
E 7 model Let χ := Z0;1 /A = 2q 1/2 /(η(τ )η(2τ ))4 , see (7.1). (7)
Z0;1 = χA , (7)
Z0;2 = (7)
Z0;3 = (7)
Z0;4 =
χ2 A(AE2 − A2 + 3B) , 48 χ3 A(6E22 A2 − 12E2 A3 + 36E2 AB + 16A4 − 33A2 B + 51B 2 ) , 6912 χ4 A(−56A6 + 189A4 B − 180B 2 A2 + 189B 3 + 8E23 A3 − 24E22 A4 165888 + 72E22 A2 B + 54E2 A5 − 135E2 A3 B + 207E2 AB 2 ) ,
(7)
Z0;5 =
χ5 A(123328A8 − 514272A6B + 858987A4B 2 + 6250A4E24 1990656000 − 25000A5E23 + 406215B 4 − 585000A3E2 B 2 + 499500A5E2 B + 607500AE2B 3 − 213750A4E22 B + 326250A2E22 B 2 − 508680A2B 3 − 136000A7E2 + 75000A3 E23 B + 75000A6 E22 ) .
E 8 model (8)
Z0;1 = (8)
Z0;2 =
1 E4 , ϕ12 1 E4 (2E6 + E2 E4 ) , 24ϕ24
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve (8)
Z0;3 = (8)
Z0;4 =
939
1 E4 (109E43 + 197E62 + 216E2 E4 E6 + 54E22 E42 ) , 15552ϕ36 1 E4 (272E43 E6 + 154E63 + 109E2 E44 + 269E2 E4 E62 + 144E22E42 E6 62208ϕ48 + 24E23 E43 ) ,
(8)
Z0;5 =
1 E4 (426250E22E42 E62 + 150000E23E43 E6 + 207505E64 37324800ϕ60 + 136250E22E45 + 772460E43E62 + 116769E46 + 18750E24E44 + 653000E2E44 E6 + 505000E2E4 E63 ) .
5.4. Modular anomaly equation We are now ready to give the modular anomaly equation, which determines the E2 dependence of the genus zero partition function Z0;n (τ ) closely following [6, 9]. First by differentiation of (4.52) with respect to E2 (τ ), we obtain ∞ X fn n−1 ∂z1 ∂ (Θp F0 ) = h nz . ∂E2 $2 1 ∂E2 n=1 Then the substitution of (5.5) to the above equation brings about ∂F0 − (Θp F0 )2 = 0 . Θp 24h ∂E2 Eventually we arrive at the modular anomaly equation for genus zero: 1 ∂F0 (Θp F0 )2 . = h ∂E2 24 P n Using the definition of the potential F0 = ∞ n=1 Z0;n (τ )p , the anomaly equation can be rewritten as n−1 1 X ∂Z0;n (τ ) = k(n − k)Z0;k (τ )Z0;n−k (τ ) , h ∂E2 (τ ) 24 k=1
which is consistent with the general form of the anomaly Eq. (2.17). The appearance of the normalization factor h (4.52) in the left hand side is explained by the fact that hE2 (hτ ) − E2 (τ ) is an anomaly-free modular form of Γ0 (h). 5.5. Rational instanton numbers inst , which We denote the genus zero instanton number of bidegree (n, m) by N0;n,m counts the ‘number’ of the rational curves of a given degree in the almost del Pezzo surface B9 , and we define its generating function with fixed n by inst (τ ) = Z0;n
∞ X m=0
inst N0;n,m qm .
(5.17)
September 21, 2002 14:4 WSPC/148-RMP
940
00146
K. Mohri
It is well-known that the genus zero multiple covering formula found in [27] leads to the following decomposition of the genus zero Gromov–Witten partition function: X inst n (kτ ) . k −3 Z0; (5.18) Z0;n (τ ) = k k|n
We can invert this equation using the M¨ obius function µ : N → {0, ±1} as X inst (τ ) = µ(k)k −3 Z0; nk (kτ ) . (5.19) Z0;n k|n
Recall that the M¨ obius function µ is defined as follows: µ(1) = 1, µ(n) = (−1)l if n is factorized into l distinct primes, and µ(n) = 0 if n is not square-free. inst , for each models below. We will give the first few terms of the expansions of Z0;n E 5 model inst = 4 + 16q + 40q 2 + 96q 3 + 220q 4 + 464q 5 + 920q 6 + 1760q 7 + 3276q 8 Z0;1
+ 5920q 9 + 10408q 10 + · · · inst = −20q 2 − 128q 3 − 608q 4 − 2304q 5 − 7672q 6 − 23040q 7 − 64256q 8 Z0;2
− 168448q 9 − 419908q 10 − · · · , inst = 48q 3 + 588q 4 + 4224q 5 + 23112q 6 + 105888q 7 + 426624q 8 + 1557216q 9 Z0;3
+ 5250816q 10 + · · · , inst = −192q 4 − 3328q 5 − 32224q 6 − 230400q 7 − 1346944q 8 − 6802432q 9 Z0;4
− 30669248q 10 − · · · , inst = 960q 5 + 21320q 6 + 260320q 7 + 2298680q 8 + 16354800q 9 Z0;5
+ 99283840q 10 + · · · . E 6 model inst = 3 + 27q + 81q 2 + 255q 3 + 702q 4 + 1701q 5 + 3930q 6 + 8721q 7 + 18225q 8 Z0;1
+ 37056q 9 + 73116q 10 + · · · , inst = −54q 2 − 492q 3 − 3078q 4 − 14904q 5 − 61320q 6 − 224532q 7 − 751788q 8 Z0;2
− 2337264q 9 − 6844338q 10 − · · · , inst = 243q 3 + 4131q 4 + 40095q 5 + 287307q 6 + 1683018q 7 + 8515449q 8 Z0;3
+ 38457585q 9 + 158463702q 10 + · · · , inst = −1728q 4 − 42120q 5 − 559920q 6 − 5344920q 7 − 40835664q 8 Z0;4
− 264772872q 9 − 1510286688q 10 − · · · ,
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
941
inst Z0;5 = 15255q 5 + 483585q 6 + 8191530q 7 + 97962210q 8 + 925275420q 9
+ 7332946200q 10 + · · · . E 7 model inst = 2 + 56q + 276q 2 + 1360q 3 + 4718q 4 + 15960q 5 + 46284q 6 + 130064q 7 Z0;1
+ 334950q 8 + 837872q 9 + 1980756q 10 + · · · , inst = −272q 2 − 4544q 3 − 46416q 4 − 335744q 5 − 2008480q 6 − 10255104q 7 Z0;2
− 46868416q 8 − 194576128q 9 − 749189328q 10 + · · · , inst = 3240q 3 + 100134q 4 + 1649088q 5 + 18786852q 6 + 168160176q 7 Z0;3
+ 1255563072q 8 + 8154689040q 9 + 47265867648q 10 + · · · , inst = −58432q 4 − 2633088q 5 − 60949696q 6 − 960253440q 7 Z0;4
− 11638833216q 8 − 115871533568q 9 − 988372855168q 10 − · · · , inst = 1303840q 5 + 77380260q 6 + 2323737360q 7 + 47046026140q 8 Z0;5
+ 724935311560q 9 + 9088122264000q 10 + · · · . E 8 model inst = 1 + 252q + 5130q 2 + 54760q 3 + 419895q 4 + 2587788q 5 + 13630694q 6 Z0;1
+ 63618120q 7 + 269531955q 8 + 1054198840q 9 + 3854102058 q 10 + · · · , inst = −9252q 2 − 673760q 3 − 20534040q 4 − 389320128q 5 − 5398936120q 6 Z0;2
− 59651033472q 7 − 553157438400q 8 − 4456706505600 q 9 − 31967377104276q 10 − · · · , inst = 848628q 3 + 115243155q 4 + 6499779552q 5 + 219488049810q 6 Z0;3
+ 5218126709400q 7 + 95602979109024q 8 + 1428776049708360 q 9 + 18102884896663488q 10 + · · · , inst = −114265008q 4 − 23064530112q 5 − 1972983690880 q 6 − 100502238355200q 7 Z0;4
− 3554323792345440q 8 − 95341997143018752q 9 − 2053905830285978880q 10 − · · · , inst = 18958064400q 5 + 5105167984850q 6 + 594537323257800q 7 Z0;5
+ 41416214037843150q 8 + 1996136210493389700q 9 + 72464241398191308000q 10 + · · · .
September 21, 2002 14:4 WSPC/148-RMP
942
00146
K. Mohri
inst Note that for these principal series, N0;n,n coincides with the genus zero, degree n instanton number of the local E N del Pezzo model computed in [4, 5].
E 0 model inst = 9 + 36q + 126q 2 + 360q 3 + 945q 4 + 2268q 5 + 5166q 6 + 11160q 7 Z0;1
+ 23220q 8 + 46620q 9 + 90972q 10 + · · · , inst = −18q − 252q 2 − 1728q 3 − 9000q 4 − 38808q 5 − 147384q 6 − 506880q 7 Z0;2
− 1613088q 8 − 4813380q 9 − 13609476q 10 − · · · , inst = 3q + 252q 2 + 4158q 3 + 40173q 4 + 287415q 5 + 1683450q 6 + 8516418q 7 Z0;3
+ 38458233q 8 + 158467806q 9 + 605183100q 10 + · · · , inst = −144q 2 − 6048q 3 − 107280q 4 − 1235520 q 5 − 10796544q 6 − 77538240q 7 Z0;4
− 479682720q 8 − 2635776000q 9 − 13140695232q 10 − · · · , inst = 45q 2 + 5670q 3 + 189990q 4 + 3508920q 5 + 45151335q 6 + 452510730q 7 Z0;5
+ 3763732545q 8 + 27047637540q 9 + 172619569800q 10 + · · · . inst } = {3, −6, 27, −192, 1695, . . .} coincides with the We have checked that {N0;3n,n genus zero, degree n instanton number of the local P2 model [4, 5].
E ˜1 model inst = 8 + 16q + 56q 2 + 112q 3 + 280q 4 + 528 q 5 + 1120q 6 + 2016q 7 + 3880q 8 Z0;1
+ 6720q 9 + 12096q 10 + · · · , inst = −4q − 56q 2 − 280q 3 − 1232q 4 − 4212 q 5 − 13544q 6 − 38584q 7 − 105200q 8 Z0;2
− 266696q 9 − 653400q 10 − · · · , inst = 24q 2 + 336q 3 + 2688q 4 + 15360q 5 + 73584q 6 + 303744q 7 + 1137192q 8 Z0;3
+ 3897648q 9 + 12515112q 10 + · · · , inst = −4q 2 − 224q 3 − 3472q 4 − 32704q 5 − 232280q 6 − 1351040q 7 − 6818336q 8 Z0;4
− 30695296q 9 − 126302196q 10 − · · · , inst = 80q 3 + 2800q 4 + 44800q 5 + 477160q 6 + 3892240q 7 + 26296560q 8 Z0;5
+ 153653920q 9 + 800623600q 10 + · · · . inst } = {−4, −4, −12, −48, −240, −1356, −8428, . . .} coWe have checked that {N0;2n,n incides with the genus zero, degree n instanton number of the local P1 × P1 [10].
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
943
6. Genus One Partition Functions 6.1. Genus one potentials In general, the determination of the genus one potential F1 of a Calabi–Yau threefold X requires the knowledge of the discriminant loci of the Picard–Fuchs system, which represent the singularities of the mirror complex moduli space M(X ∗ ), and the identification of the power indices associated with each of the irreducible components of the discriminant loci [28, 11]. For our case of the local E 9 almost del Pezzo models, we find the following answers for the instanton parts: E0 :
1 F1 = log 2
E ˜1 :
1 F1 = log 2
E 5,6,7,8 :
1 F1 = log 2
"
"
"
[(1 + z1 )3 − z2 ] (1 − z2 ) [(1 + z1 )2 − z2 ] (1 − z2 ) (1 − z2 + z1 z2 ) (1 − z2 )
− 16
∂z1 ∂ p˜
# ,
− 16
− 13
(6.1)
(1 + z1 ) − 16
∂z1 ∂ p˜
− 9−N 6
(1 − z1 )
#
∂z1 ∂ p˜
,
(6.2)
# , (6.3)
where p˜ := pψ(τ ) = e2πiσ ψ(τ ). The following identity will be used later: 1+
∞ X
! ncn (τ )z1n
n=1
∂z1 ∂ p˜
= exp −
∞ X
! cn (τ )z1n
.
(6.4)
n=1
It would suffice here to list the first four terms of the p-expansions of F1 : (0)
F1
=
(y p˜) (y p˜)2 8 (e2 + 2) + 5e22 + 8e2 + 3 + 4 64 y 32 8 (y p˜)3 3 2 39e2 + 54e2 + e2 117 − + 54 + 2 + 1152 y y " 784 1120 64 (y p˜)4 4 3 2 309e2 + 384e2 + e2 1746 − − 2 − e2 72 − + 18432 y y y # 144 320 + 2 , + 513 − y y
(˜ 1)
F1
=
4 (y p˜) (y p˜)2 8 3 (e2 + 3) + 5e22 + e2 18 − + 16 + + 2 6 144 y y y 15 33 24 20 84 24 (y p˜)3 + 2 + 88 − 3 + 2 − 13e32 + e22 57 − + e2 114 − + 1296 y y y y y y
September 21, 2002 14:4 WSPC/148-RMP
944
00146
K. Mohri
" (y p˜)4 174 864 324 103e42 + e32 540 − + 2 + + e22 1584 − 31104 y y y # 384 1854 966 291 1560 404 1176 + 2 − 3 + 4 , + 960 − + e2 1936 + 2 − 3 − y y y y y y y (5) F1
y p˜ (y p˜)2 32 19 6 2 (e2 + 3) + + 2 = 5e2 + e2 18 + + 16 + 12 576 y y y " 30 138 87 (y p˜)3 3 2 13e2 + e2 57 + + 2 + e2 114 + + 10368 y y y " # (y p˜)4 348 156 417 52 4 3 + 2 + 3 + 103e2 + e2 540 + + 88 + y y y 497664 y 1656 1062 4152 6276 1252 + 2 + 2 + 3 1584 + + e2 1936 + y y y y y # 3648 9198 7836 921 + 2 + 3 + 4 , + 960 + y y y y + e22
(6)
F1
=
y p˜ (y p˜)2 28 16 12 (e2 + 2) + + 2 5e22 + e2 8 + +3+ 12 576 y y y " 180 316 368 (y p˜)3 39e32 + e22 54 + + 2 + 54 + e2 117 + + 31104 y y y " # (y p˜)4 2088 216 992 256 + 2 + 3 + 309e42 + e32 384 + + y y y 1492992 y 3032 6112 8536 14496 8384 + 2 + + + e2 −72 + + 513 + e22 1746 + y y y y2 y3 # 2664 14720 25792 4608 + + + 4 , + y y2 y3 y
(7) F1
y p˜ (y p˜)2 27 27 36 2 (e2 + 1) + + 2 = 10e2 − e2 4 − + 12 1152 y y y " 180 81 387 (y p˜)3 3 2 26e2 − e2 42 − + 2 − 32 + e2 84 − + 20736 y y y # 99 315 216 + 2 + 3 , + y y y
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
945
" (y p˜)4 8352 13176 30312 824e42 − e32 2272 − + + + e22 6528 − 3981312 y y y2 32112 8640 44496 18720 + 2 − + 6016 − − e2 9728 − y y y3 y # 30627 39474 19683 − + , + y2 y3 y4 (8)
F1
=
10 10 y p˜ (y p˜)2 24 e2 + 5e22 − e2 12 − +7− + 2 12 576 y y y " 360 878 878 (y p˜)3 + 2 39e32 − e22 180 − + e2 369 − + 31104 y y y " # (y p˜)4 4176 712 480 320 4 3 − + 3 + 309e2 − e2 2088 − − 276 + y y y 1492992 y
39632 44880 29920 + − − e2 12336 − y y2 y3 # 33420 42510 18180 9090 + − + 4 , + 9513 − y y2 y3 y + e22
18724 18724 + 6858 − y y2
where we recall p˜ = pψ(τ ). We see that the genus one Gromov–Witten partition function Z1;n takes the form predicted in (5.8), that is, Z1;n = (Z0;1 )n $2n P1;n (e2 , y −1 ) , where P1;n is a degree n polynomial in e2 and y −1 . We will give the partition functions only for the E 7 and E 8 models below. E 7 model χ (7) A(E2 + A) , Z1;1 = 24 (7)
Z1;2 = (7)
Z1;3 =
χ2 (10E22 A2 + 36E2 AB − 4E2 A3 + 27B 2 + 27A2 B) , 4608 χ3 (26E23 A3 − 42E22 A4 + 180E22A2 B − 81E2 A3 B + 387E2 AB 2 165888 + 84E2 A5 + 99A4 B + 216B 3 + 315B 2 A2 − 32A6 ) ,
(7)
Z1;4 =
χ4 (824A4 E24 + 8352A3E23 B − 2272A5 E23 + 30312A2E22 B 2 63700992 − 13176A4E22 B + 6528A6 E22 − 9728A7 E2 + 44496AE2B 3 − 8640A3 E2 B 2
September 21, 2002 14:4 WSPC/148-RMP
946
00146
K. Mohri
+ 32112A5E2 B + 6016A8 + 19683B 4 + 30627A4B 2 − 18720A6B + 39474A2B 3 ) , χ5 (10970A5 E25 + 145800A4E24 B − 42350A6E24 − 1055250A4E22 B 2 9555148800
(7)
Z1;5 =
+ 748350A3E23 B 2 + 151200A7E23 − 345800A8E22 + 1240650A6E22 B − 389250A5E23 B + 3230811A5E2 B 2 − 1961316A7BE2 + 507584A9E2 + 2008395AE2B 4 − 340032A10 + 1930635A2B 4 + 729000B 5 + 2538540A4B 3 − 2195397A6B 2 + 1432080A8B + 1817100A2E22 B 3 − 208440A3E2 B 3 ) . E 8 model 1 (8) E2 E4 , Z1;1 = 12ϕ12 (8)
Z1;2 = (8)
Z1;3 =
1 (9E43 + 24E2 E4 E6 + 10E22 E42 + 5E62 ) , 1152ϕ24 1 (472E43 E6 + 80E63 + 299E2 E44 + 439E2 E4 E62 62208ϕ36 + 360E22E42 E6 + 78E23 E43 ) ,
(8)
Z1;4 =
1 (37448E22E42 E62 + 68768E2E44 E6 + 29920E2E4 E63 + 13809E46 11943936ϕ48 + 57750E43E62 + 17416E22E45 + 4545E66 + 16704E23E43 E6 + 2472E24E44 ) ,
(8)
Z1;5 =
1 (4102280E2E44 E62 + 808765E2E4 E64 + 1378600E22E42 E63 895795200ϕ60 + 103760E65 + 2111000E22E45 E6 + 951950E23E43 E62 + 720057E2E47 + 338950E23E46 + 1749528E44E6 + 32910E25E45 + 2340520E43E63 + 291600E24E44 E6 ) .
6.2. Modular anomaly equation To get the modular anomaly equation of genus one [9], we have only to notice that the genus one potential F1 has E2 (τ )-dependence both through z1 and through cn (τ ), where we are considering (∂z1 /∂ p˜) as a function of z1 and cn by (6.4). The contribution of the former to the derivative (∂F1 /∂E2 ) is ∂F1 ∂z1 1 (Θp z1 )(Θp F1 ) , = ∂E2 ∂z1 12h
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
947
where we have used (5.5), while the latter !! ∞ ∞ ∞ X X ∂ 1 1 X ∂cm Θp (Θp + 1)F0 . cn z1n + log 1 + ncn z1n = − 2 m=1 ∂E2 ∂cm n=1 24h n=1 Then we see that the anomaly equation for genus one takes the following form: 1 1 ∂F1 (Θp F0 )(Θp F1 ) + Θp (Θp + 1)F0 , (6.5) = h ∂E2 12 24 which can be rewritten in terms of the Gromov–Witten partition functions as 1 XX 1 ∂Z1;n (τ ) = k(n − k)Zh;k (τ )Z1−h;n−k (τ ) + n(n + 1)Z0;n (τ ) , ∂E2 (τ ) 24 24 1
h
n−1
h=0 k=1
which takes the from just predicted in (2.17). 6.3. Elliptic instanton numbers inst ∈ Z be the genus one instanton number of bidegree (n, m), and Let N1;n,m inst (τ ) = Z1;n
∞ X
inst N1;n,m qm
(6.6)
m=0
be its generating function. According to [28], we have the following decomposition of the genus one Gromov–Witten partition function: X 1 −1 inst inst (6.7) σ−1 (k)Z1; nk (kτ ) + k Z0; nk (kτ ) . Z1;n (τ ) = 12 k|n
The inversion of this equation is given by X 1 inst n n a−1 (k)Z1; k (kτ ) − a−3 (k)Z0; k (kτ ) , Z1;n (τ ) = 12
(6.8)
k|n
P where we have introduced the arithmetic functions al (n) := m|n µ(m)µ(n/m)ml . We give the generating functions of the genus one instanton numbers for each model. E 5 model inst = −8q 4 − 32q 5 − 80q 6 − 192q 7 − 464q 8 − 1024q 9 − 2080q 10 − · · · , Z1;1 inst = 18q 4 + 192q 5 + 1040q 6 + 4352q 7 + 15752q 8 + 51328q 9 + 153448q 10 + · · · , Z1;2 inst = −16q 4 − 384q 5 − 3920q 6 − 26848q 7 − 145440q 8 − 671936q 9 Z1;3
− 2754816q 10 − · · · , inst = 5q 4 + 320q 5 + 6320q 6 + 71168q 7 + 577264q 8 + 3758848q 9 Z1;4
+ 20853184q 10 + · · · ,
September 21, 2002 14:4 WSPC/148-RMP
948
00146
K. Mohri
inst Z1;5 = −96q 5 − 4640q 6 − 93056q 7 − 1170496q 8 − 10922336q 9
− 82513280q 10 − · · · . E 6 model inst = −6q 3 − 54q 4 − 162q 5 − 528q 6 − 1566 q 7 − 3888q 8 − 9414q 9 Z1;1
− 21870q 10 − · · · , inst = 9q 3 + 243q 4 + 2322q 5 + 13824q 6 + 68283q 7 + 290466q 8 + 1094580q 9 Z1;2
+ 3785940q 10 + · · · , inst = −4q 3 − 324q 4 − 7290q 5 − 85458q 6 − 700164q 7 − 4599990q 8 Z1;3
− 25682910q 9 − 126394182q 10 − · · · , inst = 135q 4 + 8262q 5 + 194532q 6 + 2729754 q 7 + 27756027q 8 + 226001070q 9 Z1;4
+ 1557055332q 10 + · · · , inst = −3132q 5 − 185346q 6 − 4812210q 7 − 78689502q 8 − 948813714q 9 Z1;5
− 9183023298q 10 − · · · . E 7 model inst = −4q 2 − 112q 3 − 564q 4 − 3056q 5 − 11108q 6 − 40528q 7 − 123112q 8 Z1;1
− 367552q 9 − 989236q 10 − · · · , inst = 3q 2 + 336q 3 + 9018q 4 + 101088q 5 + 862098q 6 + 5657664q 7 + 32067860q 8 Z1;2
+ 158512832q 9 + 712084479q 10 + · · · , inst = −224q 3 − 20496q 4 − 640032q 5 − 10716104 q 6 Z1;3
− 128761968q 7 − 1208615256q 8 − 9504050688q 9 − 64763400720 q 10 − · · · , inst = 12042q 4 + 1116896q 5 + 41444664q 6 + 903550592q 7 + 14095889180q 8 Z1;4
+ 172098048640q 9 + 1743551210128q 10 + · · · , inst = −574896q 5 − 57707124q 6 − 2511634800q 7 − 66979775872q 8 Z1;5
− 1286028782768q 9 − 19346827285068q 10 − · · · . E 8 model inst = −2q − 510q 2 − 11780q 3 − 142330q 4 − 1212930q 5 − 8207894q 6 Z1;1
− 46981540q 7 − 236385540q 8 − 1072489860q 9 − 4467531670 q 10 − · · · , inst = 762q 2 + 205320q 3 + 11361870q 4 + 317469648q 5 + 5863932540q 6 Z1;2
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
949
+ 81295293600q 7 + 909465990330q 8 + 8597134346400 q 9 + 70867771453026q 10 + · · · , inst = −246788q 3 − 76854240q 4 − 6912918432q 5 − 323516238180q 6 Z1;3
− 9882453271500q 7 − 221876231766660q 8 − 3933705832711600 q 9 − 57747806496416088q 10 − · · · , inst = 76413073q 4 + 27863327760q 5 + 3478600115600q 6 + 234196316814400q 7 Z1;4
+ 10330930335961770q 8 + 332747064864457152q 9 + 8378290954495817152q 10 + · · · , inst = −23436186174q 5 − 9930641443350q 6 − 1585090167772500q 7 Z1;5
− 140688512133882000q 8 − 8255877490179586950q 9 − 353737948953627859770q 10 − · · · . inst coincides with the genus one, degree We see that for the principal series, N1;n,n n instanton number of the EN del Pezzo model first obtained in [5]. Genus one instanton numbers for the E8 model have been computed in [4].
E 0 model inst = −18q 3 − 72q 4 − 252q 5 − 774q 6 − 2106 q 7 − 5292q 8 − 12564q 9 Z1;1
− 28278q 10 − · · · , inst = 108q 3 + 1152q 4 + 7812q 5 + 41022q 6 + 181656q 7 + 710856q 8 + 2526516q 9 Z1;2
+ 8310492q 10 + · · · , inst = −336q 3 − 7368q 4 − 85284q 5 − 700896q 6 − 4602090q 7 − 25679052q 8 Z1;3
− 126406392q 9 − 562694940q 10 − · · · , inst = 630q 3 + 26343q 4 + 496404q 5 + 6119388 q 6 + 57190644q 7 + 437749110q 8 Z1;4
+ 2875241088q 9 + 16711846956q 10 + · · · , inst = −756q 3 − 59976q 4 − 1817298q 5 − 33012216q 6 − 430550244q 7 Z1;5
− 4429221912q 8 − 38028172446q 9 − 282776491026q 10 − · · · . inst } = {0, 0, −10, 231, −4452, . . .} coincides with the We have checked that {N1;3n,n genus one, degree n instanton number of the P2 model first obtained in [5].
E ˜1 model inst = −16q 4 − 32q 5 − 112q 6 − 224 q 7 − 608q 8 − 1152q 9 − 2576q 10 − · · · , Z1;1
September 21, 2002 14:4 WSPC/148-RMP
950
00146
K. Mohri
inst Z1;2 = 84q 4 + 424q 5 + 2264q 6 + 8176q 7 + 29364q 8 + 88416 q 9 + 260360q 10 + · · · , inst = −224q 4 − 2208q 5 − 17392q 6 − 95872q 7 − 467376q 8 − 1947008q 9 Z1;3
− 7471488q 10 − · · · , inst = 350q 4 + 6272q 5 + 72512q 6 + 576704q 7 + 3778068q 8 + 20848384q 9 Z1;4
+ 102392928q 10 + · · · , inst = −336q 4 − 10976q 5 − 188880q 6 − 2130016 q 7 − 18652816q 8 − 134027488q 9 Z1;5
− 833043952q 10 − · · · . inst } = {0, 0, 0, 9, 136, 1616, 17560, . . .} coincides with We have checked that {N1;2n,n the genus one, degree n instanton number of the P1 × P1 model listed in [10].
7. Higher Genus Partition Functions In contrast to the genus zero or genus one case, we cannot evaluate directly the higher genus Gromov–Witten partition functions [16]. However, the modular anomaly Eq. (2.17) invented in [9] is so powerful that it determines the partition (N ) function Zg;n (τ ) up to finite constants. 7.1. Partition functions as modular forms In this subsection, we propose a conjecture on the form of the partition functions Zg;n (τ ) of the six models in terms of the modular forms. First we define χ(N ) for each of the six models by 1
χ
(0)
9q 6 (τ ) = , η(τ )4 1
3q 2 , χ(6) (τ ) = η(τ )3 η(3τ )3
1
(˜ 1)
χ
8q 4 (τ ) = , η(τ )2 η(2τ )2 1
2q 2 χ(7) (τ ) = , η(τ )4 η(2τ )4
1
χ
(5)
4q 2 η(2τ )4 (τ ) = , η(τ )4 η(4τ )4 1
(7.1)
q2 χ(8) (τ ) = . η(τ )12
Then we propose the conjectured forms of the partition functions of them: (0)
(0) (τ ) = (χ(0) (τ ))n P2g−2+2n (E2 (τ ), $(6) (τ ), H(τ )) , Zg;n ˜
˜
(˜ 1)
(1) (τ ) = (χ(1) (τ ))n P2g−2+2n (E2 (τ ), ϑ3 (2τ )4 , ϑ4 (2τ )4 ) , Zg;n
(7.2) (7.3)
(5)
(7.4)
(6)
(7.5)
(7)
(7.6)
(8)
(7.7)
(5) (τ ) = (χ(5) (τ ))n P2g−2+2n (E2 (τ ), ϑ3 (2τ )4 , ϑ4 (2τ )4 ) , Zg;n (6) (τ ) = (χ(6) (τ ))n P2g−2+3n (E2 (τ ), $(6) (τ ), H(τ )) , Zg;n (7) (τ ) = (χ(7) (τ ))n P2g−2+4n (E2 (τ ), A(τ ), B(τ )) , Zg;n (8) (τ ) = (χ(8) (τ ))n P2g−2+6n (E2 (τ ), E4 (τ ), E6 (τ )) , Zg;n
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
951
where each P (N ) is a polynomial over Q in three variables the subscript of which shows its weight as a quasi-modular form. E 7 model We list the genus two partition functions of the E 7 model. (7)
Z2;1 = (7)
Z2;2 =
χ A(6A2 + 3B + 5E22 + 10AE2 ) , 5760 χ2 (−64A5 + 999A3 B + 2349AB 2 + 190E23 A2 + 30E22 A3 + 810E22 AB 1658880 + 132A4E2 + 1251A2 BE2 + 1215E2 B 2 ) ,
(7)
Z2;3 =
χ3 (10561A7 − 27183A5B + 222723A3B 2 + 273699AB 3 + 4600E24 A3 318504960 − 5920E23A4 + 36480E23A2 B − 288E22 A3 B + 105840E22AB 2 + 20544E22A5 + 200988E2B 2 A2 − 12100E2A6 + 61704E2A4 B + 103680E2B 3 ) ,
(7)
Z2;4 =
χ4 (−1622467A9 + 6147828A7B − 5446746A5B 2 + 33187428A3B 3 61152952320 + 26235333AB 4 + 108800A4E25 + 1236480E24A3 B − 280320E24A5 + 7149024E22A5 B + 5454720E23B 2 A2 + 1067008E23A6 − 1639296E23A4 B + 1937232E22A3 B 2 − 2044976E22A7 + 10817280E22AB 3 + 18498600E2A4 B 2 − 8131560E2A6 B + 24054408E2A2 B 3 + 2625048E2A8 + 8048160E2B 4 ) ,
(7)
Z2;5 =
χ5 (−142044480E24A5 B + 621872640E23A6 B 17612050268160 − 342771840E23A4 B 2 + 355628304E22A9 + 55837440E25A4 B + 332170560E24A3 B 2 + 991837440E23A2 B 3 + 873400320E2B 5 − 1325802096E22A7 B + 727775280E22A3 B 3 + 1478062080E22AB 4 − 462511536E2A10 + 2359788336E22A5 B 2 − 2810004192E2A6 B 2 + 292244077A11 + 3887980560E2A2 B 4 + 4762800000E2A4 B 3 + 1930802688E2A8 B + 3804160A5E26 − 1325620701A9B + 2471168610A7B 2 − 1132668090A5B 3 + 6150153825A3B 4 + 3302730855AB 5 − 14400000E25A6 + 62142720E24A7 − 172019840E23A8 ) .
E 8 model We give the partition functions of genus up to five. (8)
Z2;1 =
1 E4 (E4 + 5E22 ) , 1440ϕ12
September 21, 2002 14:4 WSPC/148-RMP
952
00146
K. Mohri
(8)
Z2;2 = (8)
Z2;3 =
1 (417E2 E43 + 190E42 E23 + 540E22 E4 E6 + 225E2 E62 + 356E42 E6 ) , 207360ϕ24 1 (575E24 E43 + 3040E23E42 E6 + 4690E22 E4 E62 + 3548E22E44 2488320ϕ36 + 1600E63E2 + 10176E6E43 E2 + 2231E45 + 5244E42E62 ) ,
(8)
Z2;4 =
1 (77280E24E6 E43 + 209200E22E63 E4 + 547760E22E6 E44 179159040ϕ48 + 214811E46E2 + 203900E23E62 E42 + 103252E45E23 + 827230E62E43 E2 + 10200E25E44 + 57375E64E2 + 420616E45E6 + 314360E42E63 ) ,
(8)
Z2;5 =
1 (15422230E64E42 + 43101209E62E45 + 5522085E48 12899450880ϕ60 + 1903680E65E2 + 18947800E23E45 E6 + 1744920E25E44 E6 + 50040570E22E62 E44 + 6480025E24E43 E62 + 11149400E23E42 E63 + 8437860E22E4 E64 + 51231560E63E43 E2 + 42541168E46E6 E2 + 2482715E46E24 + 9555018E47E22 + 178320E26E45 ) ,
(8)
Z3;1 = (8)
Z3;2 =
1 E4 (4E6 + 21E2 E4 + 35E23 ) , 362880ϕ12 1 (14984E42E6 E2 + 8925E22E43 + 2275E42E24 + 7560E23 E4 E6 34836480ϕ24 + 4725E22E62 + 3540E44 + 4071E4E62 ) ,
(8)
Z3;3 =
1 (138104E44E6 + 224024E6E43 E22 + 36400E24E42 E6 209018880ϕ36 + 224456E42E62 E2 + 49584E4E63 + 68460E23E4 E62 + 55006E23E44 + 6055E25E43 + 97431E45E2 + 33600E63E22 ) ,
(8)
Z3;4 =
1 (28134630E47 + 151049093E44E62 + 25488295E4E64 90296156160ϕ48 + 966630E26E44 + 189296376E62E43 E22 + 8172360E25E6 E43 + 31388000E23E63 E4 + 88718416E23E6 E44 + 24977155E24E62 E42 + 13366787E45E24 + 12119625E64E22 + 137926976E42E63 E2 + 51557313E46E22 + 192353224E45E6 E2 ) ,
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve (8)
Z3;5 =
953
1 (274848600E65E4 + 2868277704E63E44 + 1662616800E6E47 1083553873920ϕ60 + 635585864E24E45 E6 + 323470350E23E4 E64 + 349176520E24E42 E63 + 41643000E26E44 E6 + 2109910578E23E62 E44 + 174368705E25E43 E62 + 3866100E27E45 + 101077200E65E22 + 424873884E47E23 + 1739056502E64E42 E2 + 5180110741E62E45 E2 + 70310947E46E25 + 3045375184E63E43 E22 + 2693483096E46E6 E22 + 696828225E48E2 ) ,
(8)
Z4;1 = (8)
Z4;2 =
1 E4 (39E42 + 80E2 E6 + 210E22 E4 + 175E24) , 87091200ϕ12 1 (53220E2E44 + 112540E22E42 E6 + 45185E23E43 + 7385E42 E25 2090188800ϕ24 + 28350E24E4 E6 + 23625E23E62 + 61065E2 E4 E62 + 6300E66 + 49402E6E43 ) ,
(8)
Z4;3 =
1 (3164700E24E4 E62 + 8993259E45E22 + 14111840E62E43 75246796800ϕ36 + 806400E64 + 25171632E2E6 E44 + 13855280E23E6 E43 + 8963520E2E63 E4 + 20453520E22E62 E42 + 4014627E46 + 208985E26E43 + 2016000E63E23 + 1417920E25E42 E6 + 2638125E24E44 ) ,
(8)
Z4;4 =
1 (3336940980E23E43 E62 + 7817234620E2E62 E44 5417769369600ϕ48 + 3248768730E63E43 + 5085796952E22E45 E6 + 101280375E65 + 3550525000E22E42 E63 + 1290318725E2E4 E64 + 936363912E46E23 + 1481276055E47E2 + 2912603799E46E6 + 1216807640E24E44 E6 + 152620090E25E45 + 78676080E26E6 E43 + 410158000E24E63 E4 + 274844990E25E62 E42 + 8381520E27E44 + 202702500E64E23 ) ,
(8)
Z4;5 =
1 (16869986640E65E4 E2 + 8944068536E25E45 E6 52010585948160ϕ60 + 1167070464E66 + 2035152000E65E23 + 436442160E27E44 E6 + 854577430E46E26 + 114133172104E62E44 + 183172864792E63E44 E2 + 36942885E28E45 + 5146355025E24E4 E64 + 11890359900E49 + 7455500881E47E24 + 23616142080E48E22 + 60902666801E64E43
September 21, 2002 14:4 WSPC/148-RMP
954
00146
K. Mohri
+ 2043907670E26E43 E62 + 4688369560E25E42 E63 + 54769592870E64E42 E22 + 66152468720E63E43 E23 + 60955175392E46E6 E23 + 109420106696E6E47 E2 + 170157797734E62E45 E22 + 35736239660E24E62 E44 ) , (8)
Z5;1 = (8)
Z5;2 =
1 E4 (136E4 E6 + 429E42 E2 + 440E22 E6 + 770E23 E4 + 385E25 ) , 11496038400ϕ12 1 (4510275E22E43 + 10553400E22E44 + 2494800E63E2 3310859059200ϕ24 + 3358995E45 + 14869360E23E42 E6 + 12090870E22E4 E62 + 19568568E6E43 E2 + 2245320E25E4 E6 + 7083727E42E62 + 512050E42E26 + 2338875E22E62 ) ,
(8)
Z5;3 =
1 (935093824E62E43 E2 + 233170300E24E6 E43 9932577177600ϕ36 + 296640960E22E63 E4 + 837550728E22E6 E44 + 453680480E23E62 E42 + 16385600E26E42 E6 + 42513240E25E4 E62 + 201151929E45E23 + 36275085E25E44 + 53222400E64E2 + 266767491E46E2 + 405268284E45E6 + 268326944E42E63 + 33264000E63E24 + 2155615E27E43 ) ,
(8)
Z5;4 =
1 (12207942670E26E45 + 523849095E28E44 2860582227148800ϕ48 + 156150752805E48 + 113811930320E25E44 E6 + 1311485716360E46E6 E2 + 1760563778482E22E62 E44 + 286289201000E22E4 E64 + 381058740370E24E43 E62 + 1449394307792E63E43 E2 + 1106487740990E62E45 + 44575839000E65E2 + 109025587484E46E24 + 774483173328E23E45 E6 + 531170439360E23E42 E63 + 5431290480E27E6 E43 + 37160939200E25E63 E4 + 337421738130E47E22 + 21439577390E26E62 E42 + 22344052500E64E24 + 344998537324E64E42 ) ,
(8)
Z5;5 =
1 (31511006810584E65E42 + 177751951656248E63E45 102980960177356800ϕ60 + 78175349827680E6E48 + 344664297670E46E27 + 21179704043952E48E23 + 4104656416113E47E25 + 41030103891064E46E6 E24 + 1264302270000E65E24 + 2891093990400E66E2 + 31336414684620E49E2 + 155081412353885E64E43 E2 + 296146234031236E62E46 E2 + 43310617469240E63E43 E24 + 19202491494120E25E62 E44 + 46735606475470E64E42 E23
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
955
+ 2649529315125E25E4 E64 + 154506124080E28E44 E6 + 803244450470E27E43 E62 + 4107192009800E26E45 E6 + 2083500320440E26E42 E63 + 149437965048686E62E45 E23 + 21211745049000E65E4 E22 + 144355295784864E6E47 E22 + 11970104685E29E45 + 236773842080568E63E44 E22 ) . 7.2. Gopakumar Vafa invariants We can extract from the higher genus Gromov–Witten partition function Zg;n an important integer-invariants, the Gopakumar–Vafa invariants [29], which we will explain very briefly. For more details on this subject, see [9, 30, 21, 31]. Let us first consider a BPS state in M theory compactified on a Calabi–Yau threefold X which is realized by the M2-brane wrapped around a holomorphic curve C. In addition to the Abelian gauge charge corresponding to the homology class [C] ∈ H2 (X), such a state carries also a quantum number of the 5D little group SO(4) = SU (2)L × SU (2)R . To clarify the origin of the Lorentz quantum number, let Mβ be the moduli ˆ β → Mβ space of curves in X with a fixed homology class β ∈ H2 (X), and π: M ˆβ the extended moduli space with its Jacobian fibration, which means that M parametrizes all the pairs of a curve of the fixed homology class β and a flat line ˆ β is the appropriate moduli space for a BPS M2 brane bundle on it. Namely, M with its Abelian charge fixed. The BPS states for this degree of freedom arise from quantization of the moduli ˆ β ; C) represented by the harmonic space, that is, the cohomology group H ∗ (M differential forms. The SU (2)L and SU (2)R Lorentz quantum numbers come from the Lefshetz SU (2) actions for the fibre and base direction of the Jacobian fibration ˆ β → Mβ respectively. π:M We take the following representation of the Lorentz spin content of the BPS states with fixed β: ˆ β ; C) = H ∗ (M
g X
Ih ⊗ Uh;β ,
L Ih := [V1/2 ⊕ 2V0L ]⊗h ,
h=0
where g is the maximum value of the genus that a curve of the fixed homology class L β can have, and Vj the irreducible SU (2) module of spin j. Let Uh;β = j Nh,j;β VjR be the irreducible decomposition of the SU (2)R module above. Then Nh,j;β ∈ Z≥0 is the multiplicity of the BPS states with the SO(4) Lorentz quantum number Ih+1 ⊗ VjR and the Abelian gauge charge β ∈ H2 (X). GV with fixed h ∈ Z and β ∈ H2 (X) is The Gopakumar–Vafa invariant Nh;β GV := then given by the index with respect to the SU (2)R on Uh;β , that is, Nh;β P 2πij e (2j + 1)N . h,j;β j
September 21, 2002 14:4 WSPC/148-RMP
956
00146
K. Mohri
It has been found in [29] that the instanton part of the full partition function of IIA topological string on X [16] can be obtained by 2h−2 ∞ ∞ X ∞ X0 X X mx 2g−2 GV 1 x Fg = Nh;β e2πimhJ,βi , (7.8) 2 sin m 2 g=0 m=1 β∈H2 (X) h=0
ahler class of X. where J ∈ H 2 (X; C) is the complexified K¨ From now on we turn to the investigation of the Gopakumar–Vafa invariants of one of the six local two-parameter models of the E 9 almost del Pezzo surface. First GV be the Gopakumar–Vafa invariant of genus g and bidegree (n, m), and let Ng;n,m define its generating function by GV (τ ) = Zg;n
∞ X
GV Ng;n,m qm .
(7.9)
m=0
We can show from (7.8) that the partition function of the genus g Gromov– Witten invariants Zg;n (τ ) admits the following decomposition into the generating functions of Gopakumar–Vafa invariants of genus h ≤ g: Zg;n (τ ) =
g X
βg,h
h=0
X
GV n (kτ ) , k 2g−3 Zh; k
(7.10)
k|n
where βg,h is the rational number defined by the following expansion: 2h−2 X ∞ sin(x/2) = βg,h x2(g−h) . (x/2)
(7.11)
g=h
Note that βg,0 coincides with the one given earlier in (2.22). Now we give the M¨ obius inversion formula of (7.10) following [30, Prop. 2.1]. To this end, let us first define the rational number αg,h by 2h−2 X ∞ arcsin(x/2) = αg,h x2(g−h) . (7.12) (x/2) g=h
The M¨obius inversion for the Gopakumar–Vafa invariants can then be written as GV (τ ) Zg;n
=
g X h=0
αg,h
X
µ(k)k 2h−3 Zh; nk (kτ ) .
(7.13)
k|n
Let us take the E 8 model and substitute the leading term (2.22) of the partition function in (7.13). Then we see for each (g, n) 6= (0, 1), GV(8) (τ ) = Zg;n
g X h=0
αg,h βh,0 n2h−3
X
µ(k) + O(q n ) = O(q n ) .
(7.14)
k|n
To be more explicit, we describe below the decompositions of the Gromov– Witten partition functions Zg;n (τ ) into the generating functions of Gopakumar– GV (τ ) (7.10) and their M¨ obius inversions (7.13) for lower genera. Vafa invariants Zg;n
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
957
Genus zero
For the genus zero case, (7.10) and (7.13) read X X GV GV n (kτ ) , k −3 Z0; Z0;n (τ ) = µ(k)k −3 Z0; nk (kτ ) , Z0;n (τ ) = k k|n
k|n
inst GV (τ ) = Z0;n (τ ), that is, the genus zero Gopakumar–Vafa which shows that Z0;n invariants are nothing but the numbers of rational instantons.
Genus one For the genus one cases, we have X 1 GV GV n (kτ ) + n (kτ ) Z k −1 Z1; , Z1;n (τ ) = k 12 0; k k|n
GV (τ ) Z1;n
=
X
µ(k) k
−1
k|n
1 −3 Z1; nk (kτ ) − k Z0; nk (kτ ) . 12
GV inst and Z1;n are The transformation formulae between Z1;n X X GV inst inst GV n (kτ ) , n (kτ ) . (τ ) = Z1; Z1;n (τ ) = µ(k)Z1; Z1;n k k k|n
Genus two
(7.15)
k|n
For the genus two case, X 1 GV GV Z n (kτ ) , k Z2; nk (kτ ) + Z2;n (τ ) = 240 0; k k|n
GV (τ ) Z2;n
=
X k|n
µ(k) kZ
2; n k
1 −3 n k Z0; k (kτ ) . (kτ ) − 240
inst [16, (7.7)], On the other hand, we have also the genus two instanton numbers N2;n,m P ∞ inst inst m the generating function of which Z2;n (τ ) = m=0 N2;n,m q is defined through inst (τ ) + Z2;n (τ ) = Z2;n
1 X inst kZ0; nk (kτ ) . 240 k|n
GV inst (τ ) and Z2;n (τ ) are related each We see that the two partition functions Z2;n other by X X inst GV GV inst n (kτ ) , n (kτ ) . (τ ) = kZ2; Z (τ ) = µ(k)kZ2; (7.16) Z2;n 2;n k k k|n
k|n
Genus three Finally for the genus three case, X 1 GV 1 GV GV n (kτ ) − n (kτ ) + n (kτ ) Z Z k 3 Z3; , Z3;n (τ ) = k 12 2; k 6048 0; k k|n
GV (τ ) = Z3;n
X k|n
1 31 −3 k Z0; nk (kτ ) . µ(k) k 3 Z3; nk (kτ ) + kZ2; nk (kτ ) − 12 60480
September 21, 2002 14:4 WSPC/148-RMP
958
00146
K. Mohri
The formula (7.13) enables us to convert the Gromov–Witten partition function GV Zg;n (τ ) to the generating function of the Gopakumar–Vafa invariants Zg;n (τ ). Let GV Ng;n (BN ) be the Gopakumar–Vafa invariant of the local E N del Pezzo model of GV (τ ) using (7.13), we genus g and degree n. Based on the calculation of several Zg;n propose the following conjecture for the Gopakumar–Vafa invariants of the local E 9 del Pezzo models: E0 :
GV GV Ng;3n,n = Ng;n (P2 ) ,
(7.17)
E ˜1 :
GV GV Ng;2n,n = Ng;n (P1 × P1 ) ,
(7.18)
EN :
GV GV Ng;n,n = Ng;n (BN ) ,
N = 5, 6, 7, 8 .
(7.19)
It should be noted that the evaluation of the left hand side is much easier than that GV (τ ) below to see of the right hand side [31]. We will show some examples of Zg;n the integrality of their q-expansions. E 7 model We give the genus two Gopakumar–Vafa generating functions. GV = 6q 4 + 168q 5 + 860q 6 + 4976q 7 + 18660 q 8 + 72160q 9 + 226952q 10 Z2;1
+ 712128q 11 + · · · , GV = −580q 4 − 12224q 5 − 171192q 6 − 1520960 q 7 − 11191692q 8 − 67475456q 9 Z2;2
− 361410816q 10 − · · · , GV = 986q 4 + 90952q 5 + 2505136q 6 + 43815752 q 7 + 539969082q 8 Z2;3
+ 5314601592q 9 + 43546643132q 10 + · · · , GV = −844q 4 − 219392q 5 − 14554008q 6 − 456217600q 7 − 9386376248q 8 Z2;4
− 142590577280q 9 − 1733995192624q 10 − · · · , GV = 116880q 5 + 22288580q 6 + 1484462912q 7 + 53446857696q 8 Z2;5
+ 1298602990944q 9 + 23677762683308q 10 + · · · . E 8 model We give only the genus two and three cases. GV = 3q 2 + 772q 3 + 19467q 4 + 257796q 5 + 2391067q 6 + 17484012q 7 Z2;1
+ 107445366q 8 + 577157904q 9 + 2782194327q 10 + · · · , GV = −4q 2 − 25604q 3 − 3075138q 4 − 135430120 q 5 − 3449998524q 6 Z2;2
− 61300761264q 7 − 839145842528q 8 − 9401698267600q 9 − 89741934231984q 10 − · · · , GV = 30464q 3 + 26356767q 4 + 4012587684q 5 + 267561063651q 6 Z2;3
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
959
+ 10669237946340q 7 + 296540296415919q 8 + 6281046300189120 q 9 + 107386914608369634q 10 + · · · , GV = −26631112q 4 − 18669096840q 5 − 3493725635712 q 6 − 315335792669280q 7 Z2;4
− 17502072462748056q 8 − 680822976267281568q 9 − 20119222969453708672q 10 − 476723960943969692160q 11 − · · · , GV = 16150498760q 5 + 11074858711765q 6 + 2457788116576020q 7 Z2;5
+ 280285943363605460q 8 + 20134110289153178480q 9 + 1021994028815246670450q 10 + · · · , GV = −4q 3 − 1038q 4 − 28200q 5 − 403530q 6 − 4027020q 7 − 31528152q 8 Z3;1
− 206468416q 9 − 1176822312q 10 − · · · , GV = 1296q 3 + 494144q 4 + 38004700q 5 + 1400424188q 6 + 32782202520q 7 Z3;2
+ 559061195716q 8 + 7518370093000q 9 + 83886353406048 q 10 + 804126968489640q 11 + · · · , GV = −1548q 3 − 5707354q 4 − 1607880090q 5 − 158684891624q 6 Z3;3
− 8435743979080q 7 − 294159368706504q 8 − 7512935612951670 q 9 − 150615781749573158q 10 − 2483798853495519960q 11 − · · · , GV = 5889840q 4 + 8744913564q 5 + 2548788575530 q 6 + 314635716180400q 7 Z3;4
+ 22243167756986804q 8 + 1053665475134158016q 9 + 36762786441521664780q 10 + 1005501515252382449280q 11 − · · · . GV = 7785768630q 5 + 8996745286730q 6 + 2835031032258700q 7 Z3;5
+ 420624614518458350q 8 + 37292995978411176810q 9 + 2255647477866896285790q 10 + 101168121676653460498460q 11 − · · · .
7.3. Partition functions as Jacobi forms The solution (2.20) of the modular anomaly equation (2.17) implies that the partition function Zg;n (τ |µ) is completely determined only if we could fix the anomaly0 (τ |µ), which is an E 8 Weyl-invariant Jacobi free part of its numerator (2.10), Tg;n form of weight 2g − 2 + 6n and index n. We introduce here some notation: let Γ (E 8 ) be the space of the E 8 Weyl-invariant Jacobi forms of weight k and index Jk,n L Γ Γ (E 8 ) := k,n Jk,n (E 8 ) the total space of n for a modular group Γ ⊂ SL(2; Z); J∗,∗ such forms, which has a structure of a graded M∗ (Γ)-algebra. Unfortunately, we do
September 21, 2002 14:4 WSPC/148-RMP
960
00146
K. Mohri SL(2;Z)
0 not have the generators of J∗,∗ (E 8 ) [32] to fix Tg;n (τ |µ) up to finite unknown coefficients. SL(2;Z) (E 8 ) from However a powerful method to generate certain elements of J∗,∗ 0 the theta function ΘE 8 has been used to obtain T0;n (τ |µ) for n = 2, 3, 4 in [7]. (n)
We will now explain the method. First, we note that ΘE 8 (τ |µ): = ΘE 8 (nτ |nµ) Γ (n)
is an element of J4,n0 Γ0 (n) (E 8 ) Jk,m
(E 8 ). Secondly, the slash action of γ ∈ SL(2; Z) on G(τ |µ) ∈
is defined by ! aτ + b µ , cτ + d cτ + d
πimc 1 (µ|µ) G exp − (G|γ)(τ |µ): = (cτ + d)k cτ + d
γ=
a
b
c
d
,
which satisfies (G|γ1 )|γ2 = G|(γ1 γ2 ). Note that G|γ = G for any γ ∈ Γ0 (n) by defiSL(2;Z) (E 8 ) if and only if G|γ = G for any γ ∈ SL(2; Z). Thirdly, nition, and G ∈ Jk,m consider the coset space Γ0 (n)\SL(2; Z), on which SL(2; Z) acts as permutation Q from the right, and the cardinality of which is c(n): = n p|n (1 + p−1 ); in particular we can take {I, T, T S, . . . , T S p−1 } as its representatives if n = p is prime [22]. Then (n) for f (τ ) ∈ Mk (Γ0 (n)), we define σa (f )(τ |µ) by X
Y
c(n)
tc(n)−a σa(n) (f ) =
a=0
(n)
[t + (f ΘE 8 |γ)] ,
(7.20)
γ∈Γ0 (n)\SL(2;Z)
(n)
(n)
that is, σa (f ) is the ath basic symmetric polynomial in {(f ΘE 8 |γ)}. From the argument above, we can see that
(n) σa (f )
∈
SL(2;Z) Ja(k+4),an (E 8 ).
Note in particular that
(n) σ1 (1)
is the nth Hecke transform of ΘE 8 . More generally, we can see that any SL(2;Z) (E 8 ); for example, permutation invariant combination gives an element of J∗,∗ P (n) (n) if fi ∈ Mki (Γ0 (n)) for i = 1, 2, then γ (f1 ΘE 8 |γ) · (f2 ΘE 8 |γ) is an element of SL(2;Z)
J8+k1 +k2 ,2n (E 8 ), and so on. 0 (τ |µ) ∈ We will now determine doubly winding partition functions Tg;2 SL(2;Z)
J2g+10,2 (E 8 ) for lower gs based on the assumption that they are obtained by the procedure that we have just explained above. We need to consider only (2) SL(2;Z) σ1 : M∗ (Γ0 (2)) → J∗+4,2 (E 8 ), which is a homomorphism of M∗ (SL(2; Z))0 should be found in the free M∗ (SL(2; Z))-module modules. It turns out that Tg;2 (2)
(2)
(2)
generated by the three elements σ1 (AB), σ1 (B 2 ) and σ1 (B) = (ΘE 8 )2 ; we do (2) (2) not use σ1 (1) and σ1 (A) as generators because the q-expansion of them specialized to the four E N 6=7,8 models has a pole; for E 5 model, for example, 2 ΘE 8 (4τ |τ ω5 ) (2) (9ϑ3 (2τ )4 + ϑ4 (2τ )4 ) , σ1 (1)(4τ |τ ω5 ) = 2 ϑ2 (2τ )4 ϑ3 (2τ )2 2 ΘE 8 (4τ |τ ω5 ) 1 (2) (15ϑ3 (2τ )8 σ1 (A)(4τ |τ ω5 ) = − 4 ϑ2 (2τ )4 ϑ3 (2τ )2 + 26ϑ4 (2τ )4 ϑ3 (2τ )4 − ϑ4 (2τ )8 ) .
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
961
Incidentally, AB and B 2 can be expanded as AB =
1 (E4 A + E6 ) , 3
B2 =
1 (4E6 A + 3E4 B + 2E42 ) . 9
0 in [7]: We list the result of our fitting including the fundamental result of T0;2 0 = T0;2
1 (2) σ (AB) , 12 1
0 = T1;2
1 (2) [E4 (ΘE 8 )2 + 6σ1 (B 2 )] , 576
0 = T2;2
1 (2) [26E6 (ΘE 8 )2 + 63E4 σ1 (AB)] , 51840
0 = T3;2
1 (2) (2) [445E42 (ΘE 8 )2 + 832E6 σ1 (AB) + 1260E4σ1 (B 2 )] , 11612160 (7.21)
0 = T4;2
1 (2) (2) [6692E4 E6 (ΘE 8 )2 + 13599E42σ1 (AB) + 7560E6σ1 (B 2 )] , 1045094400
0 = T5;2
1 (2) [(603615E43 + 523520E62)(ΘE 8 )2 + 2249856E4E6 σ1 (AB) 1655429529600 (2)
+ 1844370E42σ1 (B 2 )] , 0 = T6;2
1 [123290398E42E6 (ΘE 8 )2 1506440871936000 (2)
(2)
+ (185440941E43 + 61328640E62)σ1 (AB) + 180540360E4E6 σ1 (B 2 )] , 0 = T7;2
1 [(2926360905E44 + 5095068160E4E62 )(ΘE 8 )2 867709942235136000 (2)
(2)
+ 16042720896E42E6 σ1 (AB) + (8745249240E43 + 3358817280E62)σ1 (B 2 )] . It is then easy to recover Zg;2 (τ |µ) by the solution (2.21) of the modular anomaly equation. As a consistency check of our procedure, we can look into the integrality of the Gopakumar–Vafa invariants; indeed for the E 7 model, we have GV(7)
(τ ) = 5q 4 + 560q 5 + 18350q 6 + 240736q 7 + 2479193q 8 + · · · ,
GV(7)
(τ ) = −896q 6 − 21248q 7 − 354032q 8 − 3578624q 9 − 30445968q 10 + · · · ,
GV(7)
(τ ) = 7q 6 + 784q 7 + 30124q 8 + 443392q 9 + 5276873q 10 + · · · ,
GV(7)
(τ ) = −1228q 8 − 32064q 9 − 617904q 10 − 6946048q 11 − 66942248q 12 − · · · ,
GV(7)
(τ ) = 9q 8 + 1008q 9 + 44450q 10 + 720928q 11 + 9741094q 12 + · · · .
Z3;2 Z4;2 Z5;2 Z6;2 Z7;2
September 21, 2002 14:4 WSPC/148-RMP
962
00146
K. Mohri
We also find the triply winding partition function in the same manner: 1 (3) (3) 0 [20σ1 (H 4 ) + 972η 24 σ1 (1) − 3E4 (ΘE 8 )3 ] , = T0;3 864 1 (3) [24σ1 (H 4 ($(6) )2 ) − E6 (ΘE 8 )3 ] , 2592 is again the result of [7]. 0 = T1;3
0 where T0;3
8. Seiberg Witten Curve 8.1. Periods of rational elliptic surfaces Local mirror of the IIA string on KB9 with the K¨ahler moduli (2.5) is the IIB string ˜, y˜, u) on the degenerate Calabi–Yau threefold given by the two equations in (x, y, x [7]: (y)2 = 4(x)3 − f (u; τ, µ)x − g(u; τ, µ) , x˜y˜ = u − u∗ ,
(8.1)
where the first equation (8.1) itself describes the family of rational elliptic surfaces S9 in Weierstrass form, the nine moduli (τ, µ) of which should be encoded as f (u; τ, µ) =
4 X
f4−i (τ, µ)ui ,
f0 (τ ) =
4 4 π E4 (τ ) , 3
(8.2)
g6−i (τ, µ)ui ,
g0 (τ ) =
8 6 π E6 (τ ) . 27
(8.3)
i=0
g(u; τ, µ) =
6 X i=0
The determination of the precise forms of f and g for given moduli (τ, µ) will be discussed in Sec. 9. The base of the elliptic fibration π: S9 7→ P1 is the u-plane and we have a rational two-form Ω = d x/y ∧ d u on it inherited from the holomorphic three-form Ω ∧ d x˜/˜ x on the Calabi–Yau threefold through the Poincar´e residue. Note that the pair S9 given by (8.1) and Ω is nothing but the ingredients of the Seiberg–Witten curve [33] that describes 4D E-string [2, 3]. There exists a C∗ -action on S9 which preserves the two-form Ω [2, 3]: (x, y, u) 7→ (λ2 x, λ3 y, λu) ,
λ ∈ C∗ .
(8.4)
Let Eu : = π −1 (u) be the fiber at u; the leading terms f0 and g0 are fixed by the physical requirement that E∞ , the fiber at infinity, has the modulus τ . Let us introduce the coordinates at u = ∞ by (x, y, t) = (xu2 , yu3 , 1/u), in terms of which the defining equation of E∞ can be written in a canonical form: E∞ : y 2 = 4x3 − f0 (τ )x − g0 (τ ) . There exits a pair of one-cycles on E∞ (α, β) such that Z Z dx dx = 1, =τ. y α β y
(8.5)
(8.6)
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
963
The rational two-form Ω now takes the form Ω = d t/t ∧ d x/y, from which we see that the Poincar´e residue of Ω along E∞ is nothing but the canonical one-form on E∞ : ResE∞ (Ω) =
dx . y
(8.7)
We can identify E∞ with the complex torus C/(Zτ +Z) through the uniformization (x, y) = (℘(τ |ν), ℘0 (τ |ν)), where ν is the coordinate of the covering space C of the torus and the Weierstrass ℘ function is defined by X0 1 1 1 − 2 . ℘(τ |ν) = 2 + ν (ν − ω)2 ω ω∈Zτ +Z
Note that d x/y is pulled-back by this isomorphism to d ν, which makes (8.6) rather trivial. The point ν mod Zτ + Z of the torus will be frequently used to refer to the point (℘(τ |ν), ℘0 (τ |ν)) of E∞ below. We also introduce the homogeneous coordinates (x0 , x1 , x2 ) of P2 with x = x1 /x0 , y = x2 /x0 , so that we can realize E∞ as a plane curve defined by the ternary cubic P : P (x0 , x1 , x2 ) : = x0 x22 − 4x31 + f0 (τ )x20 x1 + g0 (τ )x30 .
(8.8)
The twelve normalized periods (1, τ, σ, µ, ∂σ F0 ) of the E-string can be obtained as the periods of the Seiberg–Witten curve (8.1) [2, 3, 5–7, 34] To see this, let (α(u), β(u)) be the standard symplectic basis of the fiber H1 (Eu ) which extends the one (α, β) at u = ∞. It is clear from (8.6) and (8.7) that the two periods 1, τ can be recovered by the integrals [34]: I I I I 1 dx dx 1 , τ =− . (8.9) du du 1=− 2πi u=∞ y 2πi α(u) u=∞ β(u) y P Evaluation of the Wilson lines µ = 8i=1 µi ωi needs a more careful study of periods, since it deals with a seemingly divergent integral due to the pole of Ω. Notice first that Ω is a holomorphic two-form on S9 −E∞ . Then we are naturally lead to define the period map %ˆ [35] of Ω by Z 1 Ω. %ˆ: H2 (S9 − E∞ ) → C , %ˆ(D) : = 2πi D The non-trivial part of the homology exact sequence of the pair (S9 , S9 − E∞ ): · · · → Hi (S9 − E∞ ) → Hi (S9 ) → Hi (S9 , S9 − E∞ ) → Hi−1 (S9 − E∞ ) → · · · , can be written as ∂
i
j∗
∗ ∗ H2 (S9 − E∞ ) −→ H2 (S9 ) −→ H0 (E∞ ) → 0 , 0 → H1 (E∞ ) −→
(8.10)
where we used the Poincar´e duality Hi (S9 , S9 − E∞ ) ∼ = H 4−i (E∞ ) and H1 (S9 − ∼ E∞ ) = 0.
September 21, 2002 14:4 WSPC/148-RMP
964
00146
K. Mohri
There are two points: first, ∂∗ α and ∂∗ β are just the elements of H2 (S9 − E∞ ) appeared in (8.9), from which it follows immediately that %(∂ ˆ ∗ α) = 1, and %ˆ(∂∗ β) = τ , that is, %ˆ · ∂∗ is nothing but the period map of E∞ by d x/y; second, j∗ : H2 (S9 ) → H0 (E∞ ) ∼ = Z simply counts the intersection number of a divisor with E∞ . As the homology class of E∞ is [δ], we see that Ker j∗ = Im i∗ can be identified (1) with L(E 8 ). We conclude that %ˆ induces the homomorphism of additive groups: (1)
%: L(E 8 ) → C mod Zτ + Z .
(8.11)
In other words, we can regularize systematically the integral of the rational form Ω (1) over L(E 8 ) at the expense of the additive ambiguity Zτ + Z. It is possible to describe % quite explicitly in terms of the moduli of S9 . Let pi := (℘(τ |νi ), ℘0 (τ |νi )) be the intersection point in S9 of E∞ with the ith exceptional divisor Ei . Then the rational elliptic surface S9 is obtained by blow-up of P2 at these nine points, which also satisfy 9 X
νi = 0 mod Zτ + Z
(8.12)
i=1
because they are the intersection points of two cubics in P2 , that is, there exists another cubic Q such that {νi } = {P = 0} ∩ {Q = 0}. The rational elliptic surface S9 in question is then expressed as the hypersurface P + tQ = 0 in P2 × P1 ; this is the same as the realization of the E 0 model, but it is the complex structure that matters here. We can show that % is given by [35, 36] %(Ei − Ej ) = νi − νj , %(l − Ei − Ej − Ek ) = −νi − νj − νk ,
(8.13) (8.14)
where νi ’s are defined only up to addition of Zτ + Z. Let us see how (8.13) is obtained. Choose a path γi,j which connects pj and pi on E∞ and let Ti,j be a closed tubular neighbourhood of γi,j in S9 − E∞ , such that each of Ei ∩Ti,j = Di and Ej ∩Ti,j = Dj is its fiber. Then (Ei −Di )∪∂Ti,j ∪(Ej −Dj ) becomes an oriented topological manifold homologous to Ei − Ej and disjoint from E∞ . In physical terms, ∂Ti,j is a “wormhole” connecting the two “universes” Ei and Ej , with the singular points pi , pj replaced by a “black and white hole”. Now the evaluation of the left hand side of (8.13) proceeds as follows: Z Z pi Z Z 1 dx 1 , Ω= Ω= ResE∞ Ω = 2πi Ei −Ej 2πi ∂Ti,j y γi,j pj which yields precisely the right hand side of (8.13), the ambiguity of which comes from the choice of the path γi,j . As for (8.14), we first combine the divisor in question as (l − Ei − Ej ) − Ek . We can then take as l the transform of the line on P2 that passes through the two
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
965
points {pi , pj }. Now we see that (l − Ei − Ej ) is an effective curve intersecting with E∞ at −pi − pj , the coordinates of which are (℘(τ |ν), ℘0 (τ |ν)), with ν := −νi − νj . At this point, the problem is reduced to (8.13), so that (8.14) follows. We note that as a consequence of (8.14), %([δ]) = %(l − E1 − E2 − E3 ) + %(l − E4 − E5 − E6 ) + %(l − E7 − E8 − E9 ) =−
9 X
νi = 0 mod Zτ + Z ,
i=1
which is consistent with the fact that we can take a generic fiber Eu as a representative of [δ], thus Ω|Eu vanishes identically, since Eu is a holomorphic curve disjoint from E∞ . We do not have much to say about the Seiberg–Witten periods [33], σ and ∂σ F0 . These are not periods of S9 in the sense above, and roughly given by ! Z Z I Z I 1 ∂F0 dx dx dx , −τ , = du σ = − du (2πi)2 ∂σ α(u) y β(u) y α(u) y a detailed account of which we have already given for the six two-parameter models in (4.65), (4.66) and (4.67) for σ and in (4.69), (4.70), and (4.71) for ∂σ F0 , with the correspondence of the bare parameters z1 ∝ u−1 in mind. Explicit evaluation and instanton expansion of these Seiberg–Witten integrals in terms of modular forms has been done for the E 8 model in [6]. 8.2. Wilson lines Based on the results in the last subsection, we find the Wilson lines [33, 2, 3, 7] to be νi − νi+1 , i = 1, . . . , 7 , 1 %([αi ]) = (8.15) µi = 2πi i = 8, −ν1 − ν2 − ν3 , where [αi ] ∈ H2 (S9 ) is defined in (2.3). The Euclidean coordinates (2.6) of the Wilson lines (mi ) and the base points of the cubic pencil νi = E∞ ∩ Ei are related to each other by ν1 1 2 0 0 0 0 0 0 m1 1 0 2 0 0 0 0 0ν 2 m2 1 0 0 2 0 0 0 0 ν3 m3 ν4 1 0 0 0 2 0 0 0 1 , (8.16) m4 = − 2 1 0 0 0 0 2 0 0 ν5 m5 1 0 0 0 0 0 2 0 ν6 m7 1 0 0 0 0 0 0 2 ν7 m8 ν8 1 2 2 2 2 2 2 2
September 21, 2002 14:4 WSPC/148-RMP
966
00146
K. Mohri
ν1
−2 −2 −2 −2 −2 −2
−5 ν 1 1 1 1 1 2 1 −5 ν3 1 1 1 1 ν4 1 1 1 −5 1 1 1 = ν 6 1 1 1 −5 1 1 5 1 ν6 1 1 1 −5 1 1 ν7 1 1 1 1 −5 ν8
1
1
1
1
1
1
−2
2
m1 −1 m 2 1 −1 m3 1 −1 m4 . 1 −1 m 5 1 −1 m 7 1 −1 m8 −5 −1 1
(8.17)
9. Inverse Problem 9.1. General strategy We are interested in the following problem: given the eight Wilson lines µ = P8 9 i=1 µi ωi , or equivalently the nine points (νi )i=1 on the torus C/(Zτ + Z), which satisfy (8.12), find the corresponding Seiberg–Witten curve (8.1). In other words, we want to know explicitly the two functions f (u; τ, µ) in (8.2) and g(u; τ, µ) in (8.3) as functions of the moduli. We will solve this problem in two steps [3]: in the first step, we obtain the cubic pencil P + tQ in P2 for the given base points (νi ), which is achieved by consideration based on the elliptic function theory; and in the second step, we transform the cubic pencil into the Weierstrass form (8.1), with the help of the classical theory of algebraic invariants. First step We claim that the curve defined by the cubic Q(x0 , x1 , x2 ) shown below passes through the nine points {(1, ℘(τ |νi ), ℘0 (τ |νi ))}, where (νi ) are taken to be generic except (8.12): 3 x0 x20 x1 x20 x2 x0 x21 x0 x1 x2 x0 x22 x21 x2 x1 x22 x32 ℘1 ℘01 ℘21 ℘1 ℘01 (℘01 )2 ℘21 ℘01 ℘1 (℘01 )2 (℘01 )3 1 1 ℘2 ℘02 ℘22 ℘2 ℘02 (℘02 )2 ℘22 ℘02 ℘2 (℘02 )2 (℘02 )3 1 ℘3 ℘03 ℘23 ℘3 ℘03 (℘03 )2 ℘23 ℘03 ℘3 (℘03 )2 (℘03 )3 ℘4 ℘04 ℘24 ℘4 ℘04 (℘04 )2 ℘24 ℘04 ℘4 (℘04 )2 (℘04 )3 , (9.1) Q= 1 1 ℘5 ℘05 ℘25 ℘5 ℘05 (℘05 )2 ℘25 ℘05 ℘5 (℘05 )2 (℘05 )3 1 ℘6 ℘06 ℘26 ℘6 ℘06 (℘06 )2 ℘26 ℘06 ℘6 (℘06 )2 (℘06 )3 ℘7 ℘07 ℘27 ℘7 ℘07 (℘07 )2 ℘27 ℘07 ℘7 (℘07 )2 (℘07 )3 1 1 ℘8 ℘0 ℘2 ℘8 ℘0 (℘0 )2 ℘2 ℘0 ℘8 (℘0 )2 (℘0 )3 8
8
8
8
8 8
8
℘0i
8
0
= ℘ (τ |νi ). where we have used the abbreviated notation ℘i = ℘(τ |νi ), The proof is quite simple if we recall some fundamental theorems of the elliptic function theory [37]; first h(τ |ν): = Q(1, ℘(τ |ν), ℘0 (τ |ν)) is an elliptic function with
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
967
the only pole of ninth order at ν = 0; then as h(τ |ν) has eight simple zeros at P8 ν = ν1 , . . . , ν8 by construction, the ninth zero should be ν = − i=1 νi = ν9 . In fact, h(τ |ν) admits the following concise expression [38, III, p. 98–99]: h(τ |ν0 ) = 2
10
σ(τ |
P8
νi ) Q8
Q
i=0
i=0
i<j
σ(τ |νj − νi )
σ(τ |νi )9
,
(9.2)
where σ(τ |ν) is the Weierstrass sigma function Y0 1 1 (πν)2 E2 (τ ) ϑ1 (τ |ν) ν ( ν )+ 1 ( ν )2 e6 . e ω 2 ω = 1− σ(τ |ν) : = ν ω 2π η(τ )3 ω∈Zτ +Z
The E2 (τ ) factors in the sigma functions cancel out in (9.2). Curiously, we encounter the same function ϑ1 (τ |ν)/η(τ )3 as in (2.16). The cubic curve Q = 0 intersects with E∞ at the nine points {(1, ℘i , ℘0i )}. Note that Q never coincides with the original cubic P since Q lacks the x31 term [3]. Therefore we have found for the given moduli parameters (τ, µ) the rational elliptic surface S9 in the form of a cubic pencil P + tQ, where P is given in (8.8) and Q in (9.1). If (νi ) are not generic, e.g., if νi = νj , then the right hand side of (9.1) vanishes identically. However an appropriate limiting procedure such as νj → νi should still produce a non-trivial solution. In fact, in the following subsection we treat several models with degenerate Wilson line parameters, where Q factorizes into lower degree polynomials. Second step What is really in need is the Seiberg–Witten form (8.1) of S9 , which is given by the cubic pencil at this point. In principle, we can find a coordinate transformation of (x0 , x1 , x2 ) by GL(3; C) which takes the cubic pencil P + tQ above into the Weierstrass form [3], but it seems very difficult to perform this task in general. We can in fact skip this difficulty to reach the Weierstrass form (8.1) directly. To see this, first recall that there exists a natural action of GL(3; C) on the space of the ternary cubic forms, a general member R of which we write as X 3! (9.3) apqr xp0 xq1 xr2 . R:= p!q!r! p+q+r=3 It is well-known that the ring of the projective invariants are the polynomial ring over C generated by the two basic invariants S and T , that is, (C[apqr ])P GL(3;C) = C[S, T ] which can be obtained by the formula [39, II.7]: β Hess αR + Hess(R) = (432α2 βS + 1728β 3 S 2 − 216αβ 2 T )R 36 + (α3 + 2β 3 T − 12αβ 2 S) Hess(R) ,
(9.4)
September 21, 2002 14:4 WSPC/148-RMP
968
00146
K. Mohri
where Hess(R) : = |(∂i ∂j R)| is the Hessian of R, which is another cubic. We give in (A.1) and (A.2) the explicit forms of these two invariants, which indeed coincide with those in [40, Prop. 4.4.7] and [40, Exm. 4.5.3]. Any generic cubic R can then be transformed by GL(3; C) to the Weierstrass form: 27 27 S, g = T . (9.5) R 7→ x22 x0 − 4x31 + f x1 x20 + gx30 , f = 4 64 This technique enables us to find the Seiberg–Witten form (8.1) of the rational elliptic surface S9 given by a cubic pencil P + tQ. Fixed Wilson lines In the remaining of this subsection, we take two models with fixed Wilson lines as a warming-up exercise. Let us first take the E 8 model the Wilson lines of which are νi = 0 for all i. Thus we need a cubic Q(x0 , x1 , x2 ) which intersects with E∞ nine times at (x0 , x1 , x2 ) = (0, 1, 0) corresponding to ν = 0. The triple line Q = x30 does the job, simply because ν = 0 is one of the inflection points of E∞ . The resulting cubic pencil P + tQ can easily be converted to the Weierstrass form in the original variables (x, y): (y)2 = 4(x)3 − f0 (τ )u4 x − (g0 (τ )u6 + u5 ) ,
(9.6)
which is the Seiberg–Witten curve of the E 8 model found in [3]. Next consider the model with the nine inflection points { 13 (aτ + b)|a, b = 0, 1, 2} as the base points of the cubic pencil. Note that these nine points sum up to zero. We can find easily also in this case the cubic Q; it is simply the Hessian of P : Hess(P ) = −8(f0 (τ )2 x30 + 36g0 (τ )x20 x1 + 12f0 (τ )x0 x21 − 12x1 x22 ) .
(9.7)
This follows from the fact that the four x coordinates {℘(τ | 13 (aτ +b))|(a, b) 6= (0, 0)} are the roots of the equation: 48x4 − 24f0 (τ )x2 − 48g0 (τ )x − f0 (τ )2 = 0. The calculation of the basic algebraic invariants S, T of the cubic P + t Hess(P ) yields the Weierstrass form (8.1) of the cubic pencil with f (u; τ ) =
4 4 π (E4 (τ )u4 − 4E6 (τ )u3 + 6E4 (τ )2 u2 − 4E4 E6 (τ )u 3 + 4E6 (τ )2 − 3E4 (τ )3 ) ,
g(u; τ ) =
8 6 π (E6 (τ )u6 − 6E4 (τ )2 u5 + 15E4 E6 (τ )u4 − 20E6 (τ )2 u3 27 + 15E42 E6 (τ )u2 + 6E4 (2E62 − 3E43 )(τ )u + E6 (9E43 − 8E62 )(τ )) .
(9.8)
Recall here that f0 (τ ) = 4/3π 4 E4 (τ ), g0 (τ ) = 8/27π 6E6 (τ ). We can see that the partition function of the singly-winding sector Zg,1 (τ ) vanishes identically for this model due to the theta function formula [38, II, p. 148], the physical meaning of which is yet to be clarified.
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
969
9.2. Several models with a few Wilson lines Two Wilson lines The model with two Wilson lines, that is, µ = m1 e1 + m2 e2 has been investigated in [7], where the Seiberg–Witten curve were found and the partition function Z0,1 (τ |mi ) has been derived directly from it. We find the nine base points of the cubic pencil representing the rational elliptic surface to be {νi } = {0, 0, 0, 0, 0, m+, −m+ , m− , −m− }, where m± = (m1 ± m2 )/2. The cubic Q which passes through these nine points can easily be identified. In fact, the nine points are decomposed into the three triples {0, 0, 0}, {0, m+ , −m+ }, {0, m− , −m− }, each of which defines a line in P2 : L0 = x0 ,
L+ = x1 − ℘(τ |m+ )x0 ,
L− = x1 − ℘(τ |m− )x0 .
(9.9)
Thus the cubic Q is found to be Q = x0 (x1 − ℘(τ |m+ )x0 )(x1 − ℘(τ |m− )x0 ). Now calculation of the basic algebraic invariants of the ternary cubic P + tQ tells us its Weierstrass form (8.1): f (u; τ, m± ) = f0 u4 − (℘+ + ℘− )u3 +
1 2 u , 12
(9.10)
f0 1 1 3 u , (9.11) u5 − (℘+ + ℘− )u4 + g(u; τ, m± ) = g0 u + ℘+ ℘− + 12 12 216 6
where ℘± = ℘(τ |m± ). Indeed, using the formula 1 1 ℘(τ |m) = − (ϑ3 (τ )4 + ϑ2 (τ )4 ) + ϑ3 (τ )2 ϑ2 (τ )2 2 π 3
ϑ4 (τ |m) ϑ1 (τ |m)
2 ,
and a rescaling (8.4), it can be shown that the above form coincides with the one in [7, (A.1)], which was obtained through a reasoning quite different from the one here, modulo some misprints. P3 Three Wilson lines Consider the model with the Wilson lines µ = i=1 mi ei , where mi are chosen to be generic. A W (E 8 ) action simplifies the nine base points of the cubic pencil to be {νi } = {0, 0, 0, 0, 0, ζ0, ζ1 , ζ2 , ζ3 }, where −1 −1 1 1 1 1 −1 1. (ζ0 , ζ1 , ζ2 , ζ3 ) = (m1 , m2 , m3 ) −1 (9.12) 2 −1 1 1 −1 We see immediately that {0, 0, 0} determines the line x0 = 0 and the other six points {0, 0, ζ0 , ζ1 , ζ2 , ζ3 } the conic of the form C = a0 x20 + a1 x0 x1 + a2 x0 x2 + a3 x21 , the precise coefficients of which are determined by the formula 2 x0 x0 x1 x0 x2 x21 1 ℘1 ℘01 ℘21 (9.13) C(x0 , x1 , x2 ) = , 1 ℘2 ℘02 ℘22 1 ℘3 ℘03 ℘23
September 21, 2002 14:4 WSPC/148-RMP
970
00146
K. Mohri
℘1 a0 = ℘ 2 ℘3 1 a3 = − 1 1
℘01 ℘02 ℘03 ℘1 ℘2 ℘3
1 ℘01 ℘21 ℘22 , a1 = − 1 ℘02 1 ℘0 ℘23 3 0 ℘1 ℘02 , ℘0
℘21 ℘22 , ℘23
1 a2 = 1 1
℘1 ℘2 ℘3
℘21 ℘22 , ℘23
3
where ℘i = ℘(τ |ζi ), ℘0i = ℘0 (τ |ζi ). The cubic passing through the nine points are determined to be Q = x0 (a0 x20 + a1 x0 x1 + a2 x0 x2 + a3 x21 ) and the Seiberg–Witten form (8.1) of it is written in terms of (a0 , a1 , a2 , a3 ) as 1 2 2 a u , 12 3 f0 1 1 3 3 6 a u . g(u; τ, mi ) = g0 u + a0 + a3 u5 + (a1 a3 − 3a22 )u4 + 12 12 216 3
f (u; τ, mi ) = f0 u4 + a1 u3 +
(9.14) (9.15)
It is an amusing exercise to see that under the limit Im τ → +∞: ℘(τ |ζ) → −
π2 π2 + , 2 3 sin (πζ)
℘0 (τ |ζ) → −
2π 3 cos(πζ) , sin3 (πζ)
the curve above reduces to the trigonometric one with three Wilson lines [41, (2.8)]. Three Wilson lines II If we consider the model with µ = m1 (e1 − e2 ) + m2 (e3 − e4 ) + m3 (e5 − e6 ), the nine base points of the cubic pencil are given by {0, 0, 0, ±m1, ±m2 , ±m3 }. As the line intersecting with E∞ with the three points {0, ±mi } is given by x1 − ℘i x0 = 0, where we set ℘i : = ℘(τ |mi ), the cubic Q we need is found to be Q = (x1 − ℘1 x0 )(x1 − ℘2 x0 )(x1 − ℘3 x0 ) .
(9.16)
The computation of the algebraic invariants of the cubic pencil P + tQ yields the Weierstrass form (8.1) with 1 1 f (u; τ, mi ) = f0 u4 + (4σ2 − f0 )u3 + (σ12 − 3σ2 )u2 , 4 12 g(u; τ, mi ) = g0 u6 −
1 1 (f0 σ1 + σ3 )u5 + (3g0 + f0 σ1 − 4σ1 σ2 )u4 12 48
1 (2σ13 − 9σ1 σ2 + 27σ3 )u3 , 432 where σ1 = ℘1 + ℘2 + ℘3 , σ2 = ℘1 ℘2 + ℘2 ℘3 + ℘3 ℘1 , and σ3 = ℘1 ℘2 ℘3 . −
(9.17)
(9.18)
Four Wilson lines Let us consider the model with the Wilson lines given in the P4 Euclidean coordinates µ = i=1 mi ei , where any of mi or mi − mj is non-zero.
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
After a suitable W (E 8 ) action, the nine base points of the cubic {νi } = {−2ζ0 , ζ0 , ζ0 , ζ0 , ζ0 , ζ1 , ζ2 , ζ3 , ζ4 }, where 1 −5 1 1 1 1 −5 1 1 (ζ0 , ζ1 , ζ2 , ζ3 , ζ4 ) = (m1 , m2 , m3 , m4 ) 1 6 1 1 −5 1
1
1
pencil become 1
1 . 1 −5
1
971
(9.19)
The first three points {−2ζ0 , ζ0 , ζ0 } defines a line L tangent to the elliptic curve E∞ , while the remaining six a conic C. If we set ℘i = ℘(τ |ζi ), ℘0i = ℘0 (τ |ζi ), ℘00 = ℘(τ |2ζ0 ), ℘000 = −℘0 (τ |2ζ0 ), they are given by x0 x1 x2 (9.20) L(x0 , x1 , x2 ) = 1 ℘0 ℘00 , 1 ℘00 ℘0 00
2 x0 1 1 C(x0 , x1 , x2 ) = 1 1 1
x0 x1
x0 x2
x21
x1 x2
℘0
℘00
℘20
℘0 ℘00
℘1
℘01
℘21
℘1 ℘01
℘2
℘02
℘22
℘2 ℘02
℘3
℘03
℘23
℘3 ℘03
℘4
℘04
℘24
℘4 ℘04
x22 (℘00 )2 (℘01 )2 . (℘02 )2 (℘03 )2 (℘0 )2
(9.21)
4
The cubic in question is found to be Q = L · C. However, the Weierstrass form of the cubic pencil P + tQ is so complicated that we do not attempt to describe it here. Four Wilson lines II Finally, let us take the model with {νi } = {0, ±ζ1 , ±ζ2 , ±ζ3 , ±ζ4 }. We thus need to find a cubic Q which passes through the eight points (1, ℘i , ±℘0i ), where ℘i = ℘(τ |ζi ), ℘0i = ℘0 (τ |ζi ), for i = 1, . . . , 4, as well as (0, 0, 1). We find that such a cubic Q is given by Q = (g0 − f0 σ1 − 4σ3 )x20 x1 + (f0 + 4σ2 )x0 x21 − σ1 x0 x22 + x1 x22 + (4σ4 − g0 σ1 )x30 , where σi is the ith basic symmetric polynomial of ℘i s. In fact we can see that Q(1, x, y) = 4(x − ℘1 )(x − ℘2 )(x − ℘3 )(x − ℘4 ) , if y 2 = 4x3 − f0 x − g0 , that is, if we restrict Q on E∞ . Nevertheless, Q itself is an irreducible cubic in this case. Acknowledgments I benefitted from attending the workshop “Periods Associated to Rational Elliptic Surfaces and Elliptic Lie Algebras” held at International Christian University in
September 21, 2002 14:4 WSPC/148-RMP
972
00146
K. Mohri
June 2001. I wish to express my gratitude to many participants both of the workshop and of my informal talks at Nagoya University for useful comments. Among them are Profs. H. Kanno, A. Kato, S. Kondo, H. Ohta, Y. Shimizu, A. Tsuchiya, and Y. Yamada. I would also like to thank Dr. Y. Ohtake for helpful discussions. My special thanks are due to the late Prof. S.-K. Yang for valuable advice. This work was supported in part by Grant-in-Aid for Scientific Research on Priority Area 707 “Supersymmetry and Unified Theory of Elementary Particles”, Japan Ministry of Education, Science, Sports, Culture and Technology. Appendix A In this appendix we present for completeness the explicit form of the generators S and T of the algebraic invariants of the ternary cubic form R (9.3). S = −a300 a120 a2012 − 2a210 a012 a2111 + a300 a030 a102 a012 + a300 a003 a120 a021 + a030 a003 a210 a201 − a210 a120 a102 a012 − a210 a201 a012 a021 − a120 a201 a102 a021 − a030 a2201 a012 − a003 a2120 a201 − a300 a102 a2021 + a2210 a2012 − a030 a210 a2102 − a003 a2210 a021 + a2120 a2102 + a2201 a2021 − 2a120 a102 a2111 + a4111 − a300 a030 a003 a111 − 2a201 a021 a2111 + a300 a012 a021 a111 + a030 a201 a102 a111 + a003 a210 a120 a111 + 3a210 a102 a021 a111 + 3a120 a201 a012 a111 ,
(A.1)
T = −3a2012 a2300 a2021 − 24a2111 a2201 a2021 + 24a120 a4111 a102 + 4a3120 a300 a2003 − 3a2120 a2210 a2003 − 27a2210 a2102 a2021 + 4a3201 a2030 a003 − 24a2012 a2210 a2111 + 24a012 a210 a4111 − 24a2120 a2102 a2111 + 4a2300 a3021 a003 + a2300 a2030 a2003 + 24a4111 a201 a021 − 12a2120 a300 a003 a102 a021 + 4a3012 a2300 a030 − 27a2012 a2120 a2201 − 3a2201 a2030 a2102 + 8a3120 a3102 + 8a3012 a3210 + 24a2210 a003 a201 a2021 + 4a300 a2030 a3102 + 12a2210 a2111 a021 a003 + 12a210 a2111 a2102 a030 − 36a210 a3111 a021 a102 + 8a3201 a3021 − 8a6111 + 4a3210 a2003 a030 − 24a300 a030 a2102 a111 a021 + 12a2201 a030 a102 a111 a021 − 12a120 a210 a3102 a030 − 12a120 a210 a3111 a003 + 24a120 a300 a2102 a2021 − 12a300 a201 a3021 a102 − 12a102 a201 a3111 a030 − 24a2210 a003 a102 a111 a030 − 24a300 a210 a2021 a111 a003 − 20a300 a3111 a030 a003 − 12a2201 a030 a210 a021 a003 + 36a210 a2111 a201 a030 a003 + 36a210 a102 a111 a2021 a201 + 6a300 a030 a102 a210 a021 a003 − 6a300 a2030 a102 a201 a003 + 6a210 a201 a021 a2102 a030 + 12a300 a2021 a2111 a102 + 12a300 a201 a030 a003 a111 a021 + 12a012 a300 a210 a030 a111 a003 + 6a012 a300 a210 a2021 a102 + 12a012 a120 a2210 a111 a003 + 6a012 a120 a2201 a030 a102 + 18a012 a300 a030 a102 a201 a021
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
973
+ 6a012 a120 a300 a201 a030 a003 − 12a012 a120 a300 a030 a2102 − 12a012a120 a210 a2111 a102 + 36a012 a120 a2201 a021 a111 + 36a012 a2120 a111 a102 a201 − 24a012a2120 a300 a111 a003 + 6a012 a2120 a210 a201 a003 + 18a012a120 a300 a210 a021 a003 − 60a012 a120 a300 a111 a021 a102 + 6a2012 a120 a300 a201 a021 − 24a2012 a300 a201 a111 a030 − 6a012 a120 a210 a102 a201 a021 − 12a012a210 a2111 a201 a021 + 36a012 a300 a030 a102 a2111 + 36a2012 a120 a210 a111 a201 − 12a2012a300 a030 a102 a210 + 12a2012 a300 a111 a021 a210 − 24a120 a2201 a030 a111 a003 − 12a120a102 a2201 a2021 + 36a120 a210 a2102 a111 a021 + 6a120 a2210 a003 a102 a021 − 60a012a102 a201 a111 a030 a210 + 36a012 a2210 a111 a021 a102 + 12a120a300 a030 a102 a111 a003 − 60a120 a210 a201 a021 a111 a003 + 18a120a210 a102 a201 a030 a003 − 12a120 a102 a2111 a201 a021 + 12a120a201 a111 a030 a2102 − 12a3120 a102 a201 a003 − 12a120 a300 a003 a201 a2021 + 36a120a300 a2111 a021 a003 − 6a120 a300 a2003 a210 a030 − 12a2120 a2102 a201 a021 + 24a2120a2201 a021 a003 + 12a2120 a2111 a201 a003 + 12a012 a2201 a030 a2111 − 12a012a300 a3111 a021 − 12a012 a2120 a210 a2102 − 12a012 a3201 a030 a021 − 12a012a210 a2201 a2021 − 36a012 a120 a3111 a201 − 12a012 a3210 a021 a003 + 24a012a2210 a2102 a030 − 12a3012 a300 a210 a120 + 24a2012 a2120 a300 a102 − 12a2012a120 a2210 a102 + 12a2012 a120 a300 a2111 − 12a2012 a2210 a201 a021 + 24a2012a2201 a030 a210 + 12a2120 a210 a102 a111 a003 − 12a012 a2210 a201 a030 a003 + 12a012a300 a201 a2021 a111 − 6a012 a2300 a030 a021 a003 .
(A.2)
References [1] O. J. Ganor and A. Hanany, Small E8 instantons and tensionless non-critical strings, Nucl. Phys. B474 (1996) 122–138. [2] O. J. Ganor, Toroidal compactification of heterotic 6D non-critical strings down to four dimensions, Nucl. Phys. B488 (1997) 223–235. [3] O. J. Ganor, D. R. Morrison and N. Seiberg, Branes, Calabi–Yau spaces, and toroidal compactification of the N = 1 six-dimensional E8 theory, Nucl. Phys. B487 (1997) 93–127. [4] A. Klemm, P. Mayr and C. Vafa, BPS states of exceptional non-critical strings, Nucl. Phys. B (Proc. Suppl.) 58 (1997) 177–194. [5] W. Lerche, P. Mayr and N. P. Warner, Non-critical strings, del Pezzo singularities and Seiberg–Witten curves, Nucl. Phys. B499 (1997) 125–148. [6] J. A. Minahan, D. Nemeschansky and N. P. Warner, Partition functions for BPS states of the non-critical E8 string, Adv. Theor. Math, Phys. 1 (1998) 167–183.
September 21, 2002 14:4 WSPC/148-RMP
974
00146
K. Mohri
[7] J. A. Minahan, D. Nemeschansky, C. Vafa and N. P. Warner, E-strings and N = 4 topological Yang–Mills theories, Nucl. Phys. B527 (1998) 581–623. [8] S. Hosono, M.-H. Saito and J. Stienstra, On the mirror symmetry conjecture for Schoen’s Calabi–Yau 3-folds, in Integrable System and Algebraic Geometry, Proc. Taniguchi Symposium 1997, eds. M.-H. Saito, Y. Shimizu and K. Ueno, World Scientific, 1998, pp. 194–235. [9] S. Hosono, M.-H. Saito and A. Takahashi, Holomorphic anomaly equation and BPS state counting of rational elliptic surface, Adv. Theor. Math. Phys. 3 (1999) 177–208. [10] T.-M. Chiang, A. Klemm, S.-T. Yau and E. Zaslow, Local mirror symmetry: calculations and interpretations, Adv. Theor. Math. Phys. 3 (1999) 495–565. [11] S. Hosono, A. Klemm, S. Theisen and S.-T. Yau, Mirror symmetry, mirror map and applications to complete intersection Calabi–Yau spaces, Nucl. Phys. B433 (1995) 501–552. [12] R. Miranda and U. Persson, On extremal rational elliptic surfaces, Math. Z. 193 (1986) 537–558. [13] O. DeWolfe, T. Hauer, A. Iqbal and B. Zwiebach, Uncovering infinite symmetries on [p, q] 7-branes: Kac–Moody algebras and beyond, Adv. Theor. Math. Phys. 3 (1999) 1835–1891. [14] K. Saito, Extended affine root systems II (flat invariants), Publ. RIMS, Kyoto Univ. 26 (1990) 15–78. [15] I. Satake, Automorphisms of the extended affine root systems and modular property for the flat theta invariants, Publ. RIMS, Kyoto Univ. 31 (1995) 1–32. [16] M. Bershadsky, S. Cecotti, H. Ooguri and C. Vafa, Kodaira–Spencer theory of gravity and exact results for quantum string amplitudes, Commun. Math. Phys. 165 (1994) 311–427. [17] T. Kawai and K. Yoshioka, String partition functions and infinite products, Adv. Theor. Math. Phys. 4 (2000) 397–485. [18] A. N. Schellekens and N. P. Warner, Anomalies, characters and strings, Nucl. Phys. B287 (1987) 317–361. [19] G. Curio and D. L¨ ust, A class of N = 1 dual string pairs and its modular superpotentials, Int. J. Mod. Phys. 12 (1997) 5847–5866. [20] M. Fukae, Y. Yamada and S.-K. Yang, Mordell–Weil lattice via string junctions, Nucl. Phys. B572 (2000) 71–94. [21] S. Hosono, M.-H. Saito and A. Takahashi, Relative Lefschetz action and BPS state counting, Int. Math. Res. Notice. 15 (2001) 783–816. [22] B. Schoeneberg, Elliptic Modular Functions: An Introduction, Die Grundlehren der mathematischen Wissenschaften, Bd. 203, Springer-Verlag, 1974. [23] K. Mohri, Y. Ohtake and S.-K. Yang, Dualities between string junctions and D-branes on del Pezzo surfaces, Nucl. Phys. B595 (2001) 138–164. [24] K. Mohri, Y. Onjo and S.-K. Yang, Closed sub-monodromy problems, local mirror symmetry and branes on orbifolds, Rev. Math. Phys. 13(6) (2001) 675–715. [25] G. Aldazabal, A. Font, L. E. Ib´ an ˜ez and A. M. Uranga, New branches of string compactifications and their F-theory duals, Nucl. Phys. B492 (1996) 119–151. [26] B. H. Lian and S.-T. Yau, Arithmetic properties of mirror map and quantum coupling, Commun. Math. Phys. 176 (1996) 163–191. [27] P. Candelas, X. C. de la Ossa, P. S. Green and L. Parkes, A pair of Calabi–Yau manifolds as an exactly soluble superconformal theory, Nucl. Phys. B359 (1991) 21–74. [28] M. Bershadsky, S. Cecotti, H. Ooguri and C. Vafa, Holomorphic anomalies in topological field theories, Nucl. Phys. B405 (1993) 279–304.
September 21, 2002 14:4 WSPC/148-RMP
00146
Exceptional String: Instanton Expansions and Seiberg–Witten Curve
975
[29] R. Gopakumar and C. Vafa, M-theory and topological strings II, preprint. [30] J. Bryan and R. Pandharipande, BPS states of curves in Calabi–Yau 3-folds, Geometry and Topology 5 (2001) 287–318. [31] S. Katz, A. Klemm and C. Vafa, M-theory, topological strings and spinning black holes, Adv. Theor. Math. Phys. 3 (1999) 1445–1537. [32] K. Wirthm¨ uller, Root systems and Jacobi forms, Compositio Math. 82 (1992) 293– 354. [33] N. Seiberg and E. Witten, Monopoles, duality and chiral symmetry breaking in N = 2 supersymmetric QCD, Nucl. Phys. B431 (1994) 484–550. [34] T. Hauer and A. Iqbal, Del Pezzo surfaces and affine 7-brane backgrounds, J. High Energy Phys. 0001 (2000) 043. [35] E. Looijenga, Rational surfaces with an anti-canonical cycle, Ann. Math. 114 (1981) 267–322. [36] H. Sakai, Rational surfaces with affine root systems and geometry of the Painlev´ e equations, Commun. Math. Phys. 220 (2001) 165–221. [37] A. Hurwitz und R. Courant, Vorlesungen u ¨ber Allgemeine Funktionentheorie und Elliptische Funktionen, Die Grundlehren der mathematischen Wissenschaften, Bd. 3, Springer-Verlag, 1929. ´ ements de la Th´eorie des Fonctions Elliptiques, Tomes I–IV, [38] J. Tannery et J. Molk, El´ Gauthier-Villars, 1893–1902; reprinted in two vols. Chelsea, New York, 1972. [39] D. Hilbert, Theory of Algebraic Invariants, Cambridge Mathematical Library, Cambridge Univ. Press, 1993. [40] B. Sturmfels, Algorithms in Invariant Theory, Texts and Monographs in Symbolic Computation, Springer, 1993. [41] J. A. Minahan, D. Nemeschansky and N. P. Warner, Investigating the BPS spectrum of non-critical En strings, Nucl. Phys. B508 (1997) 64–106.
September 23, 2002 13:37 WSPC/148-RMP
00145
Reviews in Mathematical Physics, Vol. 14, No. 9 (2002) 977–1049 c World Scientific Publishing Company
THE MASTER WARD IDENTITY
∗ and F.-M. BOAS† ¨ M. DUTSCH
Institut f¨ ur Theoretische Physik, Universit¨ at G¨ ottingen, Bunsenstrasse 9, D-37073 G¨ ottingen, Germany ∗
[email protected] †
[email protected] Received 12 November 2001 Revised 2 May 2002
In the framework of perturbative quantum field theory (QFT) we propose a new, universal (re)normalization condition (called ‘master Ward identity’) which expresses the symmetries of the underlying classical theory. It implies for example the field equations, energy-momentum, charge- and ghost-number conservation, renormalized equaltime commutation relations and BRST-symmetry. It seems that the master Ward identity can nearly always be satisfied, the only exceptions we know are the usual anomalies. We prove the compatibility of the master Ward identity with the other (re)normalization conditions of causal perturbation theory, and for pure massive theories we show that the ‘central solution’ of Epstein and Glaser fulfills the master Ward identity, if the UV-scaling behavior of its individual terms is not relatively lowered. Application of the master Ward identity to the BRST-current of non-Abelian gauge theories generates an identity (called ‘master BRST-identity’) which contains the information which is needed for a local construction of the algebra of observables, i.e. the elimination of the unphysical fields and the construction of physical states in the presence of an adiabatically switched off interaction. Keywords: perturbative quantum field theory; renormalization; gauge field theories.
Contents 1 Introduction 2 Formulation of the master Ward identity 2.1 The symbolical algebra with internal and external derivative 2.2 Inductive construction of time ordered products, basic normalization conditions (N0)–(N3) 2.3 Normalization of time-ordered products of symbols with external derivative 2.4 The master Ward identity 3 Steps towards a proof of the master Ward identity ˜ 3.1 Proof of (N) 3.2 Compatibility of the master Ward identity with (N3) 3.3 Proof of the master Ward identity for solely massive fields and not relatively lowered scaling degree 977
978 982 982 983 988 991 992 993 997 999
September 23, 2002 13:37 WSPC/148-RMP
978
00145
M. D¨ utsch & F.-M. Boas
4 Applications of the master Ward identity 4.1 Field equation 4.2 Charge- and ghost-number conservation 4.3 Non-Abelian matter currents 4.4 The master BRST-identity 4.5 Local construction of observables in gauge theories 4.5.1 Massless gauge fields: determination of the interaction 4.5.2 Massless gauge fields: local construction of observables 4.5.3 Massive gauge fields 4.5.4 The interacting BRST-charge 5 Taking anomalies into account 5.1 General procedure and axial anomaly 5.2 Energy momentum tensor: conservation and trace anomaly 6 Conclusions Appendix A: Feynman propagators Appendix B: Explicit results for ∆µ used in the application of the MWI to the BRST-current References
1003 1003 1005 1007 1008 1014 1014 1017 1025 1031 1035 1035 1038 1043 1043 1045 1047
1. Introduction A perturbative interacting quantum field theory is usually constructed in terms of time ordered products (‘T -products’) T (W1 , . . . , Wn )(x1 , . . . , xn ) of Wick polynomials W1 (x1 ), . . . , Wn (xn ) of free fields. The T -products are ill-defined for coinciding points because they are (operator-valued) distributions. In the framework of the inductive construction of Bogoliubov [4] and Epstein/Glaser [19] (‘causal perturbation theory’) this can be formulated as follows: the T -products of n-factors are known by induction as operator-valued distributions up to the total diagonal def Dn = {(x1 , . . . , xn ) | x1 = · · · = xn }. The problem of renormalization is located in the extension of the T -products to Dn , for every n. This extension is always possible, but it is non-unique. The freedom is restricted by normalization conditions. They require that symmetries which are present outside Dn are maintained in the extension and that, for each term in the T -product, the order of the singularity at Dn is not increased by the extension. (The latter implies that an interaction with mass dimension ≤ 4 yields a renormalizable theory (by power counting)). Epstein/Glaser [19] (see also [5]) give a general formula (73) for the extension to Dn which satisfies the last requirement, but the other normalization conditions are not taken into account. So, the main problem of perturbative renormalization is to prove that there is an extension which fulfills all normalization conditions. In the framework of algebraic renormalization the corresponding problem is treated by means of the ‘quantum action principle’ [26, 27, 31], which states that the variation of Green’s functions (under a change of coordinates, a variation of the fields or a variation of a parameter) is equal to the insertion of a (local or space-time integrated) composite field operator. Recently a local algebraic operator formulation of certain cases of the quantum action principle has been given by using causal perturbation theory, and the connection to our normalization conditions has been clarified [8].
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
979
The master Ward identity (we will use the abbreviation ‘MWI’) is a universal normalization condition supplementing the obvious ones. It is an explicit expression for ∂xν T (W, W1 , . . . , Wn )(x, x1 , . . . , xn ) − T (∂ ν W, W1 , . . . , Wn )(x, x1 , . . . , xn ) .
(1)
Generally this difference cannot vanish for the following reason:a the Wick polynomials are built up from free fields, whereas the T -products are the building stones of the perturbative interacting fields [4]. However, the field equations of free and interacting fields are different. Computing the difference (1) by means of the Feynman rules and the normalization condition (N3) (see Sec. 2.2), it can be expressed solely by terms which contain the difference ∂xν hΩ, T (φ, χ)(x, xl )Ωi − hΩ, T (∂ ν φ, χ)(x, xl )Ωi of Feynman propagators, where Ω is the Fock vacuum and φ, χ are free fields. The MWI requires that this structure is preserved in the process of renormalization (Sec. 2). For tree diagrams this is automatically satisfied, but for loop diagrams it is a hard task to show that there exists a normalization which fulfills the MWI and the other normalization conditions (Sec. 3). Unfortunately there are a few examples where this is impossible. However, the only obstructions we know are the usual, well-known anomalies of perturbative QFT (Sec. 5). The master Ward identity expresses all symmetries which can be traced back to the field equations in classical field theory.b In particular we will demonstrate that the MWI implies • • • •
the field equations for the interacting fields (Sec. 4.1), conservation of the energy-momentum tensor (Sec. 5.2), charge conservation in the presence of spinor fields (Sec. 4.2), ghost number conservation in the presence of fermionic ghost fields (Sec. 4.2), and • the master BRST-identity (Sec. 4.4–4.5), which contains the full information of BRST-symmetry [2] for massless and massive gauge fields. The field equations, conservation of the energy-momentum tensor, charge and ghost number conservation have already been proved by using other methods of renormalization (see e.g. [41] for the field field equation and [28] for the energy-momentum tensor, both are based on BPHZ-renormalization) or/and in a In
particular this argument, the name ‘master Ward identity’ and the application of the MWI to the computation of a rigorous substitute for the equal-time commutator of interacting fields (3) are due to Klaus Fredenhagen. b In [9] we extensively work out the MWI in classical field theory. There the MWI can be formulated non-perturbatively: it is a consequence of the field equations and the fact that classical fields may be multiplied point-wise. Hence, the classical MWI holds always true. (This, together with the fact that in the perturbative expansion of classical fields solely tree diagrams appear, agrees with the triviality of the quantum MWI for tree diagrams.) The classical formulation of the MWI shows that there is a close connection to the Schwinger–Dyson equations.
September 23, 2002 13:37 WSPC/148-RMP
980
00145
M. D¨ utsch & F.-M. Boas
the framework of causal perturbation theory [7, 12, 29, 40]. Using the normal products of Zimmermann [41], Lowenstein has proved that it is allowed to take a partial derivative out of a Green’s function, if the degree of BPHZ-subtraction is lowered by one, see [28, Appendix B]. However, we are not aware of a formulation of the MWI in its full generality in any method of renormalization. Also the master BRST-identity is new to our knowledge. It is the answer to the obvious question: what results for [Q0 , T (W1 , . . . , Wn )(x1 , . . . , xn )]∓
(2)
if the MWI is satisfied? Thereby, W1 , . . . , Wn are arbitrary Wick monomials, Q0 is the generator of the BRST-transformation of the free fields and [· , ·]∓ means the Z2 -graded (with respect to the ghost number) commutator. There have been other approaches to formulate BRST-symmetry in the framework of causal perturbation theory. In particular the ‘perturbative gauge invariance’ of [11, 15, 35], which was further developed by [16, 21, 38], suffices for a consistent construction of the S-matrix in the adiabatic limit, provided this limit exists. However, this assumption holds certainly not true in massless non-Abelian gauge theories, it seems that the confinement is out of the reach of perturbation theory. In massive non-Abelian gauge theories the instability of physical particles (W - and Z-bosons, muon and tau etc.) is an obstacle for an S-matrix description with adiabatic limit. Our way out is to construct the observables locally (i.e. with the interaction adiabatically switched off, Sec. 4.5), as we have done it for QED [7]. For our operator formalism the BRST-charge operator of Kugo and Ojima [24] seems to be the adequate tool to define the BRST-transformation. But in contrast to this reference we do not perform the adiabatic limit and, hence, avoid the infrared divergences. The mentioned perturbative gauge invariance [11, 15] does not suffice for our local construction of observables in non-Abelian (massless or massive) gauge theories. But we show that ghost number conservation and the master BRST-identity contain all information which is needed for this construction. In particular we will see that the master BRST-identity implies the perturbative gauge invariance of [11, 15] (and even the generalization proposed in [18], which is called ‘generalized (free perturbative operator) gauge invariance’ in [16]). In spite of all these important implications of the MWI, it is difficult to give a direct physical interpretation of this identity (in its full generality) or to formulate the symmetry which is expressed by it. We give two partial answers: • In classical field theory the MWI can be understood as the most general identity which can be obtained from the field equations and the fact that classical fields may be multiplied point-wise and factorize: (AB)L (x) = AL (x)BL (x) (where A, B are field polynomials and AL , BL are the corresponding fields to the interaction L), see [9]. But quantum fields may not be multiplied point-wise (because they are distributions) and, hence, the quantum MWI contains much more information than only the field equations.
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
981
• A particular case of the MWI is a formula for ∂x0 T (AL (x)BL (y)) − T (∂x0 AL (x)BL (y)) , where AL , BL are interacting quantum fields to the interaction L (i.e. non-local formal power series in free fields) and T (· · ·) means time ordering in x and y. This difference can be interpreted as a rigorous substitute for the equal-time commutator of AL and BL :c defining heuristically def T˜ (AL (x)BL (y)) = Θ(x0 − y 0 )AL (x)BL (y) + Θ(y 0 − x0 )BL (y)AL (x)
we in fact obtain ∂x0 T˜(AL (x)BL (y)) − T˜(∂x0 AL (x)BL (y)) = δ(x0 − y0 )[AL (x), BL (y)] .
(3) 0 0 ˜ However, T (· · ·) is problematic: Θ(x −y )A(x)B(y) exists if A and B are free fields, but this does not hold for A and B being Wick polynomials, and for interacting fields the situation is even worse. In addition T˜ (· · ·) is non-covariant: for a Lorentz covariant T -product (denoted by T (· · ·) in the following) and a free scalar field φ we must have the relation ∂µ T (φ(x)∂ν φ(y)) − T (∂µ φ(x)∂ν φ(y)) = Cgµν δ(x − y) (where C is an undetermined constant), which is obviously not satisfied by T˜ (· · ·). But fortunately there is the possibility that the non-covariant terms (i.e. the terms coming from T˜ (· · ·) − T (· · ·)) cancel out with other unwanted terms. This indeed µ in QCD: the (heuristic) equal-time happens for the (interacting) quark currents jaL 0 k commutator [jaL (t, ~x), jaL (t, ~y )], k = 1, 2, 3 has an ‘anomalous’ term, the Schwinger term: 3 X 0 k k kl l (3) (t, ~x), jaL (t, ~y )] = ifabc jcL (t, ~x)δ (3) (~x − ~y) + Sab ∂ δ (~x − ~y) , (4) [jaL l=1 kl ∈ C is constant. In [23, (11)–(89)], it is postulated that the non-covariant where Sab ˜ terms of T (· · ·) are compensated by the Schwinger terms: µ µ ν ν ν (x)jbL (y)) − T (∂µ jaL (x)jbL (y)) = ifabc jcL (x)δ (4) (x − y) . ∂µx T (jaL
(5)
We will show that this identiy is in fact a consequence of the MWI (Sec. 4.3). We return to the crucial question whether the MWI can be satisfied in agreement with all other normalization conditions. The compatibility with the other normalization conditions can be proved generally (Sec. 3.2). If all fields are massive there is a distinguished normalization of the T -products, the so-called central solution [19]. We prove that the central solutions fulfil the MWI (and all other normalization conditions) if the UV-scaling behavior of its individual terms is not relatively lowered (Sec. 3.3). This assumption holds mostly true. However, e.g. for the axial and pseudo-scalar triangle-diagram (89) it is violated, and this makes possible the appearance of the axial anomaly. c We
R
recallR the well-known fact that interacting fields to a sharp time do not exist, i.e. x) dx0 δ(x0 − t)AL (x), t ∈ R, f ∈ D(R3 ), is mathematically ill-defined. d3 x f (~
September 23, 2002 13:37 WSPC/148-RMP
982
00145
M. D¨ utsch & F.-M. Boas
2. Formulation of the Master Ward Identity 2.1. The symbolical algebra with internal and external derivative Let {φ(k) |k = 1, . . . , M } be the free quantum fields in terms of which the model is defined. We assume that this set is closed with respect to taking the adjoint def operator. In the larger set Φ = {∂ a φ(k) |k = 1, . . . , M, a ∈ N40 } (∂ a is a partial derivative of arbitrary order) we neglect the free field equations and write Φ as a sequence (φl )l∈N . To each φl we associate a symbol ϕl ≡ sym(φl ), l ∈ N. Let P be the unital, Abelian ∗-algebrad generated by these symbols. Thereby the symbols corresponding to a free quantum field and to its partial derivatives are linearly independent. The ∗-operation in P corresponds to taking the adjoint of the free field def µ operators: ϕ∗l ≡ sym(φl )∗ = sym(φ+ l ). We define an internal derivative ∂ : P → P def
by ∂ µ ϕl ≡ ∂ µ sym(φl ) = sym(∂ µ φl ) and the requirements that ∂ µ is linear and a derivation. Now we divide P by the ideal J which is generated by the free field equations (with respect to the internal derivative) and denote the resulting unital, Abelian ∗-algebra by P0 , def P . (6) P0 = J Let π be the projection π : P → P0 : A → A + J. Internal derivatives in P0 are def defined bye ∂ µ π(A) = π(∂ µ A), and in this sense the free field equations are valid in P0 . In addition we introduce an external derivativef ∂˜µ on P0 which generates new symbols ∂˜a A (A ∈ P0 , a ∈ N40 , i.e. ∂˜a means a higher external derivative of order |a| = a0 + a1 + a2 + a3 ) and is required to be linear and a derivation. In particular def we set ∂˜a 1 = 0, ∀a 6= 0. The Abelian, unital ∗-algebra (anticommuting in the case of Fermi fields) which is generated by these new symbols is denoted by P˜0 : _ def {∂˜a A | A ∈ P0 , a ∈ N40 } . (7) P˜0 = Next we extend the external and internal derivatives and the ∗-operation to maps P˜0 → P˜0 . For the former two we set def ∂˜b ∂˜a A = ∂˜(b+a) A ,
def ∂ b ∂˜a A = ∂˜a ∂ b A ,
∀A ∈ P0 ,
(8)
and require that ∂˜µ and ∂ µ are linear and derivations. The ∗-operation is extended def by (∂˜a A)∗ = ∂˜a (A∗ )(A ∈ P0 ) and by requiring the usual algebraic relations: antilinearity, (BC)∗ = C ∗ B ∗ and B ∗∗ = B, ∀B, C ∈ P˜0 . Finally we introduce the spaceg D(R4 , P˜0 ) ∼ = P˜0 ⊗ D(R4 ) . d In the case of Fermi fields the symbols anticommute and we call them ‘fermionic e Note that this definition is independent from the choice of the representative A. f This
(9) symbols’.
external derivative has nothing to do with the exterior derivative of differential geometry. ˜0 is paired (to V ⊗ f ) with a Grassmann-valued test function f , see e.g. [34, fermionic V ∈ P Appendix D]. gA
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
983
The internal and external derivatives are defined on this space as the operators ∂ µ ⊗ 1 and ∂˜µ ⊗ 1. Remark 2.1. There exists a surjective algebra ∗-homomorphism σ ˜ : P˜0 → P. This becomes clear from the formalism developed in [9]. Namely, we prove in appendix A of [9] that there exists a map σ : P0 → P (i.e. ‘from free fields to fields’) with the properties: (i) π ◦ σ = 1. (ii) σ is an algebra ∗-homomorphism, i.e. σ is linear, σ(AB) = σ(A)σ(B) and σ(A∗ ) = σ(A)∗ . (iii) The Lorentz transformation commutes with σπ. (iv) σπ(P1 ) ⊂ P1 , where P1 is the sub vector space of P with basis (ϕl )l , i.e. the ‘one-factor symbols’. (v) σ does not increase the mass dimension of the fields, i.e. σπ(B) is a sum of terms with mass dimension ≤ dim (B). In particular we find σπ(ϕ) = ϕ, if ϕ ∈ P1 corresponds to a free field without any derivative. W (vi) {∂ a σ(A)|A ∈ P0 , a ∈ Nd0 } = P. We now extend σ to a map σ ˜ : P˜0 → P by setting def σ ˜ (∂˜a A) = ∂ a σ(A) ,
A ∈ P0 ,
(10)
and requiring that σ ˜ is an algebra ∗-homomorphism. The property (vi) means that σ ˜ is surjective. However, σ ˜ is not injective. This follows from the following simple example: let ϕ ∈ P1 correspond to a free Klein Gordon field (without any ˜ (∂˜ν πϕ) = ∂ ν ϕ = σ ˜ (∂ ν πϕ). derivative). Usually it holds σπ(∂ ν ϕ) = ∂ ν ϕ. Then, σ µ This example does not appear if one introduces an additional field ϕ which replaces ˜ is not injective in that case, too. ∂ µ ϕ (see [9, Secs. 2.3 and 3.2]), but σ 2.2. Inductive construction of time ordered products, basic normalization conditions (N0 ) (N3 ) The time-ordered product Tn (also called ‘T -product’) is a linear, symmetricalh map from D(R4 , P˜0 )⊗n into the (unbounded) operators on the Fock space of the free quantum fields.i In particular linearity implies Tn ((∂ν V ν )g ⊗ · · ·) = 0 if ∂ν V ν vanishes due to the free field equations. All T -products Tn (f1 ⊗ · · · ⊗ fn ), fj ∈ D(R4 , P˜0 ), n ∈ N, have the same domain D which is a dense subspace of the Fock space and which is invariant under all T -products [19]. Physicists use mostly ‘unsmeared T -products’, which are defined by h To
distinguish the symmetry of Tn from other symmetries we sometimes call it ‘permutation symmetry’. i In [9] and [10] the arguments of T are elements of D(R4 , P)⊗n . The map σ ˜ (10) connects the n two formalisms.
September 23, 2002 13:37 WSPC/148-RMP
984
00145
M. D¨ utsch & F.-M. Boas
Z dx1 . . . dxn Tn (V1 , . . . , Vn )(x1 , . . . , xn )g1 (x1 ) . . . gn (xn ) def
= Tn (V1 g1 ⊗ · · · ⊗ Vn gn ) ,
(11) ˜ where g1 , . . . , gn ∈ D(R ), V1 , . . . , Vn ∈ P0 . More precisely (V1 , . . . , Vn ) → Tn (V1 , . . . , Vn ) is a linear and symmetrical map from (P˜0 )⊗n into the operatorvalued distributions. Q (k) Let P˜0 3 V = k ∂˜a sym(φjk ) (where φjk ∈ Φ, ∀k). Then we define T1 by Z Z Y (k) def def dx : ∂ a φjk : (x)g(x) , T1 (1g) = dx g(x) , g ∈ D(R4 ) , (12) T1 (V g) = 4
k
and by linearity, where the double dots mean normal ordering of the free field operators. We point out that T1 is not injective, because T1 ((∂˜a V )W g) = T1 ((∂ a V )W g), V, W ∈ P˜0 . However, T1 is injective if it is restricted to D(R4 , P0 ). The T -products are required to satisfy causal factorizationj (Causality)
Tn (f1 ⊗ · · · ⊗ fn ) = Tk (f1 ⊗ · · · ⊗ fk )Tn−k (fk+1 ⊗ · · · ⊗ fn )
if (supp f1 ∪ · · · ∪ supp fk ) ∩ ((supp fk+1 ∪ · · · ∪ supp fn ) + V¯− ) = ∅ , (13) where V¯− is the closed backward light cone in Minkowski space. Causality enables us
to construct inductively the T -products of higher orders n ≥ 2: if the time ordered products of less than n factors are everywhere defined, the time ordered product of def n factors is uniquely determined up to the total diagonal Dn = {(x1 , . . . , xn )|x1 = · · · = xn }. Thus renormalization amounts to an extension, for every n, of time ordered products to Dn . This extension is always possible, but it is non-unique. It can be done such that the following normalization conditions hold. (Note that these conditions are automatically fulfilled on D(R4n \Dn ) due do the inductive procedure and causal factorization.) • Poincare covariance: Let U be a unitary positive energy representation of ↑ ↑ in Fock space. U induces an automorphic action α of P+ the Poincare group P+ on D(R4 , P0 ) by the definition ↑ ∀f ∈ D(R4 , P0 ), L ∈ P+ , (14) 4 ˜ because T1 is injective on this subspace. We extend αL to D(R , P0 ) by the prescription that (∂˜m ⊗ 1)f transforms in the same way as (∂ m ⊗ 1)f, m ∈ N0 (where ∂ m (∂˜m resp.) denotes the mth power of the gradient ∂ (∂˜ resp.)). More precisely P let f = i Vi ⊗ gi , Vi ∈ P0 , gi ∈ D(R4 ). From (14) we know the transformation of (∂ m ⊗ 1)f , which can be written in the form X ↑ (∂ m V )i ⊗ D(Λ)ji g(Λ,a) j , L ≡ (Λ, a) ∈ P+ , m ∈ N0 , α(Λ,a) (∂ m ⊗ 1)f = def
T1 (αL (f )) = Ad U (L)(T1 (f )) ,
i,j
j This
is the reason for the name ‘time ordered product’.
(15)
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
985
def
where g(Λ,a) (x) = g(Λ−1 (x − a)). Then we define X def (∂˜m V )i ⊗ D(Λ)ji g(Λ,a) j , α(Λ,a) (∂˜m ⊗ 1)f =
↑ (Λ, a) ∈ P+ ,
m ∈ N0 . (16)
i,j
One easily verifies αL1 L2 = αL1 αL2 and that Eq. (14) holds true for the extended αL , i.e. for all f ∈ D(R4 , P˜0 ). The normalization condition expressing the Poincare covariance of the time ordered products reads (N1) Ad U (L)(T (f1 ⊗ · · · ⊗ fn )) = T (αL (f1 ) ⊗ · · · ⊗ αL (fn )) ,
↑ L ∈ P+ .
For pure massive theories the so-called ‘central solution/extension’ (see [19] and Sec. 3.3) is Poincare covariant. For theories with massless fields the existence of a Poincare covariant extension has been proved (in [39] and in the second paper of [11]) by tracing it back to a cohomological problem; an explicit solution has been given in [6]. • Unitarity: To explain what we mean by ‘unitarity’ we introduce the S-matrix (as a formal power series) which is the generating functional of the T -products ∞ n X i Tn (f ⊗ · · · ⊗ f ) , S(f ) = 1 + n! n=1
f ∈ D(R4 , P˜0 ) .
(17)
Since the zeroth order term does not vanish, it has a unique inverse in the sense of formal power series S(f )−1 = 1 +
∞ X (−i)n ¯ Tn (f ⊗ · · · ⊗ f ) , n! n=1
(18)
where the ‘anti-chronological products’ T¯ (· · ·) can be expressed in terms of the time ordered products X Y def (−1)|P |+n T|p| (⊗j∈p fj ) . (19) T¯n (f1 ⊗ · · · ⊗ fn ) = P ∈P({1,...,n})
p∈P
(Here P({1, . . . , n}) is the set of all ordered partitions of {1, . . . , n}, |P | is the number of subsets in P and |p| is the number of elements of p). The reason for the word ‘anti-chronological’ is that the T¯ -products satisfy anti-causal factorization, which means (13) with reversed order of the factors on the r.h.s.. Unitarity of the S-matrix is expressed by S(f )+ = S(f ∗ )−1
(20)
(+ means the adjoint on D, (φ, Bψ) = (B + φ, ψ), φ, ψ ∈ D.) Hence, for the T products we require the normalization condition (N2)
Tn (f1 ⊗ · · · ⊗ fn )+ = T¯n (f1∗ ⊗ · · · ⊗ fn∗ ) ,
which can easily be satisfied by symmetrizing an arbitrary normalized T -product (see [19]).
September 23, 2002 13:37 WSPC/148-RMP
986
00145
M. D¨ utsch & F.-M. Boas
• Relation to T -products of sub-polynomials: Let G ⊂ P0 be a linearly independent set of generators of P0 , i.e. G is a (vector space) basis of πP1 (see def Remark 2.1 for the definition of P1 ). Then G˜ = {∂˜a ϕ|ϕ ∈ G, a ∈ N40 } is a set of linearly independent generators of P˜0 . We define the commutator ‘function’ ∆ϕ,χ by
Z i
def
dx dy h(x)g(y)∆ϕ,χ (x − y) = [T1 (ϕh), T1 (χg)] ,
ϕ, χ ∈ G˜ .
(21)
Every V ∈ P0 can uniquely be written as a polynomial in the generators G. By k partial differentiation in this sense we obtain a ‘sub-polynomial’ ∂V ∂ϕ , ∀ϕ ∈ G. For P 4 f (x) = i Vi fi (x), fi ∈ D(R ), Vi ∈ P0 , we set ∂f def X ∂Vi fi (x) . = ∂ϕ ∂ϕ i For ψ ∈ G˜ we analogously define D(R4 , P˜0 ) respectively) with
∂ ∂ψ
(22)
to be a linear derivation P˜0 → P˜0 (D(R4 , P˜0 ) →
∂χ ∂(∂˜a χ) def = δa,b δχ,ϕ , = δa,b b ˜ ∂ϕ ∂(∂ ϕ)
χ, ϕ ∈ G .
(23)
Generally we call W ∈ P˜0 a subpolynomial of V ∈ P˜0 iff it is of the form k V ˜ The derivation property of for some k ∈ N0 , ϕi1 , . . . , ϕik ∈ G. W = ∂ϕi ∂···∂ϕ ik 1 the commutator [·, T1 (χg)] implies X ∂f ˜ g ∈ D(R4 ) , ∆ψ,χ ? g , ∀f ∈ D(R4 , P˜0 ), χ ∈ G, T1 [T1 (f ), T1 (χg)] = i ∂ψ ˜ ψ∈G (24) where ? means convolution. We now generalize the normalization condition (N3) of [7] to the present framework: we require (N3)
[Tn (f1 ⊗ · · · ⊗ fn ), T1 (χg)] n X X ∂fl ∆ψ,χ ? g ⊗ · · · ⊗ fn T n f1 ⊗ · · · ⊗ = i ∂ψ
(25)
l=1 ψ∈G˜
˜ The r.h.s. is well-defined because ∆ψ,χ ? g is where f1 , . . . , fn ∈ D(R4 , P˜0 ), χ ∈ G. a smooth function. We point out that the defining properties of the T -products given so far (linearity, symmetry, causality, (N1), (N2) and (N3)) are purely algebraic conditions, they are independent from the choice of a state. In the realization of the T -products k If
ϕ ∈ G is a fermionic symbol, the derivative
∂ ∂ϕ
is a graded derivation.
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
987
as operators in Fock space, (N3) becomes equivalent to the translation of the ‘causal Wick expansion’ of Epstein and Glaserl into our formalism, see [19, Sec. 4]. • Scaling degree: (N3) gives the relation to time ordered products of subpolynomials. Once these are known (in an inductive procedure), only the C-number part of the T -product (which is equal to the Fock vacuum expectation value of the T -product) has to be fixed. Due to translation invariance this scalar distribution depends on the relative coordinates only. Hence, the extension of the (operator valued) T -product to Dn is reduced to the extension of a C-number distribution t0 ∈ D0 (R4(n−1) \{0}) to t ∈ D0 (R4(n−1) ). (We call t an extension of t0 if t(f ) = t0 (f ), ∀f ∈ D(R4(n−1) \{0})). The singularity of t0 (y) and t(y) at y = 0 is classified in terms of Steinmann’s scaling degree [5, 37] def δ (26) sd(t) = inf δ ∈ R, lim λ t(λx) = 0 . λ↓0
Note sd(∂ a t) ≤ sd(t) + |a|
and
sd(∂ a δ (m) ) = m + |a| ,
(27)
where δ (m) denotes the m-dimensional δ-distribution. By definition sd(t0 ) ≤ sd(t), and the possible extensions are restricted by requiring (N0)
sd(t0 ) = sd(t) .
(28)
Then the extension is unique for sd(t0 ) < 4(n − 1), and in the general case there remains the freedom to add derivatives of the δ-distribution up to order (sd(t0 ) − 4(n − 1)). In formula: X Ca ∂ a δ(y) (29) t(y) + |a|≤sd(t0 )−4(n−1)
is the general solution, where t is a special extension [5, 30, 19], and the constants Ca are restricted by (N1), (N2), permutation symmetries and the normalization ˜ (normalization of time ordered products of symbols with exterconditions (N) nal derivative) and (N) (MWI) below. For an interaction L with UV-dimension dim(L) ≤ 4 the requirement (28) implies renormalizability by power counting, i.e. the number of indeterminate constants Ca in (Tn ((gL)⊗n ))n (g ∈ D(R4 )) does not increase by going over to higher perturbative orders n. In the seminal paper [19] Epstein and Glaser prove that there exists an extension to Dn which fulfills the normalization conditionsm (N0)–(N3), but they say only few about further symmetries which should be maintained in the extension, e.g. the field equations or gauge invariance. The MWI is a universal normalization condition which summarizes the request for most of this ‘further symmetries’. l Epstein/Glaser
do not use this name, but it appears e.g. in [5]. we neglect that Epstein and Glaser work with a different formalism, in particular they do ˜ not have the external derivative ∂.
m Here
September 23, 2002 13:37 WSPC/148-RMP
988
00145
M. D¨ utsch & F.-M. Boas
2.3. Normalization of time-ordered products of symbols with external derivative The aim of this subsection is to fix the normalization of time-ordered products of symbols with external derivative(s) in terms of time-ordered products without external derivative. This fixation is a necessary ingredient of the formulation of the MWI, because T -products of symbols with external derivatives unavoidably appear in the MWI. Heuristically the external derivative is a derivative which acts after having done the time-ordered contractions of the corresponding symbols (free fields resp.), e.g. Tn+1 ((∂˜ν V )g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) Z = dx dx1 · · · dxn g(x)f1 (x1 ) · · · fn (xn )∂xν T (V, W1 , . . . , Wn )(x, x1 , . . . , xn ) ≡ − Tn+1 (V ∂ ν g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) ,
(30)
where V, W1 , . . . , Wn ∈ P˜0 . However, there are other time ordered products involving factors with external derivatives such as (∂˜ν V )W which cannot be defined in this way in terms of time ordered products of factors without any external derivative. Hence we proceed in an alternative, recursive way: we give an explicit expression for the difference Tn+1 ((∂˜ν V )W g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) − Tn+1 ((∂ ν V )W g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) (31) where V, W, W1 , . . . , Wn ∈ P˜0 . For this purpose we introduce some notations: by means of the Feynman propagator ∆F χ,ψ Z def ˜ f, g ∈ D(R4 ) , χ, ψ ∈ G, dx dy f (x)g(y)i∆F χ,ψ (x − y) = hΩ, T2 (χf ⊗ ψg)Ωi , (32) (where Ω denotes the Fock vacuum) we define def
µ F = ∂ µ ∆F δχ,ψ χ,ψ − ∆∂ µ χ,ψ ,
χ, ψ ∈ G˜ .
Inserting the causal factorization of T2 (· · ·)(x, y) for x 6= y we find that distribution, hence it has the form X µ µ (z) = Cχ,ψ;a ∂ a δ(z) , δχ,ψ
(33) µ δ···
is a local (34)
a∈N40 µ ∈ C are constant numbers. Then we define where the Cχ,ψ;a
∆µχ,ψ : D(R4 , P˜0 )×2 → D(R4 , P˜0 ) X µ X Cχ,ψ;a (−1)|a| ∆µχ,ψ (V g, W f ) = a
×
0≤b≤a
a! (∂˜b V )W (∂ (a−b) g)f b!(a − b)!
(35)
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
989
Q where |a| = a0 + a1 + a2 + a3 and a! ≡ µ aµ !. This formula is motivated by the identity Z dx ∂ a δ(x − y)V (x)g(x)W (y)f (y) =
X
(−1)|a|
0≤b≤a
a! (∂ b V )(y)W (y)(∂ (a−b) g)(y)f (y) , b!(a − b)!
(36)
where V (x) and W (y) are here Wick polynomials (cf. (12)). The subtle point in the definition (35) is that the derivative on V on the r.h.s. is an external one. This results from the derivation of the MWI in classical field theory [9]. And, if the derivative on V would be an internal one, we would get wrong results, e.g. for the BRST-transformation of the interacting gauge field in non-Abelian models (187).n Note that ∆µχ,ψ is not invariant with respect to the exchange of its arguments. We are now going to compute the difference (31) on a heuristic level according to our prescription that the external derivative acts after contracting. Leto V = Qm Qp ˜ak sym(φk ). We consider the sum ˜ k=1 ϕk , W = k=m+1 ϕk , ϕk ∈ G, ϕk = ∂ of diagrams in which φ1 , . . . , φl (l ≤ m) and φm+1 , . . . , φq (q ≤ p) are contracted and φl+1 , . . . , φm , φq+1 , . . . , φp are not. By means of the Feynman rules and the normalization conditionp (N3) we compute the contribution of this sum of diagrams to the first T -product in (31): X il+q Tn+1 ((∂˜ν V )W g ⊗ W1 f1 ⊗ · · ·) = r1 ,...,rl+q
" F F F ∂xν (∆F ϕ1 ,· (x − xr1 ) · · · ∆ϕl ,· (x − xrl ))∆ϕm+1 ,· (x − xrl+1 ) · · · ∆ϕq ,· (x − xrl+q−m ) ·
: T (· · ·)(x1 , . . . , xn )
m Y
∂ ak φk (x) ·
k=l+1
p Y
∂ ak φk (x) : +
k=q+1
F F F ∆F ϕ1 ,· (x − xr1 ) · · · ∆ϕl ,· (x − xrl )∆ϕm+1 ,· (x − xrl+1 ) · · · ∆ϕq ,· (x − xrl+q−m ) · ! # p m Y Y φk (x) · ∂ ak φk (x) : + · · · , (37) : T (· · ·)(x1 , . . . , xn )∂xν ∂ ak k=l+1
n On
k=q+1
the heuristic level of the Feyman rules this can be understood as follows (for simplicity we assume |a| = 1): one shifts the derivative ∂ from the difference (33) of Feynman propagators δµ ∼ ∂δ(x − y) to V (x), however the (time-ordered) contractions of the legs of V are already performed, i.e. ∂V must be an external derivative. Thereby the term V W (∂g)f is the boundary term. o In this calculation the indices k of ϕ and φ have nothing to do with the ones introduced in k k Sec. 2.1. p The following implication of (N3) is used here: the Feynman propagators ∆F ϕj ,χ which appear in (37) depend on (ϕj , χ) only. This means that for (ϕj , χ) = (ϕl , ψ) the undetermined parameters F F F (298) in ∆F ϕj ,χ and ∆ϕ ,ψ have the same values. Note additionally ∆ϕ,χ = ±∆χ,ϕ due to (32). l
September 23, 2002 13:37 WSPC/148-RMP
990
00145
M. D¨ utsch & F.-M. Boas
where the double dots simply mean that the φk (x), k = l + 1, . . . , m, q + 1, . . . , p are not contracted. (Note that normal ordering is defined for monomials only, not for polynomials.) With (37) we obtain the following heuristic result for the difference (31) X
il+q
r1 ,,rl+q
l X
ν F ∆F ϕ1 ,· (x − xr1 ) · · · δϕt ,· (x − xrt ) · · · ∆ϕl ,· (x − xrl )
t=1
F ∆F ϕm+1 ,· (x − xrl+1 ) · · · ∆ϕq ,· (x − xrl+q−m )
: T (· · ·)(x1 , . . . , xn )
m Y
∂ ak φk (x) ·
k=l+1
p Y
∂ ak φk (x) : + · · · .
(38)
k=q+1
We now require that this structure is maintained in the process of renormalization: ˜ (N)
Tn+1 ((∂˜ν V )W g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) = Tn+1 ((∂ ν V )W g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) n X X ∂Wm ∂V W g, fm (±)Tn ∆νχ,ψ +i ∂χ ∂ψ m=1 χ,ψ∈G˜ ˆ · · · ⊗ Wn fn ⊗ W1 f1 ⊗ · · · m
where V, W, W1 , . . . , Wn ∈ P˜0 , the sign (±) comes from permutations of Fermi operators and m ˆ means that the corresponding factor is omitted. We now as˜ holds true to lower orders ≤ n. Then, due to causal factorizasume that (N) ˜ is satisfied for tion of time ordered products, we conclude that the condition (N) ˜ supp(g ⊗ f1 ⊗ · · · ⊗ fn ) ∩ Dn+1 = ∅. Hence (N) is in fact a normalization con˜ as the definition of the normalization of dition. It can be satisfied by taking (N) ν Tn+1 ((∂˜ V )W g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ). There is only one non-trivial step in this procedure: the compatibility with (N3). This is shown in Sec. 3.1. In models with anomalies, i.e. terms which violate the MWI (see the next sub˜ will be modified: in order that (30) holds section), the normalization condition (N) true these anomalies must be taken into account in the difference (31), they give ˜ (cf. Sec. 5). an additional contribution to the r.h.s. of (N) ˜ implies In particular the normalization condition (N) = (−1)|b| ∂ a ∂ b ∆F ∆F ϕj ,ϕl , ∂˜a ϕj ,∂˜b ϕ l
ϕj , ϕl ∈ G ,
(39)
and hence δ∂µ˜a ϕ
˜b ϕl
j ,∂
= (−1)|b| ∂ a ∂ b δϕµj ,ϕl ,
ϕj , ϕl ∈ G .
(40)
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
991
˜ and hΩ, T1 (U h)Ωi = 0 for P˜0 3 U 6= λ1, λ ∈ C, one By repeated application of (N) finds ! ! ! + * r r Y Y (k) (k) ∂˜a ϕj g ⊗ ∂˜b ϕl f Ω Ω, T2 k
* =
k
k=1
Ω, T2
k=1
r Y
! (k)
∂a
ϕjk g ⊗
k=1
for r > 1, where ϕm ∈ G
r Y
! ! + (k)
∂b
ϕlk f Ω
(41)
k=1 (k)
(k)
(k)
(k)
∀m, a(k) ≡ (a0 , a1 , a2 , a3 ) and similar for b(k) .
2.4. The master Ward identity The MWI is an explicit formula for the difference ∂xν T (V, W1 , . . . , Wn )(x, x1 , . . . , xn ) − T (∂ ν V, W1 , . . . , Wn )(x, x1 , . . . , xn ) ,
(42)
where V, W1 , . . . , Wn ∈ P0 . It may be regarded as the postulate that the recursive ˜ reproduces, in the case W = 1 and V, W1 , . . . , Wn ∈ P0 , the direct definition (N) definition (30) (see the Remark below). However, this is a very technical and indirect way to the MWI. We found it by the following, intuitive procedure: the result of the Feynman rules for the difference (42) is obtained from (38) by choosing ϕk ∈ G, ∀k, and putting W = 1 (i.e. p = q = m). The MWI requires that renormalization is done in such a way that this heuristic result is essentially preserved: (N)
−Tn+1 (V ∂ ν g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) = Tn+1 ((∂ ν V )g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) n X X ∂V ∂Wm ν g, fm (±)Tn ∆χ,ψ +i ∂χ ∂ψ m=1 χ,ψ∈G
ˆ · · · ⊗ Wn fn ⊗ W1 f1 ⊗ · · · m
(43)
where V, W1 , . . . , Wn ∈ P0 (not ∈ P˜0 ), the sign (±) is due to permutations of Fermi operators and m ˆ means that the corresponding factor is omitted. We recall that ∆µ contains external derivatives. To give the correct formula for the difference (42) one needs the external derivative or an equivalent formalism (for a latter see [9] and ˜ the MWI (N) presupposes the normalization the Remark 2.1). Similarly to (N) condition (N3), because (N3) is used in (37) and (38). ˜ and (N), one can take (30) and (N) ˜ as Remark 2.2. Instead of requiring (N) the primary normalization conditions, because the latter two imply (N). This alternative and more compact formulation is the straightforward way to formulate the quantum MWI [9], when departing from classical field theory. However, the advantage of the present procedure is that it explicitly distinguishes the ‘weak’
September 23, 2002 13:37 WSPC/148-RMP
992
00145
M. D¨ utsch & F.-M. Boas
˜ (which only defines the normalization of the time ornormalization condition (N) dered products with external derivatives) from the ‘hard’ one (N) (which expresses deep symmetries). This distinction plays an important role in our (incomplete) proof of the MWI (Sec. 3). We now assume that (N) holds true to lower orders ≤ n. Then, due to causal factorization of time ordered products, we conclude that the condition (N) is satisfied for supp(g ⊗ f1 ⊗ · · · ⊗ fn ) ∩ Dn+1 = ∅. Hence (N) is in fact a normalization condition. The compatibility with (N0)–(N2) is trivial and the compatibility with (N3) is proved in Sec. 3.2. The hard question is whether (N) can be satisfied by choosing suitable normalizations (which are compatible with the other normalization conditions). The answer depends on the model. We will see that the MWI implies that there is no axial anomaly and no trace anomaly of the energy momentum tensor. Hence it must be impossible to fulfil the MWI in these cases. Generally we call any term that violates the MWI (and cannot be removed by an admissible, finite renormalization of the T -products) an anomaly. If there is at most one contraction between V and W1 , . . . , Wn (i.e. we have l = 0 or l = 1 and of course p = q = m in (38)) the expression (38) is well-defined and (re)normalization can be done such that (38) gives the contribution of these diagrams to the difference (42). In other words one can fulfil the MWI (N) for these ‘tree-like’ diagrams. The anomalies must come from ‘loop-like’ diagrams. In Sec. 5 we give a more general formulation of the MWI which takes anomalies into account.
3. Steps Towards a Proof of the Master Ward Identity We have to show that there exists a normalization of the T -products which satisfies ˜ (N) and also (N0)–(N3)). The compatibility of (N) ˜ and (N) with (N0)– (N), (N2) is obvious, but the compatibility with (N3) requires some work which is done ˜ is then easily completed (Sec. 3.1). in the next two subsections. The proof of (N) But a general proof of (N) is impossible, since it is well-known that there exist anomalies in certain models. If solely massive fields appear and if the scaling degrees (28) of the individual C-number distributions appearing in (N) are not relatively lowered,q we can give a constructive proof of (N) (Sec. 3.3). More precisely we show that the so-called ‘central solution’ of Epstein and Glaser, which is a distinguished extension t(c) ∈ D0 (Rk ) of t0 ∈ D0 (Rk \{0}), satisfies (N) in this case. To simplify the formulas we restrict this section to bosonic fields, the inclusion of fermionic fields is obvious.
explain what we mean by this expression for the example of the identity ∂ν tν1 = t2 (t1 , t2 ∈ D 0 (Rk )). According to (27) we naively expect sd(t1 )+1 = sd(t2 ). We say that the scaling degree of t1 (or t2 respectively) is relatively lowered if sd(t1 ) < sd(t2 )−1 (or sd(t2 ) < sd(t1 )+1 respectively). A relative pre-factor ma (m = mass), a > 0, indicates a relatively lowered scaling degree. Let ∂ν tν1 = ma t2 and we assume that t1 and t2 contain no global factor mb (b ∈ R\{0}). Then, for dimensional reasons, the scaling degree of t2 is relatively lowered: sd(t2 ) = sd(t1 ) + 1 − a.
q We
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
993
˜ 3.1. Proof of (N) ˜ is the compatibility with (N3). The keys to The nontrivial part in the proof of (N) show this and the compatibility of (N) with (N3) (see Sec. 3.2) are the following two Lemmas: Lemma 3.1. Let V ∈ P˜0 , ϕ, ψ ∈ G˜ and f, h ∈ D(R4 ). Then the following identities hold true within D(R4 , P˜0 ): X X X ∂(∂ a V ) a! ∂V (a−b) ∆ϕ,ψ ? h = f ∂b ∂ ∆ϕ,ψ ? h (44) f ∂ϕ b!(a − b)! ∂ϕ ˜ ˜ ϕ∈G
ϕ∈G 0≤b≤a
X X X ∂(∂˜a V ) a! ∂V (a−b) ∆ϕ,ψ ? h = f ∂˜b ∂ ∆ϕ,ψ ? h , ∂ϕ b!(a − b)! ∂ϕ ˜ ˜ 0≤b≤a ϕ∈G ϕ∈G Q where again a! ≡ µ aµ !. f
Proof. We first prove (44) for |a| = 1, i.e. X ∂V X ∂(∂ µ V ) ∂V µ µ ∆ϕ,ψ ? h = f ∆ϕ,ψ ? h + ∂ ∆ϕ,ψ ? h . ∂ f ∂ϕ ∂ϕ ∂ϕ ϕ∈G˜
(45)
(46)
ϕ∈G˜
It suffices to consider the case in which V is a monomial. The proof goes by induction ˜ W ∈ on the degree of this monomial. The case V = 1 is trivial. Let V = χW, χ ∈ G, P˜0 . By assumption W satisfies (46). Inserting now V = χW into (46) and using this assumption most terms cancel and it remains to show X ∂χ X ∂(∂ µ χ) W ∆ϕ,ψ ? h = f W ∂ µ ∆ϕ,ψ ? h. (47) f ∂ϕ ∂ϕ ˜ ˜ ϕ∈G
ϕ∈G
The l.h.s. is equal to f W ∆∂ µ χ,ψ ? h and the r.h.s. to f W ∂ µ ∆χ,ψ ? h. Obviously these two expressions agree. To prove (44) for arbitrary |a| we proceed by induction on |a|: X ∂(∂ µ ∂ a V ) ∆ϕ,ψ ? h f ∂ϕ ϕ∈G˜
X ∂(∂ a V ) ∂(∂ a V ) µ µ ∆ϕ,ψ ? h + ∂ ∆ϕ,ψ ? h ∂ = f ∂ϕ ∂ϕ ϕ∈G˜
= f
X X ϕ∈G˜ 0≤b≤a
a! b!(a − b)!
µ b ∂V ∂ ∂ ∂ (a−b) ∆ϕ,ψ ? h ∂ϕ
b ∂V µ (a−b) ∆ϕ,ψ ? h ∂ ∂ + ∂ ∂ϕ = f
X
X
ϕ∈G˜ 0≤b≤(a+eµ )
(a + eµ )! b ∂V ∂ ∂ (a+eµ −b) ∆ϕ,ψ ? h , b!(a + eµ − b)! ∂ϕ
(48)
September 23, 2002 13:37 WSPC/148-RMP
994
00145
M. D¨ utsch & F.-M. Boas
where eµ = (0, . . . , 1, . . . , 0) with 1 at the µth position. First we have used (46) (with V replaced by ∂ a V ) and in the second equality sign we have inserted (44) (which is the inductive assumption) and ∂ µ ⊗ 1 applied to (44) (cf. (9)). The proof of the second identity (45) is completely similar. One simply has to replace the internal derivatives ∂ a ⊗ 1 by external ones ∂˜a ⊗ 1. In particular the validity of the equation corresponding to (47) relies on ∆∂˜µ χ,ψ = ∂ µ ∆χ,ψ . By means of Lemma 3.1 we will prove Lemma 3.2. Let V, W ∈ P˜0 , χ, ψ, κ ∈ G˜ and f, g, h ∈ D(R4 ). Then X ∂∆µχ,ψ (V g, W f ) ∂ϕ
ϕ∈G˜
=
∆ϕ,κ ? h
X µ ∂V ∂W g∆ϕ,κ ? h, W f + ∆µχ,ψ V g, f ∆ϕ,κ ? h . ∆χ,ψ ∂ϕ ∂ϕ
(49)
ϕ∈G˜
Proof. Using the explicit form (35) for ∆µ the l.h.s. of (49) is equal to ˜(a−b) X XX µ ∂(∂ a! V) W (∂ b g)f Cχ,ψ;a (−1)|a| b!(a − b)! ∂ϕ a ˜ ϕ∈G
+ (∂˜(a−b) V )
0≤b≤a
∂W b (∂ g)f ∆ϕ,κ ? h . ∂ϕ
(50)
Again by means of (35) the r.h.s. of (49) can be written as " X X XX µ a! b! |a| Cχ,ψ;a (−1) b!(a − b)! c!(b − c)! a ϕ∈G˜
0≤b≤a
0≤c≤b
# ∂W ∂V (∂ b g)f (∆ϕ,κ ? h) . (51) W (∂ c g)(∂ (b−c) ∆ϕ,κ ? h)f + (∂˜(a−b) V ) ∂˜(a−b) ∂ϕ ∂ϕ Due to (45) the expressions (50) and (51) agree. ˜ i.e. we show that there exists a normalizaWe now come to the proof of (N), ˜ Let the tion of the T -products which satisfies (N0), (N1), (N2), (N3) and (N). T -products fulfil the first four of these normalization conditions to all orders. In ˜ holds to lower orders ≤ n and a double inductive procedure we assume that (N) for all T -products to order n + 1 of sub-polynomials. More precisely, the second induction goes (for each fixed n) with respect to the ‘polynomial degree’ d which is the sum of the degrees of the polynomials V1 , . . . , Vn ∈ P˜0 in Tn (V1 g1 ⊗ · · · ⊗ Vn gn ): def d = |V1 | + · · · + |Vn |. Note |∂ a V | = |V | = |∂˜a V |. By using (N3) we want to show
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
995
˜ with T1 (κh) agree. The that the commutators of the l.h.s. and of the r.h.s. of (N) commutator of the l.h.s. is equal to " ˜ν X ∂(∂ V ) ∂W W + (∂˜ν V ) (52) g(∆ϕ,κ ? h) ⊗ W1 f1 ⊗ · · · Tn+1 i ∂ϕ ∂ϕ ˜ ϕ∈G
+
X j
# ∂Wj ν ˜ fj (∆ϕ,κ ? h) ⊗ · · · Tn+1 (∂ V )W g ⊗ W1 f1 ⊗ · · · ⊗ . (53) ∂ϕ
˜ with T1 (κh) we use again (N3) To compute the commutator of the r.h.s. of (N) and in addition Lemma 3.2. We obtain X ∂(∂ ν V ) ∂W ν W + (∂ V ) (54) g(∆ϕ,κ ? h) ⊗ W1 f1 ⊗ · · · Tn+1 i ∂ϕ ∂ϕ ϕ∈G˜
+
X
Tn+1
j
+i
n X X
∂Wj fj (∆ϕ,κ ? h) ⊗ · · · (∂ V )W g ⊗ W1 f1 ⊗ · · · ⊗ ∂ϕ
ν
Tn
∆νχ,ψ
m=1 χ,ψ∈G˜
(55)
∂V ∂W ∂Wm ∂2V W+ fm (56) g(∆ϕ,κ ? h), ∂ϕ∂χ ∂χ ∂ϕ ∂ψ
∂ 2 Wm ∂V W g, fm (∆ϕ,κ ? h) ⊗ W1 f1 ⊗ · · · m ˆ ··· ∂χ ∂ϕ∂ψ X X ∂Wm ∂V ν W g, fm ⊗ W1 f1 ⊗ · · · m Tn ∆χ,ψ ˆ ··· ∂χ ∂ψ ˜
+ ∆νχ,ψ +i
m,j (m6=j) χ,ψ∈G
∂Wj fj (∆ϕ,κ ? h) ⊗ ⊗ ∂ϕ
(57)
.
(58)
˜ for subpolynomials we have the following equations: Due to (N) (second term in (52)) = (second term in (54)) + (second term in (56)) and (53) = (55) + (57) + (58) . To get the equality of (52) + (53) and (54) + (55) + (56) + (57) + (58) it remains to show: (first term in (52)) = (first term in (54)) + (first term in (56)) .
(∗)
To verify this we insert (46) with ∂ µ ⊗ 1 replaced by ∂˜µ ⊗ 1 into the first term in (52) and the original (46) into the first term in (54). The remaining terms in (∗) ˜ for subpolynomials. cancel by means of (N)
September 23, 2002 13:37 WSPC/148-RMP
996
00145
M. D¨ utsch & F.-M. Boas
˜ can be violated by a From the just now proved result we conclude that (N) C-number only: Tn+1 ((∂˜ν V )W g ⊗ W1 f1 ⊗ · · ·) − Tn+1 ((∂ ν V )W g ⊗ W1 f1 ⊗ · · ·) n X X ∂Wm ∂V ν W g, fm ⊗ W1 f1 ⊗ · · · m Tn ∆χ,ψ ˆ ··· −i ∂χ ∂ψ m=1 ˜ χ,ψ∈G
= hΩ|Tn+1 ((∂˜ν V )W g ⊗ W1 f1 ⊗ · · ·)|Ωi − hΩ|Tn+1 ((∂ ν V )W g ⊗ W1 f1 ⊗ · · ·)|Ωi n X X ∂Wm ∂V W g, fm ⊗ W1 f1 ⊗ · · · m hΩ|Tn ∆νχ,ψ ˆ · · · |Ωi −i ∂χ ∂ψ m=1 χ,ψ∈G˜
def
= :a ˜(g, f1 , · · · , fn ) .
(59)
˜ to lower orders Due to causal factorization of the T -products and the validity of (N) ˜ ≤ n, the possible violation a ˜(g, f1 , · · · , fn ) of (N) must be local a ˜(g, f1 , · · · , fn ) Z ω X Ca ∂ a δ(x1 − x, . . . , xn − x)g(x)f1 (x1 ) · · · fn (xn ) , (60) = dx dx1 . . . dxn |a|=0
with unknown constants Ca and def
ω = sd(hΩ|Tn+1 ((∂˜ν V )W, W1 , ..., Wn )|Ωi) − 4n .
(61)
After the finite renormalization hΩ|Tn+1 ((∂˜ν V )W g ⊗ W1 f1 ⊗ · · ·)|Ωi ˜(g, f1 , · · ·) → hΩ|Tn+1 ((∂˜ν V )W g ⊗ W1 f1 ⊗ · · ·)|Ωi − a
(62)
˜ holds true. By construction (in particular (61)) this renormalization respects (N) (N0). From the definition (59) of a ˜(g, f1 , . . .) we see that (62) maintains (N1), (N2) and the permutation symmetry of hΩ|Tn+1 ((∂˜ν V )W, W1 , . . .)|Ωi. However, in general (62) violates (N3), namely in the cases in which Tn+1 ((∂˜ν V )W, W1 , . . .) appears on the r.h.s. of (N3).r So we everywhere repair (N3) by a chain of finite renormalizations of T -products of order n + 1 with polynomial degree d > |V | + |W | + |W1 | + · · · + |Wn |.s It is obvious that this can be done such that (N0), (N1) ˜ up to order n + 1 and polynomial and (N2) are preserved. The validity of (N) cases of (N3) in which Tn+1 ((∂˜ν V )W, W1 , . . .) appears on the l.h.s. remain true, because only the C-number part of Tn+1 ((∂˜ν V )W, W1 , . . .) gets changed. s The vacuum expectation values of these T -products remain unchanged; solely the operator parts get renormalized. r The
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
997
degree |V | + |W | + |W1 | + · · · + |Wn | is not touched by these renormalizations. So the inductive step is finished. In other words, the compatibility of the renormalization (62) with (N3) follows from the following general observation: (N3) determines the operator-valued map Tn completely in terms of the C-valued map hΩ|Tn (·)|Ωi : D(R4 , P˜0 )⊗n → C. However, (N3) does not give any relation among the vacuum expectation values of the T -products, they may be arbitrarily given. Hence, renormalizations of the vacuum expectation values of the T -products are not in conflict with (N3). We will use this second way of argumentation in the following. 3.2. Compatibility of the master Ward identity with (N3 ) ˜ to all We start with T -products which fulfil (N0), (N1), (N2), (N3) and (N) orders. We use the same double induction as in the preceding subsection: we assume that (N) holds to lower orders ≤ n and for Tn+1 restricted to the elements of D(R4 , P˜0 )⊗n+1 with a lower polynomial degree. By means of (N3) we are going to prove that the commutators of the l.h.s. and of the r.h.s. of (N) with T1 (κh) are equal. For the l.h.s. it results X ∂V ν (∂ g)(∆ϕ,κ ? h) ⊗ W1 f1 ⊗ (63) Tn+1 −i ∂ϕ ϕ∈G
+
n X l=1
∂Wl fl (∆ϕ,κ ? h) ⊗ · · · Tn+1 V ∂ ν g ⊗ W1 f1 ⊗ · · · ⊗ . ∂ϕ
(64)
By using again (N3) and in addition Lemma 3.2 we compute the commutator of the r.h.s. and obtain X ∂(∂ ν V ) g(∆ϕ,κ ? h) ⊗ W1 f1 ⊗ · · · (65) Tn+1 i ∂ϕ ϕ∈G
+
n X l=1
+i
∂Wl ν fl (∆ϕ,κ ? h) ⊗ · · · Tn+1 (∂ V )g ⊗ W1 f1 ⊗ · · · ⊗ ∂ϕ
2 n X X ∂Wm ∂ V g(∆ϕ,κ ? h), fm Tn ∆νχ,ψ ∂ϕ∂χ ∂ψ m=1
(66)
(67)
χ,ψ∈G
∂V ∂ 2 Wm g, fm (∆ϕ,κ ? h) ⊗ W1 f1 ⊗ · · · m ˆ ··· ∂χ ∂ϕ∂ψ X ∂V ∂Wm ν g, fm ⊗ W1 f1 ⊗ · · · m Tn ∆χ,ψ ˆ ··· + ∂χ ∂ψ
+ ∆νχ,ψ
(68)
l(l6=m)
⊗
∂Wl fl (∆ϕ,κ ? h) ⊗ · · · ∂ϕ
.
(69)
September 23, 2002 13:37 WSPC/148-RMP
998
00145
M. D¨ utsch & F.-M. Boas
The validity of (N) for sub-polynomials implies (64) = (66) + (68) + (69) . It remains to prove (63) = (65) + (67) .
(∗∗)
After inserting (46) into (65) this equation (∗∗) takes the form of (N) for some sub-polynomials, which holds true by the inductive assumption. ˜ (59)–(61) we conclude that the operator identity (N) can As in the case of (N) be violated by a local C-number a(g, f1 , . . . , fn ) only: def
a(g, f1 , . . . , fn ) = hΩ|Tn+1 (V ∂ ν g ⊗ W1 f1 ⊗ · · ·)|Ωi + hΩ|Tn+1 ((∂ ν V )g ⊗ W1 f1 ⊗ · · ·)|Ωi n X X ∂V ∂Wm ν g, fm ⊗ W1 f1 ⊗ · · · m hΩ|Tn ∆χ,ψ ˆ · · · |Ωi . (70) +i ∂χ ∂ψ m=1 χ,ψ∈G
The aim is now to remove a(g, f1 , . . . , fn ) by finite renormalizations of the vacuum expectation values hΩ|T (· · ·)|Ωi on the r.h.s.. Such renormalizations are not in conflict with (N3) (see the end of the preceding subsection). So we have proved the compatibility of N with (N3). We discuss the possibilities to remove a(g, f1 , . . . , fn ): (A) The finite renormalization hΩ|Tn+1 ((∂ ν V )g ⊗ W1 f1 ⊗ · · ·)|Ωi → hΩ|Tn+1 ((∂ ν V )g ⊗ W1 f1 ⊗ · · ·)|Ωi − a(g, f1 , . . .) ,
(71)
˜ and permutation symmetry. does this job and is compatible with (N0)–(N3), (N) ν However, this procedure works only if ∂ V 6= 0 and if sd(hΩ|Tn+1 (∂ ν V, W1 , . . . , Wn )|Ωi) = sd(hΩ|Tn+1 (V, W1 , . . . , Wn )|Ωi) + 1 . In case that the latter does not hold one gets in conflict with (N0). This happens e.g. for the axial and pseudo-scalar triangle-diagram, see (89). In many important applications of the MWI V corresponds to a conserved current (i.e. ∂ ν V = 0), for ¯ example V = ψγψ, or if V is the free ghost current (cf. Sec. 4.2 for both) or the free BRST-current (Sec. 4.4). (B) If (71) does not work one tries to satisfy (N) by renormalizing also hΩ|Tn+1 (V, W1 , . . .)|Ωi and eventually hΩ|Tn (∆νχ,ψ (· · ·), W1 · · ·)|Ωi. This method does not ensure success. In detail one proceeds in the following way: (B1) a(g, f1 , . . .) has the form (60) with def
ω = sd(hΩ|Tn+1 (V, W1 , . . . , Wn )|Ωi) + 1 − 4n .
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
999
Using symmetry properties (e.g. Poincare covariance, permutation symmetries) of the r.h.s. of (70) the constants Ca (we use the notation of (60)) can be strongly restricted. (B2) One works out the freedoms of normalization (29) of hΩ|Tn+1 (V, W1 , . . .)|Ωi, hΩ|Tn+1 ((∂ ν V ), W1 · · ·)|Ωi and eventual hΩ|Tn (∆ν (· · ·), W1 · · ·)|Ωi (the second is ˜ and permutation symonly available if ∂ ν V 6= 0) which respect (N0)–(N2), (N) ν metry. Renormalizations of hΩ|Tn (∆ (· · ·), W1 · · ·)|Ωi are also restricted by the validity of the inductive assumption for (N). (B3) One then tries to remove the remaining a(g, f1 , . . .) by using the freedoms which result from step (2). Because the restricted a(g, f1 , . . .) (step (1)) and the free normalization polynomials (step (2)) depend strongly on (V, W1 , . . . , Wn ), one has to treat each combination (V, W1 , . . . , Wn ) separately and this gives quite a lot of work. This method was used in [11] to prove ‘perturbative gauge invariance’ (which is Eq. (147) with j1 = · · · = jn = 0) for SU (N )-Yang–Mills theories. To restrict a(g, f1 , . . .) sufficiently a weak assumption about the infrared behaviour was necessary. (However, if this assumption would not hold, the Green’s functions would not exist.) 3.3. Proof of the master Ward identity for solely massive fields and not relatively lowered scaling degree We return to the end of Sec. 2.2 and set def
ω(t0 ) = sd(t0 ) − 4(n − 1) .
(72)
A possible extension t ∈ D0 (R4(n−1) ) of t0 ∈ D0 (R4(n−1) \{0}) which respects (N0) is given by (cf. [5, 30]) def
(t(w) , h) = (t0 , h(w) ),
∀h ∈ D(R4(n−1) )
(73)
where X xa (∂ a h)(0) , a!
ω(t0 ) def
h(w) (x) = h(x) − w(x)
(74)
|a|=0
and w ∈ D(R4(n−1) ) is such that there exists a neighborhood U of 0 ∈ R4(n−1) with w|U ≡ 1. A change of w alters the normalization of t(w) . For ω(t0 ) < 0 we have h(w) = h in agreement with the fact that the extension is unique in that case. Because there is no Lorentz invariant w ∈ D(R4(n−1) ), the extension t(w) is not Lorentz covariant (in general) and one has to perform a finite renormalization (29) to restore this symmetry (see the second papers of [11, 39], as well as [6]). To avoid this one is tempted to choose w ≡ 1. But h(w≡1) is not a test-function. However, if all fields are massive the infrared behavior is harmless and Epstein and Glaser [19] have def
shown that one may indeed choose w ≡ 1 in this case. The extension t(c) = t(w≡1)
September 23, 2002 13:37 WSPC/148-RMP
1000
00145
M. D¨ utsch & F.-M. Boas
is called ‘central solution’ (or better ‘central extension’ in our framework) and it was pointed out that it preserves nearly all symmetries [19, 13, 34].t We are now going to show that the central extensions fulfil the MWI provided the scaling degree is not relatively lowered for the individual, contributing C-number distributions (a precise explanation of the latter expression is given below in (86), ν(c) (87)). We define t¯ν(c) , t(c) and tb;χ,ψ to be central extensions: Z dx dx1 · · · dxn t¯ν(c) (x1 − x, . . . , xn − x)g(x)f1 (x1 ) · · · fn (xn ) def
= hΩ|Tn+1 ((∂ ν V )g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn )|Ωi(c) ,
Z
(78)
dx dx1 · · · dxn t(c) (x1 − x, . . . , xn − x)g(x)f1 (x1 ) · · · fn (xn ) def
XZ
= hΩ|Tn+1 (V g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn )|Ωi(c) ,
(79)
ν(c)
ˆ . . . , xn − xm ) , dx dx1 · · · dxn tb;χ,ψ (x1 − xm , . . . , m,
b
× (∂ b δ)(xm − x)g(x)f1 (x1 ) · · · fn (xn ) ∂V ∂Wm def g, fm ⊗ W1 f1 ⊗ · · · m = hΩ|Tn ∆νχ,ψ ˆ · · · ⊗ Wn fn |Ωi(c) , (80) ∂χ ∂ψ
and Glaser have proved that one may choose w ≡ 1 for the method of distribution splitting. In this footnote we show how their result applies to our extension procedure (73). From Epstein and Glaser [19] we know t0 ∈ S 0 (R4(n−1) \{0}) and hence t(w) ∈ S 0 (R4(n−1) ), so we may use Fourier transformation. Epstein and Glaser have proved that in the massive case the Fourier transformation tˆ(w) (p) (and therefore any extension (29)) is analytic in a neighborhood of p = 0. Then they define the central extension t(c) by
t Epstein
∂ a tˆ(c) (0) = 0 ,
∀|a| ≤ ω(t0 ) .
(75)
The Fourier transformation of t(w) (73) reads [30] X pa ∂ a (td 0 w)(0) . a!
ω(t0 )
tˆ(w) (p) = tˆ0 (p) −
(76)
|a|=0 n
− Note td ˆ ∈ C ∞ , i.e. ∂ a (td 0 w = (2π) 2 tˆ0 ? w 0 w)(0) exists. Using the definition (75) of the central extension we now find ω(t0 ) a X pa X p (∂ a tˆ(w) )(0) = tˆ0 (p) − (∂ a tˆ0 )(0) . a! a!
ω(t0 )
tˆ(c) (p) = tˆ(w) (p) −
|a|=0
(77)
|a|=0
We see that we may set w ≡ 1 in (76) and hence also in (73), and that this choice is the central extension (75).
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1001
where we have taken the definition (35) of ∆µ into account.u The corresponding non-extended distributions are t0 ∈ D0 (R4n \{0}) and tνb;χ,ψ;0 ∈ D0 (R4(n−1) \{0}) .
t¯ν0 ,
(82)
In the preceding subsection we have learnt that the validity of (N3) reduces the proof of the MWI to the vacuum sector (see (70)). So we only have to show −∂ ν t(c) (y1 , . . . , yn ) = t¯ν(c) (y1 , . . . , yn ) +i
n X X
ν(c)
tb;χ,ψ (y1 − ym , . . . , m, ˆ . . . , yn − ym )∂ b δ(ym ) ,
m=1 χ,ψ∈G
(83) where def
∂ ν = ∂1ν + · · · + ∂nν .
(84)
By causal factorization and induction we know that this equation is fulfilled by the corresponding non-extended distributions (82). Setting y ≡ (y1 , . . . , yn ) we obtain −(∂ ν t(c) (y), h(y)) "
#! X ya a ν (∂ ∂ h)(0) t0 (y), ∂ h(y) − a! ω(t0 )
ν
=
|a|=0
#! X ya a (∂ h)(0) h(y) − a!
" t0 (y), ∂
=
ν
ω(t0 )+1 |a|=0
" t¯ν0 (y),
=
#! X ya a (∂ h)(0) h(y) − a! ω(t0 )+1 |a|=0
+i
n X X
" ˆ . . . , yn − ym )∂ b δ(ym ), h(y) tνb;χ,ψ;0 (y1 − ym , . . . , m,
m=1 χ,ψ∈G
X
ω(t0 )+1
−
|a|=0
u From
X
#! ya a (∂ h)(0) . a!
(85)
(35) we see that hΩ|Tn (∆µ (· · ·) ⊗ · · · m ˆ · · ·)|Ωi(c) is of the form
t˜b (f1 ⊗ · · · ⊗ (∂ b g)fm ⊗ · · · ⊗ fn )
b
=
XZ
dx dx1 · · · dxn tb (x1 − xm , . . . , m, ˆ . . . , xn − xm )(∂ b δ)(xm − x)g(x)f1 (x1 ) · · · fn (xn ) ,
b
(81) t˜b ∈ D 0 (R4n ), tb ∈ D 0 (R4(n−1) ).
September 23, 2002 13:37 WSPC/148-RMP
1002
00145
M. D¨ utsch & F.-M. Boas
If the scaling degree is not relatively lowered, more precisely if ω(t¯ν0 ) = ω(t0 ) + 1
(86)
and ω(tνb;χ,ψ;0 ) = ω(t0 ) + 1 − |b| ,
∀b , ∀χ, ψ ∈ G ,
(87)
then the terms in the final expression of (85) are the central extensions. For the t¯ν0 -term this is obvious. To verify this statement for the tνb;χ,ψ;0 -terms (we omit the indices ν, χ, ψ in the following) it suffices to consider the term m = n and test-functions of the form h(y1 , . . . , yn ) = h1 (z1 , . . . , zn−1 )h2 (zn ) where def
z ≡ (z1 , . . . , zn ) = (y1 − yn , . . . , yn−1 − yn , yn ) = : Ay ,
A ∈ SL(n, R) .
Then we have X
ω(tb;0 )+|b|
h(y) −
|a|=0
ya a (∂ h)(0) a! X
ω(tb;0 )+|b|
= (h1 ⊗ h2 )(z) −
|a|=0
X
ω(tb;0 )+|b|
= (h1 ⊗ h2 )(z) −
|a|=0
ya ((AT ∂)a (h1 ⊗ h2 ))(0) a! za a (∂ (h1 ⊗ h2 ))(0) a!
where A denotes the transposed matrix and we have used y a ·(AT ∂)a = (Ay)a ·∂ a . z , zn ). Then the last term in (85) can be transformed We set a = (¯ a, an ) and z ≡ (¯ in the following way: " #! ω(tb;0 )+|b| a X y b a (∂ h)(0) tb;0 (y1 − yn , . . . , yn−1 − yn )∂ δ(yn ), h(y) − a! T
|a|=0
" =
X
ω(tb;0 )+|b|
tb;0 (¯ z )∂ δ(zn ), h1 (¯ z )h2 (zn ) − b
|¯ a|+|an |=0
" = (−1)|b| (∂ b h2 )(0) tb;0 (¯ z ), h1 (¯ z) −
y
#! z¯a¯ znan a¯ an (∂ h1 )(0)(∂ h2 )(0) a ¯!an !
X z¯a¯ (∂ a¯ h1 )(0) a ¯!
ω(tb;0 ) |¯ a|=0
z
#! z¯
(c)
= (tb (¯ z )∂ b δ(zn ), h1 (¯ z )h2 (zn ))z (c)
= (tb (y1 − yn , . . . , yn−1 − yn )∂ b δ(yn ), h(y))y .
(88)
Summing up we find the assertion (83) if (86) and (87) hold true, otherwise we have over-subtracted extensions. Note that this proof works also for t¯ν0 = 0 and hence
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1003
t¯ν(c) = 0. Obviously this method fails for extensions t(w) (73) with w ∈ D(R4(n−1) ), because additional terms ∼ ∂ ν w appear in (85).v def ¯ 5 def ¯ µ def ¯ µ 5 In case of the axial anomaly we set j µ = ψγ γ ψ, j µ = ψγ ψ and jπ = iψγ ψ, A
and have µ λ τ , j , j )(x, x1 , x2 )|Ωi(c) , tµλτ (c) (x, x1 , x2 ) = hΩ|T3 (jA
t¯νµλτ (c) (x, x1 , x2 ) = 2mg µν hΩ|T3 (jπ , j λ , j τ )(x, x1 , x2 )|Ωi(c)
(89)
for the AV V -triangle diagram. The corresponding distributions for the AAAλ τ , jA . All tb;... -distributions vanish. triangle are obtained by replacing j λ , j τ by jA (c) (c) (c) One finds ω(t ) = 1 and ω(t¯ ) = 0 < ω(t ) + 1.w Hence, the present proof (85) does not apply. 4. Applications of the Master Ward Identity The main success of the MWI are its many, important and far-reaching consequences. 4.1. Field equation Let us consider the pair (ϕ, χ) of symbols (corresponding to massive or massless free fields which fulfill the Klein–Gordon or wave equation) that is studied in Appendix Ax and let W1 , . . . , Wn ∈ P0 . We assume that W1 , . . . , Wn contain only zeroth and first (internal) derivatives of χ. By applying twice the MWI and using the explicit expressions (304)–(308) for δ µ we obtain Tn+1 (ϕ( + m2 )g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) = −Tn+1 ((∂µ ϕ)∂ µ g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) + m2 Tn+1 (ϕg ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) n X X ∂Wm fm ⊗ W1 f1 ⊗ · · · m (±)Tn ∆µ ϕ,ψ ∂ µ g, ˆ · · · ⊗ Wn fn −i ∂ψ m=1 ψ∈G
= i
n X X m=1 ψ∈G
∂Wm fm ⊗ W1 f1 ⊗ · · · m (±)Tn ∆µ∂µ ϕ,ψ g, ˆ · · · ⊗ Wn fn ∂ψ
v For the method of distribution splitting the central solution in momentum space can be obtained by a dispersion integral [13, 19, 34]. In [13] this dispersion integral has been used to prove gauge invariance of QED. The present proof (85) is a kind of x-space version of that procedure, which yields a more general result. In addition, it has the advantage that it is not necessary to treat the cases of different external legs individually. w According to power counting one expects ω(t¯(c) ) = 1, but the (ω = 1)-terms are proportional to the spinor trace tr(γ 5 p1µ γ µ γ λ p2ν γ ν γ τ p3ρ γ ρ ) = 0. x For simplicity we choose = 1 in (296).
September 23, 2002 13:37 WSPC/148-RMP
1004
00145
M. D¨ utsch & F.-M. Boas
−i
n X
(±)CTn
m=1
= i
n X
(±)Tn
m=1
+i
n X
∂Wm (∂ µ g)fm ⊗ W1 f1 ⊗ · · · m ˆ · · · ⊗ Wn fn ∂(∂ µ χ)
∂Wm gfm ⊗ W1 f1 ⊗ · · · m ˆ · · · ⊗ Wn fn ∂χ
(±)Tn
m=1
∂Wm µ (∂ g)f ⊗ W f ⊗ · · · m ˆ · · · ⊗ W f m 1 1 n n . ∂(∂ µ χ)
(90)
This is the normalization condition (N4) of [7] and [3]. It is equivalent to Tn+1 (ϕg ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) n X X ∂Wl F fl ∆ψ,ϕ ? g ⊗ · · · ⊗ Wn fn + · · · , Tn W1 f1 ⊗ · · · ⊗ = i ∂ψ
(91)
l=1 ψ∈G
where the dots stand for the terms in which ϕg is not contracted. We see from this formula (91) that the normalization condition (N4) can always be satisfied without getting in conflict with (N0)–(N3), even if anomalies are present. Note that the final result (on the r.h.s. in (90)) is independent from the normalization constant C which appears in the intermediate formula. This must be so, because the Feynman propagators ∆F ψ,ϕ in (91) do not contain this constant. Generalizing Bogoliubov’s idea [4] we define the interacting field ΛgL belonging to Λ ∈ D(R4 , P˜0 ) and to the interaction L ∈ P0 in terms of the T -products by d def S(gL + λΛ) ΛgL = S(gL)−1 idλ λ=0 =
∞ n X i Rn+1 ((Lg)⊗n ; Λ) = T1 (Λ) + O(g) , n! n=0
(92)
where the ‘totally retarded products’ Rn+1 (. . .) (also called ‘R-product’) are defined by X def (−1)|I| T¯ (⊗l∈I Λl )T ((⊗j∈I c Λj ) ⊗ Λ) (93) Rn+1 (Λ1 ⊗ · · · ⊗ Λn ; Λ) = I⊂{1,...,n}
and we have used (17) and (18). Similarly to the S-matrix (17), the interacting fields are formal power series. In the particular case Λ = W f, W ∈ P0 , f ∈ D(R4 ) we write WgL (f ) instead of (W f )gL . Following [7] the condition (90) can easily be translated into an identity for Rn+1 (W1 f1 ⊗ · · · ⊗ Wn fn ; ϕ( + m2 )f ). The latter implies the field equation ∂L ∂L + ∂µ g , (94) ( + m2 )ϕgL = −g ∂χ gL ∂(∂ µ χ) gL where g is a test function.
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1005
˜ The calculation (90) can be carried over to external derivatives by using (N) instead of (N). More precisely let W, W1 , . . . , Wn ∈ P0 and let us assume that W1 , . . . , Wn contain only zeroth and first (internal) derivatives of χ. With that we obtain ˜ + m2 )ϕ)W g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) Tn+1 ((( n X ∂Wm gfm ⊗ W1 f1 ⊗ · · · m (±)Tn W ˆ · · · ⊗ Wn fn = i ∂χ m=1 +i
+i
∂Wm (∂µ g)fm ⊗ W1 f1 ⊗ · · · m (±)Tn W ˆ · · · ⊗ Wn fn ∂(∂µ χ) m=1 n X
n X
(±)Tn
m=1
∂Wm gfm ⊗ W1 f1 ⊗ · · · m ˆ · · · ⊗ Wn fn (∂˜µ W ) ∂(∂µ χ)
(95)
˜ In the special by proceeding analogously to (90), i.e. we have twice applied (N). case that no derivatives of χ are present, the last two terms on the r.h.s. vanish. 4.2. Charge- and ghost-number conservation We consider massive or massless spinors ψ, ψ¯ ∈ P0 fulfilling the Dirac equation def ¯ and in particular the matter current jµ = ψγ µ ψ (which is conserved). We assume W1 , . . . , Wn ∈ P0 and that no derivatives of ψ and ψ¯ are present. Charge conservation is expressed by the following Ward identity (N5) (charge) which is an immediate consequence of the master Ward identity (N) −Tn+1 (jµ ∂ µ g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) n X ∂Wm ∂jµ g, f f ⊗ · · · m ˆ · · · ⊗ W f = i (±)Tn ∆µγµ ψ,ψ¯ ⊗ W m 1 1 n n ∂(γµ ψ) ∂ ψ¯ m=1
∂Wm ∂jµ f g, f ⊗ · · · m ˆ · · · ⊗ W f ⊗ W + (±)Tn ∆µψγ m 1 1 n n ¯ µ ,ψ ¯ µ) ∂ψ ∂(ψγ n X ∂Wm ∂Wm gfm ⊗ · · · ⊗ Wn fn . Tn W1 f1 ⊗ · · · ⊗ ψ¯ ¯ − ψ (96) = ∂ψ ∂ψ m=1
In the second step we have used the formulas (312) and (313) for δ µ . Each monomial ∂ with eigenvalue: (number of ψ¯ W is an eigenvector of the operator ψ¯ ∂∂ψ¯ − ψ ∂ψ in W ) minus (number of ψ in W ), which we call ‘spinor charge’. That this Ward identity can be satisfied by choosing suitable normalizations which are compatible with (N0)–(N4) has been proved in [7] for the case that W1 , . . . , Wn are sub¯ µ ψ. monomials of the QED-interaction L = Aµ ψγ We turn to models which contain pairs (˜ ua , ua ) of massive or massless, scalar, but fermionic ghost fields, e.g. non-Abelian gauge theories (see Appendix A for
September 23, 2002 13:37 WSPC/148-RMP
1006
00145
M. D¨ utsch & F.-M. Boas
the anti-commutators and Feynman propagators of the free ghost fields u˜a , ua ∈ P0 .) The free ghost current X [ua ∂ µ u˜a − ∂ µ ua u˜a ] (97) kµ = i a
is conserved, because ua , u˜a satisfy the Klein–Gordon or wave equation. Let W1 , . . . , Wn ∈ P0 and we assume that only zeroth and first (internal) derivatives ˜a appear in W1 , . . . , Wn . Similarly to (96) the MWI (N) implies the of ua and u following Ward identity (N5) (ghost): −Tn+1 (kµ ∂ µ h ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) n X ∂Wm ∂Wm Tn W1 f1 ⊗ · · · ⊗ ua − Cua ∂ µ ua = ∂ua ∂(∂ µ ua ) m=1
∂Wm ∂Wm + Cua ∂ µ u ˜a hfm ∂u ˜a ∂(∂ µ u ˜a ) ∂Wm ∂Wm hfm + ua (∂ µ h)fm + (1 + Cua ) (∂˜µ ua ) ∂(∂ µ ua ) ∂(∂ µ ua ) ∂Wm ∂Wm µ hf (∂ − (∂˜µ u ˜a ) − u ˜ h)f f ⊗ · · · ⊗ W m a m n n , ∂(∂ µ u ˜a ) ∂(∂ µ u ˜a )
−u ˜a
(98)
where the normalization constant C appearing in (305), (308) is specified by a lower index ua . Every monomial W is an eigenvector of the operator def
Θg = ua
∂ ∂ ∂ ∂ −u ˜a + (∂ µ ua ) − (∂ µ u ˜a ) ∂ua ∂(∂ µ ua ) ∂u ˜a ∂(∂ µ u ˜a )
(99)
and the eigenvalue is the ghost number g(W ): Θg W = g(W )W ,
g(W ) ∈ Z .
(100)
The identity (98) expresses ghost number conservation correctly if and only if Cua = −1 ,
∀a .
(101)
With this normalization (N5) (ghost) takes the form −Tn+1 (kµ ∂ µ h ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) n X g(Wm )Tn (W1 f1 ⊗ · · · ⊗ Wm fm h ⊗ · · · ⊗ Wn fn ) =
(102)
m=1
for monomials W1 , . . . , Wm ∈ P0 . That the normalization condition (N5) (ghost) (with Cua = −1) has common solutions with (N0)–(N4) has been proved in [3] by using the method of [7, Appendix B]. (A slight restriction on W1 , . . . , Wn is used in that proof).
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
Remark 4.1. The (free) ghost charge Qg is defined by Z def d3 x T1 (k 0 (x)) . Qg =
1007
(103)
x0 =const.
(N5) (ghost) implies the identity [Qg , Tn (W1 f1 ⊗ · · · ⊗ Wn fn )] =
n X
! g(Wm ) Tn (W1 f1 ⊗ · · · ⊗ Wn fn )
(104)
m=1
as can be seen by a suitable choice of the test-function h in (102). For the details of this conclusion as well as for the existence of Qg see the corresponding procedure (118)–(121) for the free BRST-current. 4.3. Non-Abelian matter currents The aim of this subsection is to derive the identity (5) from the MWI. Let (λa )αβ def ψβ jaµ = ψ¯α γ µ 2
(105)
(we use matrix notation for the spinor structure) and [λa , λb ] = 2ifabc λc ,
(106)
where (fabc )a,b,c are the structure constants of some Lie algebra. We assume that the masses of the spinor fields are colour independent (iγµ ∂ µ − m)ψα = 0 ,
∀α ,
(107)
which implies ∂µ jaµ = 0 .
(108)
We denote by (Aa )a the gauge fields and by (ua , u˜a )a the corresponding fermionic ghost fields, and consider an interaction of the form L = jaµ Aaµ + L1 (A, u, u˜) ,
(109)
˜) is a polynomial in the symbols A, u, u ˜ and internal derivatives where L1 (A, u, u thereof. QCD fits in this framework: the quark fields ψα are in the fundamental representation of SU (3). To apply the MWI we need X fh ¯ ν ∂jaµ ∂jbν µ f, h = ψγ [λa , λb ]ψ = if hfabcjcν ∆χ,ϕ (110) i ∂χ ∂ϕ 4 χ,ϕ ((312) and (313) are used), and by contracting with Abν we obtain P µ ∂jaµ ∂L i ∆χ,ϕ ∂χ f, ∂ϕ g . So the MWI for T (gL ⊗ · · · ⊗ gL ⊗ jaµ ∂µ f ) implies −Rn+1 ((gL)⊗n ; jaµ ∂µ f ) = inRn+1 ((gL)⊗(n−1) ; fabc Abν jcν f g) ,
(111)
September 23, 2002 13:37 WSPC/148-RMP
1008
00145
M. D¨ utsch & F.-M. Boas
and hence jaµgL (∂µ f ) = (fabc Abν jcν )gL (f g) ,
(112)
which corresponds to the covariant conservation of the interacting classical current. To formulate (5) we need the time-ordered product TgL (W1 f1 ⊗ · · · ⊗ Wm fm ) of the interacting fields W1 gL (f1 ), . . . , Wm gL (fm ), which is defined by generalizing (92) (cf. [4, 19]) TgL (W1 f1 ⊗ · · · ⊗ Wm fm )
! m X dm = S (gL) m S gL + λl Wl fl i dλ1 · · · dλm λ1 =···=λm =0
def
l=1
=
∞ n X
i Rn,m ((gL)⊗n ; W1 f1 ⊗ . . . ⊗ Wm fm ) n! n=0
(113)
withy Rn,m (g1 V1 ⊗ · · · ⊗ gn Vn ; W1 f1 ⊗ · · · ⊗ Wm fm ) X def = (−1)|I| T¯(⊗l∈I gl Vl )T ((⊗j∈I c gj Vj ) ⊗ (⊗m k=1 fk Wk )) .
(114)
I⊂{1,...,n}
By using (108) and (110) the MWI yields −Rn,2 ((gL)⊗n ; jaµ ∂µ f ⊗ jbν h) = Rn,1 ((gL)⊗n ; ifabc jcν f h) + inRn−1,2 ((gL)⊗(n−1) ; facd Acτ jdτ f g ⊗ jbν h) (115) which gives −TgL (jaµ ∂µ f ⊗ jbν h) = ifabc jcν gL (f h) − TgL (facd Acτ jdτ f g ⊗ jbν h) .
(116)
Due to (112) this is the formulation of (5) in the framework of causal perturbation theory. In the simple case that the gauge fields Aa are external fields (which implies ˜) ≡ 0) and the spinor fields are massive (m > 0), the proof of Sec. 3.3 L1 (A, u, u applies, i.e. the central extensions fulfil the MWI. (Note that no factor m appears in (111) and (115), which indicates that the scaling degree is not lowered.) 4.4. The master BRST-identity We consider free gauge fields Aµa , a = 1, . . . , N , with mass ma ≥ 0 in Feynman gauge and the corresponding free ghost fields u ˜a , ua with the same mass ma . For µ each fixed value of a and µ the field Aa is quantized as a real scalar field satisfying the Klein–Gordon or wave equation, i.e. in the formalism of Appendix A we set ϕ = Aµa = χ, = 1. The free ghost fields fulfil the same algebraic relations as in y The
connection to the notation (93) reads: Rn,1 ≡ Rn+1 .
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1009
Sec. 4.2 and in Appendix A. For each massive gauge field Aµa , ma > 0, we introduce a free, real scalar field φa with the same mass ma , which is quantized with a minus sign in the commutator, i.e. we have ϕ = φa = χ, = −1 in the formalism of Appendix A. (For the Fock space representation of these free fields see e.g. [35].) There is no obstacle to include spinor fields in our treatment of BRST-symmetry (Secs. 4.4 and 4.5), see [7], the last paper of of [11, 15, 21]. The free BRST-current (cf. [15, 25]) X def [(∂τ Aτa + ma φa )∂ µ ua − ∂ µ (∂τ Aτa + ma φa )ua ] (117) jµ = a
is conserved, because ∂τ Aτa , ua and φa fulfill the Klein–Gordon equation with the same mass ma . We will see that the corresponding charge Z def d3 x T1 (j 0 )(x) , (118) Q0 = x0 =const.
is the generator of the BRST-transformation of the free fields and Wick monomials. Q0 is nilpotent, 2Q20 = [Q0 , Q0 ]+ = 0 ,
(119)
[(∂τ Aτa +ma φa ), (∂ρ Aρb +mb φb )]
= 0. Without the scalar fields φa the charge because Q0 would not be nilpotent, if some gauge fields are massive. So, a main purpose of the the scalar fields φa is to restore the nilpotency of Q0 . (For a rigorous definition of Q0 , with 4-dimensional smearing with a test function and taking a suitable limit, see [7] where a method of Requardt [33] is used.) To obtain the master BRST-identity (i.e. the (anti)commutator of Q0 with arbitrary T -products (2)) we will compute Tn+1 (jµ ∂ µ g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) ,
W1 , . . . , Wn ∈ P0 ,
(120)
by means of the MWI (N). Thereby we assume that W1 , . . . , Wn have an even or odd ghost number (no mixture). From this result we shall get [Q0 , T (W1 , . . . , Wn )]∓ in the following way: let O be an open double cone with supp fj ⊂ O, ∀j = 1, . . . , n. ¯ and Following [7, Appendix B] we choose g to be equal to 1 on a neighborhood of O µ µ µ µ µ decompose ∂ g = b −a such that supp a ∩(V¯− +O) = ∅ and supp b ∩(V¯+ +O) = ∅. Then we apply causal factorization of the T -products: −Tn+1 (jµ (∂ µ g) ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) = T1 (jµ aµ )Tn (W1 f1 ⊗ · · · ⊗ Wn fn ) ∓ Tn (W1 f1 ⊗ · · · ⊗ Wn fn )T1 (jµ bµ ) = [T1 (jµ aµ ), Tn (W1 f1 ⊗ · · · ⊗ Wn fn )]∓ ∓ Tn (W1 f1 ⊗ · · · ⊗ Wn fn )T1 (jµ ∂ µ g) . (121) The last term on the r.h.s. vanishes because of ∂ jµ = 0. Since Tn (W1 f1 ⊗ · · · ⊗ ¯ without Wn fn ) is localized in O, we may vary aµ in the spatial complement of O µ µ affecting [T1 (jµ a ), Tn (W1 f1 ⊗ · · ·)]∓ . In this way and by using ∂ jµ = 0 we find µ
[T1 (jµ aµ ), Tn (W1 f1 ⊗ · · · ⊗ Wn fn )]∓ = [Q0 , Tn (W1 f1 ⊗ · · · ⊗ Wn fn )]∓ (see [7, Appendix B] for details of this conclusion).
(122)
September 23, 2002 13:37 WSPC/148-RMP
1010
00145
M. D¨ utsch & F.-M. Boas
We start the computation of (120) with the simplest case: n = 1. We assume that the symbols in W carry at most a first (internal) derivative (no higher derivatives) and give the calculation in detail X ∂jµ ∂W µ µ , f . (123) T1 ∆χ,ψ −T2 (jµ (∂ g) ⊗ W f ) = i ∂χ ∂ψ χ,ψ∈G
The explicit results for the ∆µ with a non-vanishing contribution are listed in Appendix B. Thereby CAa (C1Aa resp.), Cφa and Cua mean the normalization con˜a , χ = ua . In stants C (C1 resp.) in the cases ϕ = Aµa = χ, ϕ = φa = χ and ϕ = u the present context they may depend on the colour index a. Inserting (315)–(326) into (123) we obtain −T2 (jµ (∂ µ g) ⊗ W f ) = T1 (s0 (W )gf + [· · ·](∂ ν g)f + [· · ·](∂ ν ∂ σ g)f + [· · ·](g)f )
(124)
by means of T1 ((∂˜a V )W g) = T1 ((∂ a V )W g), where ∂W ∂W ∂W def τ s0 (W ) = i (∂ µ ua ) µ + (∂ σ ∂ µ ua ) µ − (∂τ Aa + ma φa ) σ ∂Aa ∂(∂ Aa ) ∂u ˜a − (∂ν (∂τ Aτa + ma φa ))
∂W ∂W ∂W + ma u a + ma (∂µ ua ) . (125) ∂(∂ν u ˜a ) ∂φa ∂(∂µ φa )
(The terms which are not written out depend on the normalization constants CAa , Cφa , Cua and C1Aa .) Using gf = f and (∂ a g)f = 0, ∀|a| ≥ 1 we end up with [Q0 , T1 (W f )]∓ = T1 (s0 (W )f ) ,
(126)
where we have the anti-commutator iff W has an odd ghost number. The normalization constants CAa , Cφa , Cua and C1Aa have dropped out on the r.h.s., as it must be since they do not appear on the l.h.s. of (126). The result (126) is the well-known free BRST-transformation of a Wick polynomial T1 (W ) → T1 (s0 (W )) (cf. [11]) which we have obtained here with quite a lot of calculations. Note that in our framework s0 is a derivation s0 : P0 → P0 . However, the advantage of the present method is that it can be used to compute commutators of Q0 with T -products of higher orders. For n = 2 in (120) we obtain −T3 (jµ (∂ µ g) ⊗ W1 f1 ⊗ W2 f2 ) X µ ∂jµ ∂W1 g, f1 ⊗ W2 f2 + (±)[(W1 , f1 ) ↔ (W2 , f2 )] T2 ∆χ,ψ = i ∂χ ∂ψ χ,ψ∈G
(127) where (±) is still a sign coming from permutations of fermionic operators. We insert the expressions (315)–(326) for the various ∆µ . For given f1 , f2 we then choose g
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1011
as in (121), hence gfj = fj and (∂ a g)fj = 0, ∀|a| ≥ 1. It results [Q0 , T2 (W1 f1 ⊗ W2 f2 )]∓ 1 µ 3 ∂W1 (∂ ua ) + (∂˜µ ua ) = iT2 4 4 ∂Aµ a 1 ∂W1 1 µ 2 ˜ ˜ + 2C1Aa ua + CAa ma ua + CAa + + 2C1Aa (∂ ∂µ ua ) − 2 2 ∂(∂ν Aνa ) 1 + − C1Aa (∂˜σ ∂ ν ua ) + + 2C1Aa (∂˜ν ∂ σ ua ) 2 ! 1 ∂W1 + − C1Aa (∂˜ν ∂˜σ ua ) 2 ∂(∂ σ Aνa ) − (∂τ Aτa + ma φa )
∂W1 + (−(1 + Cua )(∂˜ν )(∂τ Aτa + ma φa ) ∂u ˜a
+ Cua (∂ν (∂τ Aτa + ma φa ))) + ma ua
∂W1 ∂(∂ν u ˜a )
∂W1 ∂W1 + ma ((1 + Cφa )∂˜ν ua − Cφa ∂ν ua ) f1 ⊗ W2 f2 ∂φa ∂(∂ν φa )
+ (±)[(W1 , f1 ) ↔ (W2 , f2 )] .
(128)
To simplify this expression we insert the value Cua = −1 which is required from ˜ we replace the external ghost number conservation (98)–(101). By means of (N) derivatives by internal ones [Q0 , T2 (W1 f1 ⊗ W2 f2 )]∓ = [T2 (s0 (W1 )f1 ⊗ W2 f2 ) + T1 (G(1) (W1 f1 , W2 f2 ))] + (±)[(W1 , f1 ) ↔ (W2 , f2 )] , (129) where def
G(1) (W1 f1 , W2 f2 ) = −
X 3 ψ∈G
4
∆µua ,ψ
+ +
1 2 1 2 1 2
∂W2 ∂W1 f1 , f2 ∂(∂ν Aνa ) ∂ψ ∂W2 ∂W1 + 2C1Aa ∆µ∂˜ u ,ψ f f , 1 2 µ a ∂(∂ν Aνa ) ∂ψ ∂W1 ∂W2 − 2C1Aa ∆σ∂ν ua ,ψ f f , 1 2 ∂(∂ σ Aνa ) ∂ψ ∂W1 ∂W2 + 2C1Aa ∆ν∂ σ ua ,ψ f f , 1 2 ∂(∂ σ Aνa ) ∂ψ
+ CAa ∆µ∂µ ua ,ψ −
∂W1 ∂W2 f2 f1 , ∂Aµa ∂ψ
September 23, 2002 13:37 WSPC/148-RMP
1012
00145
M. D¨ utsch & F.-M. Boas
1 ∂W1 ∂W2 ν − C1Aa ∆∂˜σ ua ,ψ f1 , f2 + 2 ∂(∂ σ Aνa ) ∂ψ # ∂W ∂W 1 2 f1 , f2 . + ma (1 + Cφa )∆νua ,ψ ∂(∂ ν φa ) ∂ψ
(130)
Note that G(1) (· , ·) is not invariant with respect to the exchange of the two arguments. Now we assume that s0 (Wj ) is a divergence, i.e. that there exists a (Lorentz) 0 0 )ν=0,...,3 , Wjν ∈ P0 with vector (Wjν 0 , s0 (Wj ) = i∂ ν Wjν
j = 1, 2 .
(131)
By means of the MWI (N) we shift this derivative to the test-function 0 ∂ ν f1 ⊗ W2 f2 ) + T1 (G((W1 , W10 )f1 , W2 f2 ))] [Q0 , T2 (W1 f1 ⊗ W2 f2 )]∓ = [−iT2 (W1ν
+ (±)[(W1 , W10 , f1 ) ↔ (W2 , W20 , f2 )] ,
(132)
where G((W1 , W10 )f1 , W2 f2 ) = G(1) (W1 f1 , W2 f2 ) + G(2) (W10 f1 , W2 f2 ) , with G(2) (W10 f1 , W2 f2 ) =
def
X χ,ψ∈G
∆νχ,ψ
0 ∂W1ν ∂W2 f1 , f2 . ∂χ ∂ψ
(133)
(134)
If we only know that s0 (W1 ) is a divergence, our final result reads 0 0 ∂ ν f1 ⊗ W2 f2 ) + T1 (G((W1 , W1ν )f1 , W2 f2 )) [Q0 , T2 (W1 f1 ⊗ W2 f2 )]∓ = −iT2 (W1ν
+ (±)[T2 (s0 (W2 )f2 ⊗ W1 f1 ) + T1 (G(1) (W2 f2 , W1 f1 ))] (135) instead of (132). The (n = 2)-calculation generalizes to higher orders n ≥ 2 in a straight0 , ∀j = forward way: let W1 , . . . , Wk , V1 , . . . , Vn−k ∈ P0 with s0 (Wj ) = i∂ ν Wjν 4 1, . . . , k and fj , hi ∈ D(R ). For simplicity we assume that each polynomial W1 , . . . , Wk , V1 , . . . , Vn−k has an even ghost number, otherwise some additional, def obvious signs appear in the following formula. Setting m = n − k we obtain [Q0 , Tn (W1 f1 ⊗ · · · ⊗ Wk fk ⊗ V1 h1 ⊗ · · · ⊗ Vm hm )] = −i
k X
0 ν Tn (W1 f1 ⊗ · · · ⊗ Wlν ∂ fl ⊗ · · · ⊗ Wk fk ⊗ V1 h1 ⊗ · · ·)
l=1
+
m X l=1
Tn (W1 f1 ⊗ · · · ⊗ V1 h1 ⊗ · · · ⊗ s0 (Vl )hl ⊗ · · · ⊗ Vm hm )
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
+
k X
1013
Tn−1 (G((Wl , Wl0 )fl , Wr fr ) ⊗ W1 f1
l,r=1 (l6=r)
⊗ · · · ˆl · · · rˆ · · · ⊗ Wk fk ⊗ V1 h1 ⊗ · · ·) +
m k X X
[Tn−1 (G((Wl , Wl0 )fl , Vr hr ) ⊗ W1 f1 ⊗ · · · ˆl · · · ⊗ V1 h1 ⊗ · · · rˆ · · ·)
l=1 r=1
+ Tn−1 (G(1) (Vr hr , Wl fl ) ⊗ W1 f1 ⊗ · · · ˆl · · · ⊗ V1 h1 ⊗ · · · rˆ · · ·)] +
m X
Tn−1 (G(1) (Vl hl , Vr hr )
l,r=1 (l6=r)
⊗ W1 f1 ⊗ · · · ⊗ V1 h1 ⊗ · · · ˆl · · · rˆ · · · ⊗ Vm hm ) ,
(136)
where ˆ l or rˆ means that the corresponding factor is omitted. We call this equation the ‘master BRST-identity’. It is a consequence of the master Ward identity. Hence, the master BRST-identity is also a normalization condition. We point out that the G-terms (i.e. the terms in the last four lines) are explicitly known. We are not aware of any reference which gives a general formula for these terms, as it is done here. So far we have not spoken about the interaction L ≡ L0 ; the master BRSTidentity is a condition on T -products of arbitrary factors. Now we require that the interaction is s0 -invariant in some sense. The requirement s0 L0 = 0 is too restrictive, it is not satisfied for physically relevant models. So we impose the weaker condition that s0 L0 is a divergence: s0 L0 = i∂ ν L1ν .
(137)
The requirements that (a) the master BRST-identity becomes particularly simple, and (b) can be satisfied to all orders for T -products involving the interaction are good criterions (among others) to restrict L0 further. We will make (a) explicit by the formula G((L0 , L1 )f, L0 g) + (f ↔ g) = 0 .
(138)
(b) means that anomalies (in the master BRST-identity) may not occur or must cancel. It is a hard job to work this out. For example it is well-known that in weak interactions the axial anomalies cancel only if the numbers of generations for leptons and quarks agree. For an interaction L fulfilling (137) and (138) the validity of the master BRSTidentity for [Q0 , Tn (L, . . . , L)] ∀n ∈ N implies [Q0 , S] = 0 ,
(139)
September 23, 2002 13:37 WSPC/148-RMP
1014
00145
M. D¨ utsch & F.-M. Boas
where S is the S-matrix in the adiabatic limit, def
S = lim S(gL) ,
(140)
g→1
provided this limit exists [20]. Hence, S induces a well-defined operator on the ker Q0 ∗ physical Hilbert space Hphys = ran Q0 , which is unitary if L0 = L0 and (N2) is satisfied [15, 16, 20, 21]. Having determined the interaction by using (137), (138) and other (quite obvious) requirements, we will show that the validity of the master BRST-identity and of the ghost number conservation (N5) (ghost) suffices for a local construction of observables in non-Abelian gauge theories. This is a generalization of the corresponding construction for QED in [7]. In particular we will obtain an explicit formula for the computation of the nonlinear term in the BRST-transformation of an arbitrary interacting field. 4.5. Local construction of observables in gauge theories For massive gauge fields the procedure is more involved. So we first treat massless gauge fields and afterwards give the modifications for the massive case. 4.5.1. Massless gauge fields: determination of the interaction Since we are considering solely massless fields (ma = 0, ∀a), the scalar fields φa are superfluous. So we set φa ≡ 0, ∀a. First we determine the interaction L0 by the following requirements (cf. [16, 18, 21, 35, 38]): j (A) There exist Lj ∈ (P0 )4 , j = 0, 1, . . . , M which satisfy the ladder equations µ ...µj
s0 Lj 1
µ ...µj µj+1
1 = i∂µj+1 Lj+1
,
j = 0, 1, . . . , M − 1 ,
s0 LM = 0 .
(141)
(B) Lj is a polynomial in the gauge field Aµa and in the fermionic ghost fields ˜a , a = 1, . . . , N, and internal derivatives of these symbols; each monomial has ua , u at least three factors. (C) Lj has UV-dimension ≤ 4. (D) Lj is a Lorentz tensor of rank j. (E) Lj has ghost number j: g(Lj ) = j
(142)
(cf. (100)). Thereby we take into account that s0 increases the ghost number by 1. We conclude that the ladder (141) stops at M ≤ 3 for trilinear terms. (F) unitarity (for L0 only): L∗0 = L0 . Following [38] we make the most general ansatz for Lj , j = 0, 1, 2, 3 which satisfies (B)–(F) and insert it into (141). The calculation excludes quadrilinear terms in L0 .
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1015
def
Using F νµ = ∂ ν Aµ − ∂ µ Aν the most general solution for L0 reads 1 νµ µ ˜b Acµ − is0 K1 + ∂ν K2ν , L0 = g0 fabc Aaµ Abν Fc − ua ∂ u 2
(143)
where fabc must be totally antisymmetric and g0 ∈ R is a constant. This implies that the colour index takes at least N ≥ 3 values. The Kj are trilinear polynomials with ghost number (j − 2). We assume that the colour tensor in Kj , j = 1, 2 is also (1) (2) (3) totally antisymmetric (i.e. Kj = hjabc ϕa ϕb ϕc with a totally antisymmetric hj ). Then one finds K1 = g0 h1abc ua u˜b u˜c ,
K2µ = g0 h2abc Aµa ub u˜c .
(144)
The most general solutions for Lj , j ≥ 1 contain trilinear terms only. Assuming again that solely totally antisymmetric colour tensors appear, we obtain 1 ν νµ ν ˜c − is0 K2ν + g0 h3abc ∂µ (ua Aνb Aµc ) , L1 = g0 fabc Aaµ ub Fc − ua ub ∂ u 2 1 νµ 3 ν µ Lνµ 2 = g0 fabc ua ub Fc − ig0 s0 (habc ua Ab Ac ) 2 + g0 h4abc [ua ub ∂ ν Aµc + 2ua ∂ ν ub Aµc − g νµ (ua ub ∂λ Aλc + 2ua ∂λ ub Aλc )] = g0 h4abc [g νµ ua ub ∂ λ uc + g λν ua ub ∂ µ uc ] , Lνµλ 3
(145)
where h3 , h4 are totally antisymmetric. Note that the divergence ∂µ of the h4 -term in Lνµ 2 vanishes. To simplify the formulas we choose hjabc = 0 ,
∀j = 1, 2, 3, 4 .
(146)
νµ h4 = 0 is equivalent to Lµν 2 = −L2 and also to L3 = 0. The requirements (A)–(F) used so far do not involve T -products, they are of first order perturbation theory. We now restrict L0 further by (138), which can be interpreted as a requirement for second order tree diagrams, see (150). More precisely, we will work with the generalization of (138) to the ladder (141): in order that the master BRST-identity (136) implies the important equation (‘generalized perturbative gauge invariance’z )
[Q0 , Tn (Lj1 f1 ⊗ · · · ⊗ Ljn fn )]∓ = −i
n X
(−1)j1 +···+jl−1 Tn (Lj1 f1 ⊗ · · · ⊗ Lνjl +1 ∂ν fl ⊗ · · · ⊗ Ljn fn ) ,
l=1
(147) z In
[3] this identity is called the normalization condition (N6), in [16] it is called ‘generalized (free perturbative operator) gauge invariance’. The importance and usefulness of this identity has also been pointed out in the earlier paper [18]. The particular case j1 = · · · = jn = 0 is the ‘perturbative gauge invariance’ (or ‘free perturbative operator gauge invariance’) which has been proved in [11] for SU (N )-Yang-Mills theories. In [17] it has been shown that this perturbative gauge invariance implies the usual Slavnov–Taylor identities.
September 23, 2002 13:37 WSPC/148-RMP
1016
00145
M. D¨ utsch & F.-M. Boas
we require G((Lj , Lj+1 )f, Lk g) + (−1)jk G((Lk , Lk+1 )g, Lj f ) = 0 ,
j, k = 0, 1, 2, 3 ,
(148)
∀f, g ∈ D(R4 ), where we set L4 ≡ 0. Or, with the simplification (146) (which will always be used in the following), j and k run only through the values j, k = 0, 1, 2. In the present case of solely massless fields this requirement can be fulfilled. It restricts the interaction L0 further and determines the normalization constant CAa . Namely, using the simplification (146), one finds by explicit calculation that the requirement (148) holds true if and only if 1 CAa = − , 2
∀a ,
(149)
and the fabc fulfil the Jacobi identity [38].aa Hence, the fabc are structure constants of some Lie algebra. The total antisymmetry of fabc implies that this Lie algebra is isomorphic to a direct sum of Abelian and simple compact Lie algebras, see e.g. [35]. We point out, that the Lie algebraic structure is not put in, it is a consequence of our requirements. Remark 4.2. (1) If we do not use the simplification (146), but assume that K1 and K2 are built up from the same colour tensor fabc as the first two terms in (143), K1 = −iβ1 g0 fabc ua u˜b u˜c ,
K2µ = β2 g0 fabc Aµa ub u˜c ,
β1 , β2 ∈ R
(151)
(instead of (144)), we can determine the parameters β1 and β2 from the particular case (j, k) = (0, 0) of the requirement (148): additionally to (149) and the Jacobi identity this condition yields 1 1 , 1 , − , 0 , (1, 1) (152) (β1 , β2 ) ∈ (0, 0), 2 2 Jacobi identity and (149) are required even from the particular case G((L0 , L1 )f, L0 g) + (f ↔ g) = 0. This was demonstrated in [38] by reversing the calculation in [11]. The computation of the l.h.s. of (148) is lengthy. The straightforward way uses the definitions (130) and (134) of G(1) and G(2) . To shorten the calculation one may choose C1Aa = 0 = C1ua , because the terms ∼ C1Aa , C1ua must drop out. This follows from the fact that L0 (143), L1 and L2 (145) do not contain symbols with second or higher derivatives and, hence, the r.h.s. in
aa The
T1 (G((Lj , Lj+1 )f, Lk g) + (−1)jk {(j, f ) ↔ (k, g)}) = [Q0 , T2 (Lj f ⊗ Lk g)]|4-legs + i(T2 (Lνj+1 ∂ν f ⊗ Lk g) + (−1)jk {(j, f ) ↔ (k, g)})|4-legs (150) (cf. (132), . . . |4-legs expresses that we mean the terms with 4 free field operators only) does not contain the constants C1Aa and C1ua (according to the definition (301)). However, even with this simplification, it seems to be faster to compute the r.h.s. of (150) (by using the techniques of [11, 15]), instead of the straightforward computation of the l.h.s. Thereby, the T -products of second order must fulfill the normalization condition (N3), because the MWI presupposes this condition. Note that the derivation of (150) uses the MWI for tree diagrams only and, hence, (150) holds surely true.
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1017
by a generalization of the calculation (149), where Cua = −1 (101) is used. Note that the first two solutions in (152), and also the latter two, are obtained from each other by replacing u by γ u ˜ and u˜ by (− γ1 u) in L0 , γ ∈ iR\{0} arbitrary. (2) In [14] it was found how the quadrilinear interactions of the usual Lagrangian formalism appear in our framework. Namely, from (143), (151) and (N3) it results T2 (L0 , L0 )(x1 , x2 ) = fabr fcds · ∆F τ ρ (x1 − x2 )[: Aa µ (x1 )Ab ν (x1 )Ac ρ (x2 )Ad τ (x2 ) : ∂ ν Aµ r ,∂ As ub (x1 )uc (x2 )˜ ud (x2 ) :] + (β2 − 2β1 )2 gνµ gτ ρ : ua (x1 )˜ ud (x2 ) : + {∆F ˜r ,∂ ν us (x1 − x2 )β2 (β2 − 1) : Aa µ (x1 )ub (x1 )Ac ν (x2 )˜ ∂µu (153) + (x1 ↔ x2 )} + · · · , where the terms which are denoted by the dots have no contributions of the form δ(x1 − x2 ) : B1 (x1 )B2 (x1 )B3 (x2 )B4 (x2 ) :
(154)
(B1 , . . . , B4 are free field operators), they are disconnected terms, tree terms with propagators ∆F (x1 − x2 ) or ∂ ν ∆F (x1 − x2 ), or loop terms. The terms of the form (154) correspond to the quadrilinear terms of the interaction Lagrangian of the conventional theory. So, CAa = − 12 (149) yields the usual 4-gluon coupling (cf. the first paper of [11]) and, if β2 − 2β1 6= 0, a 4-ghost coupling. However, there is no AuA˜ u-coupling coming from Cua = −1, because β2 (β2 − 1) = 0. def
(3) By using (N4) (91) we obtain for the interacting F -field (F µν = ∂ µ Aν − ν µ ∂ A ) FaµνgL (x) = ∂ µ Aνa gL (x) − ∂ ν Aµa gL (x) − 2CAa g0 g(x)fabc (Aµb Aνc )gL (x).
(155)
We see that the nonlinear term is due to the non-vanishing of CAa and that it agrees with the usual nonlinear term if and only if CAa = − 21 , in agreement with (149). (4) In Sec. 4.5.3 we will see that our local construction of observables works also if one replaces (148) by the weaker condition (209) (with the specifications (210)– (211) and (216)–(217)), i.e. one allows to introduce additional 4-legs couplings ‘by hand’. This relaxed version of (148) has solutions for arbitrary (β1 , β2 ) ∈ R2 (151) (at least in the case (j, k) = (0, 0)), see e.g. [18]. 4.5.2. Massless gauge fields: local construction of observables In [7] a general local construction of observables in gauge theories and of the physical Hilbert space (in which the observables are faithfully represented) is given. This construction relies on some assumptions which can be fulfilled in QED [7]. We are now going to generalize the latter result to the class of interactions we have selected in the preceding subsection, which includes non-Abelian gauge theories. Thereby we assume that ghost number conservation (N5) (ghost) and certain cases of the master BRST-identity (136) are satisfied.
September 23, 2002 13:37 WSPC/148-RMP
1018
00145
M. D¨ utsch & F.-M. Boas
As in [7] we start with the local algebra of interacting gauge and ghost fields (92) _ def {WgL (f )|f ∈ D(O), W = Aµ , u, u ˜, . . .} (156) F (O) = ˜), where O is a double cone and (the dots stand for polynomials in Aµ , u and u g(x) = 1, ∀x ∈ O. In [5] the crucial observation has been made that a change of the switching function g outside of O, transforms all interacting fields ∈ F(O) by the same unitary transformation.bb Therefore, the algebraic properties of F (O) are independent of the adiabatic limit g(x) → 1, ∀x. Hence, we may avoid this limit, which saves us from infrared divergences. It seems that a consistent perturbative construction of massless non-Abelian gauge theories can be done only locally, i.e. without performing the adiabatic limit, due to the confinement. The field algebra F (O) contains unphysical fields. The central problem in gauge theories is to eliminate the latter, i.e. to select the observables, and, in a second step, to construct (physical) states on the algebra of observables. We proceed as in [7]: roughly speaking we will define the observables to be the BRST-invariant fields. Thereby, we will define the BRST-transformation s˜ as the (Z2 -graded) commutator with the (modified) interacting BRST-charge QgL of Kugo and Ojima [24]. But in contrast to this reference we do not perform the adiabatic limit. The latter causes the complication that QgL does not agree with its zeroth order contribution Q0 : it is a non-trivial formal power series (cf. [7, 16]). µ . From our The current belonging to QgL is the interacting BRST-current ˜jgL µ ˜ experience made in QED [7] we know that jgL should have the following properties: (a) to zeroth order it agrees with the free BRST current j µ (117), µ (x) ∼ (b) it is conserved up to (first) derivatives of the coupling ‘constant’ g: ∂µ ˜jgL (∂g)(x). Unfortunately, (b) does not hold true for the interacting field jgL (92) and (93) where jgL is constructed in terms of T -products satisfying the MWI (N): from (123) and (124) we get −T2 (j µ ∂µ f ⊗ L0 g) = −iT1 (Lν1 f ∂ν g) + iT1 (Mν (∂ν f )g) , where ν def
M = −
Lν1
3 1 ∂L0 ∂L0 + ua + 3C1Aa + . ∂µ ua 2 ∂(∂µ Aaν ) 4 ∂Aaν
(157)
(158)
However, the wanted conservation property can be achieved by a change of the normalization of Tn+1 (j µ , Lj1 , . . . , Ljn ): motivated by s0 (k µ ) = j µ
(159)
bb An alternative proof of this fact, which applies also to classical field theory, is given in the second paper of [8].
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1019
(where k µ is the (free) ghost current (97)) and generalized perturbative gauge invariance (147) we define T˜n+1 (j µ f ⊗ Lj1 f1 ⊗ · · · ⊗ Ljn fn ) def
= [Q0 , Tn+1 (k µ f ⊗ Lj1 f1 ⊗ · · · ⊗ Ljn fn )]∓ +i
n X
(−1)j1 +···+jl−1 Tn+1 (k µ f ⊗ Lj1 f1 ⊗ · · · ⊗ Lνjl +1 ∂ν fl ⊗ · · · ⊗ Ljn fn ) .
l=1
(160) By means of (159) and (147) we find that T˜n+1 (j µ , Lj1 , . . . , Ljn ) factorizes causally (13), e.g. T˜n+1 (j, L... , . . .) = T˜l+1 (j, L... , . . .)Tn−l (L... , . . .). In addition it is symmetrical in Lj1 , . . . , Ljn and fulfills the normalization conditions (N1), (N2) and (N0). Hence, Tn+1 (j µ , Lj1 , . . .) → T˜n+1 (j µ , Lj1 , . . .) is an admissible finite renormalization of T -products (solely the extension to Dn+1 is changed), which however violates µ (x) ∼ (∂g)(x) (N3) and the MWI (N). We point out that the requirement ∂µ ˜jgL is not compatible with (N3), which can be seen by an explicit calculation of the µ cc . To compute the divergence of (160) with respect first order tree diagrams of ˜jgL to j µ we first apply (N5) (ghost) (102) to all terms on the r.h.s. (where we use g(Lj ) = j) and afterwards generalized gauge invariance (147). In this way we obtain cc The
µ first order tree diagrams of ˜ jgL read:
µ (1) ˜ jgL |tree (x) = g0
Z d4 x1 g(x1 ){− : ∂ µ ua (x)Aν b (x1 )Fcνλ (x1 ) : fabc ∂λ D ret (x − x1 )
(161)
µ + : ua (x)Aν b (x1 )Fcνλ (x1 ) : fabc [∂ µ ∂λ D ret (x − x1 ) + C3 gλ δ(x − x1 )]
(162)
˜c (x1 ) : fabc ∂λ D ret (x − x1 ) + : ∂ µ ua (x)ub (x1 )∂ λ u
(163)
µ − : ua (x)ub (x1 )∂ λ u ˜c (x1 ) : fabc [∂ µ ∂λ D ret (x − x1 ) + C2 gλ δ(x − x1 )]
(164)
+ : ∂ µ ∂ν Aνa (x)Aλ b (x1 )uc (x1 ) : fabc ∂ λ D ret (x − x1 )
(165)
− :
∂ν Aνa (x)Aλ b (x1 )uc (x1 )
µ λ
: fabc [∂ ∂ D
ret
(x − x1 ) + C1 g
µλ
δ(x − x1 )]} (166)
(D ret is the retarded Green’s function of the wave operator) where we have used the simplification µ jgL (x) ∼ (146) and C1 , C2 and C3 are undetermined normalization constants. The requirement ∂µ ˜ (∂g)(x) fixes the latter uniquely: C1 = −1 ,
C2 = −
1 , 2
C3 = 0 .
(167)
If (N3) (or equivalently the causal Wick expansion [19], [5]) holds true, the propagators in (162) and (164) are both equal to i[· · ·] = hΩ|R2 (Aλ ; ∂ µ ∂τ Aτ )(x1 ; x)|Ωi . But this contradicts C2 6= C3 (167).
(168)
September 23, 2002 13:37 WSPC/148-RMP
1020
00145
M. D¨ utsch & F.-M. Boas
T˜n+1 (j µ ∂µ f ⊗ Lj1 f1 ⊗ · · · ⊗ Ljn fn ) = i
n X
(−1)j1 +···+jl−1 [Tn (Lj1 f1 ⊗ · · · ⊗ Lνjl +1 f ∂ν fl ⊗ · · · ⊗ Ljn fn )
l=1
− jl Tn (Lj1 f1 ⊗ · · · ⊗ Lνjl +1 (∂ν f )fl ⊗ · · · ⊗ Ljn fn )] .
(169)
Conversely, this current conservation identity (169) implies the generalized perturbative gauge invariance (147) by proceeding similarly to (121). We denote by ˜ n+1 (L, . . . , L; j) (where L ≡ L0 ) the R-product (93) which is constructed in terms R of Tk (L, . . . , L) and T˜k+1 (j, L, . . . , L), 1 ≤ k ≤ n. Then, the identity (169) implies ˜ n+1 ((Lg)⊗n ; j µ ∂µ f ) = inRn ((Lg)⊗n−1 ; Lν f ∂ν g) . R 1
(170)
Analogously to (92) we define ˜j µ (f ) def = gL
∞ n X i ˜ Rn+1 ((Lg)⊗n ; j µ f ) , n! n=0
(171)
and this interacting BRST-current has the wanted conservation property ˜j µ (∂µ f ) = −Lν1 gL (f ∂ν g) , gL
(172)
which agrees precisely with the corresponding result for QED (formula (5.12) of the µ µ µ − jgL ) (where jgL still denotes the interacting first paper of [7]). The difference (˜jgL field constructed in terms of T -products satisfying also the MWI and (N3)) is immediately obtained by applying the master BRST-identity: ˜j µ (f ) − j µ (f ) = i(G(1) (k µ f, Lg))gL + i(G((L, L1 )g, k µ f ))gL . gL gL The interacting BRST-charge operator is now defined by Z def d4 x hµ (x)˜jµ gL (x) , QgL =
(173)
(174)
where L ≡ L0 and hµ is a suitable test function (see [7] and Sec. 4.5.4). QgL is a formal power series and the construction is such that the relations QgL = Q∗gL ,
(QgL )0 = Q0 ,
(175)
hold true (where (· · ·)0 means the zeroth order) and that QgL is nilpotent (QgL )2 = 0 .
(176)
The latter property is proved in Sec. 4.5.4 by using current conservation (172) and generalized gauge invariance (147). We point out that the conservation of the BRSTcurrent (172) and the construction of the nilpotent BRST-charge (174)–(176) use the master BRST-identity for Tn (L0 , . . . , L0 ) and Tn (L1 , L0 , . . . , L0 ) (∀n ∈ N) only. The BRST-transformation s˜ of the interacting fields WgL (f ), f ∈ D(O), is then defined by the commutator with QgL (or anti-commutator if W has an odd ghost number) def
s˜(WgL (f )) = [QgL , WgL (f )]∓ ,
f ∈ D(O) .
(177)
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1021
We extend s˜ to a graded derivation F (O) → F(O). The local observables are selected by the definition [7] def
A(O) =
ker s˜ . ran s˜
(178)
˜ (by which Following [7] we look for states on A(O) which take values in C ˜ is called we mean the formal power series with coefficients in C). Thereby, a ∈ C ∗ ˜ positive, if there exists b ∈ C with a = b b, where ∗ means complex conjugation. In [7] it is shown that A(O) can be naturally represented on the cohomology of QgL def
Hphys =
ker QgL ran QgL
(179)
and that the induced inner product on Hphys is positive definite in the just mentioned sense. Hence Hphys is a pre Hilbert space and its elements are interpreted as physical states. The master BRST-identity (136) yields also an explicit formula for the BRSTtransformation of the interacting fields ∈ F (O), by which we will see that the definition (177) agrees with the usual BRST-transformation. (In addition this ensures the existence of non-trivial observables.) For this purpose we note that the master BRST-identity and the requirement (148) imply the following relation for the R-products (93): [Q0 , Rn+1 (L0 g)⊗n ; W f ]∓ = −inRn+1 ((L0 g)⊗(n−1) ⊗ Lν1 ∂ν g; W f ) − iRn+1 ((L0 g)⊗n ; Wν0 ∂ ν f ) + nRn ((L0 g)⊗(n−1) ; [G((L0 , L1 )g, W f ) + G((W, W 0 )f, L0 g)]) ,
(180)
where we have assumed s0 (W ) = i∂ ν Wν0 . In Sec. 4.5.4 it will be shown that this identity implies the BRST-transformation formula s˜(WgL (f )) = [QgL , WgL (f )]∓ 0ν = −iWgL (∂ν f ) + i(G((L, L1 )g, W f ) + G((W, W 0 )f, Lg))gL ,
f ∈ D(O) , (181)
where L ≡ L0 . The term in the second line is the nonlinear part of the BRSTtransformation. In case that W is a single symbol we find that [G((L, L1 )g, W f ) + G((W, W 0 )f, Lg)] is quadratic in the symbols (because L and L1 are trilinear), in agreement with the usual BRST-transformation. (To prove (181) it will be shown that the terms [(QgL − Q0 ), WgL (f )] cancel out with the terms −inRn+1 ((L0 g)⊗(n−1) ⊗ Lν1 ∂ν g; W f ).) If we do not assume that s0 W is a divergence we end up with s˜(WgL (f )) = (s0 W )gL (f ) + i(G(1) (W f, Lg) + G((L, L1 )g, W f ))gL ,
(182)
September 23, 2002 13:37 WSPC/148-RMP
1022
00145
M. D¨ utsch & F.-M. Boas
instead of (181). We choose W = k µ , compare with (173) and find µ µ ] = ˜jgL . [QgL , kgL
(183)
Introducing the interacting ghost charge Z u def d4 x hµ (x)kµ gL (x) , QgL =
(184)
where hµ ∈ D(R4 ) is chosen in precisely the same way as in QgL (see Sec. 4.5.4), it results [QgL , QugL ] = QgL
(185)
as in [24]. In most of the following examples the computation of G(1) and G(2) gives less work than it seems, because only very few terms contribute. We use the values Cua = −1 (101) and CAa = − 21 (149) without further mentioning it. Example 4.1. BRST-transformation of Aµa gL (h): G(1) (L0 g, Aµa h) = 0 , G(2) (L1 g, Aµa h) =
3 G(1) (Aµa h, L0 g) = − g0 fabc Aµb uc hg , 4
3 g0 fabc Aµb uc gh , 4
G(2) (g µν ua h, L0 g) = g0 fabc Aµb uc hg . (186)
Therefore, s˜(Aµa gL (h)) = −iua gL (∂ µ h) + ig0 (fabc Aµb uc )gL (gh) .
(187)
Taking g|supp h = 1 into account the last term takes the usual form ig0 (fabc Aµb uc )gL (h). We see that G(1) (Aµa h, L0 g) 6= 0 gives a non-vanishing contribution to the non-linear term in (187). This shows that the distinction of internal and external derivatives and in particular the appearance of the external derivative in the definition of ∆µ (35) is crucial to obtain the correct BRST-transformation. Example 4.2. BRST-transformation of ua gL (h): G(1) (L0 g, ua h) = 0 ,
G(1) (ua h, L0 g) = 0 ,
1 G(2) (L1 g, ua h) = − g0 fabc ub uc gh , 2
G(0h, L0 g) = 0 .
(188)
Hence, i s˜(ua gL (h)) = − g0 (fabc ub uc )gL (h) . 2
(189)
Example 4.3. BRST-transformation of u ˜a gL (h): ˜a h) = 0 , G(1) (L0 g, u ˜a h) = 0 , G(2) (L1 g, u
G(1) (˜ ua h, L0 g) = 0 ,
G(2) (−∂ν Aνa h, L0 g) = 0 ,
(190)
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1023
0 where we have used g νµ ∂(∂∂L ν Aµ ) = 0. So we obtain a
s˜(˜ ua gL (h)) = iAνa gL (∂ν h).
(191)
Example 4.4. BRST-transformation of FaµνgL (h): G(1) (L0 g, Faµν h) = 0 , 1 + 3C1Aa g0 fabc Aνb uc (∂ µ h)g − (µ ↔ ν) , G(1) (Faµν h, L0 g) = 2 1 + 3C1Aa g0 fabc [∂˜µ (Aνb uc )gh + Aνb uc (∂ µ g)h] G(2) (L1 g, Faµν h) = 2 − (µ ↔ ν) + g0 fabc Fbµν uc gh ,
G(2) (0h, L0 g) = 0 .
(192)
Now we apply the relation (30) (which holds obviously also for the R-products; cf. the remark at the end of Sec. 2.4) Rn+1 (L0 g ⊗ · · · ; fabc ∂˜µ (Aν uc )gh) = −Rn+1 (L0 g ⊗ · · · ; fabc Aν uc ∂ µ (gh)) . (193) b
b
Hence, inserting these formulas into (181), the terms ∼ ( 12 + 3C1Aa ) cancel and it remains s˜(FaµνgL (h)) = ig0 (fabc Fbµν uc )gL (h) .
(194)
P P Example 4.5. BRST-transformation of ( a Faµν Faρτ )gL (h) (we do not write a but always perform this sum): G(1) (L0 g, Faµν Faρτ h) = 0 , G(1) (Faµν Faρτ h, L0 g) 1 + 3C1Aa g0 fabc [Faρτ Aνb uc (∂ µ h)g + (∂˜µ Faρτ )Aνb uc hg] = 2 − (µ ↔ ν) + {(µ, ν) ↔ (ρ, τ )} , G(2) (L1 g, Faµν Faρτ h) 1 + 3C1Aa g0 fabc [Faρτ ∂˜µ (Aνb uc )gh + Faρτ Aνb uc (∂ µ g)h] = 2 ρτ µν − (µ ↔ ν) + g0 fabc Fa Fb uc gh + {(µ, ν) ↔ (ρ, τ )} , G(2) (0h, L0 g) = 0 . (195) The term ∼ F F ugh in G(2) (L1 g, F F h) drops out because fabc is totally antisymmetric. Inserting these formulas into (181), the terms ∼ ( 12 + 3C1Aa ) cancel again due to (30). So we obtain s˜((Faµν Faρτ )gL (h)) = 0
(196)
September 23, 2002 13:37 WSPC/148-RMP
1024
00145
M. D¨ utsch & F.-M. Boas
and, hence, the corresponding equivalence class (cf. (178)) is a non-trivial observable. Example 4.6. Due to the requirement G((Lj , Lj+1 )f, Lk g) + (−1)jk {(j, f ) ↔ (k, g)} = 0 (148) we can easily write down the BRST-transformation of Lj gL , j = 0, 1, 2: s˜(L0 gL (h)) = −iLν1 gL (∂ν h) , s˜(Lν1 gL (h)) = −iLνµ 2 gL (∂µ h) ,
(197)
s˜(Lνµ 2 gL (h)) = 0 . Remark 4.3. (1) Having determined the interaction L0 we can explicitly write down the interacting field equations by means of (94): with the simplification (146) they read Aµ a gL = g0 fabc ∂ ν [g(Aµ b Aν c )gL ] − g0 gfabc (Aνb Fνµ c )gL + g0 gfabc (ub ∂µ u ˜c )gL ,
(198)
ua gL = −g0 fabc ∂µ [g(Aµb uc )gL ],
(199)
˜c )gL . ˜ ua gL = −g0 gfabc (Aµb ∂µ u
(200)
They hold true everywhere, g needs not to be constant. In the classical limit ~ → 0 interacting fields factorize, (V W )gL (x) = VgL (x)WgL (x) (see [9]), and hence (198)– (200) go over into the usual Yang–Mills equations. (2) One might think that the MWI agrees with the quantum Noether condition (QNC) [22]dd in the application to ∂µ T (j µ , L, . . . , L), where j µ is a conserved current and L the interaction. But for j being the BRST-current (117) we explicitly see that the two conditions are different and that the QNC is less general. Namely, the QNC in terms of interacting fields is precisely the particular case j1 = · · · = jn = 0 of (169). This equation is sufficient for perturbative gauge invariance ((147) for j1 = · · · = jn = 0) and, if ghost number conservation holds true, it is also necessary. The QNC does not contain any information about the G-terms in the master BRST-identity (136). In addition we recall that (169) (and hence in particular the QNC in terms of interacting fields) is not compatible with the MWI (157) and (N3). Comparing (201) with (168) we find that this holds true also for the QNC in terms of T -products. dd Note that the QNC in terms of T -products (given in the first paper of [22]) does not agree with the QNC in terms of interacting fields (second paper): they normalize the interacting BRSTcurrent differently. The second formulation fixes the BRST-current such that it is conserved up to terms ∼ ∂g where g is the coupling ‘constant’, in particular it yields the values (167) for the constants C1 , C2 and C3 in (161)–(166). But the first formulation requires
C1 = −1 ,
C2 = 0 ,
C3 = 1 .
(201)
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1025
In general note that the QNC is formulated for T (j µ , L, . . . , L) with ∂µ j µ = 0 and L the interaction, only, whereas the MWI deals with T -products of arbitrary factors. In particular ∂ ν W in (1) needs not to vanish and there are important applications with ∂ ν W 6= 0, e.g. the field equation (Sec. 4.1), non-Abelian matter currents (Sec. 4.3), massive axial currents jA (89) (see Sec. 5.1), etc.. 4.5.3. Massive gauge fields To simplify the notations we consider the most simple non-Abelian model, namely three massive gauge fields, ma > 0, a = 1, 2, 3 and no massless fields. However, as far as anomalies are absent and there is a solution for the below given requirements (A)–(F) and (209)–(211), (216)–(217) on the interaction L, our method applies also to general models with arbitrary numbers of massive and massless gauge fields ˜a and spinor fields. We will find the well-known result that with the fields Aµa , ua , u and φa (a = 1, 2, 3) only, a consistent construction of the model is impossible, more precisely generalized perturbative gauge invariance (147) for second order tree diagrams cannot be satisfied [15]. We will solve this problem in the usual way: besides the scalar fields (φa )a=1,2,3 , we introduce an additional real (free) scalar field H, the ‘Higgs field’, with arbitrary mass mH ≥ 0, which is quantized according to ( + m2H )H = 0 ,
H∗ = H ,
∆H,H = −DmH
(202)
and H commutes with all other free fields. To determine the interaction L0 we require the same properties (A)–(F) as in the massless case. The only modification is that Lj is now a polynomial in ˜a , φa , a = 1, 2, 3 and H and internal derivatives of these symbols (again Aµa , ua , u we solely admit monomials which have at least three factors). Proceeding as in the massless case we find the following particular solution of (A)–(F): L0 = g0 fabc [Aa µ Ab ν ∂ ν Aµc − ua ∂ µ u˜b Ac µ ] + dabc Aµa φb ∂µ φc + eabc Aµa Ab µ φc + habc u˜a ub φc 1 + lab (−HAµa ∂µ φb + (∂µ H)Aµa φb ) + Aµa Ab µ H mb m2H 3 4 Hφa φb + pH + tH , − Hu ˜ a ub − 2ma mb Lν1
= g0
fabc Aaµ ub Fcνµ
(203)
1 ν − ua ub ∂ u ˜c + 2eabc ua Aνb φc 2
+ dabc [ua φb ∂ ν φc + mc Aνa φb uc ] 1 ua (φb ∂ ν H − H∂ ν φb ) + Aνa ub H , +lab mb
(204)
September 23, 2002 13:37 WSPC/148-RMP
1026
00145
M. D¨ utsch & F.-M. Boas
Lνµ 2 = g0
fabc ua ub Fcνµ , 2
(205)
L3 = 0 ,
(206)
where fabc ∈ R is totally antisymmetric, lab ∈ R is symmetric and m2b + m2c − m2a , 2mb mc m2 + m2c − m2b = fabc a , 2mc
dabc = fabc
eabc = fabc
habc
p, t ∈ R .
m2b − m2a , 2mc (207)
The most general solution for L0 differs from the particular solution (203) by a coboundary −is0 K1 and a divergence ∂ν K2ν as in (143). In addition one has the freedom to add terms with vanishing divergence to L1 and L2 as in (145). It is a peculiarity of the present model, that the total antisymmetry of fabc implies the Jacobi identity, so that we obtain fabc = abc (= structure constant of SU (2))
(208)
by absorbing a constant factor in g0 . So far we could set lab = 0, p = 0, t = 0, in other words the Higgs field H is not needed to satisfy (A)–(F). Now we come to an interesting complication of the massive case: the requirement G((Lj , Lj+1 )f, Lk g)+ (−1)jk {(j, f ) ↔ (k, g)} = 0 (148) cannot be satisfied ! To save generalized perturbative gauge invariance (147) we require instead the following weaker condition: there exist Nj,k ∈ P0 , j, k = 0, 1, 2 such that G((Lj , Lj+1 )f, Lk g) + (−1)jk G((Lk , Lk+1 )g, Lj f ) + s0 (Nj,k )f g ν ν (∂ν f )g + (−1)jk Nj,k+1 f (∂ν g)] , = −i[Nj+1,k
j, k ∈ {0, 1, 2}, ∀f, g ∈ D(R4 ) , (209)
where N3,k = 0 = Nk,3 , k = 0, 1, 2, and that the finite renormalization def
T2 (Lj f ⊗ Lk g) → T2N (Lj f ⊗ Lk g) = T2 (Lj f ⊗ Lk g) + T1 (Nj,k f g)
(210)
maintains the permutation symmetry of T2 and the normalization conditions (N1), (N2), (N0), and preserves the ghost number: [Qg , T2N (Lj f ⊗ Lk g)] = (j + k)T2N (Lj f ⊗ Lk g) where we take g(Lj ) = j (142) into account (cf. (104)). This requirement can only be satisfied for t = 0 in (203). Hence, the G-terms in (209) are 4-legs terms and, therefore, we may restrict the Nj,k to be 4-legs terms, too. In other ˜a , φa , a = 1, 2, 3 and words the Nj,k are sums of monomials of degree four in Aµa , ua , u in H (without any derivative), which are Lorentz tensors of rank (j + k) and satisfy g(Nj,k ) = j + k ,
∗ Nj,k = −Nj,k ,
Nj,k = (−1)jk Nk,j .
(211)
νµ µν N1,1 = −N1,1 .
(212)
These properties imply N2,1 = 0 = N1,2 ,
N2,2 = 0 ,
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1027
If (209) and (210) is satisfied we indeed obtain [Q0 , T2N (Lj1 f1 ⊗ Lj2 f2 )]∓ = −i(T2N (Lνj1 +1 ∂ν f1 ⊗ Lj2 f2 ) + (−1)j1 T2N (Lj1 f1 ⊗ Lνj2 +1 ∂ν f2 )) ,
(213)
by using the master BRST-identity (132). Turning to arbitrary orders n ≥ 3 we look for a sequence of T -products (TnN )n∈N (in the sense of Sec. 2.2) • which satisfies the normalization conditions (N1), (N2) and (N0), • which agrees as far as possible with the given sequence (Tn )n∈N that satisfies ˜ and the MWI (N)), all normalization conditions (also (N3), (N) N • and for which T2 (Lj , Lk ) is connected with T2 (Lj , Lk ) by (210) for all j, k. For this purpose let B = {L0 , L1 , L2 , B1 , B2 , . . .} be a (vector space) basis of P˜0 . Due to causality (13) the renormalization terms T1 (Nj,k f g) in (210) propagate to higher orders. More precisely we define N (Bk1 g1 ⊗ · · · ⊗ Bkl gl ⊗ Lj1 f1 ⊗ · · · ⊗ Ljn fn ) Tl+n n
def
=
[2] X X m=0 π∈Sn
1 2m m!(n − 2m)!
× η π (j1 , . . . , jn )Tl+n−m (Bk1 g1 ⊗ · · · ⊗ Bkl gl ⊗ Njπ1 ,jπ2 fπ1 fπ2 ⊗ · · · ⊗ Njπ(2m−1) ,jπ2m fπ(2m−1) fπ2m ⊗ Ljπ(2m+1) fπ(2m+1) ⊗ · · · ⊗ Ljπn fπn ) , (214) where Bk1 , . . . , Bkl ∈ B\{L0 , L1 , L2 } and η π (j1 , . . . , jn ) is the sign coming from the permutation of Fermi-operators in (Lj1 , . . . , Ljn ) → (Ljπ1 , . . . , Ljπn ). We extend this definition to D(R4 , P˜0 )⊗(l+n) by requiring linearity and (permutation) symmetry. Obviously this (TnN )n∈N solves our requirements. The formula (214) is a particular (simple) case of [32, Theorem 3.1], which is a precise formulation of a formula given in [4]. For later purpose we mention that the T N -products (214) satisfy (N5) (ghost), because the T -products do so. In particular we have N (k µ ∂µ g ⊗ Lj1 f1 ⊗ · · · ⊗ Ljn fn ) Tn+1 n X jl T N (Lj1 f1 ⊗ · · · ⊗ Ljl fl g ⊗ · · · ⊗ Ljn fn ) . =
(215)
l=1
But in general the T N -products violate (N3) and the MWI (N). To obtain generalized perturbative gauge invariance to orders n ≥ 3 we additionally require G((Lj , Lj+1 )f, Nk,l g) + G(1) (Nk,l g, Lj f ) = 0 , j, k, l ∈ {0, 1, 2} ,
∀f, g ∈ D(R4 ) ,
(216)
September 23, 2002 13:37 WSPC/148-RMP
1028
00145
M. D¨ utsch & F.-M. Boas
and k, l, r, s ∈ {0, 1, 2} ,
G(1) (Nk,l f, Nr,s g) = 0 ,
∀f, g ∈ D(R4 ) .
(217)
Then, applying the master BRST-identity (136) to [Q0 , TnN (Lj1 f1 ⊗ · · · ⊗ Ljn fn )]∓ and taking (209), (216) and (217) into account, we find the wanted (modified) generalized perturbative gauge invarianceee [Q0 , TnN (Lj1 f1 ⊗ · · · ⊗ Ljn fn )]∓ = −i
n X
(−1)j1 +···+jl−1 TnN (Lj1 f1 ⊗ · · · ⊗ Lνjl +1 ∂ν fl ⊗ · · · ⊗ Ljn fn ) . (218)
l=1
The particular case j = k = l = 0 of the requirements (209) and (210) and (216) has been worked out for general models in [15, 21, 36]. We specialize these results to the present model: (1) The second order requirement (209) and (210) is very restrictive: its restriction to j = k = 0 is satisfied if and only if the following relations (a)–(e) hold: (a) the masses agreeff def
m = m1 = m2 = m3 .
(219)
(b) fabc satisfies the Jacobi identity. (In our simple model this is already known (208), but in the general case the Jacobi identity is obtained only at this stage here, similar to the massless case.) (c) The H-coupling parameters take the values lab =
κm δab , 2
t = 0,
(220)
where κ ∈ {−1, 1} is an undetermined sign, and p is still free. In particular we see that the Higgs field (or another enlargement/modification of the model) is indispensable. (d) The constants CAa , Cφa and CH have the valuesgg 1 CAa = − , 2
Cφa = −1 ,
∀a ,
CH = −1 .
(e) The polynomial N0,0 reads ( " 3 !2 # ) 3 X X m2H 2 2 2 2 4 φa + 2H φa + λH δ(x1 − x2 ) , N0,0 = ig0 16m2 a=1 a=1
(221)
(222)
where λ ∈ R is a constant which is undetermined so far. j1 = · · · = jn = 0 this is the formulation of pertubative gauge invariance for massive fields in [15, 21, 36]. ff For general models (219) is replaced by more complicated mass relations, see [15, 21, 36]. gg We recall that C ua = −1 has already been used in the derivation of the master BRST-identity. ee For
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1029
(2) The third order requirement (216) fixes the remaining free parameters p and λ (which are the parameters of the H-self-couplings): the particular case j = k = l = 0 of (216) holds true if and only if p=
m2H , 4m
λ=−
m2H . 16m2
(223)
The CA , Cu , Cφ - and CH -terms in the tree-diagram part of T2N (L0 , L0 ) and N0,0 correspond to the quadrilinear terms in the interaction Lagrangian of the conventional theory (the latter are also of order g02 ), cf. Remark 4.3(2). With this identification our resulting interaction agrees precisely with the SU (2) Higgs–Kibble model, which is usually obtained by the Higgs mechanism. Here we have derived it in a completely different way (cf. [15, 16, 21, 36]). By inserting the explicit expressions (222) into the definition (130) of G(1) (· , ·) we verify that the fourth order requirement (217) holds true for k = l = r = s = 0. We strongly presume that the requirements (209), (210), (216) and (217) can be fulfilled for all values of j, k, l, r, s.hh In the following we assume that this conjecture holds true and that the master BRST-identity is fulfilled. In particular we will use the modified generalized perturbative gauge invariance (218). From the time ordered products (TnN )n we obtain the corresponding antichronological products (T¯nN )n by (19). In terms of the T N - and T¯N -products we N )n (93). The generating functional construct the totally retarded products (Rn+1 N N (f ) (92). Similarly to the original of the latter is the interacting field ΛgL or WgL R-products, the RN -products have retarded support N (· · ·)(x1 , . . . , xn ; x) ⊂ {(y1 , . . . , yn , y)|yl ∈ y + V¯− , ∀l} . supp Rn+1
(224)
The proof of this support property uses only the causal or anti-causal factorization of the T N - and T¯ N -products (see [19]). The replacement ΛgL → ΛN gL (f ) is a finite renormalization of the interacting field. The field equation for ϕN gL (where ϕ ∈ P0 corresponds to a free field without any derivative) differs from the one of ϕgL : instead of (94) the master Ward identity implies N ∂L (x) = −g(x) (x) ( + m2 )ϕN gL ∂χ gL N N ∂N0,0 ∂L 1 (x) + ∂xµ g(x) (x) , − (g(x))2 2 ∂χ gL ∂(∂ µ χ) gL (225) where we use that N0,0 contains no derivatives. The additional term corresponds to the contribution to the Euler–Lagrange equation of that quadrilinear terms (in the hh Additionally
larly to N0,0 .
we expect that these requirements determine N1,0 , N1,1 and N2,0 uniquely, simi-
September 23, 2002 13:37 WSPC/148-RMP
1030
00145
M. D¨ utsch & F.-M. Boas
conventional Lagrangian) which belong to N0,0 . The contributions of the quadrilinear terms belonging to the CA , Cu , Cφ - and CH -terms are contained in the first term on the r.h.s. of (225). For example the contribution of the g02 A4 -coupling (which belongs to the CA -terms) is contained in the g0 AF -term in (198), because FgL (155) has a nonlinear term ∼ g0 (AA)gL . The latter is indeed ∼ CA in our framework. The construction of the interacting BRST-current ˜jgL (171) must be modified correspondingly. We define N (j µ f ⊗ Lj1 f1 ⊗ · · · ⊗ Ljn fn ) T˜n+1 def
N = [Q0 , Tn+1 (k µ f ⊗ Lj1 f1 ⊗ · · · ⊗ Ljn fn )]∓
+i
n X
N (−1)j1 +···+jl−1 Tn+1 (k µ f ⊗ Lj1 f1 ⊗ · · · ⊗ Lνjl +1 ∂ν fl ⊗ · · · ⊗ Ljn fn ) .
l=1
(226) T˜ N (j, Lj1 , . . .) has the same properties as T˜(j, Lj1 , . . .) (160), in particular it satisfies causality (13) and the normalization condition (N0). Hence, T (j, Lj1 , . . .) → T˜ N (j, Lj1 , . . .) is a change of normalization. The divergence identity (169) holds N , TnN ), because T N (k µ , Lj1 , . . .) fulfills (N5) (ghost) (215) true also for (T˜n+1 N and T (Lj1 , . . .) satisfies generalized perturbative gauge invariance (218). Let N ˜ n+1 R (L, . . . , L; j) be the R-product (93) which is constructed in terms of Nµ N N (j, L, . . . , L), 1 ≤ k ≤ n. Then we define ˜jgL (f ) similarly Tk (L, . . . , L) and T˜k+1 N ˜ ˜ to (171), replacing Rn+1 (· · ·) by Rn+1 (· · ·). Analogously to (172) we then find that this interacting BRST-current is conserved up to terms ∼ ∂g, more precisely ν ˜j N µ (∂µ f ) = −LN 1 gL (f ∂ν g) . gL
(227)
The interacting BRST-charge is now defined by def
Z
QN gL =
d4 x hµ (x)˜jµNgL (x)
(228)
instead of (174). As in the massless case the construction can be done such that QN gL fulfills (175) and is nilpotent (176) (see Sec. 4.5.4). We turn to the BRST-transformation of the interacting fields ∈ F (O) (156). Let s0 (W ) = i∂ ν Wν0 . The formula (180) is violated by the non-vanishing of G((L0 , L1 )f, L0 g) + (f ↔ g). It must be modified: N ((L0 g)⊗n ; W f )]∓ [Q0 , Rn+1 N N = −inRn+1 ((L0 g)⊗(n−1) ⊗ Lν1 ∂ν g; W f ) − iRn+1 ((L0 g)⊗n ; Wν0 ∂ ν f )
+ nRnN ((L0 g)⊗(n−1) ; [G((L0 , L1 )g, W f ) + G((W, W 0 )f, L0 g)]) ,
(229)
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1031
where for simplicity we assume that W 0 ∈ P0 does not contain any derivative of φa or H.ii This assumption ensures G((W, W 0 )f, N0,0 g) = 0 ,
∀f, g ∈ D(R4 ) ,
(230)
as can be seen by inserting the explicit expression (222) for N0,0 into the definition (133) of G(· , ·). Moreover, one finds G(1) (N0,0 g, W f ) = 0 ,
∀f, g ∈ D(R4 ) ,
(231)
by inserting (222) into the definition (130) of G(1) . Analogously to the derivation of (218), the proof of (229) is a straightforward application of the master BRST N W f ⊗ (L0 g)⊗l ], which uses (209), (216), (217), (230) and identity to [Q0 , Tl+1 (231). The resulting formula is then translated into an identity for the R-products (93). As in the massless case, (229) yields the BRST-transformation N N 0ν N (f )) = [QN s˜(WgL gL , WgL (f )]∓ = −iWgL (∂ν f ) def
N + i G((L, L1 )g, W f ) + G((W, W 0 )f, Lg) gL ,
f ∈ D(O)
(232)
(see Sec. 4.5.4). 4.5.4. The interacting BRST-charge In this subsection we summarize the construction of the interacting BRST-charge given in [7]. With that we prove the properties (175) and the nilpotency. Finally we show that the identity (180) ((229) respectively) implies the BRST-transformation formula (181) ((232) respectively) for the interacting fields. We deal with massive gauge fields, however, the massless case is included: in all formulas it is allowed to set m = 0, φ ≡ 0 and H ≡ 0 (which replaces T N by T etc.). We assume that the double cone O (156) is centered at the origin. Let r be the diameter of O. The question is, how to choose g and hµ such that QN gL (228) satisfies (175) Rand is nilpotent. As explained in [7] it is hard to avoid a volume divergence in d4 x hµ (x)˜jµNgL (x), at least for massless fields. To get rid of this problem we proceed as in [7]: we embed our double cone O isometrically into the cylinder R × CL (the first factor denotes the time axis), where CL is a cube of length L, L r, with metallic boundary conditions for each free gauge fields Aa (a = 1, . . . , N ), and Diriclet boundary conditions for the free ghost fields ua , u˜a and the free bosonic scalar fields φa and H. If we choose the compactification length L big enough, the physical properties of the local algebra F (O) are unchanged. Following [7] we choose the switching function g to fulfil g(x) = 1 ,
∀x ∈ O ∪ {(x0 , ~x)| |x0 | < } ,
(r > 0)
(233)
W ∈ [{L0 , L1 , L2 }] (the [· · ·]-bracket denotes the linear span) this assumption is not needed: [G(· · ·) + G(· · ·)] vanishes and (229) follows immediately from (218).
ii For
September 23, 2002 13:37 WSPC/148-RMP
1032
00145
M. D¨ utsch & F.-M. Boas
on R × CL and to have compact support in timelike directions. In addition let hµ be such that Z def dx0 h(x0 ) = 1 . (234) hµ (x) = δ0µ h(x0 ) , where h ∈ D([−, ], R) , Then def QN gL =
Z
Z d3 x ˜j0NgL (x)
dx0 h(x0 )
(235)
CL
is well-defined, because (x0 , ~x) → h(x0 ) is an admissible test function on R × CL . ˜N (QN gL )0 = Q0 holds true, since (jµ gL )0 = jµ is conserved. From the conservation of the interacting current, ∂ µ ˜jµNgL (x) = 0 for x ∈ [−, ] × CL , we conclude that QN gL is independent from the choice of h. By (N2) and by the fact that h and g are ∗ N real-valued we obtain QN gL = QgL . It remains to prove the nilpotency. For this purpose we first show Nν QN gL = Q0 + L1 gL (H∂ν g)
where ˜ 0 ) def = H(x) ≡ H(x
Z
x0
−∞
dt[−h(t) + h(t − a)]
(236)
(237)
and a ∈ R is such that the support of (x0 , ~x) → h(x0 − a) is earlier than the support of g: x0 < y0 ,
∀x0 ∈ supp h(· − a) ∧ ∀y0 with ∃~y ∈ CL with (y0 , ~y) ∈ supp g .
(238)
In particular we will need H(y)∂g(y) = ∂g(y) ,
∀y ∈ ((supp H ∪ O) + V¯− ) ,
O ∩ (supp (H∂g) + V¯− ) = ∅
(239) (240)
and ∂H(y)∂g(y) = 0 . Proof of (236). From our definitions we immediately obtain Z Nµ N d4 x ˜jgL (x)[−∂µ H(x) + gµ0 h(x0 − a)] . QgL =
(241)
(242)
R×CL
Due to (238) and the retarded support of the R-products (224) we have supp(˜jµNgL − jµ ) ∩ supp((x0 , ~x) → h(x0 − a)) = ∅ .
(243)
So the contribution of gµ0 h(x0 − a) to (242) is Q0 , and by inserting (227) into the ∂µ H-term we obtain the assertion (236). The formula (236) manifestly shows that QN gL converges to Q0 in the adiabatic limit g(x) → 1, ∀x, provided this limit exists. For pure massive theories this limit exists indeed in the strong sense [20]. So, in the adiabatic limit of a pure massive
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1033
gauge theory, one has the simplification that the BRST-cohomology is given in terms of Q0 (cf. (139) and (140) and [16]). In the following proofs we will use [7, Proposition 2], which is a formula for the (anti-)commutator of two interacting fields:jj N N N N [WgL (f ), VgL (h)]∓ = RgL (W, V )(f, h) ∓ RgL (V, W )(h, f ) ,
(244)
where def
N (W, V )(f, h) = − RgL
∞ n X i N Rn+2 ((gL)⊗n ⊗ W f ; V h) . n! n=0
(245)
N (W, V )(f, h) the Due to the retarded support (224) of the RN -products we call RgL N N N (h)]∓ . retarded part (and ∓RgL (V, W )(h, f ) the advanced part) of [WgL (f ), VgL 2 Proof of (QN gL ) = 0. Generalized perturbative gauge invariance (218) implies the relation N ((Lg)⊗n ; Lµ1 f )]+ [Q0 , Rn+1 N N = −inRn+1 ((Lg)⊗(n−1) ⊗ Lρ1 ∂ρ g; Lµ1 f ) − iRn+1 ((Lg)⊗n ; Lµρ 2 ∂ρf ) . (246) ρµ Due to Lµρ 2 = −L2 we may require ρµ N TlN (L, . . . , L, Lµρ 2 ) = −Tl (L, . . . , L, L2 ) ,
∀l ∈ N .
(247)
This is an additional normalization condition, which is compatible with the other normalization conditions. Similarly to (N2), it can be satisfied by antisymmetrizing in µ ↔ ρ a TlN (L, . . . , L; Lµρ 2 ) which satisfies the other normalization conditions. N (L, . . . , L; Lµρ The corresponding Rn+1 2 ) is then also antisymmetric in µ ↔ ρ and, ρµ N µρ = −L . hence, we have LN 2 gL 2 gL By means of (236) and Q20 = 0 we find Nµ Nµ 2 N N Nν 2(QN gL ) = [QgL , QgL ]+ = 2[Q0 , L1 gL (H∂µ g)]+ + [L1 gL (H∂ν g), L1 gL (H∂µ g)]+ .
(248) Using (246) and (245) we obtain µ N µρ µ N ν 2[Q0 , LN 1 gL (H∂µ g)]+ = −2iL2 gL (∂ρ (H∂µ g)) − 2RgL (L1 , L1 )(∂ν g, H∂µ g) . (249)
In the first term the (∂ρ H)(∂µ g)-part vanishes by (241) and the H∂ρ ∂µ g-part is zero because of (247). In the remaining second term we first take (239) into account and then apply (244) µ µ N ν 2[Q0 , LN 1 gL (H∂µ g)]+ = − 2RgL (L1 , L1 )(H∂ν g, H∂µ g) Nµ ν = − [LN 1 gL (H∂ν g), L1 gL (H∂µ g)]+ .
(250)
jj This formula, the retarded support of the R-products (224) and some further, quite obvious requirements can be viewed as the defining properties of the retarded products. They determine a direct construction of the Rn+1 (n ∈ N0 ) by induction on n. If wanted, the T -products can then be obtained by reversing (93), see [37], the second paper of [8] and [10].
September 23, 2002 13:37 WSPC/148-RMP
1034
00145
M. D¨ utsch & F.-M. Boas
Inserting this into (248) we see that QN gL is in fact nilpotent.
Proof of the formula N 0ν N 0 N [QN gL , WgL (f )]∓ = −iWgL (∂ν f )+i(G((L, L1 )g, W f )+G((W, W )f, Lg))gL
(251)
for the BRST-transformation of the interacting fields, let f ∈ D(O). According to (236) we have to compute two terms: N N Nν N [QN gL , WgL (f )]∓ = [Q0 , WgL (f )]∓ + [L1 gL (H∂ν g), WgL (f )]∓ .
(252)
For the first one the identity (229) gives N 0ν N (f )]∓ = − iWgL (∂ν f ) + i(G((L, L1 )g, W f ) [Q0 , WgL N ν + G((W, W 0 )f, Lg))N gL − RgL (L1 , W )(∂ν g, f ) ,
(253)
where we have used (245). We turn to the other term in (252) and apply (244): ν N N ν N ν [LN 1 gL (H∂ν g), WgL (f )]∓ = RgL (L1 , W )(H∂ν g, f ) ∓ RgL (W, L1 )(f, H∂ν g) . (254)
The second term vanishes due to the support properties (224) and (240). Because of (239) we may omit the factor H in the first term. Hence, we find that ν N [LN 1 gL (H∂ν g), WgL (f )]∓ cancels out with the last term in (253) and it remains the assertion (251). Summing up the following conditions on an interaction L0 are sufficient for our local construction of observables. • L0 fulfills the conditions (A)–(F) given at the beginning of Sec. 4.5.1. or 4.5.3 respectively. In addition one can choose Luν 2 such that it is antisymmetric in µ1 ν. • There exist sums Nj,k ∈ P0 of monomials of degree four which satisfy (209)–(211) and (216)–(217). • There is no restriction coming from ghost number conservation (N5) (ghost). In the relevant cases, this normalization condition has common solutions with (N0)–(N3), see [3, Appendix B.1]. • Simultaneously with (N0)–(N3) and (N5) (ghost), the master BRST-identity can be fulfilled for [Q0 , Tn (L0 , . . . , L0 )] and [Q0 , Tn (L1 , L0 , . . . , L0 )]
(255)
to all orders n ∈ N. The conditions mentioned so far suffice for the construction of the BRST-charge N QN gL . In our derivation of the BRST-transformation formula (232) of WgL (W ∈ P0 arbitrary) we have additionally used: • the master BRST-identity for [Q0 , Tn (W, L0 , . . . , L0 )] , • the properties (230) and (231) of N0,0 .
∀n ∈ N ,
(256)
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1035
5. Taking Anomalies into Account 5.1. General procedure and axial anomaly We recall that we understand by the expression ‘anomaly’ any term that violates the master Ward identity (N) and cannot be removed by an admissible finite renormalization of the T -products. The aim of this section is to take anomalies into account in the formulation of the MWI. This has consequences for the normalization condi˜ We want to normalize T ((∂˜ν V ) ⊗ W1 f1 ⊗ · · ·) such that (30) holds true: tion (N): ˜ should agree with in the special case W = 1, V, W1 , . . . , Wn ∈ P0 the r.h.s. of (N) ˜ the r.h.s. of (N). Hence, we will take the anomalies into account also in (N). We proceed inductively with respect to the order n. Since we are not aware of a general proof that second order loop diagrams (i.e. n = 1 in (N)) are anomalyfree,kk we start with that case. We set (2)ν
def
−aV,W (g, f ) = T2 (V ∂ ν g ⊗ W f ) + T2 ((∂ ν V )g ⊗ W f ) X ∂V ∂W ν g, f , (±)T1 ∆χ,ψ +i ∂χ ∂ψ
V, W ∈ P˜0 . (257)
χ,ψ∈G
For later purpose we let V, W ∈ P˜0 (not only V, W ∈ P0 as in the MWI (N)). (2) Due to causal factorization of the T -products we know that aV,W (g, f ) is local. (2)
Therefore, there exists a unique bV,W (g, f ) ∈ D(R4 , P0 ) withll (2)ν
(2)ν
aV,W (g, f ) = T1 (bV,W (g, f )) .
(258)
(m+1)
(m+1)
Let us now assume that we have already defined aV,W1 ,...,Wm and bV,W1 ,...,Wm ∈ D(R4 , P0 ) for all m < n. We then set (n+1)ν
−aV,W1 ,...,Wn (g, f1 , . . . , fn ) def
= Tn+1 (V ∂ ν g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) + Tn+1 ((∂ ν V )g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) n X X ∂V ∂Wm ν g, fm ⊗ W1 f1 ⊗ · · · m (±)Tn ∆χ,ψ ˆ · · · ⊗ Wn fn +i ∂χ ∂ψ m=1 χ,ψ∈G
+
n−1 X
X
(k+1)ν (±)Tn+1−k bV,Wm ,...,Wm (g, fm1 , . . . , fmk )
k=1 1≤m1 <...<mk ≤n
ˆ1 ···m ˆ k · · · ⊗ Wn fn , ⊗ W1 f1 ⊗ · · · m kk In
1
k
(259)
our proof of charge conservation (N5) (charge) (which is given in Appendix B of [7]) vacuum polarization plays an exceptional role, even to second order: our general argumentation does not yield ∂xµ T2 (jµ (x)jν (y)) = 0, an explicit calculation was necessary. ll Remember T (D(R4 , P )) = T (D(R4 , P ˜0 )) and that T1 is injective on D(R4 , P0 ). 1 0 1
September 23, 2002 13:37 WSPC/148-RMP
1036
00145
M. D¨ utsch & F.-M. Boas
where V, W1 , . . . , Wn ∈ P˜0 and the possible signs (±) come from the permutation of Fermi operators. By causal factorization and the definition of the (b(k+1) )k
unique bV,W1 ,...,Wn (g, f1 , . . . , fn ) ∈ D(R4 , P0 ) with (n+1)
(n+1)
aV,W1 ,...,Wn (g, f1 , . . . , fn ) = T1 (bV,W1 ,...,Wn (g, f1 , . . . , fn )) .
(260)
Obviously b(n+1) : D(R4 , P˜0 )⊗n+1 → D(R4 , P0 ) , (n+1)
V g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn → bV,W1 ,...,Wn (g, f1 , . . . , fn )
(261)
is linear and symmetrical in all factors except the first one. As a consequence of (n+1) the normalization condition (N3) the bV,W1 ,...,Wn , V, W1 , . . . , Wn ∈ P0 , n fixed, are not independent. For example let V be a sub-polynomial of V 0 and Wk a subpolynomial of Wk0 , ∀k = 1, . . . , n (V = V 0 and Wk = Wk0 is admitted). Then (n+1) (n+1) (n+1) bV,W1 ,...,Wn 6= 0 implies bV 0 ,W 0 ,...,W 0 6= 0.mm However, the bV,W1 ,...,Wn are indepenn 1 dent for different n, because the violations of the MWI coming from sub-diagrams are taken into account in (259) by the terms n−1 XX
(±)Tn+1−k (b(k+1) (· · ·) ⊗ · · ·) .
k=1 (k)
(259) and (260) depend on the normalization of the T Obviously the b ˜ in the following products. We assume that the latter fulfil (N0)–(N3) and (N) modified form: ˜ (N)
Tn+1 ((∂˜ν V )W g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) = Tn+1 ((∂ ν V )W g ⊗ W1 f1 ⊗ · · · ⊗ Wn fn ) n X X ∂Wm ∂V ν W g, fm (±)Tn ∆χ,ψ +i ∂χ ∂ψ m=1 ˜ χ,ψ∈G
ˆ · · · ⊗ Wn fn ⊗ W1 f1 ⊗ · · · m +
n X
X
(k+1)ν
(±)Tn+1−k (bV,Wm
k=1 1≤m1 <...<mk ≤n
1 ,...,Wmk
ˆ1···m ˆ k · · · ⊗ Wn fn ) , ⊗ W1 f1 ⊗ · · · m
(g, fm1 , . . . , fmk )W (262)
where V, W, W1 , . . . , Wn ∈ P˜0 . Note that the sum over k in the last term runs here up to n. Setting W = 1 and using the definition (259) and (260), we in fact obtain see this we consider the (V 0 , W10 , . . . , Wn0 )-diagrams in which the additional factors of V 0 , W10 , . . . , Wn0 are external legs. By amputating these external legs we obtain all (V, W1 , . . . , Wn )diagrams. (N3) requires that the non-amputated and amputated diagrams are equally normalized.
mm To
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1037
(30) (even for V, W1 , . . . , Wn ∈ P˜0 ), which is the main reason for this modification ˜ In Sec. 2 this implication relied on the validity of the MWI (N). This of (N). assumption is not needed here to get (30). ˜ fixes the normalization of the T -products of symbols with Also the modified (N) external derivatives in terms of T -products without external derivatives, namely by Q (s) the following recursive procedure: For a monomial W = s ∂˜a ϕrs ∈ P˜0 , ϕr ∈ P (s) (s) P0 , a(s) ∈ (N0 )4 , we define |W | = s |a(s) | where |a(s) | = a0 + · · · + a3 . Let the normalization of Tn (W1 , . . . , Wn ) with |W1 | + · · · + |Wn | = 0 (i.e. W1 , . . . , Wn ∈ (n) P0 ) be given for all n ∈ N. Then the determination of the bW1 ,...,Wn and of the normalization of Tn (W1 , . . . , Wn ) with |W1 | + · · · + |Wn | > 0 goes in a double inductive way: one makes a first induction with respect to the order n and for each fixed n a second induction with respect to |W1 | + · · · + |Wn |. More precisely (l) let Tl (W1 , . . . , Wl ) and bW1 ,...,Wl be given for all l ≤ n and W1 , . . . , Wl ∈ P˜0 , and ˜ also for l = n + 1 if |W1 | + · · · + |Wn+1 | < d (d ∈ N). Then we determine by (N) (262) the normalization of the Tn+1 (W1 , . . . , Wn+1 ) with |W1 | + · · · + |Wn+1 | = d (this step does not take place for d = 0, because Tn+1 is given in that case). ˜ for |V | + |W | + |W1 | + · · · + |Wn | = d − 1. Note that More precisely we use (N) (k+1) ˜ are and all T -products which appear in this case on the r.h.s. of (N) all b (n+1) inductively given. Finally, from (259) and (260) we obtain the bV,W1 ,...,Wn with |V | + |W1 | + · · · + |Wn | = d. Again we point out that thereby all terms which appear on the r.h.s. of (259) are inductively known. Starting this procedure with restricted T -products (Tn |D(R4 ,P0 )⊗n )n∈N which satisfy (N0)–(N3) (such T -products exist ˜ [19]) we end up with T -products which fulfil (N0)–(N3) and the modified (N). To formulate the modified MWI we specialize to the case V, W1 , . . . , Wn ∈ P0 . Let us consider the set T of all sequences of T -products (Tn )n∈N which satisfy the requirements of Sec. 2.2 (in particular causality and the normalization conditions ˜ We now define (N0)–(N3)) and the modified (N). def A((Tn )n∈N ) = (¯b(n+1) )n∈N ,
∀(Tn )n∈N ∈ T ,
(263)
where ¯b(n+1) is the restriction of b(n+1) (261) to D(R4 , P0 )⊗n+1 . The image A(T ) of this map is model dependent and it is usually hard work to get information def about A(T ). If the zero-sequence (i.e. 0 = (0, 0, . . .)) is an element of A(T ), which means that the model is anomaly-free, we are in the situation of Sec. 2.4: the MWI is then the normalization condition which forbids all (Tn )n∈N which are not an element of A−1 (0). If 0 6∈ A(T ) we choose a suitable (usually as simple as possible) b ∈ A(T ) and the master Ward identity is then the normalization condition that solely sequences (Tn )n∈N ∈ A−1 (b) are allowed. We illustrate this by the example of the axial anomaly. Let Pψ be the linear def
space which is generated by L = Aµ j µ , jA , jπ (cf. (89)) and all sub-monomials thereof. According to Bardeen [1] the most simple b ∈ A(T ) reads
September 23, 2002 13:37 WSPC/148-RMP
1038
00145
M. D¨ utsch & F.-M. Boas (n+1)
∀ n + 1 6= 3 ,
bV,W1 ,...,Wn = 0 , (3)ν bjAν ,j µ1 ,j µ2 (g, f1 , f2 )
V, W1 , . . . , Wn ∈ Pψ ,
= Cµ1 µ2 ρτ g(∂ρ f1 )(∂τ f2 ) ,
(3)ν
bjAν ,L,L (g, f1 , f2 ) = Cµ1 µ2 ρτ g∂ρ (f1 Aµ1 )∂τ (f2 Aµ2 ) , (3)ν
(264)
(3)ν
bjAν ,L,j µ (g, f1 , f2 ) = bjAν ,j µ ,L (g, f2 , f1 ) = Cµ1 µ2 ρτ g∂ρ (f1 Aµ1 )∂τ f2 , (3)
(3)
bjA ,jA ,jA (g, f1 , f2 ) = 13 bjA ,j,j (g, f1 , f2 ) and bV,W1 ,W2 = 0 for all other (V, W1 , W2 ) ∈ (Pψ )×3 , where C is a well-known, fixed, complex number. Then particular cases of the MWI read (3)
Tn+1 (j µ ∂µ f ⊗ Lg1 ⊗ · · · ⊗ Lgn ) = 0, µ ∂µ f ⊗ Lg1 ⊗ · · · ⊗ Lgn ) Tn+1 (jA
= 2mTn+1 (jπ f ⊗ Lg1 ⊗ · · · ⊗ Lgn ) X Cµ1 µ2 ρτ Tn−1 (f ∂ρ (gm1 Aµ1 )∂τ (gm2 Aµ2 ) + 1≤m1 <m2 ≤n
ˆ1···m ˆ 2 · · · ⊗ Lgn ) , ⊗ Lg1 ⊗ · · · m
(265)
which imply µ (∂µ f ) = 0 , jgL
(266) C µ1 µ2 ρτ (Fρµ1 Fτ µ2 )gL (f ) , 8 where we assume g(x) = g0 = const., ∀x ∈ supp f . (m+1) Non-vanishing anomalies bV,W1 ,...,Wm are not an obstacle to fulfil the normalization condition (N4) and hence the field equation (94) (see Sec. 4.1), because (91) still solves (N4) (90). But the axial anomaly appears as an additional term in the charge conservation (N5) (charge) (96) and in the generalized perturbative gauge invariance (N6) (147) and hence also in the master BRST-identity, if axial fermions are present. However, for the non-Abelian gauge models studied in Sec. 4.5, we expect that the master BRST-identity can be satisfied in the relevant cases (255) and (256), and that therefore our local construction of observables works. But this remains to be proved. µ 2 −jA gL (∂µ f ) = 2mjπ gL (f ) − g0
5.2. Energy momentum tensor: conservation and trace anomaly We follow the procedure in [29]. Classically the canonical energy momentum tensor is the Noether current belonging to translation invariance (in time and space). Turning to QFT we consider a real, free, scalar field φ of mass m ≥ 0. (In the def
def
formalism of Appendix A we set ϕ = χ = φ and choose = 1.) The free canonical energy momentum tensor reads 1 µν ρ 1 µν 2 2 µ ν (267) Θµν 0 can = ∂ φ∂ φ − g ∂ φ∂ρ φ + g m φ , 2 2 and this tensor is conserved due to the Klein–Gordon equation: ∂µ Θµν 0 can = 0.
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1039
Now we add an interaction of the form L = λφ4 .
(268)
The interacting canonical energy momentum tensor is not simply the interacting field belonging to Θµν 0 can , it has an additional term µν µν Θµν can gL (f ) = Θ0 can gL (f ) + g LgL (gf ) .
(269)
Let W1 , . . . , Wn be polynomials in φ (without any derivative). Applying twice the definition (259) and (260) we obtain the relation T (W1 g1 ⊗ · · · ⊗ Wn gn ⊗ Θµν 0 can ∂µ f ) = −i
n X
T (W1 g1 ⊗ · · · ⊗ (∂ ν Wk )gk f ⊗ · · · ⊗ Wn gn ) − A1 (f, g1 , . . . , gn )
k=1
= −A1 (f, g1 , . . . , gn ) + i
n X
[T (W1 g1 ⊗ · · · ⊗ Wk ∂ ν (gk f ) ⊗ · · ·
k=1
⊗ Wn gn ) + A2,k (f, g1 , · · · , gn )] ,
(270)
where def
A1 (f, g1 , . . . , gn ) =
n X
X
(j+1)
T (bΘµν
j=1 m1
0 can ,Wm1 ,···,Wmj
µ
(f, gm1 , . . . , gmj )
ˆ1...m ˆ j · · · ⊗ Wn gn ) , ⊗ W1 g1 ⊗ · · · m def
A2,k (f, g1 , . . . , gn ) =
n−1 X
X
(l+1)ν
T (bWk ,Wm
l=1 m1 <···<ml (mj 6=k)
1 ,...,Wml
ˆ1···m ˆ l · · · ⊗ Wn gn ) . ⊗ W1 g1 · · · m
(gk f, gm1 , . . . , gml ) (271)
In [29] it is shown that there exists a normalization (which is compatible with Pn (N0)–(N3)nn) such that −A1 + i k=1 A2,k = 0. In the following we use this normalization. Then the identity (270) and (269) imply ν Θµν can gL (∂µ f ) = −LgL ((∂ g)f ) .
(272)
The energy momentum tensor is only conserved in space-time regions in which the coupling ‘constant’ g is constant, in agreement with the fact that translation invariance is broken by a non-constant g. Unfortunately the trace of the canonical energy momentum tensor does not vanish, even for free fields. Following [29] and references cited therein, we assume nn So
˜ plays no role. far no external derivatives are present. Hence, (N)
September 23, 2002 13:37 WSPC/148-RMP
1040
00145
M. D¨ utsch & F.-M. Boas
m = 0 (and still L = λφ4 ) and introduce the improved energy momentum tensor.oo In (interacting) classical field theory it is defined by 1 µν def class µν = Θcan − [∂ µ (φclass ∂ ν φclass ) − g µν ∂ ρ (φclass ∂ρ φclass )] , Θclass imp 3
(273)
µν is given by the same formulas (267)–(269) as in QFT. This improved where Θclass can tensor is conserved and traceless. The latter relies on the field equation. Now we are going to construct the corresponding tensor in QFT. We apply the ˜ (262) to T ((φ∂˜µ ∂ ν φ)f ⊗ definition (259) and (260) to T ((φ∂ ν φ)∂ µ f ⊗ · · ·) and (N) · · ·). So we obtain
−T ((φ∂ ν φ)∂ µ f ⊗ W1 g1 ⊗ · · · ⊗ Wn gn ) = T ((∂ µ φ∂ ν φ)f ⊗ W1 g1 ⊗ · · · ⊗ Wn gn ) (n+1) µν + T ((φ∂˜µ ∂ ν φ)f ⊗ W1 g1 ⊗ · · · ⊗ Wn gn ) + AW1 ,...,Wn (f, g1 , . . . , gn ) , (274)
where def
(n+1) µν
AW1 ,...,Wn (f, g1 , . . . , gn ) =
n X
X
(k+1) µ
T (bφ∂ ν φ,Wm
k=1 m1 <···<mk
1 ,...,Wmk
(f, gm1 , . . . , gmk )
ˆ1 ···m ˆ k · · · ⊗ Wn gn ) . ⊗ W1 g1 ⊗ · · · m
(275)
Here we have normalized T (∂ ν φ, Wm1 , . . . , Wmk ) according to (91) (N4), which implies (k+1)
b∂ ν φ,Wm
1 ,...,Wmk
=0
(276)
(there are no anomalies for tree-like diagrams, cf. Sec. 2.4). Hence, there are no ˜ to T ((φ∂˜µ ∂ ν φ)f ⊗ W1 g1 ⊗ · · ·). In T (b(k+1) . . .)-terms in the application of (N) particular it follows T ((φ∂˜µ ∂ ν φ)f ⊗ W1 g1 ⊗ · · · ⊗ Wn gn ) = T ((φ∂˜ν ∂ µ φ)f ⊗ W1 g1 ⊗ · · · ⊗ Wn gn ) (277) ˜ For the interacting fields the identity (274) implies from (N). −(φ∂ ν φ)gL (∂ µ f ) = (∂ µ φ∂ ν φ)gL (f ) + (φ∂˜µ ∂ ν φ)gL (f ) + Aµν g (f ) ,
(278)
where def Aµν g (f ) =
oo In
∞ X
n X (−1)n−r ¯ (r+1) µν T ((gL)⊗(n−r) )AL,...,L (f, g, . . . , g) . i r!(n − r)! n=1 r=1 n
(279)
massive theories it is already at the classical level impossible to construct an energy momentum tensor Θclass µν which is conserved and traceless. Because the corresponding dilatation current D class µ ≡ xν Θclass µν would be conserved, but a (non-vanishing) mass breaks dilatation invariance.
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1041
(k+1) µ
Without further knowledge about bφ∂ ν φ,L,...,L we cannot interpret Aµν g (f ) as an pp (f ) needs not to be symmetrical in µ ↔ ν. By means of interacting field and Aµν g (277) we find νµ (φ∂ ν φ)gL (∂ µ f ) = (φ∂ µ φ)gL (∂ ν f ) − Aµν g (f ) + Ag (f ) .
(281)
Now we define the improvement tensor def
µν (f ) = −(φ∂ ν φ)gL (∂ µ f ) + g µν (φ∂ρ φ)gL (∂ ρ f ) . IgL
(282)
By using (281) we find that it is conserved up to anomalous terms (i.e. terms which violate the MWI) µν νµ (∂µ f ) = Aµν IgL g (∂µ f ) − Ag (∂µ f ) .
(283)
To compute the trace, we first mention T ((φ∂˜µ ∂µ φ)f ⊗ W1 g1 ⊗ · · · ⊗ Wn gn ) n X ∂Wm f gm ⊗ · · · ⊗ Wn gn T W1 g1 ⊗ · · · ⊗ φ = i ∂φ m=1 ˜ and (276). Therefore, which is a consequence of (N) ∂L µ ˜ (f g) = −4LgL (f g) . (φ∂ ∂µ φ)gL (f ) = − φ ∂φ gL
(284)
(285)
From (282), (278) and (285) we obtain 1 µ I (f ) = −(∂ µ φ∂µ φ)gL (f )+4LgL (f g)−Aµg µ (f ) = Θµcan µ gL (f )−Aµg µ (f ) . (286) 3 µ gL The improved energy momentum tensor is defined analogously to (273), namely 1 µν def µν Θµν imp gL (f ) = Θcan gL (f ) − IgL (f ) . 3 pp Due
(k+1)µ
(287)
(k+1)µ
to (N0) the anomaly aφ∂ ν φ,L,...,L = T1 (bφ∂ ν φ,L,...,L ) has the form Z Z (k+1)µ dx f (x) dy1 · · · dyk g(y1 ) · · · g(yk ) aφ∂ ν φ,L,...,L (f, g, . . . , g) = {P4µν (∂1 , . . . , ∂k )δ(y1 − x, . . . , yk − x) X
µν : φ(yj )φ(yl ) : P2,jl (∂1 , . . . , ∂k )δ(y1 − x, . . . , yk − x)
j≤l
+
X
µν : φ(yj )φ(yl )φ(yr )φ(ys ) : P0,jlrs δ(y1 − x, . . . , yk − x)} ,
(280)
j≤l≤r≤s µν where Pm,... (∂1 , . . . , ∂k ) is a polynomial of degree m in the partial derivatives ∂y1 , . . . , ∂yk , and µν is the expression in the {· · ·}-bracket is symmetrical under permutations of y1 , . . . , yk . P0,jlrs ∼ g µν and, hence, symmetrical in µ ↔ ν. But, e.g. for k = 2 the terms (1 ∂1µ ∂2ν + 2 ∂2µ ∂1ν )δ(y1 − x, y2 − x) and (: φ2 (y1 ) : ∂1µ ∂2ν + : φ2 (y2 ) : ∂2µ ∂1ν )δ(y1 − x, y2 − x) have not this (µ ↔ ν)-symmetry (k+1)µ
(k+1)ν
and their contributions to aφ∂ ν φ,L,...,L (∂µ f, g, . . . , g) − aφ∂ µ φ,L,...,L (∂µ f, g, . . . , g) and hence to µν νµ IgL (∂µ f ) = Aµν g (∂µ f ) − Ag (∂µ f ) (283) do not vanish.
September 23, 2002 13:37 WSPC/148-RMP
1042
00145
M. D¨ utsch & F.-M. Boas
Our results (272), (283) and (286) yield that it is conserved and traceless up to anomalous terms: 1 µν µν νµ Θµν imp gL (∂µ f ) = Θcan gL (∂µ f ) − (Ag (∂µ f ) − Ag (∂µ f )) 3 1 (∂µ f ) − Aνµ = −Lg L ((∂ ν g)f ) − (Aµν g (∂µ f )) , 3 g µ µ Θimp µ gL (f ) = Ag µ (f ) .
(288)
In the literature ([29] and references cited therein) it is shown that the anomalous terms can be removed by suitable normalization in one of the two equations in (288), but not simultaneously in both. Usually one puts the priority on the conservation and allows for a trace anomaly. The latter breaks the dilatation invariance and gives rise for anomalous dimensions of the interacting fields. Remark 5.1. We are going to show that the trace anomaly is of order O(g 2 ) for the interaction (268). We have to verify that (270), (274), (277) and (284) can be fulfilled without any anomalous terms A... ... to first order in g. Due to (N3) we have T2 (∂ a φ∂˜b ∂ c φ, φ4 )(x, y) = : ∂ a φ∂ b+c φ(x)φ4 (y) :
(289)
+ 4hΩ, T2 (∂˜b ∂ c φ, φ)(x, y)Ωi : ∂ a φ(x)φ3 (y) : (290) + 4hΩ, T2 (∂ a φ, φ)(x, y)Ωi : ∂ b+c φ(x)φ3 (y) : (291) + 6hΩ, T2 (∂ a φ∂˜b ∂ c φ, φ2 )(x, y)Ωi : φ2 (y) : .
(292)
For the tree diagrams (289), (290) and (291) the MWI holds true. An anomaly must come from the loop diagram (292), which is the two-legs sector. We define (2) µν ≡ 0. The (µ ↔ ν)the normalization of hΩ, T2 (φ∂˜µ ∂ ν φ, φ2 )Ωi by (274) with Aφ2 µν symmetry (277) holds, because all tensors of rank two are ∼ g or ∼ pµ pν , where p is the momentum belonging to the relative coordinate (x − y). The T -products on the r.h. sides of (270) and (284) have four legs for n = 1 and W1 = L = λφ4 . Hence, it remains to show that there exits a normalization such that ∂µx hΩ, T2 (∂ µ φ∂ ν φ, φ2 )(x, y)Ωi =
1 ν ∂ hΩ, T2 (∂ ρ φ∂ρ φ, φ2 )(x, y)Ωi 2 x
(293)
(which is (270)) and ∂µx hΩ, T2 (φ∂ µ φ, φ2 )(x, y)Ωi = hΩ, T2 (∂µ φ∂ µ φ, φ2 )(x, y)Ωi
(294)
(which is (284)). An explicit calculation shows that this can in fact be done.qq qq The C-number distributions in (292) for b = 0 and the relevant values of a and c have essentially been calculated in the second paper of [11] (Sec. 2 and Appendix C).
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1043
6. Conclusions The justifications to require the master Ward identity (as a normalization condition for the time-ordered products) are the following facts: • In the classical limit ~ → 0 the MWI becomes an identity which holds always true [9]. • The MWI has many, far-reaching and important consequences (see Sec. 4) which we would like to hold true in QFT.rr • It seems that the MWI can nearly always be satisfied: it is compatible with the other normalization conditions (Sec. 3), and many consequences of the MWI (e.g. the field equation, charge- and ghost-number conservation, conservation of the energy momentum tensor and perturbative gauge invariance ((147) with j1 = · · · = jn = 0) for SU (N )-Yang–Mills theories) have already been proved in the literature (Sec. 4). The only counter-examples we know are the usual anomalies of perturbative QFT. Appendix A. Feynman Propagators Let ϕ, χ ∈ G be the symbols corresponding to two massive or massless free fields (without derivatives) with the same mass and which satisfy the Klein–Gordon or wave equation ( + m2 )ϕ = 0 ,
( + m2 )χ = 0 ,
m ≥ 0,
(295)
and obey Bose or Fermi statistics. We assume that T1 (ϕg), g ∈ D(R ) (anti-) commutes with all free fields except T1 (χh), h ∈ D(R4 ) and the same for ϕ and χ exchanged. The non-vanishing (anti-)commutator is given by 4
∆ϕ,χ = Dm ,
(296)
where Dm is the (massive or massless) Pauli–Jordan distribution to the mass m, is a sign which depends on (ϕ, χ) and we have extended the notation (21) to anticommutators. For a bosonic real scalar field it is χ = ϕ and for a bosonic complex scalar field we have χ = ϕ+ . In case of the fermionic ghost fields of non-Abelian gauge theories ϕ and χ must be different: ϕ = u ˜a , χ = ua , = 1 where a is the ˜a , = −1. Spinor fields color index. Alternatively one may also set ϕ = ua , χ = u will be treated later. According to our definition (32) of the Feynman propagators and the normalization condition (N0), ∆F ∂ a ϕ,∂ b χ contains undetermined local terms if and only if def
ω = sd(∆F ∂ a ϕ,∂ b χ ) − 4 ≡ −2 + |a| + |b| ≥ 0 , rr We
(297)
discovered (or invented) the MWI by searching for a local construction of observables in non-Abelian quantum gauge theories. (In [7] this construction is given for QED.) We succeeded, provided several normalization conditions are fulfilled, see [3]. In order to prove that the latter have a common solution we looked for a universal formulation of these normalization conditions — and found the MWI.
September 23, 2002 13:37 WSPC/148-RMP
1044
00145
M. D¨ utsch & F.-M. Boas
namely
" ∆F ∂ a ϕ,∂ b χ
|b|
= (−1)
a b
∂ ∂
F Dm
+
ω X
# Cc(a,b) ∂ c δ
,
(298)
|c|=0 (a,b)
F is the massive or massless Feynman propagator and the Cc ∈ C are where Dm constants. We give an explicit list of the undetermined terms for the lowest values of |a| + |b|: µ ν F µν ∆F ∂ µ ϕ,∂ ν χ = −[(∂ ∂ Dm + Cg δ)] , 1 µν F F µ ν F ∆∂ µ ∂ ν ϕ,χ = ∆ϕ,∂ µ ∂ ν χ = ∂ ∂ Dm − g δ 4 F µ ν λ F −∆F ∂ µ ∂ ν ϕ,∂ λ χ = ∆∂ λ ϕ,∂ µ ∂ ν χ = ∂ ∂ ∂ Dm
(299) (300)
1 + 2C1 (g µλ ∂ ν δ + g νλ ∂ µ δ) 2 F = ∂ µ ∂ ν ∂ λ Dm
+ C1 g µν ∂ λ δ − F ∆F ∂ µ ∂ ν ∂ λ ϕ,χ = −∆ϕ,∂ µ ∂ ν ∂ λ χ
1 µν λ µλ ν νλ µ − (g ∂ δ + g ∂ δ + g ∂ δ) , 6
(301)
(302)
where we have taken account of Poincare covariance, symmetry with respect to exchange of Lorentz indices and 2 F F ∆F ∂ a ϕ,∂ b χ = −m ∆∂ a ϕ,∂ b χ = ∆∂ a ϕ,∂ b χ .
(303) def
µ F F = δ we compute δχ,ψ = ∂ µ ∆F With these formulas and ( + m2 )Dm χ,ψ − ∆∂ µ χ,ψ (33): µ = 0, δϕ,χ µ µν δ, δϕ,∂ ν χ = Cg
(304) (305)
1 (306) δ∂µν ϕ,χ = g µν δ , δ∂µµ ϕ,χ = δ , 4 1 1 µ τν µ µτ ν µν τ + 2C1 g ∂ δ , δ∂ τ ϕ,∂ ν χ = − C + + 2C1 g ∂ δ + C1 g ∂ δ − 2 2 (307) δ∂µµ ϕ,∂ ν χ = −(1 + C)∂ ν δ , 3 δ∂µµ ∂τ ϕ,χ = ∂τ δ , 4 1 1 + 2C1 g τ ν δ − Cm2 g τ ν δ . δ∂µµ ∂ τ ϕ,∂ ν χ = − + C1 ∂ ν ∂ τ δ + 2 2
(308) (309) (310)
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1045
For spinor fields with mass m ≥ 0 obeying the Dirac equation we have ∆ψ,ψ¯ = −(iγ µ ∂µ + m)Dm
(311)
and find δγµµ ψ,ψ¯ = −iδ ,
(312)
µ δψγ ¯
(313)
µ ,ψ
= −iδ .
Appendix B. Explicit Results for ∆µ Used in the Application of the MWI to the BRST-Current Let jµ be the free BRST-current (117). We assume that each symbol in W ∈ P0 carries at most a first (internal) derivative (no higher derivatives). Then the following ∂jµ ∂W µ g, f , χ, ψ ∈ G , (314) ∆χ,ψ ∂χ ∂ψ do not vanish:
∂W 1 ∂W f = (∂ µ ua ) µ gf , (315) χ = ∂τ Aτa : ∆µ∂τ Aτa ,Aν (∂ µ ua )g, b ∂Aνb 4 ∂Aa ∂W µ µ f ∆∂τ Aτa ,∂ σ Aν (∂ ua )g, b ∂(∂ σ Aνb ) 1 ∂W gf = CAa + + 2C1Aa (∂˜µ ∂µ ua ) 2 ∂(∂ν Aνa ) ∂W µ (∂ g)f + (∂µ ua ) ∂(∂ν Aνa ) ∂W ∂W σ ν µ ν ˜ gf + (∂ ua ) (∂ g)f − C1Aa (∂ ∂ ua ) ∂(∂ σ Aνa ) ∂(∂ ν Aµa ) 1 ∂W + 2C1Aa (∂˜ν ∂ σ ua ) gf + 2 ∂(∂ σ Aνa ) ∂W µ ν (∂ g)f , (316) + (∂ ua ) ∂(∂ µ Aνa ) ∂W ∂W µ τ f = −(∂τ Aτa + ma φa ) gf , (317) χ = ∂µ ua : ∆∂µ ua ,˜ub (∂τ Aa + ma φa )g, ∂u ˜b ∂u ˜a ∆µ∂µ ua ,∂ν u˜b (∂τ Aτa + ma φa )g,
∂W f ∂(∂ν u ˜b )
= −(1 + Cua ) (∂˜ν (∂τ Aτa + ma φa ))
∂W gf ∂(∂ν u˜a )
September 23, 2002 13:37 WSPC/148-RMP
1046
00145
M. D¨ utsch & F.-M. Boas
+ (∂τ Aτa
∂W (∂ν g)f , + ma φa ) ∂(∂ν u ˜a )
χ = ∂µ ∂τ Aτa : ∆µ∂µ ∂τ Aτ ,Aν a
− ua g,
b
∂W f ∂Aνb
=
(318)
3 ˜ν ∂W ∂W ν (∂ g)f , (∂ ua ) ν gf + ua 4 ∂Aa ∂Aνa (319)
∂W f a b ∂(∂ σ Aνb ) 1 ∂W − C1Aa · (∂˜ν ∂˜σ ua ) gf = 2 ∂(∂ σ Aνa ) ∂W ∂W σ ˜ + + (∂ ua ) (∂ ν g)f ∂(∂ ν Aσa ) ∂(∂ σ Aνa ) 1 ∂W ν σ ˜ a ) ∂W gf (∂ ∂ g)f − + 2C1Aa (u + ua σ ν ∂(∂ Aa ) 2 ∂(∂ν Aνa ) ∂W ∂W ν (∂ (g)f g)f + u + 2(∂˜ν ua ) a ∂(∂τ Aτa ) ∂(∂τ Aτa )
∆µ∂µ ∂τ Aτ ,∂ σ Aν
− ua g,
+ CAa m2a ua
∂W gf , ∂(∂τ Aτa )
(320)
χ = ua :
∆µua ,˜ub
−
(∂µ (∂τ Aτa
∆µua ,∂ν u˜b
−
∂W + ma φa ))g, f ∂u ˜b
(∂µ (∂τ Aτa
= 0,
∂W f + ma φa ))g, ∂(∂ν u ˜b )
= Cua (∂ν (∂τ Aτa + ma φa ))
∂W gf , ∂(∂ν u ˜a )
(321)
(322)
∂W f = 0, (323) χ = φa : ∆µφa ,φb (∂µ ua )g, ∂φb ∂W ∂W µ f = −Cφa (∂µ ua ) gf , (324) ∆φa ,∂ν φb (∂µ ua )g, ∂(∂ν φb ) ∂(∂µ φa ) χ = ∂µ φa : ∆µ∂µ φa ,φb
− ua g,
∆µ∂µ φa ,∂ν φb
∂W f ∂φb
− ua g,
∂W f ∂(∂ν φb )
= (1 + Cφa ) (∂˜ν ua )
∂W gf , ∂φa
= ua
(325)
∂W ∂W gf + ua (∂ν g)f , (326) ∂(∂ν φa ) ∂(∂ν φa )
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1047
where we have used the explicit expressions (304)–(310) for the δ µ and the definition (35) of ∆µ . Acknowledgments We very much profitted from discussions with Klaus Fredenhagen about all parts of this paper, including technical details. He gave us important stimuli. Together with the first author, he is working on a second paper about the master Ward identity [9], which has influenced this paper. We thank Raymond Stora for interesting comments, detailed questions and improvements of various formulations. We are also grateful to Joachim Freund, Dirk Prange and Karl-Henning Rehren for discussions. This paper was mainly written at the ‘II. Institut f¨ ur Theoretische Physik der Universit¨at Hamburg’. The first author was supported by the Deutsche Forschungsgemeinschaft. References [1] W. A. Bardeen, Anomalous Ward identities in spinor field theories, Phys. Rev. 184 (1969), 1848. [2] C. Becchi, A. Rouet and R. Stora, Renormalization of the Abelian Higgs–Kibble model, Commun. Math. Phys. 42 (1975), 127. C. Becchi, A. Rouet and R. Stora, Renormalization of gauge theories, Ann. Phys. (N. Y.) 98 (1976), 287. [3] F. M. Boas, Gauge theories in local causal perturbation theory, hep-th/0001014. [4] N. N. Bogoliubov, and D. V. Shirkov, Introduction to the Theory of Quantized Fields, New York, 1959. [5] R. Brunetti and K. Fredenhagen, Microlocal analysis and interacting quantum field theories: Renormalization on physical backgrounds, Commun. Math. Phys. 208 (2000), 623. [6] K. Bresser, G. Pinter and D. Prange, The Lorentz invariant extension of scalar theories, hep-th/9903266. D. Prange, Lorentz covariance in Epstein–Glaser renormalization, hep-th/9904136. [7] M. D¨ utsch and K. Fredenhagen, A local (perturbative) construction of observables in gauge theories: the example of QED, Commun. Math. Phys. 203 (1999), 71. M. D¨ utsch and K. Fredenhagen, Deformation stability of BRST-quantization, in Proceedings of the Conference “Particles, Fields and Gravitation”, Lodz, Poland, 1998, pp. 324–333. [8] M. D¨ utsch and K. Fredenhagen, Algebraic quantum field theory, perturbation theory, and the loop expansion, Commun. Math. Phys. 219 (2001), 5. M. D¨ utsch and K. Fredenhagen, Perturbative algebraic field theory, and deformation quantization, Proceedings of the Conference on Mathematical Physics in Mathematics and Physics, Siena, June 20–25, 2000. [9] M. D¨ utsch and K. Fredenhagen, The master Ward identity and generalized Schwinger–Dyson equation in classical field theory, under preparation. [10] M. D¨ utsch and K. Fredenhagen, Causal perturbation theory in terms of retarded products and perturbative algebraic field theory, under preparation. [11] M. D¨ utsch, T. Hurth, K. Krahe and G. Scharf, Causal construction of Yang–Mills theories. I., N. Cimento A 106 (1993), 1029.
September 23, 2002 13:37 WSPC/148-RMP
1048
[12] [13] [14] [15]
[16] [17] [18] [19] [20] [21]
[22]
[23] [24] [25] [26]
[27]
00145
M. D¨ utsch & F.-M. Boas
M. D¨ utsch, T. Hurth, K. Krahe and G. Scharf, Causal construction of Yang–Mills theories. II., N. Cimento A 107 (1994), 375. M. D¨ utsch, T. Hurth and G. Scharf, Causal construction of Yang–Mills theories. III., N. Cimento A 108 (1995), 679. D¨ utsch, M., T. Hurth and G. Scharf, Causal construction of Yang–Mills theories. IV. Unitarity, N. Cimento A 108 (1995), 737. M. D¨ utsch, On gauge invariance of Yang–Mills theories with matter fields, N. Cimento A 109 (1996), 1145. M. D¨ utsch, K. Krahe and G. Scharf, Interacting fields in finite QED, N. Cimento A 103 (1990), 871. M. D¨ utsch, K. Krahe and G. Scharf, Gauge invariance in finite QED, N. Cimento A 103 (1990), 903. M. D¨ utsch, K. Krahe and G. Scharf, Scalar QED Revisited, Nuovo Cimento A 106 (1993), 277. M. D¨ utsch and G. Scharf, Perturbative gauge invariance: the electroweak theory, Ann. Phys. (Leipzig) 8 (1999), 359. A. Aste, M. D¨ utsch and G. Scharf, Perturbative gauge invariance: the electroweak theory II, Ann. Phys. (Leipzig) 8 (1999), 389. M. D¨ utsch, and B. Schroer, Massive vector mesons and gauge theory, J. Phys. A 33 (2000), 4317. M. D¨ utsch, Slavnov-Taylor identities from the causal point of view, Int. J. Mod. Phys. A 12 (1997), 3205. M. D¨ utsch, Non-uniqueness of quantized Yang–Mills theories, J. Phys. A 29 (1996), 7597. H. Epstein and V. Glaser, The role of locality in perturbation theory, Ann. Inst. H. Poincar´e A 19 (1973), 211. H. Epstein and V. Glaser, Adiabatic limit in perturbation theory, in Renormalization theory, eds. G. Velo and A. S. Wightman, 1976, 193–254. D. R. Grigore, On the uniqueness of the non-abelian gauge theories in Epstein–Glaser approach to renormalisation theory, Romanian J. Phys. 44 (1999), 853. D. R. Grigore, The standard model and its generalisations in Epstein–Glaser approach to renormalisation theory, J. Phys. A 33 (2000), 8443. D. R. Grigore, The standard model and its generalisations in Epstein–Glaser approach to renormalisation theory II: the fermion sector and the axial anomaly, J. Phys. A 34 (2001), 5429. D. R. Grigore, The structure of the anomalies of gauge theories in the causal approach, to appear in J. Phys. A. T. Hurth and K. Skenderis, Quantum Noether method, Nucl. Phys. B 514 (1999), 566. T. Hurth and K. Skenderis, The quantum noether condition in terms of interacting fields, Lect. Notes Phys. 558 (2000), 86. C. Itzykson and J.-B. Zuber, Quantum Field Theory, McGraw-Hill, 1985. T. Kugo and I. Ojima, Local covariant operator formalism of non-abelian gauge theories and quark confinement problem, Suppl. Progr. Theor. Phys. 66 (1979), 1. F. Krahe, A causal approach to massive Yang–Mills theories, Acta Phys. Polonica B 27 (1996), 2453. Y.-M. P. Lam, Perturbation Lagrangian theory for scalar fields–Ward-Takahashi identity and current algebra, Phys. Rev. D6 (1972), 2145. Y.-M. P. Lam, Equivalence theorem on Bogoliubov–Parasiuk–Hepp–Zimmermann — Renormalized Lagrangian field theories, Phys. Rev. D7 (1973), 2943. J. H. Lowenstein, Differential vertex operations in Lagrangian field theory, Commun. Math. Phys. 24 (1971), 1.
September 23, 2002 13:37 WSPC/148-RMP
00145
The Master Ward Identity
1049
[28] J. H. Lowenstein, Normal-product quantization of currents in Lagrangian field theory, Phys. Rev. D 4 (1971), 2281. [29] D. Prange, Energy momentum tensor and operator product expansion in local causal perturbation theory, hep-th/0009124. D. Prange, Energy momentum tensor in local causal perturbation theory, Ann. Phys. (Leipzig) 10 (2001), 497. [30] D. Prange, Epstein–Glaser renormalization and differential renormalization, J. Phys. A 32 (1999), 2225. [31] O. Piguet and S. P. Sorella, Algebraic Renormalization, Springer-Verlag, 1995. [32] G. Pinter, Finite renormalizations in the Epstein–Glaser framework and renormalization of the S-matrix of φ4 -theory, Ann. Phys. (Leipzig) 10 (2001), 333. [33] M. Requardt, Symmetry conservation and integrals over local charge desities in quantum field theory, Commun. Math. Phys. 50 (1976), 259. [34] G. Scharf, Finite Quantum Electrodynamics. The Causal Approach, Springer-Verlag, 1995. [35] G. Scharf, Quantum Gauge Theories; A True Ghost Story, John Wiley and Sons, Inc., 2001. [36] G. Scharf, General massive gauge theory, N. Cimento A 112 (1999), 619. [37] O. Steinmann, Perturbation expansions in axiomatic field theory, Lecture Notes in Physics 11, Berlin-Heidelberg-New York, Springer-Verlag, 1971. [38] R. Stora, Local gauge groups in quantum field theory: perturbative gauge theories, talk given at the workshop ‘Local quantum physics’ at the Erwin-SchroedingerInstitute, Vienna, 1997. [39] R. Stora, Differential Algebras in Lagrangean Field Theory, ETH-Z¨ urich Lectures, January-February, 1993. G. Popineau and R. Stora, A pedagogical remark on the main theorem of perturbative renormalization theory, unpublished preprint, 1982. [40] R. Stora, Lagrangian field theory, summer school of theoretical physics about ‘particle physics’, Les Houches, 1971. [41] W. Zimmermann, in Lectures on Elementary Particles and Quantum Field Theory, Brandeis Summer Institute in Theoretical Physics, 1970, ed. S. Deser.
November 7, 2002 14:27 WSPC/148-RMP
00149
Reviews in Mathematical Physics, Vol. 14, No. 10 (2002) 1051–1072 c World Scientific Publishing Company
QUASI-CLASSICAL VERSUS NON-CLASSICAL ¨ SPECTRAL ASYMPTOTICS FOR MAGNETIC SCHRODINGER OPERATORS WITH DECREASING ELECTRIC POTENTIALS
GEORGI D. RAIKOV Departamento de Matem´ aticas, Universidad de Chile, Las Palmeras 3425, Casilla 653, Santiago, Chile
[email protected] SIMONE WARZEL Institut f¨ ur Theoretische Physik, Universit¨ at Erlangen-N¨ urnberg, Staudtstrasse 7, D-91058 Erlangen, Germany
[email protected] Received 25 December 2001 Revised 28 June 2002 We consider the Schr¨ odinger operator H(V ) on L2 (R2 ) or L2 (R3 ) with constant magnetic field, and electric potential V which typically decays at infinity exponentially fast or has a compact support. We investigate the asymptotic behaviour of the discrete spectrum of H(V ) near the boundary points of its essential spectrum. If the decay of V is Gaussian or faster, this behaviour is non-classical in the sense that it is not described by the quasi-classical formulas known for the case where V admits a power-like decay. Keywords: Magnetic Schr¨ odinger operators; spectral asymptotics. Mathematics Subject Classification 2000: 35P20, 47B35
1. Introduction odinger operator with constant magnetic field of Let H(0) := (−i∇−A)2 be the Schr¨ strength b > 0, essentially self-adjoint on C0∞ (Rd ), d = 2, 3. The magnetic potential A is chosen in the form by bx if d = 2 , −2, 2 A(x) = by bx − , ,0 if d = 3 . 2 2 ∂A1 2 In the two-dimensional case we identify the magnetic field with ∂A ∂x − ∂y = b, while in the three-dimensional case we identify it with curl A = (0, 0, b). Moreover, if d = 2, we write x = (x, y) ∈ R2 , and if d = 3, we write x = (X⊥ , z) with X⊥ = (x, y) ∈ R2 and z ∈ R. Thus, in the latter case, z is the variable along
1051
November 7, 2002 14:27 WSPC/148-RMP
1052
00149
G. D. Raikov & S. Warzel
the magnetic field, while X⊥ are the variables on the plane perpendicular to it. Introducing the sequence of Landau levels Eq := (2q + 1)b, q ∈ Z+ := {0, 1, . . .}, we recall [7, 3] that ( ∞ ∪q=0 {Eq } if d = 2 , (1.1) σ(H(0)) = σess (H(0)) = if d = 3 . [E0 , ∞) Here σ(H(0)) denotes the spectrum of the operator H(0), and σess (H(0)) denotes its essential spectrum. Let V : Rd → R be a measurable, non-negative function which decays at infinity in a suitable sense, so that the operator V 1/2 H(0)−1/2 is compact. By Weyl’s theorem, σess (H(0)) = σess (H(±V )) where H(±V ) := H(0) ± V , and ±V is the electric potential of constant (positive or negative) sign. The aim of the article is to investigate the behaviour of the discrete spectrum of the operator H(±V ) near the boundary points of its essential spectrum. This behaviour has been extensively studied in the literature in case where V admits power-like or slower decay at infinity (see [15, 16, 17, 18], [12, Chaps. 11 and 12]) and also in the special case where d = 3 and V is axially symmetric with respect to the magnetic field (see [3, 21]). The novelty in the present paper is that we consider V ’s which decay exponentially fast or have compact support and which at most asymptotically obey a certain symmetry. If d = 3, this type of decay of V is supposed to take place in the directions perpendicular to the magnetic field while the decay in the z-direction could be much more general (see Theorems 2.3 and 2.4 below). If the decay of V in the (x, y)-directions is Gaussian or super-Gaussian, we show that the discrete-spectrum behaviour of H(±V ) is not described by quasiclassical formulas known for the case of power-like decay. The results of the present paper have been announced in [20]. After the initial submission of the paper, we became aware of the preprint [19]. It deals with the eigenvalue asymptotics for the Schr¨ odinger and Dirac operators with full-rank magnetic fields, and compactly supported electric potentials of fixed sign. In particular, [19] extends our Theorem 2.2 to the case of full-rank magnetic fields in arbitrary even dimension. The methods of proof applied in [19] are variational ones similar to those used in the present paper. This paper is organized as follows. In Sec. 2 we formulate our main results. Section 3 is devoted to the analysis of the eigenvalue asymptotics for compact operators of Toeplitz type. Section 4 contains the proofs of the results concerning the two-dimensional case. Finally, the proofs of the results for the three-dimensional case can be found in Sec. 5. 2. Formulation of Main Results 2.1. Basic notation In order to formulate our main results we need the following notations. Let T be a linear self-adjoint operator. Denote by PI (T ) the spectral projection of T
November 7, 2002 14:27 WSPC/148-RMP
00149
Quasi-Classical Versus Non-Classical Spectral Asymptotics
1053
corresponding to the open interval I ⊂ R. Set N (λ1 , λ2 ; T ) := rank P(λ1 ,λ2 ) (T ) , N (λ; T ) := rank P(−∞,λ) (T ) ,
λ1 , λ2 ∈ R ,
λ1 < λ2 ,
λ ∈ R.
If T is compact, we will also use the notations n± (s; T ) := rank P(s,∞) (±T ) ,
s > 0.
(2.1)
By k.k we denote the usual operator norm, and by k.kHS the Hilbert–Schmidt norm. 2.2. Main results for two dimensions This subsection contains our main results related to the two-dimensional case. Theorem 2.1. Let V be bounded and non-negative on R2 . Assume that there exist two constants 0 < µ < ∞ and 0 < β < ∞ such that ln V (x) = −µ . |x|→∞ |x|2β
(2.2)
lim
Moreover, fix a Landau level Eq , q ∈ Z+ , and an energy E 0 ∈ (Eq , Eq+1 ). (i) If 0 < β < 1, then we have b N (Eq + E, E 0 ; H(V )) = 1/β . E↓0 | ln E|1/β 2µ lim
(2.3)
(ii) If β = 1, then we have 1 N (Eq + E, E 0 ; H(V )) = . E↓0 | ln E| ln(1 + 2µ/b) lim
(2.4)
(iii) If 1 < β < ∞, then we have β N (Eq + E, E 0 ; H(V )) = . E↓0 (ln|lnE|)−1 |lnE| β−1 lim
(2.5)
The proof of Theorem 2.1 can be found in Sec. 4.2. It is evident from this proof that Theorem 2.1(iii) admits the following generalization as the asymptotic coefficient in (2.5) is independent of µ. Corollary 2.1. Let V be bounded and non-negative on R2 . Assume that there exist 0 < µ1 < µ2 < ∞ and 1 < β < ∞ such that −µ2 ≤ lim inf |x|→∞
ln V (x) , |x|2β
lim sup |x|→∞
ln V (x) ≤ −µ1 . |x|2β
Then (2.5) remains valid. The last theorem of this subsection concerns the case where V has a compact support.
November 7, 2002 14:27 WSPC/148-RMP
1054
00149
G. D. Raikov & S. Warzel
Theorem 2.2. Let V be bounded and non-negative on R2 . Assume that the support of V is compact, and there exists a constant C− > 0 such that V ≥ C− on an open non-empty subset of R2 . Moreover, let q ∈ Z+ and E 0 ∈ (Eq , Eq+1 ). Then we have N (Eq + E, E 0 ; H(V )) = 1. E↓0 (ln|lnE|)−1 |lnE|
(2.6)
lim
The proof of Theorem 2.2 is contained in Sec. 4.3. Remark 2.1. Under the hypotheses of Theorems 2.1 or 2.2 we have V ∈ L1 (R2 ) ∩ L∞ (R2 ). It is well-known that this inclusion implies that the operator V 1/2 (−∆ + 1)−1/2 is compact. Hence, it follows from the diamagnetic inequality (see e.g. [3]) that the operator V 1/2 H(0)−1/2 is compact as well. For further references, we introduce some additional notation which allows us to unify (2.3)–(2.6) into a single formula. For κ ∈ (e, ∞) define the increasing functions (β) aµ by 1/β b κ if 0 < β < 1 , 2 µ κ if β = 1 , ln(1 + 2µ/b) (2.7) a(β) µ (κ) := κ β if 1 < β < ∞ , β − 1 ln κ κ if β = ∞ . ln κ Then asymptotic relations (2.3)–(2.6) can be re-written as lim
E↓0
N (Eq + E, E 0 ; H(V )) (β)
aµ (| ln E|)
= 1,
0 < β ≤ ∞.
(2.8)
Remark 2.2. Whenever we refer to functions (2.7) with 1 < β ≤ ∞, we will write (β) a(β) (κ) instead of aµ (κ) because in this case they are independent of µ. Let us discuss the results of Theorems 2.1 and 2.2. (1) Asymptotic relation (2.8) describes the behaviour of the infinite sequence of discrete eigenvalues of the operator H(V ) accumulating to the Landau level Eq , q ∈ Z+ , from the right. Analogous results hold if we consider the eigenvalues of H(−V ) accumulating to Eq from the left. Namely, (2.8) remains valid if we replace N (Eq + E, E 0 ; H(V )) by N (E 00 , Eq − E; H(−V )) with some E 00 ∈ (Eq−1 , Eq ) if q > 0, or by N (E0 − E; H(−V )) if q = 0. (2) Introduce the quasi-classical quantity Ncl (E) :=
b |{x ∈ R2 |V (x) > E}| , 2π
E > 0,
November 7, 2002 14:27 WSPC/148-RMP
00149
Quasi-Classical Versus Non-Classical Spectral Asymptotics
1055
where | · | denotes the Lebesgue measure. If V ≥ 0 satisfies the asymptotics 1 V (x) = v(x/|x|) |x|α (1 + o(1)) as |x| → ∞ with some v ∈ C(S ), v > 0, and some R b 2/α ds, and it has been shown 0 < α < ∞, then limE↓0 E 2/α Ncl (E) = 4π S1 v(s) that N (Eq + E, E 0 ; H(V )) = 1, E↓0 Ncl (E) lim
(2.9)
assuming some regularity of Ncl (E) as E ↓ 0 (see [15, Theorem 2.6], [12, Chap. 11]). On the other hand, if V satisfies the assumptions of Theorem 2.1, then lim
E↓0
b Ncl (E) = , |lnE|1/β 2µ1/β
0 < β < ∞,
and if V satisfies the assumptions of Theorem 2.2, then Ncl (E) = O(1) ,
E ↓ 0.
Comparing (2.8) and (2.9), we see that they are different if and only if 1 ≤ β ≤ ∞. In case β = 1 the asymptotic orders of (2.8) and (2.9) coincide but their coefficients differ although they have the same main asymptotic term in the strong magnetic field regime b → ∞. In brief, asymptotic relation (2.8) is quasi-classical for potentials V whose decay is slower than Gaussian (0 < β < 1), and it is non-classical for potentials whose decay is faster than Gaussian (1 < β ≤ ∞), while the Gaussian decay (β = 1) of V is the border-line case. A similar transition from quasi-classical to non-classical behaviour as a function of the decay of the single-site potential with Gaussian decay as the border-line case has been detected in [10]. There the leading low-energy fall-off of the integrated density of states of a charged quantum particle in R2 subject to a perpendicular constant magnetic field and repulsive impurities randomly distributed according to Poisson’s law has been considered. (3) The assumptions of Theorems 2.1 and 2.2 that V be bounded and non-negative are not quite essential. For example, both theorems remain valid if we consider potentials |x|−α V (x) where 0 < α < 2, and V satisfies the hypotheses of Theorems 2.1 or 2.2. Similarly, Theorem 2.1 holds also in the case where V is allowed to change sign on a compact subset of R2 . (4) Let π(λ) be the number of primes less than λ > 0. It is well-known that lim
λ→∞
π(λ) =1 (ln λ)−1 λ
(see e.g. [9, Sec. 1.8, Theorem 6]). Hence, (2.6) can be re-written as N (Eq + E, E 0 ; H(V )) = 1. E↓0 π(| ln E|) lim
November 7, 2002 14:27 WSPC/148-RMP
1056
00149
G. D. Raikov & S. Warzel
2.3. Main results for three dimensions In this subsection we formulate our main results concerning the case d = 3. In this case we will analyze the behaviour of N (E0 − E; H(−V )) as E ↓ 0. In order to define properly the operator H(−V ) we need the following lemma. Lemma 2.1. Let U ∈ L1 (R2 ) ∩ L∞ (R2 ), and v ∈ L1 (R). Assume that 0 ≤ V (X⊥ , z) ≤ U (X⊥ )v(z), X⊥ ∈ R2 , z ∈ R. Then the operator V 1/2 H(0)−1/2 is compact. The proof of the lemma is elementary. Nevertheless, for the reader’s convenience we include it in Sec. 5.2. Denote by H(−V ) the self-adjoint operator generated in L2 (R3 ) by the quadratic form Z {|i∇u + Au|2 − V |u|2 } dx , u ∈ D(H(0)1/2 ) , R3
which is closed and lower bounded in L2 (R3 ) since the operator V 1/2 H(0)−1/2 is compact by Lemma 2.1. Theorem 2.3. Let 0 < µ < ∞ and 0 < β < ∞. Assume that there exist a constant C > 0 and a function v ∈ L1 (R; (1 + |z|) dz), which does not vanish identically, such that x = (X⊥ , z) ∈ R3 .
0 ≤ V (x) ≤ Cv(z) ,
Moreover, suppose that for every δ > 0 there exist a constant rδ > 0 and two nonnegative functions vδ± ∈ L1 (R; (1 + |z|) dz), which do not vanish identically, such that e−δ|X⊥ | vδ− (z) ≤ eµ|X⊥ | V (X⊥ , z) ≤ eδ|X⊥ | vδ+ (z) 2β
2β
2β
for all |X⊥ | ≥ rδ and all z ∈ R. Then we have lim
E↓0
N (E0 − E; H(−V )) = 1. √ (β) aµ (|ln E|)
(2.10)
The proof of Theorem 2.3 can be found in Sec. 5.4. Our last theorem treats the case where the projection of the support of V onto the plane perpendicular to the magnetic field is compact. Denote by χr,X⊥0 : R2 → R 0 | < r} of radius r > 0, the characteristic function of the disk {X⊥ ∈ R2 | |X⊥ − X⊥ 0 0 ∈ R2 . If X⊥ = 0, we will write χr instead of χr,0 . centered at X⊥ ± ∈ R2 , and two Theorem 2.4. Assume that there exist four constants r± > 0, X⊥ ± 1 non-negative functions v ∈ L (R; (1 + |z|) dz), which do not vanish identically, such that V obeys the estimates
χr− ,X − (X⊥ )v − (z) ≤ V (x) ≤ χr+ ,X + (X⊥ )v + (z) , ⊥
⊥
x = (X⊥ , z) ∈ R3 .
November 7, 2002 14:27 WSPC/148-RMP
00149
Quasi-Classical Versus Non-Classical Spectral Asymptotics
1057
Then we have lim
E↓0
N (E0 − E; H(−V )) √ = 1. a(∞) (|ln E|)
(2.11)
The proof of Theorem 2.4 is contained in Sec. 5.5. Let us discuss briefly the above results. (1) In particular, Theorem 2.3 covers bounded negative potentials −V which decay at infinity exponentially fast, i.e. lim
|x|→∞
ln V (x) = −µ , |x|2β
(2.12)
with some 0 < β < ∞ and 0 < µ < ∞. (2) Assume that V ≥ 0 satisfies the asymptotics V (x) = v(x/|x|) |x|α (1 + o(1)) as |x| → ∞ with some v ∈ C(S2 ), v > 0, and some 2 < α < ∞. For E > 0 set Z √ ˜cl (E) := b X⊥ ∈ R2 V (X⊥ , z) dz > 2 E . N 2π R Under some supplementary regularity assumptions concerning the behaviour of N˜cl (E) as E ↓ 0 we have lim
E↓0
N (E0 − E; H(−V )) =1 ˜cl (E) N
(2.13)
(see [17], [18, Theorem 1(ii)], [15, Theorem 2.4(i)], [12, Chap. 12]). Theorem 2.3 shows that (2.13) remains valid if the decay of V is slower than Gaussian in the sense that (2.12) holds with 0 < β < 1. On the other hand, if this decay is Gaussian or faster in the sense that (2.12) holds with β = 1 or 1 < β ≤ ∞, the asymptotics of N (E0 − E; H(−V )) as E ↓ 0 differs from (2.13). 3. Spectra of Auxiliary Operators of Toeplitz Type 3.1. Landau Hamiltonian and angular-momentum eigenstates Let d = 2. In this case, by (1.1) the spectrum of H(0) consists of the eigenvalues Eq , q ∈ Z+ , which are of infinite multiplicity. Denote by Pq , q ∈ Z+ , the spectral projection of H(0) corresponding to the eigenvalue Eq . Our next goal is to introduce convenient orthonormal bases of the subspaces Pq L2 (R2 ). For x ∈ R2 , q ∈ Z+ , and k ∈ Z+ − q := {−q, −q + 1, . . .} set s "r #k r b|x|2 b|x|2 q! b b (k) (x + iy) Lq exp − (3.1) ϕq,k (x) := (k + q)! 2 2 2π 4 where L(α) q (ξ)
q X q + α (−ξ)m , := q−m m! m=0
ξ ≥ 0,
(3.2)
November 7, 2002 14:27 WSPC/148-RMP
1058
00149
G. D. Raikov & S. Warzel
are the generalized Laguerre polynomials (see e.g. [8, Sec. 8.97]) which are defined α in terms of the binomial coefficients m := α(α − 1) · . . . · (α − m + 1)/m! if α m ∈ Z+ \{0}, and 0 := 1, for all α ∈ R. It is well-known that the functions ϕq,k , k ∈ Z+ − q, constitute an orthonormal basis in the qth Landau-level eigenspace Pq L2 (R2 ), q ∈ Z+ (see e.g. [7, 11]). In fact, ϕq,k is also an eigenfunction of the angular-momentum operator −i(x∂/∂y − y∂/∂x) with eigenvalue k. For further references we establish some useful properties of the Laguerre (α) polynomials Lq . We first recall [1, Sec. 22.2.12] their orthogonality relation Z ∞ Γ(α + q + 1) (α) δq,q0 ξ α e−ξ L(α) (3.3) q (ξ)Lq0 (ξ) dξ = q! 0 0 valid for all q, q 0 ∈ Z+ and α > −1. Here R ∞ we have introduced Kronecker’s delta δq,q and Euler’s gamma function Γ(s) := 0 ts−1 e−t dt, s > 0, such that Γ(k + 1) = k! if k ∈ Z+ , see e.g. [1, Chap. 6].
Lemma 3.1. Let q ∈ Z+ . Then q ξ/(k+q) |L(k) q (ξ)| ≤ (k + q) e
(3.4)
holds for all ξ ≥ 0 and all k ≥ 1 − q. Moreover, one has the uniform convergence lim k −q L(k) q (kξ) =
k→∞
(1 − ξ)q q!
(3.5)
for all 0 ≤ ξ ≤ 1. Remark 3.1. An immediate consequence of (3.5) is the following lower bound on the pre-limit expression k −q L(k) q (kξ) ≥
(1 − ξ0 )q 2q!
(3.6)
which is valid for all 0 ≤ ξ ≤ ξ0 < 1 and sufficiently large k. Proof of Lemma 3.1. The rough upper bound (3.4) is taken from [11, Eq. (42)]. For a proof of (3.5) we use (3.2) to obtain q X q + k (−ξ)m m−q . (3.7) (kξ) = k k −q L(k) q q−m m! m=0
Asymptotic relation [1, Eq. 6.1.46] entails lim k m−q
k→∞
Γ(k + q) = 1. Γ(k + m)
(3.8)
Pq q (−ξ)m /q! = The r.h.s. of (3.7) thus converges (uniformly on [0, 1]) towards m=0 m q (1 − ξ) /q! by the binomial formula. P 0 For x, x0 ∈ R2 denote by Kq (x, x0 ) := ∞ k=−q ϕq,k (x)ϕq,k (x ) the integral kernel of the projection Pq , q ∈ Z+ . It is well-known that b (0) b|x − x0 |2 b Lq (3.9) exp − (|x − x0 |2 + 2i(x0 y − xy 0 )) Kq (x, x0 ) = 2π 2 4
November 7, 2002 14:27 WSPC/148-RMP
00149
Quasi-Classical Versus Non-Classical Spectral Asymptotics
1059
(see e.g. [11]). Note that we have Kq (x, x) =
b , 2π
x ∈ R2 ,
q ∈ Z+ .
(3.10)
3.2. Compact operators of Toeplitz type In this subsection we investigate the eigenvalue asymptotics of auxiliary compact operators of Toeplitz type Pq F Pq where q ∈ Z+ and F is the multiplier by a realvalued function. The results obtained here will be essentially employed in the proofs of Theorems 2.1–2.4. First of all, note that Pq F Pq = e2(2q+1)bt Pq e−tH(0) F e−tH(0) Pq , t > 0, q ∈ Z+ . Hence, the diamagnetic inequality implies that Pq F Pq is compact if the operator |F |1/2 e∆t (or, equivalently, e∆t |F |1/2 ) is compact for some t > 0 (see [3, Theorems 2.2, 2.3]). In particular, the following lemma holds. Lemma 3.2 ([15, Lemma 5.1]). Let F be real-valued and F ∈ Lp (R2 ) for some p ≥ 1. Then the operator Pq F Pq , q ∈ Z+ , is self-adjoint and compact. Lemma 3.3. Let F : R2 → R satisfy the conditions of Lemma 3.2. Suppose in addition that F is radially symmetric with respect to the origin, and bounded. Then the eigenvalues of the operator Pq F Pq with domain Pq L2 (R2 ), q ∈ Z+ , are given by Z ∞ p q! 2 F (( 2ξ/b, 0))e−ξ ξ k L(k) k ∈ Z+ − q , hF ϕq,k , ϕq,k i = q (ξ) dξ , (k + q)! 0 (3.11) where h·, ·i denotes the scalar product in L2 (R2 ). Proof. It suffices to take into account (3.1) and the radial symmetry of F . Remark 3.2. Evidently, Lemma 3.3 is valid under more general assumptions. In particular, the boundedness condition is unnecessarily restrictive. However, we state the lemma in a simple form which is sufficient for our purposes. 3.3. Two examples of explicit eigenvalue asymptotics (β)
For x ∈ R2 set Gµ (x) := exp(−µ|x|2β ) where 0 < µ < ∞ and 0 < β < ∞. (β) According to Lemma 3.3 the eigenvalues of Pq Gµ Pq on Pq L2 (R2 ) are given by (β)
γq,k (µ) := hG(β) µ ϕq,k , ϕq,k i ,
k ∈ Z+ − q .
Let (aµ )−1 denote the inverse function of aµ β µ 2k −1 b (k) = (a(β) µ ) k ln(1 + 2µ/b) (β)
(β)
(3.12)
defined in (2.7). Evidently, if 0 < β < 1 , if β = 1 .
(3.13)
November 7, 2002 14:27 WSPC/148-RMP
1060
00149
G. D. Raikov & S. Warzel
Moreover, it is straightforward to verify that β − 1 if 1 < β < ∞ , (β) −1 (a ) (k) β = lim k→∞ k ln k 1 if β = ∞ .
(3.14)
(β)
The next proposition treats the asymptotics of γq,k (µ), q ∈ Z+ , as k → ∞. For q = 0 and 0 < β ≤ 1/2 closely related asymptotic evaluations can be found in [21, Appendix]. Proposition 3.1. Let q ∈ Z+ , 0 < µ < ∞, and 0 < β < ∞. Then we have (β)
lim
k→∞
ln γq,k (µ) (β)
(aµ )−1 (k)
= −1 .
Proof. From (3.12) and Lemma 3.3 it follows that q! k! (β) β J (β) (k, µ(2/b) ) γq,k (µ) = (k + q)! where we have introduced the notation Z 1 ∞ k −λξβ −ξ (k) 2 ξ e Lq (ξ) dξ . J (β) (k, λ) := k! 0
(3.15)
(3.16)
Thanks to asymptotic relation (3.8) it remains to study the asymptotic behaviour of J (β) for large values of its first argument. For this purpose we distinguish three cases. Case 0 < β < 1. The claim follows from (3.8) and (3.13) with 0 < β < 1, together with the asymptotic relation ln J (β) (k, λ) = −λ (3.17) k→∞ kβ valid for λ > 0 in this case. For a proof of (3.17) we construct asymptotically coinciding lower and upper bounds. To obtain a lower bound we suppose k > −1. (k) The orthogonality relation (3.3) implies that ξ k e−ξ Lq (ξ)2 q!/(k + q)! dξ induces a probability measure on [0, ∞] such that Jensen’s inequality [14] yields Z ∞ q! (k + q)! (β) k+β −ξ (k) 2 exp −λ ξ e Lq (ξ) dξ . (3.18) J (k, λ) ≥ k! q! (k + q)! 0 (k+β) Pq (k) Lq−m (ξ) We may now employ the combinatorial identity Lq (ξ) = m=0 m−β−1 m [8, Eq. 8.974(2)], which implies that Z ∞ q! 2 ξ k+β e−ξ L(k) q (ξ) dξ (k + q)! 0 Z ∞ q X q! m−β−1 l−β−1 (k+β) (k+β) ξ k+β e−ξ Lq−m (ξ)Lq−l (ξ) dξ = m l (k + q)! 0 lim
m,l=0
=
2 q X m−β−1 m=0
m
Γ(k + q − m + β + 1) q! . (q − m)! Γ(k + q + 1)
(3.19)
November 7, 2002 14:27 WSPC/148-RMP
00149
Quasi-Classical Versus Non-Classical Spectral Asymptotics
1061
Here we have again used the orthogonality relation (3.3) in the last step. Using (3.8) this entails lim inf k→∞ k −β ln J (β) (k, λ) ≥ −λ. For the upper bound we suppose k + q > 2 and choose Ξk as the (unique) maximum of the integrand in the r.h.s. of the estimate Z (k + q)2q ∞ k −λξβ −(1−2/(k+q))ξ ξ e dξ (3.20) J (β) (k, λ) ≤ k! 0 which was obtained by using (3.4). More precisely, we define Ξk as the (unique) solution of the equation λβΞβk + (1 − 2/(k + q))Ξk = k. Splitting the integration in (3.20) into two parts with domain of integration restricted to [0, Ξk ) and [Ξk , ∞), the two parts are estimated separately as follows. Using monotonicity of the integrand on [0, Ξk ) we obtain the bound Z Ξk+1 1 Ξk k −λξβ −(1−2/(k+q))ξ ξ e dξ ≤ k exp[−λΞβk − (1 − 2/(k + q))Ξk ] k! 0 k! = Ξk
kk exp[k ln[Ξk /k] − (1 − 2/(k + q))Ξk − λΞβk ] k!
k k −k e exp[−λΞβk + 2Ξk /(k + q)] (3.21) k! on the first part. For the last inequality we have used the fact that ln ξ ≤ ξ − 1 for all ξ > 0. The second part is bounded according to Z ∞ k Z ξ −(1−2/(k+q))ξ 1 ∞ k −λξβ −(1−2/(k+q))ξ e ξ e dξ ≤ exp[−λΞβk ] dξ k! Ξk k! 0 ≤ Ξk
= (1 − 2/(k + q))−k−1 exp[−λΞβk ] .
(3.22)
The sandwiching bounds 1 − λβk β−1 ≤ (1 − 2/(k + q))Ξk /k ≤ 1 imply limk→∞ Ξk /k = 1. Using this in (3.21) and (3.22), employing Stirling’s asymptotic formula [1, Eq. 6.1.37] k k−1/2 −k e = (2π)−1/2 , k→∞ Γ(k) lim
(3.23)
and the fact that limk→∞ (1+2/k)k = e2 , we obtain lim supk→∞ k −β ln J (β) (k, λ) ≤ −λ. This concludes the proof of (3.17). Case β = 1. An explicit calculation yields Z 1 ∞ k −(1+λ)ξ (k) 2 (1) ξ e Lq (ξ) dξ J (k, λ) = k! 0 Z q 1 X q+k q + k (−1)m+l ∞ k+m+l −(1+λ)ξ ξ e dξ = q−m q−l k! m! l! 0 m,l=0
=
q X q+k q + k (−1)m+l (k + l + m)! (1 + λ)−k−m−l−1 . q−m q−l m! l! k!
m,l=0
(3.24)
November 7, 2002 14:27 WSPC/148-RMP
1062
00149
G. D. Raikov & S. Warzel
Using (3.8) and proceeding similarly as in the second part of the proof of Lemma 3.1 one shows that the r.h.s. is asymptotically equal to " q #2 2q 2q X q (−1)m −k−1 k −k−2q−1 (λk) = (1 + λ) (3.25) (1 + λ) (q!)2 m=0 m (1 + λ)m (q!)2 which in turn implies that limk→∞ k −1 ln J (β) (k, λ) = − ln(1 + λ). Case 1 < β < ∞. The claim follows from (3.8) and (3.14) together with the asymptotic relation β−1 ln J (β) (k, λ) =− k→∞ k ln k β
(3.26)
lim
valid for λ > 0 in this case. For a proof of (3.26) we construct asymptotically coinciding lower and upper bounds. The lower bound reads J (β) (k, λ) ≥ e−λk−k
≥e
1/β
−λk−k1/β
≥ e−λk−k
1/β
1 k!
Z
k1/β 2 ξ k L(k) q (ξ) dξ 0
k k+1 k!
Z
k
1−β β
2 ξ k L(k) q (kξ) dξ 0
k k+1/β k 1−β k 2q k β q+1 . (k + 1)! 4 (q!)2
(3.27)
Here the last inequality follows from (3.6) with ξ0 = 1/2, and is valid for sufficiently large k only. Using Stirling’s asymptotic formula (3.23) in (3.27), we obtain lim inf k→∞ (k ln k)−1 ln J (β) (k, λ) ≥ 1−β β . For the upper bound we suppose k + q > 2 and use (3.4) in order to estimate the integrand in (3.20) from above. Thus we obtain Z k+1 (k + q)2q ∞ k −λξβ (k + q)2q (β) Γ ξ e dξ = . (3.28) J (k, λ) ≤ k! β βλ(k+1)/β k! 0 Stirling’s formula (3.23) finally yields lim supk→∞ (k ln k)−1 ln J (β) (k, λ) ≤
1−β β .
The last topic in this section is the derivation of an asymptotic property of the eigenvalues νq,k (r) := hχr ϕq,k , ϕq,k i ,
k ∈ Z+ − q ,
q ∈ Z+ ,
r > 0,
(3.29)
of the operator Pq χr Pq (see Lemma 3.3). Proposition 3.2. Let q ∈ Z+ and r > 0. Then we have ln νq,k (r) = −1 . k→∞ k ln k lim
(3.30)
November 7, 2002 14:27 WSPC/148-RMP
00149
Quasi-Classical Versus Non-Classical Spectral Asymptotics
1063
Remark 3.3. It follows from (3.30), (3.15), (3.13), and (3.14) with β < ∞, that (β)
k → ∞,
νq,k (r) = o(γq,k (µ)) ,
(3.31)
for all 0 < µ < ∞ and 0 < β < ∞. Proof of Proposition 3.2. From Lemma 3.3 it follows that q! νq,k (r) = (k + q)!
Z
br 2 /2
2 ξ k e−ξ L(k) q (ξ) dξ .
(3.32)
0
In its turn, the integral in (3.32) is estimated as follows Z
br 2 /2
2 −br ξ k e−ξ L(k) q (ξ) dξ ≥ e
2
Z
br 2 /(2k)
/2 k+1
2 ξ k L(k) q (kξ) dξ
k
0
0
≥ e−br
2
/2
k k+1 k+1
br2 2k
k+1
k 2q 4q+1 (q!)2
.
(3.33)
Here the last inequality again is implied by (3.6), and is valid for sufficiently large k. Moreover, we may use (3.4) to estimate Z
br 2 /2
k −ξ
ξ e
Z 2 L(k) q (ξ)
br 2 /2
dξ ≤ (k + q)
2q
0
ξ k e−(1−2/(k+q))ξ
0
(k + q)2q ≤ k+1
br2 2
k+1 (3.34)
for all k + q ≥ 2. The claim again follows with the help of Stirling’s formula (3.23). 4. Proof of the Main Results for Two Dimensions 4.1. Reduction to a single Landau-level eigenspace In this subsection we establish asymptotic estimates of N (Eq + E, E 0 ; H(V )) as E ↓ 0, which play a crucial role in the proof of Theorems 2.1 and 2.2. For this purpose, we recall in the following lemma a suitable version of the well-known Weyl inequalities for the eigenvalues of self-adjoint compact operators. Lemma 4.1 ([5, Sec. 9.2, Theorem 9]). Let T1 and T2 be linear self-adjoint compact operators on a Hilbert space. Then for each s > 0 and ε ∈ (0, 1) we have n± (s(1 + ε); T1 ) − n∓ (sε; T2 ) ≤ n± (s; T1 + T2 ) ≤ n± (s(1 − ε); T1 ) + n± (sε; T2 ) , the counting functions n± being defined in (2.1).
(4.1)
November 7, 2002 14:27 WSPC/148-RMP
1064
00149
G. D. Raikov & S. Warzel
Proposition 4.1. Let E 0 ∈ (Eq , Eq+1 ), q ∈ Z+ . Assume that V satisfies the hypotheses of Theorem 2.1 or Theorem 2.2. Then for every ε ∈ (0, 1) we have n+ (E; (1 − ε)Pq V Pq ) + O(1) ≤ N (Eq + E, E 0 ; H(V )) ≤ n+ (E; (1 + ε)Pq V Pq ) + O(1) ,
E ↓ 0.
(4.2)
Proof. First of all, note that under the hypotheses of Theorems 2.1 and 2.2, V satisfies the assumptions of Lemma 3.2, so that the operator Pq V Pq is compact. Next, the generalized Birman–Schwinger principle (see e.g. [2, Theorem 1.3]) entails N (Eq + E, E 0 ; H(V )) = n+ (1; V 1/2 (Eq + E − H(0))−1 V 1/2 ) − n+ (1; V 1/2 (E 0 − H(0))−1 V 1/2 ) − dim Ker (H(V ) − E 0 ) .
(4.3)
Since the operator V 1/2 H(0)−1/2 is compact, the last two terms at the r.h.s. of (4.3), which are independent of E, are finite. Fix ε ∈ (0, 1) and set Qq := Id − Pq . Applying (4.1) with T1 := V 1/2 (Eq + E − H(0))−1 Pq V 1/2 and T2 := V 1/2 (Eq + E − H(0))−1 Qq V 1/2 , we obtain n+ (1; V 1/2 (Eq + E − H(0))−1 V 1/2 ) ≥ n+ (1/(1 − ε); V 1/2 (Eq + E − H(0))−1 Pq V 1/2 ) − n− (ε/(1 − ε); V 1/2 (Eq + E − H(0))−1 Qq V 1/2 ) ,
(4.4)
n+ (1; V 1/2 (Eq + E − H(0))−1 V 1/2 ) ≤ n+ (1/(1 + ε); V 1/2 (Eq + E − H(0))−1 Pq V 1/2 ) + n+ (ε/(1 + ε); V 1/2 (Eq + E − H(0))−1 Qq V 1/2 ) .
(4.5)
Next, we deal with the first terms on the r.h.s. of (4.4) and (4.5). Since the non-zero singular numbers of the compact operators Pq V 1/2 and V 1/2 Pq coincide, we get n+ (1/(1 ± ε); V 1/2 (Eq + E − H(0))−1 Pq V 1/2 ) = n+ (E; (1 ± ε)V 1/2 Pq V 1/2 ) = n+ (E; (1 ± ε)Pq V Pq ) .
(4.6)
Further, we estimate the second terms on the r.h.s. of (4.4) and (4.5). The operator inequality X |Eq + E − El |−1 Pl |Eq + E − H(0)|−1 Qq = l∈Z+
l6=q
≤ Cq
X l∈Z+
El−1 Pl = Cq H(0)−1 ,
(4.7)
November 7, 2002 14:27 WSPC/148-RMP
00149
Quasi-Classical Versus Non-Classical Spectral Asymptotics
1065
valid for E ∈ (0, E 0 − Eq ), E 0 ∈ (Eq , Eq+1 ), and Cq := Eq+1 /(Eq+1 − E 0 ), implies n± (ε/(1 ± ε); V 1/2 (Eq + E − H(0))−1 Qq V 1/2 ) ≤ n+ (ε/(1 ± ε); Cq V 1/2 H(0)−1 V 1/2 ) .
(4.8)
Since the operator V 1/2 H(0)−1/2 is compact, the quantity on the r.h.s. of (4.8), which is independent of E, is finite for each ε ∈ (0, 1). Putting together (4.3)–(4.8), we obtain (4.2). 4.2. Proof of Theorem 2.1 (β)
Pick δ ∈ (0, µ). From (2.2) we conclude that there exist rδ > 0 such that Gµ+δ (x) ≤ (β)
V (x) ≤ Gµ−δ (x) for all x ∈ R2 which satisfy |x| > rδ . Hence, we have (β)
(β)
Gµ+δ (x) − M χrδ (x) ≤ V (x) ≤ Gµ−δ (x) + M χrδ (x) ,
x ∈ R2 ,
(4.9)
(β)
with M := max{1, supx∈R2 V (x)} as supx∈R2 Gλ (x) = 1 for each λ ∈ (0, ∞), β ∈ (0, ∞). Let us pick ε > 0. According to Proposition 4.1 and (4.9) we have N (Eq + E, E 0 ; H(V )) ≥ n+ (E; (1 − ε)Pq V Pq ) + O(1) , (β)
≥ n+ (E; (1 − ε)Pq [Gµ+δ − M χrδ ]Pq ) + O(1) ,
E ↓ 0,
(4.10)
E ↓ 0.
(4.11)
N (Eq + E, E 0 ; H(V )) ≤ n+ (E; (1 + ε)Pq V Pq ) + O(1) (β)
≤ n+ (E; (1 + ε)Pq [Gµ−δ + M χrδ ]Pq ) + O(1) , (β)
Since Gµ±δ ∓ M χrδ is bounded and radially symmetric, Lemma 3.3 implies that the (β)
(β)
eigenvalues of Pq [Gµ±δ ∓ M χrδ ]Pq are given by γq,k (µ ± δ) ∓ M νq,k (rδ ), k ∈ Z+ − q, (see (3.12) and (3.29)). Therefore, (β)
n+ (E; (1 ∓ ε)Pq [Gµ±δ ∓ M χrδ ]Pq ) (β)
= #{k ∈ Z+ − q|(1 ∓ ε)[γq,k (µ ± δ) ∓ M νq,k (rδ )] > E} .
(4.12)
Thanks to Proposition 3.1 and (3.31), there exists some Kε ∈ Z+ − q such that (β)
(β)
γq,k (µ + δ) − M νq,k (rδ ) ≥ (1 − ε)γq,k (µ + δ) ≥ (1 − ε) exp[−(1 + ε)(aµ+δ )−1 (k)] , (β)
(β)
(4.13)
(β)
γq,k (µ − δ) + M νq,k (rδ ) ≤ (1 + ε)γq,k (µ − δ) ≤ (1 + ε) exp[−(1 − ε)(aµ−δ )−1 (k)] (β)
(4.14)
November 7, 2002 14:27 WSPC/148-RMP
1066
00149
G. D. Raikov & S. Warzel
for all k ≥ Kε . Using (4.10)–(4.14), we thus conclude that lim inf E↓0
lim sup E↓0
N (Eq + E, E 0 ; H(V )) (β)
aµ+δ (|ln(E/(1 − ε)2 )|/(1 + ε)) N (Eq + E, E 0 ; H(V )) (β)
aµ−δ (|ln(E/(1 + ε)2 )|/(1 − ε))
≥ 1,
(4.15)
≤ 1.
(4.16)
Letting ε ↓ 0 and afterwards δ ↓ 0 in (4.15) and (4.16), and taking into account that (β)
lim lim
aµ±δ (κ/(1 ± ε)) (β)
ε↓0 κ→∞
(β)
= 1,
aµ±δ (κ)
lim lim
δ↓0 κ→∞
aµ±δ (κ) (β)
= 1,
(4.17)
aµ (κ)
we obtain (2.8) with β < ∞ which is equivalent to (2.3)–(2.5). 4.3. Proof of Theorem 2.2 Its hypotheses imply that there exist C± > 0, r± > 0, and x± ∈ R2 , such that C− χr− ,x− (x) ≤ V (x) ≤ C+ χr+ ,x+ (x) ,
x ∈ R2 .
(4.18)
Pick ε ∈ (0, 1). Combining (4.2), (4.18), and the minimax principle, we get N (Eq + E, E 0 ; H(V )) ≥ n+ (E; (1 − ε)C− Pq χr− ,x− Pq ) + O(1) ,
E ↓ 0,
(4.19)
N (Eq + E, E 0 ; H(V )) ≤ n+ (E; (1 + ε)C+ Pq χr+ ,x+ Pq ) + O(1) ,
E ↓ 0.
(4.20)
For x0 = (x0 , y 0 ) ∈ R2 define the magnetic translation Tx0 by b (Tx0 u)(x) := exp i (x0 y − xy 0 ) u(x − x0 ) , x = (x, y) ∈ R2 . 2 The unitary operator Tx0 commutes with H(0), and hence with the projections Pq , q ∈ Z+ (see e.g. [11, Eq. 11]). Therefore, Pq χr± ,x± Pq = Pq Tx± χr± Tx∗± Pq = Tx± Pq χr± Pq Tx∗± .
(4.21)
Hence, the operators Pq χr± ,x± Pq and Pq χr± Pq are unitarily equivalent, and we have n+ (E; (1 ± ε)C± Pq χr± ,x± Pq ) = n+ (E; (1 ± ε)C± Pq χr± Pq ) = #{k ∈ Z+ − q|(1 ± ε)C± νq,k (r± ) > E} = #{k ∈ Z+ − q| ln νq,k (r± ) + ln((1 ± ε)C± ) > ln E} .
(4.22)
Taking into account (3.30), we find that (4.22) entails n+ (E; (1 ± ε)C± Pq χr± ,x± Pq ) = 1. E↓0 (ln|lnE|)−1 |lnE| lim
Putting together (4.19), (4.20), and (4.23), we obtain (2.5).
(4.23)
November 7, 2002 14:27 WSPC/148-RMP
00149
Quasi-Classical Versus Non-Classical Spectral Asymptotics
1067
5. Proof of Main Results for Three Dimensions 5.1. Auxiliary facts about Schr¨ odinger operators in one dimension This subsection contains some well-known facts from the spectral theory of onedimensional Schr¨odinger operators. Let v ∈ L1 (R) be real-valued and R let h(v) be the self-adjoint operator generated in L2 (R) by the quadratic form R {|u0 |2 − v|u|2 } dz, u ∈ W21 (R). It is closed and lower bounded since the operator |v|1/2 (h(0)+1)−1/2 is Hilbert–Schmidt, and hence compact. Lemma 5.1 ([4, Secs. 2.4, 4.6], [13]). Let 0 ≤ v ∈ L1 (R; (1 + |z|) dz), g > 0. Assume that v does not vanish identically. Then we have Z (5.1) 1 ≤ N (0; h(gv)) ≤ g |z|v(z) dz + 1 . R
R
Note that if 0 < g R |z|v(z) dz < 1, then by (5.1) the operator h(gv) has a unique, strictly negative eigenvalue denoted in the sequel by −E(gv). Lemma 5.2 ([6, Theorem 3.1], [13], [21]). Let the hypotheses of Lemma 5.1 hold. Then E(gv) obeys the asymptotics Z p g E(gv) = v(z) dz(1 + o(1)) , g ↓ 0 . (5.2) 2 R 5.2. Proof of Lemma 2.1 Denote by Pq : L2 (R3 ) → L2 (R3 ), q ∈ Z+ , the orthogonal projections corresponding to the qth Landau level. In other words, Z 0 0 0 Kq (X⊥ , X⊥ )u(X⊥ , z) dX⊥ , (X⊥ , z) ∈ R3 , (Pq u)(X⊥ , z) := R2
0 0 ), X⊥ , X⊥ ∈ R2 , is the integral where Kq (X⊥ , X⊥ projection Pq : L2 (R2 ) → L2 (R2 ), introduced in (3.9). Let N ≥ 1 and set T := V 1/2 H(0)−1/2 and TN := T
kernel of the orthogonal PN
q=0 Pq . First, we show that TN is a Hilbert–Schmidt operator. To this end we estimate
kTN kHS ≤
N X
kT Pq kHS .
q=0
Further, taking into account (3.9) and (3.10), we find that Z Z b b −1/2 dζ Eq V (x) dx ≤ kU kL1(R2 ) kvkL1 (R) . kT Pq k2HS = 2+E (2π)2 R3 ζ 4π q R
(5.3)
Therefore, TN is Hilbert–Schmidt, and hence compact. Next we show that limN →∞ kT − TN k = 0. Evidently, kT − TN k ≤ kU kL∞ (R2 ) k |v|1/2 (h(0) + EN +1 )−1/2 k . 1/2
(5.4)
November 7, 2002 14:27 WSPC/148-RMP
1068
00149
G. D. Raikov & S. Warzel
Since the operator |v|1/2 (h(0) + 1)−1/2 is compact in L2 (R), we have limN →∞ k |v|1/2 (h(0) + EN +1 )−1/2 k = 0. Consequently, the operator T can be approximated in norm by the sequence of compact operators TN . Hence, T is a compact operator itself. 5.3. Reduction to one dimension In this subsection we prove a proposition which can be regarded as the threedimensional analogue of Proposition 4.1. Proposition 5.1. Let V ≥ 0. Suppose that there exist four non-negative functions v ± ∈ L1 (R) and U ± ∈ L1 (R2 ) ∩ L∞ (R2 ) such that U − (X⊥ )v − (z) ≤ V (x) ≤ U + (X⊥ )v + (z) ,
x = (X⊥ , z) ∈ R3 .
(5.5)
Then for every ε > 0 we have X N (−E; h(κk− v − )) k∈Z+
≤ N (E0 − E; H(−V )) X N (−E; h((1 + ε)κk+ v + )) + O(1) , ≤
E ↓ 0.
(5.6)
k∈Z+
Here h(v) is the operator defined at the beginning of Sec. 5.1, and κk± , k ∈ Z+ , stand for the respective eigenvalues of the compact operators P0 U ± P0 on P0 L2 (R2 ). Proof. Set Q0 := Id − P0 and denote by Z1 (V ) (respectively, by Z2 (V )) the self-adjoint operator generated in P0 RL2 (R3 ) (respectively, in Q0 L2 (R3 )) by the closed, lower bounded quadratic form R3 {|i∇u + Au|2 − V |u|2 } dx defined for u ∈ P0 D(H(0)1/2 ) (respectively, for u ∈ Q0 D(H(0)1/2 )). Let ε > 0. Since V ≥ 0, the minimax principle yields N (E0 − E; Z1 (V )) ≤ N (E0 − E; H(−V )) ≤ N (E0 − E; Z1 ((1 + ε)V )) + N (E0 − E; Z2 ((1 + ε−1 )V )) . It is easy to check that σess (Z2 ((1 + ε N (E0 − E; Z2 ((1 + ε
−1
−1
(5.7)
)V )) = [E1 , ∞) for each ε > 0. Therefore,
)V )) = O(1) ,
E ↓ 0.
(5.8)
Set V ± (x) := U ± (X⊥ )v ± (z), x = (X⊥ , z). Then (5.5) implies N (E0 − E; Z1 (V )) ≥ N (E0 − E; Z1 (V − )) , N (E0 − E; Z1 ((1 + ε)V )) ≤ N (E0 − E; Z1 ((1 + ε)V + )) .
(5.9) P
(5.10)
Obviously, Z1 (V − ) is unitarily equivalent to the orthogonal sum k∈Z+ ⊕ P (h(κk− v − ) + E0 ), while Z1 ((1 + ε)V + ) is unitarily equivalent to k∈Z+ ⊕(h((1 + ε)κk+ v + ) + E0 ). Thus the combination of (5.7)–(5.10) yields (5.6).
November 7, 2002 14:27 WSPC/148-RMP
00149
Quasi-Classical Versus Non-Classical Spectral Asymptotics
1069
5.4. Proof of Theorem 2.3 By the hypotheses of Theorem 2.3 we may pick δ ∈ (0, µ) and choose rδ > 0 such that the assumptions of Proposition 5.1 are satisfied with U ± (X⊥ ) = Gµ∓δ (X⊥ ) ± Mχrδ (X⊥ ) , (β)
v + (z) = vδ+ (z) + v(z) ,
(5.11)
v − (z) = vδ− (z) ,
where, similarly to (4.9), M := max{1, C}, and C is the constant occurring in the (β) formulation of Theorem 2.3. Accordingly, Lemma 3.3 implies that κk± = γ0,k (µ ∓ δ)±Mν0,k (rδ ), k ∈ Z+ . Now pick ε ∈ (0, 1) and choose Kε such that k ≥ Kε entails the following inequalities (β)
(β)
(β)
(β)
γ0,k (µ + δ) − Mν0,k (rδ ) ≥ (1 − ε)γ0,k (µ + δ) , γ0,k (µ − δ) + Mν0,k (rδ ) ≤ (1 + ε)γ0,k (µ − δ) , R (β) (1 + ε)2 γ0,k (µ ∓ δ) R |z|v ± (z)dz < 1 .
(5.12)
Taking into account (5.1) and Proposition 5.1, we get N (E0 − E; H(−V )) X (β) N (−E; h((γ0,k (µ + δ) − Mν0,k (rδ ))v − )) ≥ k∈Z+
≥ #{k ∈ Z+ , k ≥ Kε |E((1 − ε)γ0,k (µ + δ)v − ) > E} . (β)
(5.13)
Similarly, we have N (E0 − E; H(−V )) X (β) N (−E; h((1 + ε)(γ0,k (µ − δ) + Mν0,k (rδ ))v + ) + O(1) ≤ k∈Z+ (β)
≤ #{k ∈ Z+ , k ≥ Kε |E((1 + ε)2 γ0,k (µ − δ)v + ) > E} + O(1) ,
E ↓ 0.
(5.14)
The last inequality in (5.14) results from splitting the series into two parts and using (5.1) to verify that the sum over k ∈ {0, 1, . . . , Kε − 1} remains bounded as E ↓ 0. Utilizing (5.2), choose Kε0 ≥ Kε such that k ≥ Kε0 entails q q
(β)
E((1 − ε)γ0,k (µ + δ)v − ) ≥
E((1 +
(β) ε)2 γ0,k (µ
−
δ)v + )
(1 − ε)2 (β) γ0,k (µ + δ) 2
(1 + ε)3 (β) γ0,k (µ − δ) ≤ 2
Z Z
v − (z) dz ,
(5.15)
v + (z) dz .
(5.16)
R
R
November 7, 2002 14:27 WSPC/148-RMP
1070
00149
G. D. Raikov & S. Warzel
Consequently, #{k ∈ Z+ , k ≥ Kε |E((1 − ε)γ0,k (µ + δ)v − ) > E} Z 2 √ (β) 0 (1 − ε) − γ0,k (µ + δ)) v (z) dz > E , ≥ # k ∈ Z+ , k ≥ Kε 2 R (β)
(5.17)
(β)
#{k ∈ Z+ , k ≥ Kε |E((1 + ε)2 γ0,k (µ − δ)v + ) > E} Z 3 √ (β) 0 (1 + ε) + γ0,k (µ − δ)) v (z) dz > E ≤ # k ∈ Z+ , k ≥ Kε 2 R + O(1) ,
E ↓ 0.
(5.18)
Putting together (5.13)–(5.14) and (5.17)–(5.18), we obtain the asymptotic estimates N (E0 − E; H(−V ))
√ (β) ≥ #{k ∈ Z+ | ln γ0,k (µ + δ) > ln E + O(1)} + O(1) ,
(5.19)
N (E0 − E; H(−V ))
√ (β) ≤ #{k ∈ Z+ | ln γ0,k (µ − δ) > ln E + O(1)} + O(1) ,
(5.20)
valid as E ↓ 0. Using Proposition 3.1 and proceeding as in the proof of Theorem 2.1, we find that (5.19) and (5.20) imply (2.10). 5.5. Proof of Theorem 2.4 Finally, in this subsection we give a sketch of the proof of Theorem 2.4 which is quite similar and only easier than the proof of Theorem 2.3. First of all, note that the assumptions of Proposition 5.1 are satisfied with U ± (X⊥ ) = χr± ,X ± (X⊥ ), so ⊥
that κk± = ν0,k (r± ) thanks to the unitary equivalence of the operators P0 χr± ,X ± P0 ⊥ and P0 χr± P0 established in Sec. 4.3. Proposition 5.1 and Lemma 5.1 then imply the asymptotic estimates √ #{k ∈ Z+ | ln ν0,k (r− ) > ln E + O(1)} + O(1) ≤ N (E0 − E; H(−V )) ≤ #{k ∈ Z+ | ln ν0,k (r+ ) > ln
√
E + O(1)} + O(1) ,
(5.21)
which hold for E ↓ 0, and are analogous to (5.19) and (5.20). Applying (3.30) and (3.14) with β = ∞, we conclude that (5.21) implies (2.11). Acknowledgments The authors are very grateful to Professor Grigori Rozenblum for indicating a gap in the proof of Propositions 3.1 and 3.2 for q ≥ 1 in the first version of the paper.
November 7, 2002 14:27 WSPC/148-RMP
00149
Quasi-Classical Versus Non-Classical Spectral Asymptotics
1071
Acknowledgements are also due to both referees whose remarks contributed to the improvement of the article. A part of this work was done while G. Raikov was visiting the FriedrichAlexander Universit¨ at Erlangen–N¨ urnberg in the summer of 2001 as a DAAD Research Fellow. The financial support of DAAD and of the Chilean Science Foundation Fondecyt under Grants 1020737 and 7020737, is gratefully acknowledged. It is a pleasure for G. Raikov to express his gratitude to Professor Hajo Leschke for his warm hospitality. Both authors thank him for encouragement and several stimulating discussions.
References [1] M. Abramowitz and I. Stegun, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, National Bureau of Standards, Applied Mathematics Series 55, 1964. [2] S. Alama, P. A. Deift and R. Hempel, Eigenvalue branches of the Schr¨ odinger operator H − λW in a gap of σ(H), Commun. Math. Phys. 121 (1989), 291–321. [3] J. Avron, I. Herbst and B. Simon, Schr¨ odinger operators with magnetic fields. I. General interactions, Duke Math. J. 45 (1978), 847–883. [4] M. S. Birman, On the spectrum of singular boundary value problems, Mat. Sbornik 55 (1961) 125–174 (in Russian). English translation in Amer. Math. Soc. Transl., (2) 53 (1966), 23–80. [5] M. S. Birman and M. Z. Solomjak, Spectral Theory of Self-Adjoint Operators in Hilbert Space, Reidel, Dordrecht, 1987. [6] R. Blankenbecler, M. L. Goldberger and B. Simon, The bound states of weakly coupled long-range one-dimensional quantum Hamiltonians, Ann. Phys. (NY) 108 (1977), 69–78. [7] V. Fock, Bemerkung zur Quantelung des harmonischen Oszillators im Magnetfeld, Z. Physik 47 (1928), 446–448 (in German). [8] I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and Products, corrected and enlarged edition, Academic, San Diego, 1980. [9] G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, Clarendon Press, Oxford, 1954. [10] T. Hupfer, H. Leschke and S. Warzel, Poissonian obstacles with Gaussian walls discriminate between classical and quantum Lifshits tailing in magnetic fields, J. Stat. Phys. 97 (1999), 725–750. [11] T. Hupfer, H. Leschke and S. Warzel, Upper bounds on the density of states of single Landau levels broadened by Gaussian random potentials, J. Math. Phys. 42 (2001), 5626–5641. [12] V. Ya. Ivrii, Microlocal Analysis and Precise Spectral Asymptotics, Springer, Berlin, 1998. [13] M. Klaus, On the bound state of Schr¨ odinger operators in one dimension, Ann. Phys. (NY) 108 (1977), 288–300. [14] E. H. Lieb and M. Loss, Analysis, 2nd edition, Amer. Math. Soc., Providence, RI, 2001. [15] G. D. Raikov, Eigenvalue asymptotics for the Schr¨ odinger operator with homogeneous magnetic potential and decreasing electric potential. I. Behaviour near the essential
November 7, 2002 14:27 WSPC/148-RMP
1072
[16] [17]
[18] [19] [20]
[21]
00149
G. D. Raikov & S. Warzel
spectrum tips, Commun. Partial Differential Equations 15 (1990), 407–434; Errata: Commun. Partial Differential Equations 18 (1993), 1977–1979. G. D. Raikov, Border-line eigenvalue asymptotics for the Schr¨ odinger operator with electromagnetic potential, Integral Equations Operator Theory 14 (1991), 875–888. A. V. Sobolev, Asymptotic behavior of the energy levels of a quantum particle in a homogeneous magnetic field, perturbed by a decreasing electric field. I, Probl. Mat. Anal. 9 (1984), 67–84 (in Russian); English translation in J. Sov. Math. 35 (1986), 2201–2212. H. Tamura, Asymptotic distribution of eigenvalues for Schr¨ odinger operators with homogeneous magnetic fields, Osaka J. Math. 25 (1988), 633–647. M. Melgaard and G. Rozenblum, Eigenvalue asymptotics for even-dimensional perturbed Dirac and Schr¨ odinger operators with constant magnetic fields, preprint, 2002. G. D. Raikov and S. Warzel, Spectral asymptotics for magnetic Schr¨ odinger operators with rapidly decreasing electric potentials, C. R. Acad. Sci. Paris, Ser. I 335 (2002), 683–688. S. N. Solnyshkin, Asymptotics of the energy of bound states of the Schr¨ odinger operator in the presence of electric and homogeneous magnetic fields, Probl. Mat. Fiz. 10 (1982), 266-278 [in Russian]; English translation in: Sel. Math. Sov. 5 (1986), 297–306.
November 7, 2002 14:37 WSPC/148-RMP
00153
Reviews in Mathematical Physics, Vol. 14, No. 10 (2002) 1073–1098 c World Scientific Publishing Company
FERMION AND BOSON RANDOM POINT PROCESSES AS PARTICLE DISTRIBUTIONS OF INFINITE FREE FERMI AND BOSE GASES OF FINITE DENSITY
EUGENE LYTVYNOV Institut f¨ ur Angewandte Mathematik, Universit¨ at Bonn, Wegelerstr. 6, D-53115 Bonn, Germany Received 24 November 2001 Revised 25 May 2002 The aim of this paper is to show that fermion and boson random point processes naturally appear from representations of CAR and CCR which correspond to gauge invariant generalized free states (also called quasi-free states). We consider particle density operators ρ(x), x ∈ Rd , in the representation of CAR describing an infinite free Fermi gas of finite density at both zero and finite temperature [6], and in the representation of CCR describing an infinite free Bose gas at finite temperature [5]. We prove that the specR tral measure of the smeared operators ρ(f ) = dx f (x)ρ(x) (i.e., the measure µ which allows to realize the ρ(f )’s as multiplication operators by h·, f i in L2 (dµ)) is a wellknown fermion, respectively boson process on the space of all locally finite configurations in Rd . Keywords: Boson process; canonical commutation relations; canonical anticommutation relations; currents; fermion process; particle densities; quasi-free states.
1. Introduction: Representations of Current Algebras; Boson and Fermion Processes The nonrelativistic quantum mechanics of many identical particles may be described by means of a field ψ(x), x ∈ Rd , satisfying either canonical commutation relations (CCR) and describing bosons: [ψ(x), ψ(y)]− = [ψ ∗ (x), ψ ∗ (y)]− = 0 , [ψ ∗ (x), ψ(y)]− = δ(x − y)1 ,
(1.1)
or satisfying canonical anticommutation relations (CAR) and describing fermions: [ψ(x), ψ(y)]+ = [ψ ∗ (x), ψ ∗ (y)]+ = 0 , [ψ ∗ (x), ψ(y)]+ = δ(x − y)1 .
(1.2)
Here, [A, B]∓ = AB ∓ BA is the commutator (respectively anticommutator). The statistics of the system is thus determined by the algebra which is to be represented. 1073
November 7, 2002 14:37 WSPC/148-RMP
1074
00153
E. Lytvynov
In the formulation of nonrelativistic quantum mechanics in terms of particle densities and currents, one defines ρ(x) := ψ ∗ (x)ψ(x) , J(x) := (2i)−1 (ψ ∗ (x)∇ψ(x) − (∇ψ ∗ (x))ψ(x)) .
(1.3)
Using CCR or CAR, one can formally R compute the commutationRrelations satisfied by the smeared operators ρ(f ) := Rd dx f (x)ρ(x) and J(v) := Rd dx v(x) · J(x). These turn out to be [ρ(f1 ), ρ(f2 )]− = 0 , [ρ(f ), J(v)]− = iρ(v · ∇f ) , [J(v1 ), J(v2 )]− = −iJ(v1 · ∇v2 − v2 · ∇v1 ) ,
(1.4)
independently of whether one starts with CCR or CAR. Thus, in a nonrelativistic current theory, the particle statistics is not determined by a choice of an equal-time algebra, but instead may be determined by the choice of a representation of the algebra, see e.g. [16, 22, 24] and the references therein. Since the operators ρ(f ) and J(v) are generally speaking unbounded, one usually starts with study of the group G obtained by exponentiating the algebra g generated by the commutation relations (1.4). More precisely, considering ρ(f ) and J(v) to be self-adjoint, the corresponding one-parameter groups are U(tf ) = exp[itρ(f )] ,
V(φvt ) = exp[itJ(v))] ,
(1.5)
where φvt is the one-parameter group of diffeomorphisms (or flows) on Rd generated by the vector field v: ∂φvt = v(φvt ) , φvt=0 (x) = x . ∂t From Eq. (1.4), the operators (1.5) satisfy the group law U(f1 )V(ψ1 )U(f2 )V(ψ2 ) = U(f1 + f2 ◦ ψ1 )V(ψ2 ◦ ψ1 ) . Hence, the group G is the semidirect product S(Rd ) ∧ Diff(Rd ), where S(Rd ) is the Schwartz space of rapidly decreasing functions on Rd and Diff(Rd ) is a certain group of diffeomorphisms of Rd of Schwartz’s type (and thus containing all diffeomorphisms with compact support, i.e. which are identical outside a compact set). Due to physical interpretation, the currents ρ(f ) and J(v) (and the group G) can be taken as fundamental structures of quantum mechanics. It should be noted that, given a representation of the group G and its current algebra g, corresponding operators ψ(x), ψ ∗ (x) may, in general, not exist. Let U(f ) be a continuous unitary cyclic representation of S(Rd ) in a Hilbert space H with cyclic vector Ω. The functional L(f ) := (U(f )Ω, Ω) satisfies the conditions of the Bochner–Minlos theorem, and hence is the Fourier transform of a probability measure µ on S 0 (Rd ), the dual of S(Rd ): Z exp[ihω, f i]µ(dω). L(f ) = S 0 (Rd )
November 7, 2002 14:37 WSPC/148-RMP
00153
Fermion and Boson Random Point Processes
1075
Therefore, H can be realized as L2 (S 0 (Rd ); µ), Ω = 1, and U(f ) as the multiplication operator by exp[ih·, f i] in L2 (S 0 (Rd ); µ). Let now U(f )V(ψ) be a continuous unitary cyclic representation of the group S(Rd ) ∧ Diff(Rd ) in H. Suppose additionally that the cyclic vector Ω is also cyclic for (the smaller family of unitary operators) U(f ). From some physical reasons, one believes that, in the spinless case, the latter condition is always fulfilled as long as one does not deal with parastatistics. Then, realizing the Hilbert space H as L2 (S 0 (Rd ); µ) just as above, by [22], one has that the measure µ is quasi-invariant for Diff(Rd ) and the operators V(ψ) become 1/2 dµ(ψ ∗ ω) ∗ , µ-a.e. ω ∈ S 0 (Rd ) , (1.6) (V(ψ)F )(ω) = χψ (ω)F (ψ ω) dµ(ω) for all F ∈ L2 (S 0 (Rd ); µ). Here, dµ(ψ ∗ ω)/dµ(ω) is the Radon–Nikodym derivative of the transformed measure with respect to the original measure, and χψ (ω) is a cocycle, i.e. χψ (·) is a complex-valued function of modulus one, depending on ψ, defined µ-a.e., and satisfying, for each ψ1 , ψ2 ∈ Diff(Rd ), χψ2 (ω)χψ1 (ψ2∗ ω) = χψ1 ◦ψ2 (ω) µ-a.e. A powerful method of construction of continuous unitary cyclic representations of G has been the method of generating functional. A continuous complexvalued function E on G is called a generating functional on G if the following conditions are fulfilled: (1) E(e) = 1, where e is the identity element of G; PN −1 ¯ (2) i,j=1 λi λj E(gi gj ) ≥ 0 for all λi ∈ C, gi ∈ G, i = 1, . . . , N , N ∈ N. By Araki’s theorem [3], E is a generating functional on G if and only if there exists a continuous unitary cyclic representation π of G in H with cyclic vector Ω such that E(g) = (Ω, E(g)Ω), g ∈ G. Thus, one may implicitly construct unitary representations of G by finding generating functionals. In [23] (see also [1, 2, 48]), the case of an infinite free Bose gas at zero temperature with average particle density ρ > 0 was studied in the formalism of local current algebras. Goldin et al. started with considering a system of N bosons in a box of volume V . The physical Hilbert space H is now L2s (V N ), the subspace of L2 (V N ) consisting of all symmetric functions (we also used the letter V to denote the box PN itself). The Hamiltonian for N boson particles is given by HN,V = − 12 i=1 ∆i with periodic boundary conditions, and the normalized ground state wave function ΩN,V = V −N/2 . The representation of the group G in the box V is given by ! N X f (xi ) F (x1 , . . . , xN ) , UN,V (f )F (x1 , . . . , xN ) = exp i i=1
VN,V (ψ)F (x1 , . . . , xN ) = F (ψ(x1 ), . . . , ψ(xN ))
N q Y Jψ (xi ) ,
(1.7)
i=1
where f and ψ have support inside the box V and Jψ (x) = det(∂ψ k (x)/∂xl )dk,l=1 is the Jacobian of the flow. Thus, one can write down the generating functional EN,V (f, ψ) = (ΩN,V , UN,V (f )VN,V (ψ)ΩN,V )
(1.8)
November 7, 2002 14:37 WSPC/148-RMP
1076
00153
E. Lytvynov
of this representation and take the so-called N/V -limit, i.e. the limit as N , V → ∞, N/V → ρ. The limiting functional then has the form Z q (eif (x) Jψ (x) − 1) dx . E(f, ψ) = exp ρ Rd
The authors then showed that Ω is cyclic for U(f ), f ∈ S(Rd ), and hence this representation can be realized on the space L2 (S 0 (Rd ); µ), where the Fourier transform of the measure µ is equal to Z Z exp(ihω, f i)µ(dω) = exp ρ (eif (x) − 1) dx . (1.9) S 0 (Rd )
Rd
Thus, µ = πρ is the Poisson measure with intensity ρ dx. This measure is concentrated on the space ΓRd of all locally finite configurations in Rd . As for the operators V(ψ) in this representation, the general formula (1.6) now takes the following form: all the cocycles are identically equal to one and the Radon–Nykodym derivative is given by Y dπρ (ψ ∗ γ) = Jψ (x) , πρ -a.e. γ ∈ ΓRd . dπρ (γ) x∈γ One may also derive an explicit formula for the action of the operators J(v) in L2 (ΓRd ; πρ ), which particularly shows that these are certain differential operators on the configuration space (see [1] for details). Furthermore, it was shown in [23] that the representation of G defined by (1.8) is unitarily equivalent to the representation in the symmetric Fock space Fs (L2 (Rd )) in which the operators ρ(x), J(x) are defined by formula (1.3) with √ √ (1.10) ψ(x) = ψF (x) + ρ , ψ ∗ (x) = ψF∗ (x) + ρ and Ω is the vacuum vector in Fs (L2 (Rd )). In (1.10), ψF (x), ψF∗ (x) are the standard annihilation and creation operators in the Fock space, respectively. In fact, this unitary equivalence has played a crucial role in the study of the representation defined by (1.8). On the other hand, the obtained unitary I : Fs (L2 (Rd )) → L2 (ΓRd ; πρ ) coincides with the well-known chaos decomposition for the Poisson measure (e.g. [47]). The operators √ √ ρ(x) = ψ ∗ (x)ψ(x) = ψF∗ (x)ψF (x) + ρψF∗ (x) + ρψF (x) + ρ are known in quantum probability as quantum Poisson white noise (e.g. [25, 32]). The πρ can also be thought of as the spectral measure of the family (ρ(f ))f ∈S(Rd ) (cf. [11, Chap. 3] and [10, 27]). In fact, it was in Araki and Woods’ paper [5] dealing with representations of CCR that the operators (1.10) first appeared in the description of an infinite free Bose gas at zero temperature. In [33, 34] (see also [21]), a unitary cyclic representation of the group G describing an infinite free Fermi gas at zero temperature was, in particular, studied.
November 7, 2002 14:37 WSPC/148-RMP
00153
Fermion and Boson Random Point Processes
1077
Menikoff started with a system of N free Fermi particles in a cubic box V in R3 with edges of length L. If kf is the Fermi momentum of the system, the number of the particles in the box V is equal to N = #{kn = (2π/L)(n1 , n2 , n3 ) : n1 , n2 , n3 ∈ Z, |kn | ≤ kf } .
(1.11)
The Hilbert space of the system is then H = L2a (V N ), the subspace of L2 (V N ) consisting of all antisymmetric functions, and the Hamiltonian of the system PN HN,V = − 12 i=1 ∆i . The normalized ground state of HN,V is ΩN,V (x1 , . . . , xN ) = (V −N /N !)1/2 det(exp(ikn · xm ))N n,m=1 , where the kn ’s are as in (1.11). The representation of the group G in the box V is given by the same formulas (1.7) but with F ∈ L2s (V N ). By (1.11), we get for the average particle density N/V = N/(L3 ) → ρ = (4/3)π(kf /2π)3 as L → ∞ . Taking the limit N , V → ∞, N/V → ρ of the generating functional EN,V (f, ψ) again given by (1.8), one gets the following results. Let Z −3 eiλx dλ = 3ρ(sin z − z cos z)/(z 3 )|z=kf |x| , x ∈ R3 , κ(x) := (2π) {|λ|
and let Rn (y1 , . . . , yn ; x1 , . . . , xn ) := det(κ(xi − yj ))ni,j=1 ,
n ∈ N.
Then, the limiting generating functional is given by Z Z Z n ∞ Z X Y [δ(xi − yi )(eif (xi ) Txi (ψ) − 1)] dx1 dy1 · · · dxn dyn E(f, ψ) = 1 + n=1
i=1
× Rn (y1 , . . . , yn ; x1 , . . . , xn ) , p where Tx (ψ)g(x) := g(ψ(x)) Jψ (x). In particular, ! ∞ Z n X Y if (xi ) (e − 1) det(κ(xi − xj ))ni,j=1 dx1 · · · dxn , L(f ) = 1 + n=1
(R3 )n
(1.12)
(1.13)
i=1
which is the Fourier transform of a measure µa on S 0 (R3 ). Furthermore, it follows from (1.13) that the measure µa has correlation functions (x1 , . . . , xn ) = det(κ(xi − xj ))ni,j=1 . kµ(n) a
(1.14)
Menikoff mentioned in [34] that from the existence of correlation functions it should follow that the measure µa is concentrated on ΓRd , however he could not prove it. Two important problems remained open after [33, 34]: (1) Is there any connection with the representation of CAR for an infinite free Fermi gas at zero
November 7, 2002 14:37 WSPC/148-RMP
1078
00153
E. Lytvynov
temperature? Does corresponding ψ(x), ψ ∗ (x) operators exist? (2) Is Ω — the cyclic vector of the representation defined by (1.12) — also cyclic for the U(f )’s, and if it is so, what is the form of the Radon–Nikodym derivative and the cocycles in the formula (1.6) in this case? When studying statistical properties of a chaotic beam of fermions by using wavepacket formalism, Benard and Macchi [8] arrived at measures on the configuration space over a bounded volume whose correlation functions are given by a formula of type (1.14). In [29, 30], Macchi called a measure on the configuration space a fermion process if the respective correlation functions are given by (1.14) in which κ(x − y) is a non-negative definite function. She gave sufficient conditions for the existence of such a measure. Fermion process (also called determinantal random point fields) are often met with in random matrix theory, probability theory, representation theory, and ergodic theory. We refer to the paper [43] containing an exposition of recent as well as sufficiently old results on the subject. Scaling limits of fermion point processes are proved in [44, 45, 46]. We also refer to the recent papers [41, 42] for a discussion of different problems connected with fermion processes on the configuration space over the lattice Zd . In a parallel way, a boson process was defined and studied in [8, 28, 29]. This process is defined as a probability measure µ on ΓRd whose correlation functions are given by kµ(n) = per(κ(xi − xj ))ni,j=1 , where κ(x − y) is again a non-negative definite function and the permanent per A of a matrix A contains the same terms as the corresponding determinant det A but with constant positive signs for each product of matrix elements in place of the alternating positive and negative signs of the determinant. It should be mentioned that any boson process is a Cox process, i.e. a Poisson process with a random intensity measure, see [15, Sec. 8.5]. In [18, 20] (see also [19]), point processes were constructed and studied which correspond to locally normal states of a boson system, and which can be interpreted as the position distribution of the state. More precisely, let L(Fs (L2 (Rd ))) be the von Neumann algebra of all bounded linear operators in the symmetric Fock space Fs (L2 (Rd )), and let A be its C ∗ -subalgebra obtained as the uniform closure of all local von Neumann algebras in L(Fs (L2 (Rd ))). Let ω be a locally normal state on A (cf. [14]). As well known, the symmetric Fock space Fs (L2 (Rd )) may be isomorfin phically realized as the L2 -space L2 (Γfin Rd ; λ), where ΓRd is the space of all finite configurations in Rd and λ is the Lebesgue–Poisson measure on Γfin Rd . (Notice the fin evident inclusion ΓRd ⊂ ΓRd .) Then, every bounded function F on Γfin Rd determines a bounded operator MF of multiplication by F . A function F on ΓRd is called local if there exists a compact set Λ ⊂ Rd such that F (γ) = F (γ ∩ Λ) for all γ ∈ ΓRd . fin The restriction of F to Γfin Rd is a local function on ΓRd , for which we preserve the notation F . In [20], it was proved that there exists a unique probability measure
November 7, 2002 14:37 WSPC/148-RMP
00153
Fermion and Boson Random Point Processes
1079
µω on ΓRd such that, for all bounded local functions F on ΓRd , Z F (γ)µω (dγ) = ω(MF ) . ΓRd
Some properties of such point processes were also studied in [20]. In [19], it was (n) shown that, if the reduced density matrices ρω (x1 , . . . , xn ; y1 . . . , yn ) of the state ω exist and are continuous, then the correlation functions of the measure µω are given by (x1 , . . . , xn ) = ρ(n) kµ(n) ω (x1 , . . . , xn ; x1 , . . . , xn ) . ω
(1.15)
Furthermore, the special case corresponding to the ideal Bose gas (cf. [14]) was studied in detail in [18]. By (1.15), µω is now the boson measure with κ(x) =
∞ X
zn exp[−|x|2 /(4nβ)] , d/2 (4πβn) n=1
where β is the inverse temperature and z the activity. This boson measure was proved to be an infinite divisible point process. It should also be noted that the local normality of the states we discuss below, in Sec. 2, was established in e.g. [14]. The aim of this paper is to show a connection between representations of CAR, (respectively CCR), describing infinite free Fermi (respectively Bose) gases of finite density and the fermion (respectively boson) random point processes. In Sec. 2, we recall Araki and Wyss’ representations of CAR in the the antisymmetric Fock space that describes infinite free Fermi gases at both finite and zero temperature [6], and Araki and Woods’ representation of CCR in the symmetric Fock space that describes an infinite free Bose gas at finite temperature [5]. The results of this section are essentially known (with the only exception that the corresponding “annihilation” and “creation” operators ψ(x), ψ ∗ (x) has not been treated without smearing). In Sec. 3, we prove that the corresponding particle density operators are welldefined and form a family (ρ(f ))f ∈S(Rd ) of commuting selfadjoint operators. Then, we introduce the space H] (] = a in the fermionic case and ] = s in the bosonic case) as the closed linear span of the vectors of the form ρ(f1 ) · · · ρ(fn )Ω. Restricted to this space, (ρ(f ))f ∈S(Rd ) evidently becomes a cyclic family. Using the spectral theory of cyclic families of commuting selfadjoint operators [11, 39], we then show that there exist a unique probability measure µ] on S 0 (Rd ) — the Schwartz space of tempered distributions — and a unitary operator I] : H] → L2 (S 0 (Rd ); µ] ) such that I] Ω = 1 and I] ρ(f )I]−1 = h·, f i· for each f ∈ S(Rd ). Next, we introduce an operator field : ρ(x1 ) · · · ρ(xn ) : via a recurrence relation, and prove that : ρ(x1 ) · · · ρ(xn ) : = ψ ∗ (xn ) · · · ψ ∗ (x1 )ψ(x1 ) · · · ψ(xn ) .
November 7, 2002 14:37 WSPC/148-RMP
1080
00153
E. Lytvynov
Thus, : ρ(x1 ) · · · ρ(xn ) : is nothing but a normal (Wick) product of ρ(x1 ), . . . , ρ(xn ). Using this and results of [12], we explicitly calculate the correlation functions of µ] . This enables us, first, to show that µ] is concentrated on the configuration space ΓRd , and second, to identify µa as a fermion process, and µs as a boson process. In particular, starting from the representation of CAR [6] corresponding to an infinite free Fermi gas at zero temperature, we arrive at the the same probability measure µa as Menikoff did in [34]. Thus, the main results of the paper are as follows: (1) We introduced the particle density operators (quantum white noise) ρ(x), x ∈ Rd , corresponding to an infinite free Fermi gas (of finite density) at zero and at finite temperature, respectively an infinite free Bose gas at finite temperature, proved the well-definedness of ρ(x) and the essential self-adjointness of the corresponding field (ρ(f ))f ∈S(Rd ) . (2) We proved a one-to-one correspondence between the operator field (ρ(f ))f ∈S(Rd ) and a fermion (respectively boson) point process µ] on ΓRd . Furthermore, the constructed unitary isomorphism between the spaces H] and L2 (ΓRd ; µ] ) can be thought of as a kind of a chaos decomposition of L2 (ΓRd ; µ] ) (compare with the Poisson case). We also note that, though the very existence of a fermion process under a slightly stronger condition on the function κ in terms of its Fourier transform has been known before (cf. [13, Proposition 4.1]), as a by-product of our results we get a new proof of the existence of fermion (as well as boson) processes. It is still an open problem to show that also the operators J(v) may be realized on L2 (ΓRd ; µ] ), but we hope that the obtained unitary operator between the latter space and the corresponding subspace of the Fock space may be of some help to tackle this problem. 2. Infinite Free Fermi and Bose Gases of Finite Density We first recall the construction of a cyclic representation of CAR whose state (constructed with respect to the cyclic vector) is a gauge invariant generalized free state. This representation is due to Araki and Wyss [6]. Generalized free states were first defined and studied by Shale and Stinespring [40]. Since that generalized free states (also called quasi-free states) have been studied by several authors, see e.g. [4, 7, 14, 17, 31, 36, 38] and the references therein. Let H be a separable real Hilbert space and let HC denote its complexification. We suppose that the scalar product in HC , denoted by (·, ·)HC , is antilinear in the first dot and linear in the second one. Let ∞ M Fa(n) (H) Fa (H)(= Fa (HC )) := n=0 (0)
denote the antisymmetric Fock space over H. Here Fa (H) := C, and for n ∈ N, (n) Fa (H) := HC∧n , ∧ standing for antisymmetric tensor product. By Fa, fin (H) we denote the subset of Fa (H) consisting of all elements f = (f (n) )∞ n=0 ∈ Fa (H) for which f (n) = 0, n ≥ N , for some N ∈ N. We endow Fa, fin (H) with the topology
November 7, 2002 14:37 WSPC/148-RMP
00153
Fermion and Boson Random Point Processes
1081
(n)
of the topological direct sum of the spaces Fa (H). Thus, the convergence in Fa, fin (H) means uniform finiteness and coordinate-wise convergence. For f ∈ HC , we denote by a(f ) and a∗ (f ) the standard annihilation and creation operators on Fa (H). They are defined on the domain Fa, fin (H) through the formula 1 X (−1)i+1 (f, hi )HC h1 ∧ · · · ∧ hi−1 ∧ hˇi ∧ hi+1 · · · ∧ hn , a(f )h1 ∧ · · · ∧ hn := √ n i=1 n
a∗ (f )h1 ∧ · · · ∧ hn :=
√ n + 1f ∧ h1 ∧ · · · ∧ hn ,
(2.1)
where h1 , . . . , hn ∈ HC and hˇi denotes the absence of hi . The operator a∗ (f ) is the restriction to Fa, fin (H) of the adjoint of a(f ) in Fa (H), and both a(f ) and a∗ (f ) act continuously on Fa, fin (H). The annihilation and creation operators satisfy CAR: [a(f ), a(g)]+ = 0 , [a(g), a∗ (f )]+ = (g, f )HC 1 for all f , g ∈ HC . Let K be a linear operator in HC such that 0 ≤ K ≤ 1. We take the direct sum H ⊕ H of two copies of the Hilbert space H, and construct the antisymmetric Fock space Fa (H ⊕ H). For f ∈ HC , we denote a1 (f ) := a(f, 0), a2 (f ) := a(0, f ) and analogously a∗i (f ), i = 1, 2. Let also K1 := K 1/2 , K2 := (1 − K)1/2 . We then set, for f ∈ HC , ψ(f ) := a2 (K2 f ) + a∗1 (JK1 f ) ,
ψ ∗ (f ) := a∗2 (K2 f ) + a1 (JK1 f ) ,
(2.2)
where J : HC → HC is the operator of complex conjugation: Jf := f¯. As easily seen, the operators {ψ(f ), ψ ∗ (f )|f ∈ HC } again satisfy CAR. Let Hi denote the closure of Im Ki in HC , i = 1, 2. Then, restricted to the subspace Fa (H1 ⊕ H2 ), the operators {ψ(f ), ψ ∗ (f )|f ∈ HC } form a cyclic representation of CAR with cyclic vector Ω := (1, 0, 0, . . .) — the vacuum in Fa (H ⊕ H). Let Aa (HC ) denote the C ∗ -algebra generated by the operators ψ(f ), f ∈ HC , and let ωa be the state on Aa (HC ) defined by ωa (Ψ) := (ΨΩ, Ω)Fa (H1 ⊕H2 ) , Ψ ∈ Aa (HC ). The n-point functions of ωa are given by the formula ωa (ψ ∗ (fn ) · · · ψ ∗ (f1 )ψ(g1 ) · · · ψ(gm )) = δn,m det((fi , Kgj )HC )
(2.3)
for all f1 , . . . , fn , g1 , . . . , gm ∈ HC . Thus, ωa is a gauge invariant generalized free state corresponding to the operator K. An analogous representation of CCR was constructed by Araki and Woods [5] (historically it preceded the representation of CAR [6]). Let us outline it. In the symmetric Fock space over H, denoted by Fs (H), we construct the standard annihilation and creation operators, b(f ) and b∗ (f ), which satisfy CCR: [b(f ), b(g)]− = 0 , [b(g), b∗ (f )]− = (g, f )HC 1
November 7, 2002 14:37 WSPC/148-RMP
1082
00153
E. Lytvynov
for all f , g ∈ HC . Let now K be a bounded linear operator in HC such that K ≥ 0. We set K1 := K1/2 , K2 := (1 + K)1/2 . Analogously to (2.2), we define the following operators in Fs (H ⊕ H): ϕ(f ) := b2 (K2 f ) + b∗1 (JK1 f ) ,
ϕ∗ (f ) := b∗2 (K2 f ) + b1 (JK1 f )
(2.4)
for f ∈ HC . These operators again satisfy CCR and form a cyclic representation of CCR in the Hilbert space Fs (H1 ⊕ H2 ), where Hi is the closure of Im Ki in HC , i = 1, 2. Let As (HC ) denote the C ∗ -algebra generated by the operators ϕ(f ), f ∈ HC , and let ωs be the state on As (HC ) defined by ωs (Ψ) := (ΨΩ, Ω)Fs (H⊕H) , Ψ ∈ As (HC ). The n-point functions of ωs are given by ωs (ϕ∗ (fn ) · · · ϕ∗ (f1 )ϕ(g1 ) · · · ϕ(gm )) = δn,m per((fi , Kgj )HC ) .
(2.5)
We now proceed to consider an infinite free Fermi gas of finite density, which is a special case of representation (2.2). Let H := L2 (Rd ; dx) = L2 (Rd ), and so HC = L2C (Rd ) = L2 (Rd → C; dx). To fix notations, we define the Fourier transform of a function f ∈ L1C (Rd ) by Z e−ix·λ f (x) dx , λ ∈ Rd , F f (λ) := fˆ(λ) := (2π)−d/2 Rd
and the inverse Fourier transform by F
−1
ˇ := (2π)−d/2 f (x) := f(x)
Z eiλ·x f (λ) dλ , Rd
x ∈ Rd ,
so that F can be extended by continuity from L2C (Rd )∩L1C (Rd ) to a unitary operator on L2C (Rd ), and F −1 is the inverse operator of F . Let k be the inverse Fourier transform of a function kˆ satisfying the following conditions: 0 ≤ kˆ ≤ 1 ,
kˆ ∈ L1 (Rd ) .
(2.6)
We define K := F −1 kˆ · F , where f · denotes the operator of multiplication by a function f . Using this K, we construct the operators ψ(f ), ψ ∗ (f ) defined in the Fock space Fa (H ⊕ H) by formula (2.2). Notice that now ˆ dx), H1 = F −1 L2C (supp k;
ˆ dx). H2 = F −1 L2C (supp(1 − k);
This representation of CAR describes an infinite free Fermi gas with density ˆ in “momentum space” [6], see also [17]. In particular, if β is the distribution k(·) inverse temperature, µ the chemical potential, and m the mass of a particle, the corresponding infinite free Fermi gas is described by exp(βµ − β |λ| 2m ) 2
ˆ k(λ) =
1 + exp(βµ −
2 β |λ| 2m )
,
λ ∈ Rd .
(2.7)
For the limit β → ∞ of zero temperature, we obtain ˆ k(λ) = 1B(√2mµ) (λ) ,
λ ∈ Rd ,
(2.8)
November 7, 2002 14:37 WSPC/148-RMP
00153
Fermion and Boson Random Point Processes
1083
where B(r) denotes the ball in Rd of radius r > 0 centered at the origin, and 1X (·) denotes the indicator of a set X. Notice that, in the case of (2.7), we have H1 = H2 = HC , while in the case of (2.8) HC = H1 ⊕ H2 . We will now need a rigging of Fa (H). Let D(Rd ) denote the space of all realvalued infinite differentiable functions on Rd with compact support. For p ∈ N, we define a weighted Sobolev space Sp (Rd ) as the closure of D(Rd ) with respect to the Hilbert norm Z 2 Ap f (x)f (x) dx , f ∈ D(Rd ) , kf kp := Rd
where Af (x) := −∆f (x) + (|x|2 + 1)f (x) , d
2
x ∈ Rd
(2.9)
d
is the harmonic oscillator. We identify S0 (R ) = L (R ) with its dual and obtain S(Rd ) := proj lim Sp (Rd ) ⊂ L2 (Rd ) ⊂ ind lim S−p (Rd ) =: S 0 (Rd ) . p→∞
p→∞
We recall that the Fourier transform F is a continuous bijection of SC (Rd ) onto SC (Rd ), and, extended by continuity, it is a continuous bijection of SC0 (Rd ) onto SC0 (Rd ). Here, SC (Rd ) and SC0 (Rd ) denote the complexification of S(Rd ) and S 0 (Rd ), respectively. Denoting Φ := SC (Rd )⊕SC (Rd ), Φp := Sp,C (Rd )⊕Sp,C (Rd ), and Φ0 := SC0 (Rd )⊕ 0 SC (Rd ), we get Φ = proj limp→∞ Φp and Φ0 = ind limp→∞ Φ−p . We set, for n ∈ Z+ , Fa(n) (Φ) := proj lim Fa(n) (Φp ) , p→∞
Fa(n) (Φ0 ) := ind lim Fa(n) (Φ−p ) . p→∞
(n)
Let Fa, fin (Φ) denote the topological direct sum of the spaces Fa (Φ), n ∈ Z+ . ∗ The dual of Fa, fin (Φ) with respect to the zero space Fa (H ⊕ H) is Fa, fin (Φ) = 0 0 ×∞ n=0 Fa (Φ ), the topological product of the spaces Fa (Φ ). It consists of all (n) (0) (1) (2) (n) ∈ Fa (Φ0 ), and consequences of the form F = (F , F , F , . . .) such that F ∗ vergence in Fa, fin (Φ) means coordinatewise convergence. Thus, we have constructed the nuclear triple (n)
(n)
∗ Fa, fin (Φ) ⊂ Fa (H ⊕ H) ⊂ Fa, fin (Φ) .
ˆ 1/2 ∈ S 0 (Rd ), we define Noting that kˆ 1/2 ∈ L2 (Rd ) and (1 − k) κ1 := (2π)−d/2 F −1 kˆ1/2 ∈ L2C (Rd ) , Then, for any f ∈ L2C (Rd ), K1 f (x) = κ1 ∗ f (x) =
ˆ 1/2 ∈ S 0 (Rd ) . κ2 := (2π)−d/2 F −1 (1 − k) C
Z Rd
κ1 (x − y)f (y) dy ,
a.e. x ∈ Rd ,
and for any f ∈ SC (Rd ), K2 f = κ2 ∗ f , where the convolution of a generalized function with a test one is defined in the usual way. For each x ∈ Rd , we define κ1,x ∈ L2C (Rd ) and κ2,x ∈ SC0 (Rd ) by hκi,x , f i = hκi , f (x + ·)i ,
f ∈ SC (Rd ), i = 1, 2 ,
(2.10)
November 7, 2002 14:37 WSPC/148-RMP
1084
00153
E. Lytvynov
where h·, ·i denotes the dual pairing (generated by the scalar product in HC ). Then, for any f ∈ SC (Rd ), Ki f (x) = hκi,x , f i ,
a.e. x ∈ Rd , i = 1, 2 .
(2.11)
Using formulas (2.1), we can easily define, for each (f1 , f2 ) ∈ Φ0 , an annihilation operator a(f1 , f2 ) acting continuously on Fa, fin (Φ), and a creation operator ∗ a∗ (f1 , f2 ) acting continuously on Fa, fin (Φ). Analogously to the above, we then ∗ introduce operators ai (f ) and ai (f ), i = 1, 2, for each f ∈ SC0 (Rd ). Taking to notice (2.10) and (2.11), we now set, for each x ∈ Rd , ¯ 1,x ) , ψ(x) := a2 (κ2,x ) + a∗1 (κ
ψ ∗ (x) := a∗2 (κ2,x ) + a1 (κ ¯ 1,x ) .
∗ These operators act continuously from Fa, fin (Φ) into Fa, fin (Φ), and we have the following integral representation: for each (real-valued) f ∈ S(Rd ) Z Z ∗ dx f (x)ψ(x) , ψ (f ) = dx f (x)ψ ∗ (x) . (2.12) ψ(f ) = Rd
Rd
The integration in (2.12) and below is to be understood in the following Rsense: for example, the first equality in (2.12) means: hψ(f )G1 , ∗G2 i = Rd f (x) hψ(x)G1 , G2 i dx for any G1 , G2 ∈ Fa, fin (Φ). The operators ψ(x), ψ (x) satisfy the CAR (1.2), the formulas making sense after integration with test functions. Now, let us briefly consider the bosonic case. Let H := L2 (Rd ) and let k be the inverse Fourier transform of a function kˆ satisfying the following conditions: 0 ≤ kˆ ≤ C
for some C ∈ (0, ∞) ,
kˆ ∈ L1 (Rd ) .
(2.13)
We define K := F −1 kˆ · F, and using this K, we construct the operators ϕ(f ), ϕ∗ (f ) defined on the symmetric Fock space Fs (H ⊕H) by formula (2.4). If we additionally ˆ suppose that k(x) > 0 a.e. x ∈ Rd , then the obtained representation of CCR describes an infinite free Bose gas at finite temperature and with density distribution kˆ in “momentum space” [5], see also [17]. Analogously to the above, we construct the triple Fs, fin (Φ) ⊂ Fs (H ⊕ H) ⊂ Fs,∗ fin (Φ) , and using it, we make sense of the operators ϕ(x), ϕ∗ (x), x ∈ Rd . These satisfy the CCR (1.1) with ψ replaced by ϕ. 3. Particle Density Operators and Their Spectral Measure We will again consider the fermionic case in detail, and then outline the bosonic case. 3.1. Fermionic case We suppose that (2.6) holds. For each x ∈ Rd , we define a particle density operator ρa (x) := ψ ∗ (x)ψ(x) .
November 7, 2002 14:37 WSPC/148-RMP
00153
Fermion and Boson Random Point Processes
1085
Since κ ¯ 1,x ∈ L2 (Rd ), the operator ψ(x) acts continuously from Ffin (Φ) into Ffin (H ⊕ ∗ (Φ). Therefore, ρa (x) H), and ψ ∗ (x) acts continuously from Ffin (H ⊕ H) into Ffin ∗ is a well-defined, continuous operator from Ffin (Φ) into Ffin (Φ). We then define Z dx f (x)ρa (x) , f ∈ S(Rd ) . ρa (f ) := Rd
Lemma 3.1. For each f ∈ S(Rd ), the operator ρa (f ) is well-defined and continuous on Fa, fin (H ⊕ H). Proof. Step 1. We first prove the statement for ρa, 1 (f ) := ¯ 1,x ). As easily seen, it suffices to show that a∗1 (κ Z dx f (x)κ2,x ⊗ κ ¯ 1,x ∈ H ⊗2 ,
R Rd
dx f (x)a∗2 (κ2,x )
(3.1)
Rd
where ⊗ denotes the usual tensor product. For any g, h ∈ SC (Rd ), we have Z dx f (x)κ2,x ⊗ κ ¯ 1,x , g ⊗ h Rd
Z
:= Z
Rd
= Rd
f (x)hκ2,x , gihκ ¯ 1,x , hi dx
f (x)(K2 g)(x)(JK1 Jh)(x) dx
Z =
Rd
f (x)(K1 Jh)(x)(K2 g)(x) dx
Z =
Rd
Z =
Rd
Z =
(F (f · K1 Jh))(λ)(F K2 g)(λ) dλ c ˆ 1/2 (λ)ˆ (2π)−d/2 fˆ ∗ (kˆ1/2 · Jh)(λ)(1 − k) g (λ) dλ (2π)−d/2
Rd
Z =
R2d
Z Rd
c fˆ(λ − ξ) dξ(1 − k) ˆ 1/2 (λ)ˆ g (λ) dλ kˆ1/2 (ξ)Jh(ξ)
ˆ dλ dξ . ˆ 1/2 (λ)kˆ1/2 (−ξ)ˆ (2π)−d/2 fˆ(λ + ξ)(1 − k) g (λ)h(ξ)
ˆ ≤ 1 and kˆ 1/2 , fˆ ∈ L2 (Rd ), the function Since |1 − k| ˆ 1/2 (λ)kˆ1/2 (−ξ) , Gf (λ, ξ) := (2π)−d/2 fˆ(λ + ξ)(1 − k) belongs to L2 (R2d ). Therefore, by (3.2), Z −1 dx f (x)κ2,x ⊗ κ ¯ 1,x , g ⊗ h = hF2d (Gf ), g ⊗ hi , Rd
ξ, λ ∈ Rd ,
g, h ∈ S(Rd ) ,
(3.2)
November 7, 2002 14:37 WSPC/148-RMP
1086
00153
E. Lytvynov
where F2d denotes the Fourier transform on L2C (R2d ). By linearity and continuity, this implies Z −1 dx f (x)κ2,x ⊗ κ ¯ 1,x = F2d (Gf ) ∈ L2 (Rd )⊗2 . Rd
Step 2. We now prove the statement for Z dxf (x)a∗2 (κ2,x )a2 (κ2,x ) . ρa, 2 (f ) := Rd
For any gi , hi ∈ SC (R ), i = 1, 2, we have Z dx f (x)a∗2 (κ2,x )a2 (κ2,x )(g1 , g2 ), (h1 , h2 ) d
Rd
Z
= Rd
and therefore Z Rd
f (x)K2 g2 (x)K2 h2 (x) = hK2 (f · K2 g2 ), h2 i ,
dx a∗2 (κ2,x )a2 (κ2,x ) F (1) (Φ) = 0 ⊕ (K2 (f · K2 )) =: A2,f .
Evidently A2,f is continuous on HC ⊕ HC . For any linear continuous operator A on HC ⊕ HC , we define the second quantization of A, denoted by dΓ(A), as the operator in Fa (H ⊕ H) with domain D(dΓ(A)) := Fa, fin (H ⊕ H), given by dΓ(A) Fa(n) (H ⊕ H) := A ⊗ 1 ⊗ · · · ⊗ 1 + 1 ⊗ A ⊗ 1 ⊗ · · · ⊗ 1 + 1 ⊗ · · · ⊗ 1 ⊗ A . dΓ(A) acts continuously on Fa, fin (H ⊕ H). Then, an easy calculation shows that ρa, 2 (f ) = dΓ(A2,f ). Step 3. Analogously, we get Z dx f (x)a1 (κ ¯ 1,x )a∗1 (κ ¯ 1,x ) ρa, 3 (f ) := Rd
Z =
Rd
Z =
f (x)hκ ¯ 1,x , κ ¯ 1,x i dx 1 − dΓ(A1,f ) f (x) dx (2π)−d
Rd
Z Rd
ˆ k(λ) dλ1 − dΓ(A1,f ) ,
where A1,f := (JK1 J(f · JK1 J)) ⊕ 0. (n) Step 4. Finally, since ρa, 1 (f ) is a continuous operator from Fa (H ⊕ H) into (n+2) (H ⊕ H) for each n ∈ Z+ , and since Fa Z dxf (x)a1 (κ ¯ 1,x )a2 (κ2,x ) ρa, 4 (f ) := Rd
(n+2)
is its adjoint, we have that ρa, 4 (f ) acts continuously from Fa (n) Fa (H ⊕ H), and hence, continuously on Fa, fin (H ⊕ H).
(H ⊕ H) into
November 7, 2002 14:37 WSPC/148-RMP
00153
Fermion and Boson Random Point Processes
1087
Using anticommutation relations (1.2), we easily prove the following Lemma 3.2. For each f1 , f2 ∈ S(Rd ), we have on Fa, fin (H ⊕ H): ρa (f1 )ρa (f2 ) = ρa (f2 )ρa (f1 ) .
(3.3)
We define a Hilbert space Ha as the closure of the linear span of the set {Ω, ρa (f1 ) · · · ρa (fn )Ω|f1 , . . . , fn ∈ S(Rd ), n ∈ N} in Fa (H ⊕ H). Remark 3.1. It is not hard to see that Ha is a subspace of the space ∞ M
⊗n ⊗n Pa, 2n (H1,C ⊗ H2,C ),
(3.4)
n=0
where Pa, 2n : (HC ⊕ HC )⊗2n → (HC ⊕ HC )∧2n is the antisymmetrization operator. Evidently, (3.4) is a subspace of Fa (H1 ⊕ H2 ). Whether Ha coincides with (3.4) or it is a proper subspace of it, is an open problem (see also Remark 3.3 below). We next define Ha, fin := Ha ∩ Fa, fin (H ⊕ H), Ha, fin being dense in Ha . Let us consider the ρa (f )’s as operators in Ha with domain Ha, fin . Lemma 3.3. The operators ρa (f ), f ∈ S(Rd ), are essentially selfadjoint in Ha . Proof. The operators are evidently symmetric. The proof of essential selfadjointness is quite standard (see e.g. [11, Chap. 3, Sec. 3.8] and [27, Lemma 4.1]), so we only outline it. (n) As easily seen from the proof of Lemma 3.1, we have for any g (n) ∈ Fa (H ⊕ H) ρa (f )g
(n)
=
4 X
ρa, j (f )g (n) ∈
j=1
M
Fa(i) (H ⊕ H) ,
i=n−2,n,n+2
and moreover kρa, j (f )g (n) kFa (H⊕H) ≤ C1 max{kf kL2(Rd ) , kf kL1(Rd ) , kf kL∞ (Rd ) }kg (n) kF (n) (H⊕H) a
(3.5)
for j = 1, . . . , 4 and some C1 > 0. From here, it is not hard to show that, for every (n) g (n) ∈ Fa (H ⊕ H), the series ∞ X kρa (f )m g (n) kFa (H⊕H) m t m! m=0
converges for 0 < t < (4C1 max{kf kL2(Rd ) , kf kL1 (Rd ) , kf kL∞ (Rd ) })−1 .
November 7, 2002 14:37 WSPC/148-RMP
1088
00153
E. Lytvynov
Therefore, any vector from Ha, fin is analytical for ρa (f ). By Nelson’s analytic vector criterium (e.g. [37, Theorem X.39]), the lemma follows. We denote by ρ∼ a (f ) the closure of ρa (f ) in Ha , which is a selfadjoint operator by Lemma 3.3. ∼ Lemma 3.4. For any f1 , f2 ∈ S(Rd ), the operators ρ∼ a (f1 ) and ρa (f2 ) commute in the sense of their resolutions of the identity.
Proof. Since ρa (f1 ) is essentially selfadjoint, the set (ρa (f1 ) + i1)Ha, fin is dense in Ha . Furthermore, (ρa (f1 ) + i1)Ha, fin ⊂ Ha, fin . Thus, by the proof of Lemma 3.3, the operator ρa (f1 ) (ρa (f1 ) + i1)Ha, fin has a dense set of analytical vectors. Hence, the lemma follows from [11, Chap. 5, Theorem 1.15]. Theorem 3.1. Let k be the inverse Fourier transform of a function kˆ satisfying d (2.6). Let the Hilbert space Ha and the operators ρ∼ a (f ), f ∈ S(R ), be defined as 0 above. Then, there exist a unique probability measure µa on (S (Rd ), B(S 0 (Rd ))) (B(S 0 (Rd )) denoting the Borel σ-algebra on S 0 (Rd )) and a unique unitary operator Ia : Ha → L2 (S 0 (Rd ), B(S 0 (Rd )); µa ) such that Ia Ω = 1 and the following formula holds −1 Ia ρ∼ a (f )Ia = h·, f i· ,
f ∈ S(Rd ) .
(3.6)
Remark 3.2. In terms of the spectral theory of commuting selfadjoint operators (e.g. [11, 39]), Theorem 3.1 states that the family (ρ∼ a (f ))f ∈S(Rd ) has a spectral measure µa on (S 0 (Rd ), B(S 0 (Rd ))). Furthermore, since the operators ρa (f ) have a Jacobi type form in Fa (H1 ⊕ H2 ), this result is close in spirit to [9, 27]. Proof of Theorem 3.1. Let (hk )∞ k=0 be the sequence of Hermite functions forming an orthonormal basis in L2 (Rd ) and let ak > 0 be the eigenvalue of the operator A (defined by (2.9)) belonging to the eigenvector hk , k ∈ Z+ . We denote by R∞ := RZ+ the space of all sequences of the form x = (x0 , x1 , x2 , . . .), xk ∈ R, k ∈ Z+ , and we endow R∞ with the product topology. The Borel σ-algebra B(R∞ ) coincides with the cylinder σ-algebra Cσ (R∞ ). Lemma 3.5. There exist a unique probability measure µ ˜a on (R∞ , B(R∞ )) and a 2 ∞ ∞ ˜a ) such that I˜a Ω = 1 and, for unique unitary operator I˜a : Ha → L (R , B(R ); µ each k ∈ Z+ , I˜a ρ∼ (hk )I˜−1 = xk ·, where xk · denotes the operator of multiplication a
a
by xk . P Proof. For f ∈ S(Rd ), we have f = ∞ k=0 hf, hk ihk , where the series converges in each space Sp (Rd ), p ∈ N, and hence in S(Rd ). Next, it follows from (3.5) that, for each fixed G ∈ Ha, fin , the mapping S(Rd ) 3 f 7→ ρa (f )G ∈ Ha
(3.7)
November 7, 2002 14:37 WSPC/148-RMP
00153
Fermion and Boson Random Point Processes
1089
∞ is continuous. Therefore, Ω is a cyclic vector for the family (ρ∼ a (hk ))k=0 . Thus, ∼ ∞ (ρa (hk ))k=0 is a countable family of commuting selfadjoint operators having a cyclic vector, and hence the lemma follows from [39, Chap. 1, Theorem 4].
For each p ∈ N, we define the following measurable function on R∞ : ∞ X 2 → 7 kxk := x2k a−p R∞ 3 x = (xk )∞ k=0 −p k ∈ R+ ∪ {+∞} . k=0
Let S−p := {x ∈ R∞ : kxk2−p < ∞} ,
p ∈ N,
[
S 0 :=
S−p .
p∈N
Evidently, S−p , S 0 ∈ B(R∞ ). By using the monotone convergence theorem and Lemma 3.5, we get Z X Z ∞ kxk2−p d˜ µa (x) = x2k a−p µa (x) k d˜ R∞
R∞ k=0
=
∞ X
a−p k
Z
k=0
R∞
x2k d˜ µa (x) =
∞ X
2 a−p k kψ(hk )ΩkHa .
(3.8)
k=0
For some C2 > 0, max{kf kL2(Rd ) , kf kL1(Rd ) , kf kL∞ (Rd ) } ≤ C2 kf kSd(Rd ) ,
f ∈ S(Rd ) ,
(3.9)
and since the inclusion Sd (Rd ) ,→ L2 (Rd ) is of Hilbert–Schmidt type, ∞ X
a−d k < ∞.
(3.10)
k=0
By (3.5), (3.8)–(3.10), Z R∞
kxk2−2d d˜ µa (x) ≤ C12 C22
∞ X
a−d k < ∞.
k=0
This yields that ˜a (S 0 ) = 1 . µ ˜ a (S−2d ) = µ
(3.11)
Let B(S 0 ) denote the trace σ-algebra of B(R∞ ) on S 0 . By (3.11), we can consider µ ˜ a as a probability measure on (S 0 , B(S 0 )). Noticing that the mapping ∞ X xk hk ∈ S 0 (Rd ) S 0 3 x = (x0 , x1 , x2 , . . .) 7→ Ex := k=0
is a measurable bijection, we define a probability measure µa on (S 0 (Rd ), B(S 0 (Rd ))) ˜a ◦ E −1 , and a unitary operator U : L2 (S 0 , B(S 0 ); µ ˜a ) → L2 (S 0 (Rd ), by µa := µ 0 d B(S (R )); µa ) by UF (ω) := F (E −1 ω) ,
ω ∈ S 0 (Rd ) .
November 7, 2002 14:37 WSPC/148-RMP
1090
00153
E. Lytvynov
Setting Ia := U I˜a , we get a unitary operator acting from Ha onto L2 (S 0 (Rd ); µa ) such that Ia Ω = 1 and −1 Ia ρ∼ a (hk )Ia = h·, hk i· ,
k ∈ Z+ .
(3.12)
Furthermore, using the continuity of mapping (3.7), we easily conclude from (3.12) that (3.6) holds. Thus, the theorem is proved. The configuration space ΓRd over Rd is defined as the set of all locally finite subsets (configurations) in Rd : ΓRd := {γ ⊂ Rd | |γ ∩ Λ| < ∞ for each compact Λ ⊂ Rd } . Here, |Λ| denotes the cardinality of a set Λ. We can identify any γ ∈ ΓRd with the P positive Radon measure x∈γ δx ∈ M(Rd ), where δx is the Dirac measure with P mass at x, x∈∅ δx := zero measure, and M(Rd ) stands for the set of all positive Radon measures on B(Rd ). The space ΓRd is endowed with the relative topology as a subset of the space M(X) with the vague topology. We denote by B(ΓX ) the Borel σ-algebra on ΓRd . We endow D(Rd ) with its natural projective limit topology and denote by 0 D (Rd ) the dual space of D(Rd ). One can show that ΓRd belongs to the cylinder σ-algebra Cσ (D0 (Rd )), and furthermore, the trace σ-algebra of Cσ (D0 (Rd )) on ΓRd , respectively S 0 (Rd ), coincides with B(ΓX ), respectively B(S 0 (Rd )). Thus, any probability measure ν on (S 0 (Rd ), B(S 0 (Rd ))) can be considered as a measure on (D0 (Rd ), Cσ (D0 (Rd ))), and if additionally ν(ΓRd ) = 1, ν can be considered as a probability measure on (ΓRd , B(ΓRd )) as well. Our next aim is to show that µa is supported by ΓRd . To this end, let us recall the notion of correlation functions of a probability measure ν on (ΓRd , B(ΓRd )). ˆ ˆ stand for the symmetric tensor product. For any g (n) ∈ D(Rd )⊗n (= the Let ⊗ space of all smooth, symmetric, compactly supported functions on (Rd )n ), we define a function ΓRd 3 γ 7→ h: γ ⊗n :, g (n) i ∈ R by X X X ··· g (n) (x1 , . . . , xn ) (3.13) h: γ ⊗n :, g (n) i = x1 ∈γ x2 ∈γ,x2 6=x1
xn ∈γ,xn 6=x1 ,...,xn 6=xn−1
(the number of the non-zero summands on the right hand side of (3.13) is finite). (n) (n) : (Rd )n → R being measurable and symmetric, The functions (kν )∞ n=1 with kν ˆ , n ∈ N, are called correlation functions of the measure ν if, for each g (n) ∈ D(Rd )⊗n Z h: γ ⊗n :, g (n) iν(dγ) ΓRd
Z = (Rd )n
g (n) (x1 , . . . , xn )kν(n) (x1 , . . . , xn ) dx1 · · · dxn
(3.14)
(if the measure ν has correlation functions, then these are a.s. uniquely defined).
November 7, 2002 14:37 WSPC/148-RMP
00153
Fermion and Boson Random Point Processes
1091
ˆ As easily seen from (3.13), the kernels : γ ⊗n : ∈ D0 (Rd )⊗n satisfy the recursion relation
: γ ⊗1 (x) := γ(x) , : γ ⊗(n+1) (x1 , . . . , xn+1 ) := (γ(xn+1 ) : γ ⊗n (x1 , . . . , xn ) : −
n X
δ(xn+1 − xi ) : γ ⊗n (x1 , . . . , xn ) :)∼ ,
n ∈ N,
(3.15)
i=1
where (·)∼ denotes symmetrization of a function. Replacing γ ∈ ΓRd with an ˆ and introduce, analoarbitrary ω ∈ D0 (Rd ), we may now define : ω ⊗n : ∈ D0 (Rd )⊗n (n) ∞ gously to (3.14), the notion of correlation functions (kν )n=1 for any probability measure ν on D0 (Rd ). (We, however, remark that the introduction of correlation functions for a measure on D0 (Rd ) is only of “technical” nature, since one always expects that a measure having correlation functions is supported by ΓRd , see the arguments below). Following (3.15), we introduce operators : ρa (x) : = ρa (x) , : ρa (xn+1 )ρa (xn ) · · · ρa (x1 ) : = (ρa (xn+1 ) : ρa (x1 ) · · · ρa (xn ) : −
n X
δ(xn+1 − xi ) : ρa (x1 ) · · · ρa (xn ) :)∼ ,
(3.16)
i=1
which make sense after integration with test functions. The following proposition shows that : ρa (xn ) · · · ρa (x1 ) : is the “normal product” of the the operators ρa (x1 ), . . . , ρa (xn ) (compare with [35, Secs. 2.B and 2.C]). Proposition 3.1. For each n ∈ N and f1 , . . . , fn ∈ S(Rd ), we have on Fa, fin (H ⊕ H): Z dx1 · · · dxn f1 (x1 ) · · · fn (xn ) : ρa (x1 ) · · · ρa (xn ) : (Rd )n
Z
dx1 · · · dxn f1 (x1 ) · · · fn (xn )ψ ∗ (xn ) · · · ψ ∗ (x1 )ψ(x1 ) · · · ψ(xn ) .
=
(3.17)
(Rd )n
Proof. We first note that, for n ≥ 2, the well-definedness of the operator on the right hand side of (3.17) on Fa, fin (H ⊕ H) may be proved by using arguments analogous to those as in the proof of Lemma 3.1. We prove the proposition by induction. For n = 1, (3.17) is trivially satisfied. Suppose that (3.17) holds for some
November 7, 2002 14:37 WSPC/148-RMP
1092
00153
E. Lytvynov
n ∈ N. Then, by the induction hypothesis, (1.2), and (3.16), we have : ρa (xn+1 ) · · · ρa (x1 ) : = (ψ ∗ (xn+1 )ψ(xn+1 )ψ ∗ (xn ) · · · ψ ∗ (x1 )ψ(x1 ) · · · ψ(xn ) −
n X
δ(xn+1 − xi )ψ ∗ (xn ) · · · ψ ∗ (x1 )ψ(x1 ) · · · ψ(xn ))∼
i=1
= (−ψ ∗ (xn+1 )ψ ∗ (xn )ψ(xn+1 )ψ ∗ (xn−1 ) · · · ψ ∗ (x1 )ψ(x1 ) · · · ψ(xn ) −
n−1 X
δ(xn+1 − xi )ψ ∗ (xn ) · · · ψ ∗ (x1 )ψ(x1 ) · · · ψ(xn ))∼
i=1
= (ψ ∗ (xn+1 )ψ ∗ (xn )ψ ∗ (xn−1 )ψ(xn+1 )ψ ∗ (xn−2 ) · · · ψ ∗ (x1 )ψ(x1 ) · · · ψ(xn ) − δ(xn+1 − xn−1 )ψ(xn−1 )∗ ψ(xn )∗ ψ(xn−2 )∗ · · · ψ ∗ (x1 )ψ(x1 ) · · · ψ(xn ) −
n−1 X
δ(xn+1 − xi )ψ ∗ (xn ) · · · ψ ∗ (x1 )ψ(x1 ) · · · ψ(xn ))∼
i=1
= (ψ ∗ (xn+1 )ψ ∗ (xn )ψ ∗ (xn−1 )ψ(xn+1 )ψ ∗ (xn−2 ) · · · ψ ∗ (x1 )ψ(x1 ) · · · ψ(xn ) −
n−2 X
δ(xn+1 − xi )ψ ∗ (xn ) · · · ψ ∗ (x1 )ψ(x1 ) · · · ψ(xn ))∼
i=1
= · · · = ((−1)n ψ ∗ (xn+1 ) · · · ψ ∗ (x1 )ψ(xn+1 )ψ(x1 ) · · · ψ(xn ))∼ = ψ ∗ (xn+1 ) · · · ψ ∗ (x1 )ψ(x1 ) · · · ψ(xn+1 ) , the formulas above making sense after integration with test functions. Proposition 3.2. For any f1 , . . . , fn ∈ S(Rd ), n ∈ N, ! Z dx1 · · · dxn f1 (x1 ) · · · fn (xn ) : ρa (x1 ) · · · ρa (xn ) : Ω, Ω (Rd )n
Ha
Z = (Rd )n
ˆ · · · ⊗f ˆ n )(x1 , . . . , xn ) det(κ(xi − xj ))ni,j=1 dx1 · · · dxn , (f1 ⊗
where κ(x) := (2π)−d/2 k(x), x ∈ Rd . Proof. By Proposition 3.1, ! Z dx1 · · · dxn f1 (x1 ) · · · fn (xn ) : ρa (x1 ) · · · ρa (xn ) : Ω, Ω (Rd )n
Ha
Z ˆ · · · ⊗f ˆ n )(x1 , . . . , xn )(a1 (κ (f1 ⊗ ¯ 1,xn ) · · · a1 (κ ¯ 1,x1 )
= (Rd )n
(3.18)
November 7, 2002 14:37 WSPC/148-RMP
00153
Fermion and Boson Random Point Processes
1093
× a∗1 (κ ¯ 1,x1 ) · · · a∗1 (κ ¯ 1,xn )Ω, Ω)Ha dx1 · · · dxn Z √ ˆ · · · ⊗f ˆ n )(x1 , . . . , xn ) n!(a1 (κ = (f1 ⊗ ¯ 1,xn ) · · · a1 (κ ¯ 1,x1 ) (Rd )n
κ ¯ 1,x1 ∧ · · · ∧ κ1,xn , Ω)Ha dx1 · · · dxn Z ˆ · · · ⊗f ˆ n )(x1 , . . . , xn ) n!(κ = (f1 ⊗ ¯ 1,x1 ⊗ · · · ⊗ κ ¯ 1,xn , (Rd )n
¯ 1,xn )Ha dx1 · · · dxn κ ¯ 1,x1 ∧ · · · ∧ κ Z ˆ · · · ⊗f ˆ n )(x1 , . . . , xn ) = (f1 ⊗ (Rd )n
¯ 1,xj )HC )ni,j=1 dx1 · · · dxn . × det((κ ¯ 1,xi , κ
(3.19)
Next, for any f , g ∈ SC (Rd ), Z (κ ¯ 1,x , κ ¯ 1,y )HC f (x)¯ g (y) dx dy (Rd )2
Z κ1 (x − z)κ ¯ 1 (y − z)f (x)¯ g (y) dx dy dz
= (Rd )3
= (K1 g, K1 f )HC = (Kg, f )HC Z Z = (2π)−d/2 k(x − y)g(y) dyf (x) dx Z
Rd
Rd
κ(y − x)f (x)¯ g (y) dx dy .
= (Rd )2
Hence, ¯ 1,y )H = κ(y − x) a.e. (x, y) ∈ (Rd )2 . (κ ¯ 1,x , κ Furthermore, for each x ∈ Rd , Z Z ¯ 1,x )HC = |κ1 (y − x)|2 dy = (κ ¯ 1,x , κ Rd
Rd
(3.20)
Z |κ1 (y)|2 dy =
ˆ (2π)−d k(λ) dλ
Rd
ˆ = κ(0) . = (2π)−d/2 (F −1 k)(0)
(3.21)
Evidently, for any (x1 , . . . , xn ) ∈ (Rd )n , det(κ(xi − xj ))ni,j=1 = det(κ(xj − xi ))ni,j=1 .
(3.22)
Thus, (3.19)–(3.22) imply (3.18). Corollary 3.1. The measure µa has correlation functions, which are given by (x1 , . . . , xn ) = det(κ(xi − xj ))ni,j=1 , kµ(n) a
(x1 , . . . , xn ) ∈ (Rd )n , n ∈ N .
(3.23)
November 7, 2002 14:37 WSPC/148-RMP
1094
00153
E. Lytvynov
Proof. By Theorem 3.1, we have, for any f1 , . . . , fn ∈ S(Rd ), Z ˆ · · · ⊗f ˆ n i) = ˆ · · · ⊗f ˆ n )(x1 , . . . , xn ) : ρa (x1 ) · · · ρa (xn ) : Ω . (f1 ⊗ Ia−1 (h: ·⊗n :, f1 ⊗ (Rd )n
From here and Proposition 3.2 the statement easily follows. Theorem 3.2. Let the conditions of Theorem 3.1 be fulfilled. Then, µa (ΓRd ) = 1, the correlation functions of µa are given by (3.23), and the Fourier transform of µa is calculated as follows: for each f ∈ S(Rd ) Z e
ihω,f i
Z ∞ X 1 µa (dω) = (eif (x1 ) − 1) · · · (eif (xn ) − 1) n! d )n (R n=0 × det(κ(xi − xj ))ni,j=1 dx1 · · · dxn .
(3.24)
Proof. We evidently have ˆ L (Rd ) |κ(x)| ≤ (2π)−d kkk 1
∀ x ∈ Rd .
Hence, by [34, Corollary 3] and Corollary 3.1 ˆ L1 (Rd ) )n nn/2 (x1 , . . . , xn )| ≤ ((2π)−d kkk |kµ(n) a ∀ (x1 , . . . , xn ) ∈ (Rd )n , n ∈ N .
(3.25)
By [12, Theorem 2] (see also [26, Theorem 6.5]), the bound (3.25) implies that the measure µa is concentrated on ΓRd . Finally, formula (3.24) follows in a standard way from Lemma 3.1 and bound (3.25) (see e.g. [12, Remark 2]). By Theorem 3.2, µa is a fermion process [15, 29], or a determinantal random point field in terms of [43]. Remark 3.3. Let us suppose that, in addition to condition (2.6), the function kˆ satisfies Z n ˆ k(λ)|λ| dλ < ∞ , ∀ n ∈ N . Rd
Then, using formula (1.3), for each v ∈ V0 (Rd ), one can construct J(v) as a selfadjoint operator in F (H1 ⊕H2 ). Here, V0 (Rd ) denotes the set of all smooth, compactly supported vector fields on Rd . Thus, one gets a representation of the full algebra g (see Sec. 1). However, it is still an open problem, whether Ha is an invariant subspace for the operators J(v). If it were so, we could evidently construct a representation of the algebra g, as well as the group G in the space L2 (ΓRd ; µa ).
November 7, 2002 14:37 WSPC/148-RMP
00153
Fermion and Boson Random Point Processes
1095
3.2. Bosonic case We now suppose that (2.13) holds. For each x ∈ Rd , we introduce a particle density operator ρs (x) := ϕ∗ (x)ϕ(x), acting continuously from Fs (Φ) into Fs∗ (Φ). Then, the operators Z dx f (x)ρs (x) , f ∈ S(Rd ) , ρs (f ) := Rd
act continuously on Ffin (H ⊕ H). Using (1.1) and an estimates of type (3.5), we show that ρs (f ) are essentially selfadjoint and their closures ρ∼ s (f ) constitute a cyclic family of commuting selfadjoint operators in the Hilbert space Hs — the closure of the linear span of the vectors {Ω, ρs (f1 ) · · · ρs (fn )Ω|f1 , . . . , fn ∈ S(Rd ), n ∈ N} in Fs (H ⊕ H). We then construct the spectral measure µs of the operator family (ρ∼ s (f ))f ∈S(Rd ) as a probability measure on (S 0 (Rd ), B(S 0 (Rd ))). Furthermore, with the help of formula (2.5) we show that µs has correlation functions, which are given by the following formula: (x1 , . . . , xn ) = per(κ(xi − xj ))ni,j=1 , kµ(n) s
(x1 , . . . , xn ) ∈ (Rd )n , n ∈ N ,
−d/2
(3.26)
k(x). Next, for every bounded Λ ∈ B(R ), we evidently where κ(x) := (2π) have the following estimate: Z 1 |k (n) (x1 , . . . , xn )| dx1 · · · dxn ≤ (|Λ|C3 )n , (3.27) n! Λn µs where |Λ| denotes the volume of Λ and C3 := supx∈Rd |κ(x)| < ∞. Hence, by (3.27) and [12, Theorem 2], we get µs (ΓRd ) = 1. Thus, we get the following d
Theorem 3.3. Let k be the inverse Fourier transform of a function kˆ satisfying d (2.13). Let the Hilbert space Hs and the operators ρ∼ s (f ), f ∈ S(R ), be defined as above. Then, there exist a unique probability measure µs on (S 0 (Rd ), B(S 0 (Rd ))) and a unique unitary operator Is : Hs → L2 (S 0 (Rd ); µs ) such that Is Ω = 1 and −1 = h·, f i· , Is ρ∼ s (f )Is
f ∈ S(Rd ) .
Furthermore, µs (ΓRd ) = 1 and the correlation functions (kµs )∞ n=0 of the measure µs are given by formula (3.26). (n)
By [28, 29] (see also [15]), µs is a boson process. Acknowledgments I am grateful to Yu. Kondratiev for drawing my attention to the fermion processes and for his permanent interest in this work. I would like to thank S. Albeverio, G. Goldin and Yu. Samoilenko for useful discussions. I am also grateful to the referees of the paper for many suggestions on improvement of the first version of the paper. The financial support of SFB 256, DFG Research Projects 436 RUS 113/593, and BMBF Research Project UKR-004-99 is gratefully acknowledged.
November 7, 2002 14:37 WSPC/148-RMP
1096
00153
E. Lytvynov
References [1] S. Albeverio, Yu. G. Kondratiev and M. R¨ ockner, Analysis and geometry on configuration spaces, J. Func. Anal. 154 (1998) 444–500. [2] S. Albeverio, Yu. G. Kondratiev and M. R¨ ockner, Diffeomorphism groups and current algebras: configuration space analysis in quantum theory, Rev. Math. Phys. 11 (1999) 1–23. [3] H. Araki, Factorizable representation of current algebra. Non commutative extension of the L´evy-Kinchin formula and cohomology of a solvable group with values in a Hilbert space, Publ. RIMS Kyoto Univ. 5 (1969/70) 361–422. [4] H. Araki, On quasifree states of CAR and Bogoliubov automorphisms, Publ. RIMS Kyoto Univ. 6 (1970/71) 385–442. [5] H. Araki and E. Woods, Representations of the C.C.R. for a nonrelativistic infinite free Bose gas, J. Math. Phys. 4 (1963) 637–662. [6] H. Araki and W. Wyss, Representations of canonical anticommutation relation, Helv. Phys. Acta 37 (1964) 136–159. [7] E. Balslev and A. Verbeure, States on Clifford algebras, Commun. Math. Phys. 7 (1968) 55–76. [8] Ch. Benard and O. Macchi, Detection and “emission” processes of quantum particles in a “chaotic state”, J. Math. Phys. 14 (1973) 155–167. [9] Yu. M. Berezansky, Commutative Jacobi fields in Fock space, Integral Equations Operator Theory 30 (1998) 163–190. [10] Yu. M. Berezansky, Poisson measure as the spectral measure of Jacobi field, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 3 (2000) 121–139. [11] Yu. M. Berezansky and Yu. G. Kondratiev, Spectral Methods in Infinite Dimensional Analysis, Kluwer, Dordrecht, 1994. [12] Yu. M. Berezansky, Yu. G. Kondratiev, T. Kuna and E. Lytvynov, On a spectral representation for correlation measures in configuration space analysis, Meth. Func. Anal. and Topol. 5(4) (1999) 87–100. [13] A. Borodin and G. Olshanski, Point processes and the infinite symmetric group. Part III: Fermion point processes, Preprint, 1998, available via http://xxx.lanl.gov/abs/math.RT/9804088. [14] O. Bratteli and W. D. Robinson, Operator Algebras and Quantum-Statistical Mechanics I, II, Springer-Verlag, New York/Berlin, 1979, 1981. [15] D. J. Daley and D. Vere-Jones, An Introduction to the Theory of Point Processes, Springer-Verlag, New York, 1988. [16] R. Dashen and D. H. Sharp, Currents as coordinates for hadrons, Phys. Rev. 165 (1968) 1857–1867. [17] G. F. Dell’Antonio, Structure of the algebra of some free systems, Comm. Math. Phys. 9 (1968) 81–117. [18] K.-H. Fichtner, On the position distribution of the ideal Bose gas, Math. Nachr. 151 (1991) 59–67. [19] K.-H. Fichtner and W. Freudenberg, Point processes and states of infinite boson systems, Preprint NTZ Leipzig, 1986. [20] K.-H. Fichtner and W. Freudenberg, Point processes and the position distribution of infinite boson systems, J. Statist. Phys. 47 (1987) 959–978. [21] A. Girard, Current algebras of free systems at finite temperature, J. Math. Phys. 14 (1973) 353–365. [22] G. A. Goldin, Nonrelativistic current algebras as unitary representations of groups, J. Math. Phys. 12 (1971) 462–487.
November 7, 2002 14:37 WSPC/148-RMP
00153
Fermion and Boson Random Point Processes
1097
[23] G. A. Goldin, J. Grodnik, R. T. Powers and D. H. Sharp, Nonrelativistic current algebra in the N/V limit, J. Math. Phys. 15 (1974) 88–100. [24] G. A. Goldin, R. Menikoff and D. H. Sharp, Particle statistics from induced representations of a local current group, J. Math. Phys. 21 (1980) 650–664. [25] R. L. Hudson and K. R. Parthasarathy, Quantum Ito’s formula and stochastic evolutions, Commun. Math. Phys. 93 (1984) 301–323. [26] Yu. G. Kondratiev and T. Kuna, Harmonic analysis on configuration spaces I. General theory, Infinite Dimens. Anal. Quantum Prob. Related Topics 5 (2002) 201–203. [27] E. W. Lytvynov, Multiple Wiener integrals and non-Gaussian white noises: a Jacobi field approach, Meth. Func. Anal. and Topol. 1 (1995) 61–85. [28] O. Macchi, Distribution statistique des instants d’´ emission des photo´electrons d’une lumi`ere thermique, C. R. Acad. Sci. Paris Ser. A 272 (1971) 437–440. [29] O. Macchi, The coincidence approach to stochastic point processes, Adv. Appl. Prob. 7 (1975) 83–122. [30] O. Macchi, The Fermion process — a model of stochastic point process with repulsive points, in Transactions of the Seventh Prague Conference on Information Theory, Statistical Decision Functions, Random Processes and of the Eighth European Meeting of Statisticians (Tech. Univ. Prague, Prague, 1974), Vol. A, Reidel, Dordrecht, 1977, pp. 391-398. [31] J. Manuceau and A. Verbeure, Representations of anticommutation relations and Bogolioubov transformations, Comm. Math. Phys. 8 (1968) 315–326. [32] P. A. Meyer, Quantum Probability for Probabilists, Lecture are Notes in Mathematics, Vol. 1538, Springer-Verlag, Berlin/New York, 1993. [33] R. Menikoff, The Hamiltonian and generating functional for a nonrelativistic local current algebra, J. Math. Phys. 15 (1974) 1138–1152. [34] R. Menikoff, Generating functionals determining representations of a nonrelativistic local current algebra in the N/V limit, J. Math. Phys. 15 (1974) 1394–1408. [35] R. Menikoff and D. H. Sharp, Representations of a local current algebra: Their dynamical determination, J. Math. Phys. 16 (1975) 2341–2352. [36] R. T. Powers and Strømer, Free states of the canonical abticommutation relations, Comm. Math. Phys. 16 (1970) 1–33. [37] M. Reed and B. Simon, Methods of Modern Mathematical Physics. II. Fourier Analysis, Self-Adjointness, Academic Press, New York, 1975. [38] G. Rideau, On some representations of the anticommutation relations, Comm. Math. Phys. 9 (1968) 229–241. [39] Y. S. Samoilenko, Spectral Theory of Families of Self-Adjoint Operators, Kluwer, Dordrecht, 1987. [40] D. Shale and W. F. Stinespring, States on the Clifford algebra, Ann. Math. 80 (1964) 365–381. [41] T. Shirai and Y. Takahashi, Random point field associated with certain Feedholm determinant II: fermion shift and its ergodic and Gibbs properties, Preprint, 2001, available via http://neptune.math.titech.ac.jp/shirai/preprint.html. [42] T. Shirai and H. J. Yoo, Glauber dynamics for fermion point processes, Preprint, 2001, available via http://neptune.math.titech.ac.jp/shirai/preprint.html. [43] A. Soshnikov, Determinantal random point fields, Russian Math. Surveys 55 (2000) 923–975. [44] A. Soshnikov, Gaussian fluctuation for the number of particles in Airy, Bessel, sine, and other determinantal random point fields, J. Statist. Phys. 100 (2000) 491–522. [45] A. Soshnikov, Gaussian limit for determinantal random point fields, Preprint, 2001, available via http://xxx.lanl.gov/abs/math.PR/0006037.
November 7, 2002 14:37 WSPC/148-RMP
1098
00153
E. Lytvynov
[46] H. Spohn, Interacting Brownian particles: A study of Dyson’s model, in Hydrodynamic Behavior and Interacting Partcle Systems, G. Papanicolau (ed.) (Springer-Verlag, New York, 1987), pp. 151–179. [47] D. Surgailis, On multiple Poisson stochastic integrals and associated Markov semigroups, Probab. Math. Statist. 3 (1984) 217–239. [48] A. M. Vershik, I. M. Gelfand, and M. I. Graev, Representations of the group of diffeomorphisms, Russian Math. Surveys 30(6) (1975) 1–50.
November 6, 2002 9:0
WSPC/148-RMP
00152
Reviews in Mathematical Physics, Vol. 14, No. 10 (2002) 1099–1113 c World Scientific Publishing Company
POSITIVE AND NEGATIVE CORRELATIONS FOR CONDITIONAL ISING DISTRIBUTIONS
CAMILLO CAMMAROTA Dipartimento di Matematic´ a, Universit´ a di Roma “La Sapienza”, Piazzale A. Moro 2, 00185 Roma, Italia
[email protected] Received 27 March 2002 Revised 18 July 2002 In the Ising model at zero external field with ferromagnetic first neighbors interaction the Gibbs measure is investigated using the group properties of the contours configurations. Correlation inequalities expressing positive dependence among groups and comparison among groups and cosets are used. An improved version of the Griffiths’ inequalities is proved for the Gibbs measure conditioned to a subgroup. Examples of positive and negative correlations among the spin variables are proved under conditioning to a contour or to a separation line. Keywords: Correlation inequality; positive correlation; negative correlation; Ising model; contours configurations; groups. Mathematics Subject Classification 2000: 82B20; 60B15
1. Introduction We consider the Ising model on a finite set V , which to fix the ideas is chosen to be a cube of the d-dimensional integer lattice. The configuration space is SV = {−1, +1}V and an element s ∈ SV is a finite sequence s = (si , i ∈ V ) . Two sites i and j are called first neighbors if their distance is 1; we denote by B the set of the first neighbors pairs of V . The Hamiltonian is X Jij si sj H(s) = − {i,j}∈B
where the interaction Jij is assumed to be ferromagnetic, i.e. Jij ≥ 0 . 1099
November 6, 2002 9:0
1100
WSPC/148-RMP
00152
C. Cammarota
The Gibbs measure µ at inverse temperature β with open boundary conditions is defined as µ(s) = P
e−βH(s) . −βH(s) s∈SV e
(1)
We denote by C the set of unit lines (or surfaces in d > 2 dimensions) which separate the first neighbors pairs; if c ∈ C, we denote the neighboring pair that is separated by c by {c1 , c2 }. Given the spins configuration s, the associated contours configuration γs is the subset of C defined as γs = {c ∈ C|sc1 6= sc2 } . The contours configurations γs are elements of ΩC = {0, 1}C ; the set of contours configurations is Γ = {γ ∈ ΩC |γ = γs for some s ∈ SV } . In order to simplify the notations we consider an interaction, denoted by J, which is not dependent on the pair, but our results do not depend on this assumption. On the contours configurations one can define the probability measure λΓ (γ) = P
e−2βJ|γ| , −2βJ|γ| γ∈Γ e
(2)
where |γ| denotes the length of the contours configuration γ. The Gibbs measure can be expressed in terms of the contours by means of the obvious equation µ(s) = λΓ (γs ) .
(3)
If C is any finite set, ΩC has a natural group structure: the product of two sets is their symmetric difference and the identity is the empty set. The contours configurations set Γ is in turn a subgroup of ΩC : the product of two contours configurations γ1 and γ2 , denoted by γ1 · γ2 , is a contours configuration and the identity is the empty contours configuration. For other properties see Sec. 3. The group structure of the classical lattice systems has been widely investigated in a general context (see for instance [7]). This structure has been recognized to be useful in connection with the correlation inequalities in [8] and used in [1]. In [3] the Gibbs measure was investigated as a Bernoulli one conditioned to a group and some correlation inequalities and monotonicity properties were proved; applications to the Ising model were also provided. In particular there was considered the extension of the measure in Eq. (2), considering any subgroup G of ΩC , where C is any finite set. The conditional probability with respect to G is defined putting for any E ⊂ G X λG (γ) . λG (E) = γ∈E
In [3] the following correlation inequalities were proved.
November 6, 2002 9:0
WSPC/148-RMP
00152
Positive and Negative Correlations
1101
Proposition 1.1. Let G be a subgroup of ΩC and let E, F be subgroups of G and T a coset of E; then λG (E) ≥ λG (T ) λG (E ∩ F ) ≥ λG (E)λG (F ) .
(4) (5)
The measure λG can be considered as a Bernoulli measure on ΩC conditioned to the set G. If G is a subgroup, the inequality (5) states a positive dependence among pairs of events that are subgroups of G. It is reminiscent of the well known FKG inequality [4] for monotone events, but the kind of events here considered is very different. The aim of this paper is to discuss some applications of these inequalities using the contours representation of the Ising model. In particular we shall give an improved version of the Griffiths’ inequalities [6]. These essentially state that the spins variables are positively correlated. We shall prove that this is also true for the Gibbs measure conditioned to a subgroup. We remark that our results depend on symmetry arguments (the external field is zero) and on the assumption of first neighbors interaction between two valued variables. We use them to analyze a new type of problems. In particular we give examples of positive and negative correlations among the spins for the measure conditioned to a contour or to the separation line between the phases. The next section is devoted to these applications; the other one contain a self consistent proof of the inequalities. The proof, which tries to simplify the one given in [3], is based on several ingredients. The first one is an expansion of the Gibbs weight based on the ferromagnetic assumption; the second one exploits the group properties of the contours configurations; the third one makes use of the FKG theorem. This theorem has been used as a powerful tool for proving both known and new inequalities. We refer for instance to [9] and [1]. 2. Applications We discuss the two-dimensional Ising model, but our arguments are independent on the dimensionality. We first consider open boundary conditions. The unit lines in C are incident in vertices of semiinteger coordinates. Let us denote by I the set of vertices which are surrounded by four sites of V . More precisely, if (i1 , i2 ) is a vertex of semiinteger coordinates, it belongs to I if the four sites (i1 ± 1/2, i2 ± 1/2) belong to V . The group Γ is characterized by the following parity condition: a contours configuration γ belongs to Γ if the number of lines of γ that are incident in any vertex of I is even (zero, two or four). This group can be considered as the intersection of the local groups Ki just defined: \ Ki . Γ= i∈I
Let us denote by ∂I the set of vertices which are surrounded by a number of sites of V greater than zero and smaller than four (boundary vertexes). In ∂I the contours configurations do not satisfy the parity condition (the incident lines are zero or
November 6, 2002 9:0
1102
WSPC/148-RMP
00152
C. Cammarota
one). Hence the contours can be “open” in ∂I. The spins configurations set SV has also a natural group structure: the product of two spins configurations s and t is given by the ordinary pointwise product, that we denote by s × t, with the obvious identity. We notice that γs · γt = γs×t . This implies that the map s → γs maps subgroups into subgroups. This map is not injective since γs = γ−s . The cylindrical event {s|si = 1} is a subgroup but the event {s|si = −1} is not. We are interested in the conditional measure with respect to a given subset G of SV , defined as µG (s) = P
e−βH(s) −βH(s) s∈G e
(6)
if s ∈ G and zero otherwise. We consider only events which are closed under the ˜ its image in the contours spin flip operation. If E is such an event, we denote by E configurations. Hence the Gibbs measure of E conditioned to G is represented in terms of a conditioned measure on the contours by ˜ . µG (E) = λG˜ (E)
(7)
2.1. Comparison with the Griffiths’ inequalities We now consider the relationship between inequality (5) and the Griffiths’ correlation inequalities [6] hsA i ≥ 0 , hsA sB i ≥ hsA ihsB i ,
(8) (9)
where h i denotes the expectation with respect to the Gibbs measure and sA = Q i∈A si . We also denote by hsA iG the conditional expectation of sA with respect to G and if G is any subgroup we state an improved version of the inequalities. Proposition 2.1. Let G be any subgroup closed under the spin flip of the spins configurations space and A, B disjoint lattice sets of even cardinality; then hsA iG ≥ hsA i ,
(10)
hsA iG ≥ 0 ,
(11)
hsA sB iG ≥ hsA iG hsB iG .
(12)
and so
furthermore
Proof. Let us denote by χA the indicator function of the event EA = {s|sA = 1}. This event is a group and is closed under the spin flip. One has sA = 2χA (s) − 1
November 6, 2002 9:0
WSPC/148-RMP
00152
Positive and Negative Correlations
1103
and ˜A ) − 1 , hsA iG = 2µG (EA ) − 1 = 2λG˜ (E ˜ is a subgroup of Γ. Using the inequality (5) one has where G ˜ ˜ ˜A ) = λΓ (EA ∩ G) ≥ λΓ (E ˜A ) , λG˜ (E ˜ λΓ (G) and so hsA iG ≥ hsA i ≥ 0 . We also have ˜B ) − 2λ ˜ (E˜A ) − 2λ ˜ (E˜B ) + 1 . hsA sB iG = h(2χA − 1)(2χB − 1)iG = 4λG˜ (E˜A ∩ E G G ˜B are subgroups, the inequality (5) gives Since the events E˜A and E ˜A ∩ E˜B ) ≥ λ ˜ (E˜A )λ ˜ (E˜B ) , λG˜ (E G G and then the inequality (12) follows. Remark 2.1. The inequalities (11) and (12) can be obtained also using the standard method of proof in the natural spins representation (see [5]). According to this method the main point is now to prove that for any subgroup G and any lattice subset A one has X sA ≥ 0 . (13) s∈G 0
If s is a duplicated spins configuration, one obviously has X X sA s0A ≥ 0 . s∈G
s0 ∈G
Since s and s0 are both elements of the group G, there is a unique t ∈ G such that s0 = ts. Hence the left hand side is X X X sA s0A = sA tA sA = |G| tA (s,s0 )∈G×G
(s,t)∈G×G
t∈G
and this proves the inequality (13). If L is a coset of G it is natural to ask for the sign of X sA . (14) s∈L
Since L = l × G for some l ∈ L, one has X X sA = l A sA s∈L
s∈G
and in general there is no simple inequality for this sum. Hence we are led to consider particular cosets and this can be naturally done in the contours description.
November 6, 2002 9:0
1104
WSPC/148-RMP
00152
C. Cammarota
2.2. Conditioning to a contour or to a separation line We are interested in conditioning to an event which is a cylinder in the contours. Given α ⊂ C the cylinder L(α) = {γ ∈ Γ|γ ∩ α = α} is the set of contours configurations that occupy the lines in α; given η ⊂ C the cylinder G(η) = {γ ∈ Γ|γ ∩ η = ∅} is the set of configurations that occupy only lines outside η. In the 0, 1 language, L(α) is a cylinder of 1’s and G(η) is a cylinder of 0’s. It is easy to check that G(η) is a subgroup of Γ and L(α) is a coset of G(α). If α ∩ η = ∅ we also consider the cylinder G(η) ∩ L(α), which is a coset of G(α ∪ η). In order to simplify the notations we denote the corresponding sets in the spins by the same notations. We want to study the Gibbs measure conditioned to G(η) ∩ L(α). This can be done using the local character of these events and the additivity of the Hamiltonian in the contours configurations. Hence in order to have a 0 conditioning on η and a 1 conditioning on α it is sufficient to restrict the space of contours to those which lie outside α ∪ η. One has just to guarantee that these contours are compatible with α, in the sense that they satisfy the parity conditions. What follows is just a formal exposition of the above ideas. We consider the subset of ΩC\(α∪η) defined as Γ(α ∪ η) = {ρ ∈ ΩC\(α∪η) |ρ ∪ α ∈ Γ} . Obviously there is a one to one map between Γ(α ∪ η) and L(α) ∩ G(η) given by γ = ρ ∪ α and one has |γ| = |ρ| + |α| . If one chooses α ∈ Γ, then Γ(α ∪ η) is a subgroup of ΩC\(α∪η) with the usual product. Actually, the empty subset of C\(α ∪ η) is the identity and considering the elements ρ1 , ρ2 ∈ Γ(α ∪ η) one has ρ1 · ρ2 ∈ Γ(α ∪ η) since (ρ1 · ρ2 ) ∪ α ∈ Γ. The group Γ(α ∪ η) can also be considered as the intersection of local groups, according to the equation \ Ki (α ∪ η) . Γ(α ∪ η) = i∈I
where the parity condition now depends on the fact that both the unit lines of α and η are not to be occupied. We remark that this new space of contours configurations is corresponding to spins ones only if α is a contours configuration. The Gibbs measure conditioned to L(α) ∩ G(η) is simply given by the measure on Γ(α ∪ η) defined as λΓ(α∪η) (ρ) = P
e−2βJ|ρ| . −2βJ|ρ| ρ∈Γ(α∪η) e
(15)
November 6, 2002 9:0
WSPC/148-RMP
00152
Positive and Negative Correlations
1105
Now the correspondence between the spins and the contours which takes the place of (3) is µL(α)∩G(η) (s) = λΓ(α∪η) (ρs )
(16)
where ρs is such that γs = ρs ∪ α. Let us now consider the particular case of conditioning to L(α). Just to fix the ideas choose α to be the a simple loop, say the boundary of a square. In this case we say that i and j are separated by α if they can be joined by a path which intersects α only one time, and they are not separated if there is a path which does not intersect α. For any contours configuration α we say that the sites are separated by α if the number of intersections is odd and they are not separated if this number is even. We can prove the following. Proposition 2.2. Given two sites i and j and a contours configuration α, one has hsi sj iL(α) ≤ 0
(17)
if the two sites are separated by α and hsi sj iL(α) ≥ 0
(18)
if they are not separated. Proof. Conditioning to L(α), the set si = sj corresponds to “i and j are separated by an odd number of contours”, since one separating contour is given by α. Denoting by Fij = “i and j are separated by an even number of contours”, with obvious notations one has hsi sj iL(α) = λΓ(α) (Fijc ) − λΓ(α) (Fij ) . The complement event Fijc is a coset of the group Fij . Using (4) in the configurations space Γ(α) one has λΓ(α) (Fij ) ≥ λΓ(α) (Fijc ) and the inequality hsi sj iL(α) ≤ 0 follows. The other stated inequality can be proved using a similar argument. This provides an example of negative dependence; we refer to [10] where related problems and recent results are discussed. A simple consequence is the following one. Given two lattice sites i and j we denote by R0 (i, j) the event “i and j belong to the same cluster”, where, as usual, a cluster is a maximal connected set of sites having the same spin. We also denote by R1 (i, j) the complement, i.e. “the sites belong to different clusters”. We also denote by hsi sj i0 and hsi sj i1 the conditional expectations with respect to R0 (i, j)
November 6, 2002 9:0
1106
WSPC/148-RMP
00152
C. Cammarota
and R1 (i, j). One obviously has hsi sj i0 = 1 and it is natural to ask for the sign of hsi sj i1 . We can prove the following Proposition 2.3. The expectation of si sj conditioned to “i and j belong to different clusters” is non positive, i.e. hsi sj i1 ≤ 0 .
(19)
Proof. Fix a site i and a spin configuration s. It is so defined the set η of the lines which separate the points of the cluster to which i belongs in s and the set α of the lines which separate the points of the cluster from the outside. If s ∈ R1 (i, j) there exist α and η such that s ∈ L(α) ∩ G(η) and one can so define a partition of R1 (i, j). We now have X hsi sj iL(α)∩G(η) µ(L(α) ∩ G(η)) (20) hsi sj i1 = α,η
where the sum is over all the elements of the partition. We also have hsi sj iL(α)∩G(η) = λΓ(α∪η) (Fijc ) − λΓ(α∪η) (Fij ) and using the argument used in the above proposition we get hsi sj iL(α)∩G(η) ≤ 0 and the inequality (19) follows. We now consider the separation line between the phases in the two-dimensional Ising model and, as usual, we put + and − boundary conditions respectively on the upper half and on the lower half of a square box. Let us denote by a and b the unit lines that separate the + spins from the − ones on the boundary. We define the separation line ξ as the maximal connected component of the contours configuration that joins a and b. Given two lattice sites i and j and a spins configuration s we say that they are separated by ξ if i can be connected, say, to the upper half boundary without crossing ξ and j can be connected to the lower half boundary without crossing ξ. The event “the points are (not) separated” is the set of spins configurations such that there is a line which (not) separates the points and it is denoted by L1 (L0 ). Proposition 2.4. In the model with separation line, given two lattice points i and j, one has hsi sj iL1 ≤ 0
(21)
hsi sj iL0 ≥ 0 .
(22)
and
Proof. Let us denote by L(ξ) the event in the spins configurations that the line is ξ. Conditioning to this event is equivalent to consider the reduced contours
November 6, 2002 9:0
WSPC/148-RMP
00152
Positive and Negative Correlations
1107
configurations Γ(ξ). If i and j are separated by ξ, the event si = sj corresponds to “i and j are separated by an odd number of contours” in Γ(ξ). Actually any path that joins i and j crosses ξ an odd number of times. With the same notations used before one has hsi sj iL(ξ) = λΓ(ξ) (Fijc ) − λΓ(ξ) (Fij ) and so for any ξ this gives hsi sj iL(ξ) ≤ 0 . Similarly one has hsi sj iL(ξ) ≥ 0 if the sites are not separated by ξ. Using the same arguments of the above proposition the result follows. 3. Proof of the Inequalities 3.1. An expansion for the ferromagnetic Ising measure We provide an expansion of the measure λΓ based on the factorization of the Gibbs weight of the contours configurations and on the ferromagnetic character of the interaction. We define x = e−2βJ /(1 + e−2βJ ) and write the measure λΓ as x|γ| (1 − x)|γ | |γ| |γ c | γ∈Γ x (1 − x) c
λΓ (γ) = P
(23)
where γ c denotes the complement of γ in C. This equation states that the Gibbs measure can be considered as a product one on the space ΩC conditioned to the subset Γ of contours configurations. We now define the probability measure on ΩC νΓ (ω) = P
(1 − 2x)|ω| x|C\ω| |Γω 0| |ω| x|C\ω| |Γω | (1 − 2x) 0 ω⊂C
(24)
were we have used the notation E0ω = {γ|γ ∈ E, γ ∩ ω = ∅} . Using this measure we can state the following representation. Proposition 3.1. The ferromagnetic Ising measure λΓ can be represented in terms of the measure νΓ on ΩC as X |E ω | (25) νΓ (ω) ω0 . λΓ (E) = |Γ0 | ω⊂C
Proof. From the ferromagnetic hypothesis, J ≥ 0, one has x ≤ 1/2 and so the following expansion X c c (1 − 2x)|ω| x|γ \ω| (1 − x)|γ | = ω⊂γ c
November 6, 2002 9:0
1108
WSPC/148-RMP
00152
C. Cammarota
has only non negative summands. We also get X c (1 − 2x)|ω| x|C\ω| . x|γ| (1 − x)|γ | = ω⊂γ c
Let us compute the λΓ probability of an event E: P |γ| |γ c | γ∈E x (1 − x) . λΓ (E) = P |γ| |γ c | γ∈Γ x (1 − x) Using the above expansion the numerator is given by X X X (1 − 2x)|ω| x|C\ω| = (1 − 2x)|ω| x|C\ω| |E0ω | γ∈E ω⊂γ c
where we have used
(26)
ω⊂C
X
1 = |E0ω |
γ∈E,γ⊂ω c
and the denominator by X X X (1 − 2x)|ω| x|C\ω| = (1 − 2x)|ω| x|C\ω| |Γω 0|. γ∈Γ ω⊂γ c
(27)
ω⊂C
Using the definition of νΓ , this completes the proof. We also write λΓ (E) = νΓ (fE ) where the right hand side denotes the average with respect to νΓ of the function fE on ΩC defined by fE (ω) =
|E0ω | . |Γω 0|
We notice that while the contours configurations measure λΓ is a measure conditioned to the subset Γ, the measure νΓ is defined in all the space ΩC . 3.2. The group structure of the contours We now recall some of the properties of the group ΩC (for a general reference see for instance [2]). We shall use that the group is commutative and that ω −1 = ω. If G is a subgroup the binary relation ∼ defined by ω1 ∼ ω2 if and only if ω1 · ω2 ∈ G is an equivalence relation. The elements of the partition of ΩC so defined are the cosets of G. The subgroup itself is an element of the partition. Any coset L different from G is so disjoint from G ad is given by L = σ · G = {α ∈ ΩC |α = σ · ω, ω ∈ G} for any σ ∈ L. We shall use that G and L have the same cardinality: |G| = |L|. Given two cosets H, L the set H · L = {α ∈ ΩC |α = σ · ω, σ ∈ H, ω ∈ L}
November 6, 2002 9:0
WSPC/148-RMP
00152
Positive and Negative Correlations
1109
is a coset. The set of the cosets of a group G is itself a group with respect to the product above defined, the identity being G. If F is a subgroup of G the quotient G/F is a group whose elements are the cosets of F . Hence its cardinality is given by the equation |G/F | =
|G| . |F |
(28)
We shall use the following result Proposition 3.2. For any two subgroups E, F of the group G one has |E · F | |E ∩ F | = |E| |F | .
(29)
Proof. We notice that both E · F and E ∩ F are subgroups of G. We first consider the case E ∩ F = {∅}, i.e. the two groups have in common only the identity. In this case from the definitions it follows easily that |E · F | = |E| |F | . In the general case we consider the quotient with respect to E ∩ F of the groups E, F , E · F . These quotients that we denote by E/(E ∩ F ), F/(E ∩ F ), (E · F )/(E ∩ F ), are groups, the identity being E ∩ F . Since the two first have in common only the identity, the above equation gives |(E · F )/(E ∩ F )| = |E/(E ∩ F )| |F/(E ∩ F )| . From (28) it follows that |E| |F | |E · F | = |E ∩ F | |E ∩ F | |E ∩ F | and the proof is completed. If σ ⊂ C we denote Eασ = {ω ∈ ΩC |ω ∈ E, ω ∩ σ = α} .
(30)
In the sequel we shall use the following property, whose proof is a direct consequence of the definitions: Gσ0 is a group and Gσα are its cosets. As a consequence one has |Gσα | = |Gσ0 | .
(31)
Gσα · Gσβ = Gσα·β .
(32)
Furthermore for any α and β
Proof of the inequality (4). We use the representation (25) for λG (E) and λG (T ) and the obvious fact that if T0ω is non empty it is a coset of E0ω . Hence |E0ω | ≥ |T0ω | and one easily gets the result.
November 6, 2002 9:0
1110
WSPC/148-RMP
00152
C. Cammarota
3.3. The FKG structure The set ΩC has a natural order structure based on the partial order ω1 ≤ ω2
if ω1 ⊂ ω2 .
A function f on ΩC is called “increasing” if ω1 ≤ ω2 ⇒ f (ω1 ) ≤ f (ω2 ) . A similar definition is given for decreasing functions. A probability measure µ on ΩC is said to be “positively associated” or to have the FKG property if for any two increasing (or decreasing) functions f , g the following inequality holds for the expectations with respect to µ µ(f g) ≥ µ(f )µ(g) .
(33)
An event is called “increasing” if its indicator function is such. Hence if two events A, B are both increasing, it follows µ(A ∩ B) ≥ µ(A)µ(B) .
(34)
A sufficient condition for positive association is [4] µ(ω1 ∪ ω2 )µ(ω1 ∩ ω2 ) ≥ µ(ω1 )µ(ω2 ) .
(35)
Using the spins language version of this condition one can get, as it is well known, that the Ising ferromagnetic measure is associated. We are looking for a similar property in the contours language. We will first show that the probability measure νΓ has the FKG property. We then show that if E is a group then the function fE is monotonic and finally we use the representation (25). This will be sufficient to deduce the correlation inequality. Proposition 3.3. The probability measure νΓ is FKG. Proof. We shall prove the more general statement for the measure νG where G is any subgroup of ΩC . We shall check the sufficient condition νG (ω1 ∪ ω2 )νG (ω1 ∩ ω2 ) ≥ νG (ω1 )νG (ω2 ) which is equivalent to ω2 1 ∪ω2 1 ∩ω2 1 | |Gω | ≥ |Gω |Gω 0 0 0 | |G0 | .
(36)
1 ∩ω2 = G. Hence we have to We first consider the case ω1 ∩ ω2 = ∅ in which Gω 0 prove
ω2 1 ∪ω2 1 | |G| ≥ |Gω |Gω 0 0 | |G0 | .
(37)
November 6, 2002 9:0
WSPC/148-RMP
00152
Positive and Negative Correlations
We have 1 |Gω 0 | =
X
1111
1 ∪ω2 |Gω |; α2
α2 ⊂ω2 2 |Gω 0 | =
X
α1 ⊂ω1
G=
X
1 ∪ω2 |Gω |; α1
X
α1 ⊂ω1 α2 ⊂ω2
1 ∪ω2 |Gω α1 ∪α2 | .
1 ∪ω2 . The sets which appear in the sums, if non empty, are cosets of the group Gω 0 From (32) if α1 ⊂ ω1 , α2 ⊂ ω2 , it easily follows that 1 ∪ω2 6= ∅ , Gω α1
ω1 ∪ω2 1 ∪ω2 1 ∪ω2 1 ∪ω2 Gω 6= ∅ ⇒ Gω · Gω 6= ∅ α2 α1 ·α2 = Gα1 α2
and obviously one has α1 · α2 = α1 ∪ α2 . Since all the sets that appear in the sums have the same cardinality (if non empty) one gets 1 ∪ω2 1 ∪ω2 1 ∪ω2 1 ∪ω2 | |Gω | ≤ |Gω | |Gω |Gω α1 α2 α1 ∪α2 | . 0
Using this inequality we easily get (37). We now consider the case ω1 ∩ ω2 = τ 6= ∅. We put τ1 = ω1 \τ , τ2 = ω2 \τ and since τ1 ∩ τ2 = ∅ we apply the above argument to the group Gτ0 in place of G, and this completes the proof. Proposition 3.4. If E is a subgroup of G, the function defined on ΩC by |E0ω | |Gω 0|
fE (ω) = is increasing.
Proof. We have to prove that for each ω and i ∈ C\ω one has fE (ω) ≤ fE (ω ∪ {i}) .
(38)
We use the notation ωi = {ρ ∈ E|ρ ∩ ω = ∅, ρ ∩ {i} = {i}} E01 ωi and for G. Hence and a similar one for E00
fE (ω ∪ {i}) =
ωi |E00 | ωi |G00 |
and the inequality (38) is equivalent to ωi ω ωi |Gω 0 | |E00 | ≥ |E0 | |G00 | .
We have ωi ωi | + |E01 |; |E0ω | = |E00
ωi ωi |Gω 0 | = |G00 | + |G01 |
(39)
November 6, 2002 9:0
1112
WSPC/148-RMP
00152
C. Cammarota
so the above inequality is equivalent to ωi ωi ωi |Gωi 01 | |E00 | ≥ |E01 | |G00 | .
(40)
ωi = ∅ this inequality is trivially true. Suppose that this set is non empty. It If E01 ωi and so it has the same cardinality; in addition it follows is a coset of the group E00 ωi that also the coset G01 is non empty and it has the same cardinality of Gωi 00 . In this ωi ωi would case (40) holds as an equality. We notice that if G01 were empty, also E01 be such, since by hypothesis E ⊂ G.
Proof of the correlation inequality (5). We prove the more general statement for any group G. From the representation (25) one gets λG (E ∩ F ) =
X ω⊂C
νG (ω)
|(E ∩ F )ω 0| . |Gω | 0
We shall use for each ω ⊂ C the following inequality |E0ω | |F0ω | |(E ∩ F )ω 0| ≥ ω |Gω |Gω 0| 0 | |G0 |
(41)
and since the two functions at the right hand side are both increasing, the FKG theorem proves the statement. From (29), since E · F ⊂ G, one has |G| |E ∩ F | ≥ |E| |F | .
(42)
ω ω ω ω We now use that (E ∩ F )ω 0 = E0 ∩ F0 and the fact that E0 , F0 are subgroups of Gω 0 . The above inequality then gives (41).
Acknowledgment We thank G. Gallavotti and S. Miracle Sol´e for stimulating conversations and G. Gallavotti for suggesting the inequality (13). References [1] B. Baumgartner, On the group structure, GKS and FKG inequalities for Ising models J. Math. Phys. 24, (1983) 2197–2199. [2] N. L. Biggs, Discrete Mathematics, Oxford Univ. Press, 1993. [3] C. Cammarota and L. Russo, Bernoulli and Gibbs Probabilities of Subgroups of {0, 1}S , Forum Math. 3, (1991) 401–414. [4] C. Fortuin, P. W. Kasteleyn and J. Ginibre, Correlation inequalities on some partially ordered sets, Comm. Math. Phys. 22, (1971) 89–103. [5] J. Ginibre, General Formulation of Griffiths’ inequalities, Comm. Math. Phys. 16, (1970) 310–328. [6] R. Griffiths, Correlations in ferromagnets I and II, J. Math. Phys. 8, (1967) 478–489. [7] C. Gruber, A. Hintermann and D. Merlini, Group Analysis of Lattice Systems, Lecture Notes in Physics 60, Springer-Verlag, 1977.
November 6, 2002 9:0
WSPC/148-RMP
00152
Positive and Negative Correlations
1113
[8] D. G. Kelly and S. Sherman, General Griffiths’ inequalities on correlations in ising Ferromagnets, J. Math. Phys. 9, (1968) 466–484. [9] J. L. Lebowitz, GHS and other Inequalities, Comm. Math. Phys. 35, (1974) 87–92. [10] R. Pemantle, Towards a theory of negative dependence, J. Math. Phys. 41, (2000) 1371–1390.
November 7, 2002 15:28 WSPC/148-RMP
00151
Reviews in Mathematical Physics, Vol. 14, No. 10 (2002) 1115–1163 c World Scientific Publishing Company
TRI HAMILTONIAN VECTOR FIELDS, SPECTRAL CURVES AND SEPARATION COORDINATES
L. DEGIOVANNI and G. MAGNANO Dipartimento di Matematica, Universit` a di Torino Via Carlo Alberto 10, I–10131 Torino, Italy Received 15 March 2001 Revised 4 April 2002
We show that for a class of dynamical systems, Hamiltonian with respect to three distinct Poisson brackets (P0 , P1 , P2 ), separation coordinates are provided by the common roots of a set of bivariate polynomials. These polynomials, which generalise those considered by E. Sklyanin in his algebro-geometric approach, are obtained from the knowledge of: (i) a common Casimir function for the two Poisson pencils (P1 − λP0 ) and (P2 − µP0 ); (ii) a suitable set of vector fields, preserving P0 but transversal to its symplectic leaves. The framework is applied to Lax equations with spectral parameter, for which not only it establishes a theoretical link between the separation techniques of Sklyanin and of Magri, but also provides a more efficient “inverse” procedure to obtain separation variables, not involving the extraction of roots. Keywords: Bihamiltonian systems; poisson brackets; separability; Lax equations.
1. Introduction The relationship between the Liouville integrability of a Hamiltonian system and the existence of a second conserved Poisson bracket (or “hamiltonian structure”) in its phase space, first discovered by Magri [1], has been thoroughly investigated in the past years. Bihamiltonian structures underlying all classical examples of integrable systems (both finite and infinite-dimensional) have been described by several authors, and almost all the relevant properties connected to integrability have been reinterpreted in terms of the geometry of bihamiltonian manifolds and vector fields. Recently, the classical problem of characterizing separable hamiltonians (i.e. those for which the Hamilton–Jacobi equation can be solved by separation of variables in a suitable system of canonical coordinates) has been also translated in the language of bihamiltonian geometry [2, 3]. A question which has not yet received a complete answer concerns the link between the bihamiltonian framework and the algebro-geometric methods of solution based on the isospectrality property of Lax equations [4, 5]. Although it is possible 1115
November 7, 2002 15:28 WSPC/148-RMP
1116
00151
L. Degiovanni & G. Magnano
to introduce bihamiltonian structures which naturally lead to Lax equations with a spectral parameter [6, 7], the role of the characteristic equation for the Lax operator (the “spectral curve” of the algebro-geometric approach) has not been clarified so far in the bihamiltonian perspective. The present work adds new elements in view of a connection between multihamiltonian structures, existence of separation coordinates and spectral curves, starting from an apparently marginal observation: some well-known integrable systems allow two distinct bihamiltonian descriptions, independently described by different authors and apparently unrelated (in spite of having one Poisson bracket in common). In this introductory section, we shall recall some relevant facts using the simplest example of such “trihamiltonian systems”, namely the generalized Euler–Poinsot rigid body. To motivate the reader to follow us through an exercise which could seem of little practical interest, let us anticipate that the interplay of the three Poisson structures leads to a new role played by the characteristic determinant of the Lax matrix, and this fact may eventually clarify the connection between Sklyanin’s algebro-geometrical construction of separation variables [8] and the bihamiltonian method recently proposed in [2, 3]. Indeed, the occurrence of more than two Poisson brackets on the same manifold is not new nor surprising by itself, and in some cases it is even a structural property, as for the so–called “Lie–Poisson pencils” described in [6]; in the sequel, we discuss the difference between such known cases of multihamiltonian structures and the trihamiltonian structure that we are presently considering. The simplest (nontrivial) example of Lax equation with spectral parameter is provided by the dynamics of a rigid body about a fixed point, in the absence of external forces (Poinsot rigid body). In the body reference frame, the motion is described by the Euler–Poisson–Lax equation dM = [M, Ω] , dt
(1.1)
where M and Ω are the skew-symmetric 3 × 3 matrices representing the angular momentum and the angular velocity, respectively, in the body reference frame. A straightforward consequence of (1.1) is that the trace of any power of the matrix M d Tr(M k ) = 0. Generalizing the system to M ∈ so(r), one is a constant of motion: dt obtains in this way at most 2r independent constants of motion if r is even, or r−1 2 for r odd , which for r > 3 would not be enough to meet Liouville’s integrability condition. Assuming (I1 , I2 , I3 ) to be the eigenvalues of the inertia tensor, one can intro duce the diagonal matrix with diagonal elements −I1 +I2 2 +I3 , I1 −I22 +I3 , I1 +I22 −I3 ; the linear relation between M and Ω can then be written in the following form: M = JΩ + ΩJ ;
(1.2)
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1117
Manakov [9] has observed that the Euler–Poisson equation (1.1) and the inertia map (1.2) can be put together into a single Lax equation for a new Lax pair dependent on a formal parameter λ, d(M + λJ 2 ) = [M + λJ 2 , Ω + λJ] . dt
(1.3)
The insertion of the parameter λ into the Lax equation leads to a wider number of constants of motion, the Manakov integrals. We denote them by fik , according to the following convention: (k)
fλ
=
k 1 2k k X k k−i J λ + fi λ . k i=1
(1.4)
For M ∈ so(r), the functions fik (M ) vanish identically for i odd; the odd Manakov functions are however relevant for the “generalized Euler–Poinsot system”, with M ∈ gl(r), that we shall consider in the sequel. As is well known, Eq. (1.1) is Hamiltonian with respect to the Lie–Poisson bracket, defined on so(r) through the ad-invariant scalar product (A, B) = Tr(A · B) .
(1.5)
More precisely, given any function f : so(r) → R, one defines its gradient at a point M to be the matrix ∇f ∈ so(r) such that f˙ = hdf, M˙ i = (M˙ , ∇f ); then, for any pair of functions, {f, g}LP = (M, [∇f, ∇g])
(1.6)
is a Poisson bracket [10]. The Lie–Poisson bracket (1.6) is degenerate: an adinvariant function f (M ) is in involution with any other function, i.e. is a Casimir function for the bracket (1.6). The Casimir functions, which include the traces of the powers of M , are automatically constants of motion, but they are irrelevant as far as the Liouville integrability of the system (with the Lie–Poisson bracket) is concerned. Therefore, the integrability of the system actually relies on the existence of the other Manakov first integrals. In 1996, Morosi and Pizzocchero [11] introduced a second Poisson bracket on so(r), defined as follows: let A ∈ gl(r) a fixed matrix (for the Euler–Poinsot case, A ≡ J 2 ; notice that A needs not to belong to so(r)). With the same definition of scalar product and gradient as above, one sets {f, g}MP = (M, ∇f · A · ∇g − ∇g · A · ∇f ) .
(1.7)
One can check that the vector field generated by the Manakov functions through the Poisson structure (1.7) are exactly the same as those generated through the Lie–Poisson structure (1.6), up to a rearrangement in the correspondence between
November 7, 2002 15:28 WSPC/148-RMP
1118
00151
L. Degiovanni & G. Magnano
hamiltonians and vector fields. For instance, the physical hamiltonian generating the Euler–Poinsot dynamics through the Lie–Poisson bracket is h1 = 12 Tr(ΩM ) = Tr(Ω2 J), while the hamiltonian of the same vector field, relative to the Morosi– Pizzocchero bracket is h2 = − 21 Tr(ΩJ −1 M J −1 ). To simplify the notation, let us denote by P1 and P2 , repectively, the Poisson tensors associated respectively to the brackets (1.6) and (1.7): {f, g}LP = hdg, P1 df i ,
{f, g}MP = hdg, P2 df i .
(1.8)
Denoting by X1 the vector field over so(r) corresponding to Eq. (1.1), the relation P1 dh1 = P2 dh2 is depicted by the diagram
P
(1.9)
P
where h −→ X is an abbreviation for dh 7−→ X, a convention that we shall use in analogous diagrams throughout this article. The diagram (1.9) is nothing but the elementary block of the Lenard–Magri recursion generating a whole family of quadratic first integrals hi (known as Miˇsˇcenko functions), and the corresponding symmetry vector fields Xi : .
/
4
5
?
:
:
@
:
E
"
#
E
"
E
$
!
&
'
(
*
%
&
A
+
,
1
2
7
8
<
B
C
(1.10) D
=
The Manakov first integrals can be generated by the same recursion procedure. (k) Setting A ≡ J 2 , one has ∇fλ = (M + λA)k−1 , then P1 dfik = [M, ∇fik ] = M ∇fik−1 A − A∇fik−1 M = P2 dfik−1
(1.11)
which correspond to Lenard–Magri diagrams starting with the P1 -Casimir functions fkk : #
$
%
&
-
.
/
3 0
4
5
6
=
'
1
=
=
7
!
" (
)
*
+
8
2
,
2
9
:
;
2
<
"
Notice that all the functions iteratively generated by Lenard–Magri recursion relations are automatically in involution with respect to both Poisson tensors P1 and P2 . The (elementary) proof of this fact will be recalled in the next section. Thanks to this property of bihamiltonian vector fields, one does not need to prove separately the involutivity of the first integrals of Manakov, and the complete integrability of the generalized Euler–Poinsot system is simply assessed by computing how many independent first integrals can be found in this way.
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1119
All the statements above hold valid if one extends the equation (1.1) to M ∈ gl(r). Both the Lie–Poisson bracket and the Morosi–Pizzocchero bracket can be introduced in gl(r) using the same definitions (1.6) and (1.7). The Morosi– Pizzocchero bracket is defined in terms of the matrix product (not of the commutator) and therefore is defined even more naturally on gl(r): it reduces on so(r) by orthogonal projection with respect to the scalar product (1.5), provided the matrix A is symmetric. Thus, for the Lax matrix L(λ) = Aλ + M the complete family of Manakov constants of motion can be found by the recursion procedure, which also ensures their mutual involutivity. Whenever A is symmetric (and positive). The dynamical system defined by (1.1) in gl(r) is a proper extension of the original Euler–Poinsot system. The flows of the original model are those associated to the even Manakov functions (these flows are tangent to so(r)), while the other flows of the enlarged system, generated by the odd Manakov functions, are orthogonal to so(r). In the larger phase space gl(r), however, one can obtain the full set of first integrals by another Lenard–Magri recursion, relative to a different bihamiltonian pair. The new Poisson bracket depends, as for (1.7), on the choice of the matrix A: {f, g}A = (A, [∇g, ∇f ]) .
(1.12)
From now on, let us denote by P0 the Poisson tensor associated with (1.12). The Manakov functions are in bihamiltonian recursion also with respect to the pair (P0 , P1 ), but the sequences are arranged in a different way: k k , A] = P0 dfi+1 . P1 dfik = [M, ∇fik ] = [∇fi+1
(1.13)
Each integer power of L(λ) corresponds to a single finite Lenard–Magri sequence, starting from a Casimir function for P0 and ending with a Casimir function for P1 : ,
0
1
6
7
< (
2
8
-
"
#
$
&
!
"
)
* 3
/
/
4
9
:
/
5
;
*
(1.14)
A disadvantage of the new bihamiltonian structure (P0 , P1 ) is that it cannot be reduced (by restriction or by orthogonal projection) to so(r). On the other hand, (P0 , P1 ) leads naturally to the Lax equation with spectral parameter (1.3), which on the contrary is rather difficult to derive from the former pair (P1 , P2 ). To show this, we need to reexpress the Lenard–Magri recursion relations (1.14) in the language of Poisson pencils. Given a pair of Poisson tensors (P, Q) on a manifold M, assume that the λdependent bracket {f, g}P −λQ = {f, g}P − λ{f, g}Q = hdf, (P − λQ)dgi ;
(1.15)
November 7, 2002 15:28 WSPC/148-RMP
1120
00151
L. Degiovanni & G. Magnano
be a Poisson bracket, i.e. fulfill the Jacobi identity for any λ; in this case, P and Q are said to be compatible; (M, P, Q) becomes a bihamiltonian manifold (or P Qmanifold, following [12]), and one refers to (1.15) as to its Poisson pencil. It is immediate to see that, given a sequence of functions {fi }i=0,...,N such that 0 = P df0 Qdf0 = P df1 .. . Qdfk = P dfk+1
(1.16)
.. . QdfN = 0 , then the polynomial in λ defined by fλ = f0 + f1 λ + · · · + fN λN
(1.17)
is a Casimir function of the Poisson pencil, i.e. for any λ (P − λQ) dfλ = 0
(1.18)
(the differential of fλ is taken with respect to the coordinates on M, λ being regarded as a parameter). Conversely, given a λ-polynomial function fulfilling (1.18), its coefficients obey the Magri–Lenard recursion according to (1.16) and generate a sequence of commuting bihamiltonian vector fields. In the next section, we will recall the proof of the following relevant property, that we shall extensively use. Let gλ be a second Casimir function of the same Poisson pencil : then, not only its coefficients gk are in involution with each other , but they also Poisson–commute with all the coefficients fk of the other Casimir function fλ . Given a polynomial Casimir function fλ , each bihamiltonian vector field of the associated Lenard–Magri hierarchy Xk = P dfk = Qdfk−1 can be also represented by PN a Hamilton equation with spectral parameter . Having set fλ = i=0 fi λi , consider for each positive integer k < N the polynomial (k)
fλ
≡ f0 λk + f1 λk−1 + · · · + fk−1 λ + fk ;
(1.19)
taking into account (1.16) it is easy to see that (k)
Xk = (P − λQ) dfλ .
(1.20)
This formula holds true for formal power series (N = ∞); if the Casimir function P −i fλ is instead expanded in Laurent series, fλ = ∞ i=0 fi λ , then the polynomial (k) k fλ is easily obtained upon multiplying by λ and truncating to the nonnegative powers:
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates (k)
fλ
= λk fλ
+
.
1121
(1.21)
We are now ready to derive the Manakov equation (1.3) as a Hamilton equation with spectral parameter for the Poisson pencil (P1 − λP0 ) on gl(r). In fact, it is easy to see that the trace of any power of the Lax matrix Aλ + M is a Casimir function of the Poisson pencil (P1 − λP0 ): by definition (1.6, 1.12), (P1 − λP0 ) df = [Aλ + M, ∇f ] : (k)
(1.22)
(k)
for fλ = k1 Tr(Aλ+M )k one has ∇fλ = (Aλ+M )k−1 , which obviously commutes with Aλ+M . The same happens for the Laurent series expansion of the trace of any half-integer power of A+M λ−1 . On account of (1.20), all the vector fields generated by the coefficients of these Casimir functions (which all mutually commute, by the property mentioned above) correspond to Lax equations with spectral parameter: (k)
Xk = (P1 − λP0 ) dfλ
(k)
= [Aλ + M, ∇fλ ] .
(1.23)
In particular, the Manakov equation (1.3) corresponds to the first vector field of the hierarchy associated to the Casimir function fλ = 23 Tr(A + M λ−1 )3/2 , for which (1) ∇fλ = A1/2 λ + Ω. No comparably simple and natural connection exists between the other Poisson pencil (P2 − λP1 ) and the Lax–Manakov form of the equations. This setting can be generalized to cover the cases of more general Lax matrices with spectral parameter on gl(r), of the form L(λ) = Aλk + M1 λk−1 + · · · + Mk . The general framework is described in [6], and will be partially recalled in Sec. 4 below. So far, we have simply put together some results already present in the literature. Now, some questions arise naturally. We have seen a dynamical system which is bihamiltonian with respect to two independent PQ structures, (P0 , P1 ) and (P1 , P2 ); is that a pure accident, or it is a common situation? One can check by explicit computation that the Poisson tensors P0 and P2 are not only separately compatible with P1 , but also compatible with each other (that is not obvious, as compatibility is not a transitive relation). Does it make any sense to introduce the notion of a trihamiltonian structure (P0 , P1 , P2 )? Would it carry any additional information not already contained in either one of the PQ structures, each of which already allows to characterize completely the dynamical system and its symmetries? The vector fields (1.23) on gl(r) are indeed trihamiltonian. The full set of hamiltonians and vector fields generated by the traces of integers powers of Aλ + M , fit into a single “planar” diagram (as was first pointed out by M. Ugaglia [13]), which could be regarded as the “trihamiltonian version” of the Lenard–Magri “linear” diagrams (1.14):
November 7, 2002 15:28 WSPC/148-RMP
1122
00151
L. Degiovanni & G. Magnano o
k
l
m
n P
Q
R
k
S
T
P
N
O
O
B
I
C
J
D
K
L
M
j
f
g
h
i E >
F
G
U H
V
?
W @
X
Y
A
f
U E
>
9
:
;
<
=
=
(
!
0
1
2
)
*
4
6
7
e
`
a
b
c
Z
#
$
+
,
-
[
/
] &
^
_
'
`
Z
#
+
u
p
s
s
q
v
y
z
s
r
w
t
w
w
{
x
{
{
(1.24)
We have seen above that, for a P Q structure, any linear recursion starting from a Casimir function of Q and ending with a Casimir function of P corresponds to the existence of a λ-polynomial Casimir function of the Poisson pencil (P − λQ). Can one find a “generating polynomial” for the full trihamiltonian recursion? The answer is yes: as we shall see in detail in Sec. 2, if one considers two compatible Poisson pencils (P1 − λP0 ) and (P2 − µP0 ), one can define a common Casimir P function of the two pencils to be a bivariate polynomial fλµ = hij λj µi such that (P1 − λP0 ) dfλµ = 0 (P2 − µP0 ) dfλµ = 0
(1.25)
for any value of (λ, µ): then its coefficients hij fulfill the recursion relations represented in the diagram (1.24). Later on we will explain why the construction of two Poisson pencils, each one with its own spectral parameter, is here more fruitful than introducing a two-parameter pencil like (P0 − λP1 − µP2 ). Up to this point, the reader might still regard the idea of trihamiltonian structures as an artifact of purely academic interest, a mere “variation on the theme” of bihamiltonian structures. Two results, presented in this article, suggest that the subject is worth investigating further. First, the trihamiltonian structure associate to (1.3) can be generalized, in quite a nontrivial way, to Lax equations for matrices of the form L(λ) = Aλn + M1 λn−1 + · · · + Mn , which include several interesting systems such as the Lagrange top [14] and the finite-dimensional Dubrovin–Novikov reductions of the Gel’fand–Dickey soliton hierarchies [15]. Indeed, the generalization of the pencil (P1 − λP0 ) to the direct sum of n copies of gl(r) was already described in [6], but to our knowledge it is an entirely new result that also the Morosi–Pizzocchero bracket is a particular case of a more general structure existing on gl(r)n , a Poisson tensor P2 which turns out to be quadratically dependent on the dynamical variables Mi , apart for the linear case n = 1 already discussed.
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1123
The second striking fact is that for these trihamiltonian structures on gl(r)n there always exists a common Casimir function of the two pencils, which (for a generic choice of the matrix A) has the property that its coefficients form a maximal set of independent hamiltonians in involution (we stress that, in contrast, the recursion diagram for the traces of the powers of the Lax matrix includes infinitely many hamiltonians). This miraculous Casimir function is nothing but the characteristic determinant of the Lax matrix, fλµ = det|L(λ) − µ1l| .
(1.26)
The corresponding recursion diagram features a sort of “fundamental molecule”, a “fingerprint” associated to the trihamiltonian structure (P, Q1 , Q2 ). For instance, the following diagram corresponds to the trihamiltonian recursion on the algebra gl(3): a p
]
^
_
` l
n
]
m
o
l
B
CD
9
:
;
@
A
\ k
X
Y
Z
[ 5
7
X
g 6 <
=
>
i
j
g 5
.
h
?
8
<
/
1
2
3
3
#
$
%
+
,
W f
R
S
T
U
R
!
&
'
(
b *
d "
c
e
b &
J
E
F
K
N
O
G
L H
I
Q
(1.27) M
2
while (1.28) is the “molecule” of the trihamiltonian structure on gl(2) : f
p F
b
c
d
e
l B
C
G
H
I
m
J
n D
o
E
b
l B
1
G
2
4
5
6
6
&
.
'
(
/
a
k <
\
]
^
_
g
h
!
"
)
*
+
-
8
9
=
>
?
A
i
j
$
%
:
;
\
g
!
)
8
=
P
K
N
L
Q
T
U
Y
Z
M
R
O
W
S
[
X
(1.28)
The general form of the “fundamental molecule” for gl(r)n is given in Sec. 4 as Fig. 1. Although it is a well known fact that the coefficients of the characteristic polynomial are in involution with respect to the usual Lie–Poisson bracket, in the bihamiltonian framework there was no apparent reason to introduce a bivariate polynomial
November 7, 2002 15:28 WSPC/148-RMP
L. Degiovanni & G. Magnano
å
ã
â
Ü
Û
Ú
Ø
Ú
Ù
ä
ß
á
ã
Ø
× å
â
Õ
á
à
ß
Ñ
Ð
Í
Ì
º
¹
Å
Ó
Þ
Ô
Ý
Ó
Ò
É
È
Ç
Å
Ç
Ö
Ö
Ö
Õ
Å
Ó
Ô
Ó
Ò
É
È
Ç
Å
Ç
Æ Ï
Å
Ä Ò
» Î
Ã
Ë
Ê
¿
Á
À º ¹
·
Â
Á
¸
¿
¾
»
²
· ¶
µ
©
±
°
´
®
°
¯
·
¡
¸
½
®
¼
§
¥
¨
¬
·
¶
µ
¨
¨
´ ª
³
¥¦
«
£
Fundamental molecule for gl(r)n
¤
¤
ª
¢
~
~
}
|
u
l z w
{
z
v
r t
t
p
o
Fig. 1.
1124
00151
n
l ~
u
w
n
m
y
x
x
l
k
f
v
t a
r
x
q
t
i
`
_
]
_ ]
g
j
h
g
j
j
i
]
g
h
g
f
a
`
_
D ]
e
_
^
]
\
E
e
C
d
c
B
b
@
[
?
>
<
M
D Y
>
=
G
Z
F
Y
F
<
;
X
E
F
$
C
B
A #
"
Q
:
P
O
! 4
M
* V 8
O
N
9
M
8
L
5
W
3
U
S
1
R
/ K
.
-
*
<
I
4
-
,
7
6
J
I
6
*
)
H 5 6
3
1
0
(
#
&
'
&
%
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1125
fλµ in connection with the Lenard–Magri recursion. For a trihamiltonian structure, instead, it is quite natural to consider this object, and the characteristic polynomial of a Lax matrix becomes just a particular case of it, in exactly the same way as Lax equations with spectral parameter are a particular case of Hamilton equations, for the appropriate Poisson pencil (1.20). This opens a very interesting perspective. The characteristic equation det|L(λ) − µ1l| = 0 ,
(1.29)
regarded as a polynomial equation for (λ, µ) ∈ C2 defines the well–known spectral curve, i.e. the starting point for the algebro-geometric methods of linearisation [4, 5]. In the trihamiltonian framework, as we have seen, the characteristic determinant naturally occurs as the fundamental Casimir function of two pencils: yet this does not explain why the roots (λ, µ) of Eq. (1.29) should play any role at all. Now comes a third surprise: a fairly general construction presented in Sec. 3 shows that the equation fλµ = 0 is the keystone for the construction of canonical separation coordinates for trihamiltonian systems. This result essentially derives from an observation by E. Sklyanin [8]. On algebro-geometric grounds, Sklyanin has found a “magic recipe”(“Take the poles of the properly normalized Baker–Akhiezer function and the corresponding eigenvalues of the Lax operator ”), which essentially amounts to finding the common roots of (1.29) and of suitable minors (or linear combination of minors) of the characteristic matrix L(λ) − µ1l. In the examples considered by Sklyanin, the new variables (λi , µi ) defined in this way turn out to be canonical with respect to a suitable Poisson bracket; by direct consequence of Eq. (1.29), all the hamiltonians provided by the (nonconstant) coefficients of fλµ are then separable in the coordinates (λi , µi ). However, Sklyanin himself remarks that “generally speaking, there is no guarantee that one obtains the canonical Poisson brackets [..] The key words in the above recipe are ‘the properly normalized’. The choice of the proper normalization can be quite nontrivial , and for some integrable models the problem remains unsolved ”. Independently of Sklyanin’s approach, Magri and his collaborators [2, 3] have recently shown that given (i) a PQ structure, (ii) a complete family of commuting hamiltonians defined by the Casimir functions of the Poisson pencil, and (iii) a set of vector fields, suitably normalized on the hamiltonians, preserving the Poisson tensor P but do not belong to its image, then one can define by projection (under some additional conditions on the vector fields) a reduced, kernel-free bihamiltonian structure; for this new PQ structure, a set of Darboux–Nijenhuis canonical coordinates can be obtained by a constructive procedure, and the original hamiltonians (properly reduced) turn out to be all simultaneously separable in these coordinates. The theoretical interest of both constructions is largely beyond the concrete applicability of these procedures. As a matter of fact, while Sklyanin’s recipe lacks a general, theoretically-grounded rule to find the key element (the normalization of the BA function, or equivalently the proper linear combination of minors of the Lax matrix which should vanish), in Magri’s theory there is no practical recipe
November 7, 2002 15:28 WSPC/148-RMP
1126
00151
L. Degiovanni & G. Magnano
to construct systematically sets of transversal vector fields fulfilling the necessary requirements. In both approaches, moreover, the final construction of separation coordinates involves finding the roots of polynomial equations, which even for rather simple examples turn out to be of order higher than three. As we show in this article, Magri’s procedure can be adapted to the trihamiltonian setup, without loosing its geometric elegance, and actually making the theory even simpler and more symmetric (although less general). In this framework, the central role of the “generalized spectral equation” fλµ = 0 becomes clear. Moreover, for the particular trihamiltonian structures that we are introducing on the spaces gl(r)n , we have found a systematic way to produce the required transversal vector fields, and we will show that the resulting Darboux–Nijenhuis coordinates are exactly the roots of suitable combinations of minors of the characteristic matrix, much alike Sklyanin’s coordinates; in this way, we provide for this class of systems the missing element in both Sklyanin’s and Magri’s prescriptions for the construction of separation variables. In addition, we show that our framework makes available a different strategy, which yields the inverse transformation (i.e. the matrix elements of the Lax operator as functions of the separation variables) by solving only a system of linear algebraic equations, thus bypassing the problem of finding roots of higher-order polynomials. Let us quote another important remark by Sklyanin [8]: “Separation of variables, understood generally enough, could be the most universal tool to solve integrable models [. . .] the standard construction of the action-angle variables from the poles of the Baker–Akhiezer function can be interpreted as a variant of separation of variables, and moreover, for many particular models it has a direct quantum counterpart”. Therefore, a satisfactory hamiltonian setup for Sklyanin’s construction is likely to provide a link between hamiltonian and algebro-geometric integrability. In this sense, the equation fλµ = 0, pointing towards a generalisation of the notion of spectral curve not relying on Lax representations, should deserve some additional interest. The article is organized as follows: in Sec. 2 we recall, as synthetically as possible, some facts about bihamiltonian structures which are necessary for the subsequent discussion; then, we present a theoretical setting of our class of trihamiltonian structures. In Sec. 3 we discuss the general method of construction of separation variables, i.e. the trihamiltonian version of Magri’s construction. We present in detail the proofs of some relevant propositions providing the theoretical background for all applications of our framework; furthermore, we show how the components of all relevant objects (Poisson structures, common Casimir function, transversal vector fields, etc.) look like in Darboux–Nijenhuis coordinates; this will be used in Sec. 4 to reconstruct the coordinate transformation. The fourth and last section is devoted to the application to Lax equations with spectral parameter on gl(r)n ; here we simply list the “ingredients of the recipe” without a general proof; this section is intended to present a concrete outcome in just enough detail to motivate the reader to deal with the theoretical construction of Sec. 3.
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1127
Throughout the article no attempt is made to present a geometric characterisation, or classification, of the trihamiltonian structures possessing the specific features considered. In particular, we deliberately avoid to encompass these features into a single definition of “special trihamiltonian structure”. In our discussion, the basic structure involved is sometimes presented as a triple of compatible Poisson tensors, but more often as a pair of Possion pencils; the assumptions on this basic structure vary according to the context. In Sec. 2 we just reconsider the notion of “trihamiltonian recursion” associated with a common Casimir function of two Poisson pencils, without imposing particular conditions on the Poisson tensors besides their mutual compatibility; in such generality, indeed, nothing ensures that a common Casimir function exist at all (we will present a counterexample). In Sec. 3.1 we define a pair of Nijenhuis tensors using a set of vectorfields, which should fulfill a number of requirements: these Nijenhuis tensors are used to introduce suitable Darboux–Nijenhuis coordinates, and do not act as recursion operators for the trihamiltonian iteration described previously (throughout this subsection, the existence of a common Casimir function is irrelevant). In Sec. 3.2 the two objects, i.e. the common Casimir function (that we now require to be complete in a suitable sense) and the set of transversal vectorfields, are eventually combined together to construct a set of bivariate polynomials Sα (λ, µ): then, a number of additional hypotheses are introduced to obtain the main result, i.e. that the common roots of these polynomials are Darboux–Nijenhuis coordinates, in which the Hamiltonians occurring as coefficients in the common Casimir function are separated (in the sense of Sklyanin). Hence, the set of conditions to be imposed on the basic structure depends on whether one looks for the mere existence of iterated trihamiltonian vectorfields, or for a possible explicit construction of separation coordinates. In conclusion, we feel that at the present stage of understanding of the matter presented herein, one could hardly single out a simple and general definition which might be regarded as truly fundamental. The aim of the article is rather to display a number of nontrivial and rigorous (if not yet complete) arguments in favour of further investigations in this direction. 2. From Bi- to Tri-Hamiltonian Structures 2.1. Poisson pencils As was already done (1.8) in the introductory section, we shall represent a Poisson bracket {·, ·} on a manifold M by means of a contravariant antisymmetric tensorfield P , according to hdg, P df i = {f, g} .
(2.1)
The names Poisson structure or hamiltonian structure are equivalently used, as is commonly done, to denote both the tensor P and the algebra of differentiable functions on M with the bilinear operation defined by the corresponding Poisson bracket. Of course, a contravariant antisymmetric tensorfield P defines a
November 7, 2002 15:28 WSPC/148-RMP
1128
00151
L. Degiovanni & G. Magnano
hamiltonian structure only if (2.1) obeys the Jacobi identity; this condition corresponds to a differential identity on the components of P . In most of our applications, the tensor P will not be of maximal rank; thus, the subalgebra of functions which are in involution with any other function may include non-constant functions, the Casimir functions. The Casimir functions are constant of motion for any hamiltonian vectorfield, i.e. for any vectorfield being the image of a closed one-form through the Poisson tensor P . Therefore, any trajectory of any possible hamiltonian system lies entirely on a common level set of all the Casimir functions. Generically, such a level set is a submanifold, the dimension of which equals the rank of the Poisson tensor. Upon reduction to any of these submanifolds, the Poisson tensor becomes invertible and therefore defines a symplectic structure. For this reason, the common level sets (for regular values) of the Casimir functions are called symplectic leaves. In contrast with the case of symplectic manifolds, the Lie derivative the Poisson tensor can vanish along the flow of a given vectorfield X, LX (P ) = 0 ,
(2.2)
without X being even locally hamiltonian: the condition (2.2) can be fulfilled also by vectorfields which do not belong to the image of P , and are therefore transversal to the symplectic leaves. Such vectorfields will be called weakly hamiltonian. They play an important role in the sequel. Bihamiltonian structures were first introduced by Magri in [1]. Definition 2.1. Two Poisson tensors P and Q on a manifold M are said to be compatible if any linear combination of the two tensors is again a Poisson tensor. Vectorfields which are hamiltonian with respect to both structures are called bihamiltonian vectorfields. In this article, we borrow from the bihamiltonian theory the following facts: (i) Two hamiltonians are associated to a single bihamiltonian vectors field X = P dh = Qdk. Then, one can define two other vectorfields, namely Qdh and P dk. In some cases, these turn out to be bihamiltonian as well, and the procedure can be iterated yielding a Magri–Lenard hierarchy of bihamiltonian vectorfields, as in (1.10). (ii) Once a Magri–Lenard hierarchy has been constructed, all the vectorfields belonging to it are mutually commuting, and all their hamiltonians are in involution with respect to both P and Q. (iii) There are basically two ways to produce such hierarchies: if at least one of the Poisson tensors (say, P ) is nondegenerate, then one can introduce the recursion operator (or Nijenhuis tensor) N = Q · P −1 .
(2.3)
One can prove [12] that for any bihamiltonian vectors field X, the vectors field N X is also bihamiltonian, so the hierarchy can be produced by iterated application of
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1129
the (1, 1) tensors field N . The sequence of hamiltonians is in turn generated by the adjoint recursion operator N ∗ = P −1 · Q. Alternatively (for instance, if both Poisson tensors are degenerate), one can look for Casimir functions of the Poisson pencil (Q − λP ), as already described in the Introduction. The classical proof of the involutivity property, which is the most relevant to our purposes, is so simple and elegant that we reproduce it here (further details can be found in [16, 17]): Proposition 2.1. Let fλ and gλ be two Casimir functions of the Poisson pencil (Q − λP ). Assume that fλ and gλ are expanded in power series in the parameter P P λ of the pencil, fλ = i=0 fi λi and gλ = i=0 gi λi . Then {fj , fk } = {gj , gk } = {fj , gk } = 0 for all j, k: this holds for both brackets { , }P and { , }Q associate to P and to Q respectively. Proof. The conditions (Q − λP ) dfλ = 0 and (Q − λP ) dgλ = 0 are equivalent to P dfi = Qdfi+1 and P dgi = Qdgi+1 . Moreover, one should have Qdf0 = 0 and Qdg0 = 0, i.e. the lowest-order coefficients of both expansions should be Casimir function for Q. One can assume j < k, without loss of generality. From Definition 2.1 {fj , fk }P = {fj+1 , fk }Q = {fj+1 , fk−1 }P . Whenever k − j is even, applying repeatedly the equality one finds that {fj , fk }P = {fr , fr }P = 0, with r = (k−j)/2; otherwise, after (k−j) steps one finds {fj , fk }P = {fk , fj }P , which proves the statement by the antisymmetry of the Poisson bracket. The same holds for {gj , gk }P , and for the other bracket { , }Q . Furthermore, applying the same iterative argument one finds {fj , gk }Q = {fj+k , g0 }Q , and the latter bracket vanishes because g0 is a Casimir function for Q. This proves that {fj , gk }Q = 0 for all j, k. Since {fj , gk }P = {fj+1 , gk }Q , one has {fj , gk }P = 0 as well. 2.2. Casimir functions and trihamiltonian vector fields Assume that a manifold M is endowed with three Poisson tensors P0 , P1 and P2 , pairwise compatible. A natural question is whether a set of vectorfields that are hamiltonian with respect to all three structures can be generated by the coefficients of some “generating function”, analogous to the Casimir function fλ above, and whether the corresponding hamiltonians would then be automatically in involution. One might believe that the obvious generalization of the setting just described would consist in introducing a two-parameter Poisson pencil P0 − λP1 − µP2 and seeking for its Casimir functions. Unfortunately, the coefficients of the Taylor series in the two parameters (λ, µ) do not fit into any useful recursion relation: from the Casimir equation ∞ X for fλ,µ = hji λi µj (P0 − λP1 − µP2 ) dfλµ = 0 i,j=0
November 7, 2002 15:28 WSPC/148-RMP
1130
00151
L. Degiovanni & G. Magnano
one gets the relations P0 dh00 = 0 P0 dh0i+1 = P1 dh0i
i = 0, 1, . . .
= P2 dhj0 P0 dhj+1 0
j = 0, 1, . . .
j+1 + P2 dhji+1 P0 dhj+1 i+1 = P1 dhi
i, j = 0, 1, . . .
which neither provide trihamiltonian vector fields nor force the functions fij to be in involution. Let us consider instead a function fλµ that is simultaneously a Casimir function of the two distinct pencils (P1 − λP0 ) and (P2 − µP0 ): (P1 − λP0 ) dfλµ = 0 ,
(P2 − µP0 ) dfλµ = 0 .
In this case we obtain the following relations: P1 dhj0 = 0
j = 0, 1, . . .
P0 dhji = P1 dhji+1
i, j = 0, 1, . . .
P2 dh0i = 0
i = 0, 1, . . .
P0 dhji = P2 dhj+1 i
i, j = 0, 1, . . .
graphically:
±
®
¯
° z
}
{
~
z
s
v
N
O
w
P
U
y
]
d
x
\
V
tu
^
y
y
j
e
k
q
l
r
¬
¨
©
ª
« J
L
¨
K Q
R
S
T
X
M Z
J
>
Q
Y
`
a
b
c
f
[
h
X
`
E
A
B
g
m
n
o
p
i
f
?
m
C
F
G
H
H
H
C
$
,
%
&
4
-
<
5
6
=
§
£
¤
¥
¦
£
"
'
(
)
/ +
#
2
/ '
0 7
8
9
;
3
7
¡
¢
(2.4)
are clearly trihamiltonian. the vectorfields P0 dhji = P1 dhji+1 = P2 dhj+1 i Notice that it is possible to find a common Casimir function which can be (formally) expanded in a Taylor series with respect to the two parameters λ e µ only if both P1 and P2 are degenerated Poisson tensors: in fact, for a fixed power
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1131
of λ, the lowest-order coefficient in µ must be a Casimir function of P2 , while the lowest-order coefficient in λ for any fixed power of µ must be a Casimir function of P1 . If, moreover, also P0 is degenerate, then it is possible to find Casimir functions which are polynomials in λ and µ. For such functions the recursion diagram is finite. In analogy with the bihamiltonian case, one has: P Proposition 2.2. Given a common Casimir function fλµ = hji λi µj of two compatible Poisson pencils P1 − λP0 and P2 − µP0 , all the coefficients hji are in mutual involution with respect to all three Poisson brackets. P i k P j k (i) (j) hk λ and hλ = hk λ , i.e. the Proof. For any i, j the functions hλ = coefficients of µi and µj in the expansion of fλµ , are Casimir functions of the Poisson pencil (P1 − λP0 ); then, by Proposition 2.1 all their coefficients hji are in involution P k k (i) hi µ with respect to both P0 and P1 . On the other hand, the functions hµ = P (j) and hµ = hkj µk are Casimir functions of the other pencil P2 − µP0 , hence hji are in involution also with respect to P2 . We remark that there are other possible ways to extend the bihamiltonian framework to the case in which there are more than two compatible Poisson structures. The idea of a trihamiltonian vector field was already considered, for example, in [18] and [19]), where the three structures P , Q and S were however assumed to produce the iteration P dhi = Qdhi+1 = Sdhi−1 . In this approach, the third structure only supplies an additional relation which links vectorfields already belonging to the same Magri–Lenard hierarchy; in our framework, the third structure acts instead as a bridge linking different bihamiltonian hierarchies, and so allows to collect a greater number of function in involution in a single objet: the common Casimir function. In our setup, the structure P0 seems to play a distinguished role with respect to P1 and P2 . As a matter of fact, it is easy to figure out how to include in the picture also the Poisson pencil built from P1 and P2 , but the P1 — P2 recursion is already included in the diagram (2.4), and introducing a third pencil would be redundant. In the recursion diagram, all structures appear on equal footing; on the other hand, in the applications that we have in mind there is a distinguished structure, so the “symmetry breaking” caused by the choice of two pencils is significant. We stress that, given three Poisson structures P0 , P1 and P2 , mutually compatible and such that the two pencils P1 − λP0 , P2 − µP0 both admit Casimir functions, a common Casimir function as required in Proposition 2.2 may not exist at all. An obvious necessary condition is that at each point x of the phase space M, and for generic values of the spectral parameters (λ, µ), the two subspaces ker(P1 − λP0 ) and ker(P2 − µP0 ) have a nontrivial intersection in Tx∗ M. For instance, let us consider the space R5 , with coordinates {x1 , x2 , x3 , x4 , x5 }, endowed with the three Poisson tensors:
November 7, 2002 15:28 WSPC/148-RMP
1132
00151
L. Degiovanni & G. Magnano
0
0
1
0
0
0
0 0 0 P0 = −1 0 0 0 −1 0
0
0 P1 = 0 −1 0
0
0
0
0
0
0
0
0
−1 0
1 0
0 1 0 0 0 0
0 0
1 0 0 0 0 0
0 0 0 0 P2 = 0 −1
0 0
0
0
0
0
0
0
0
0
0
0
1 0
0 0 0 1 . 0 0
−1 0 0
Although the pencils P1 − λP0 and P2 − λP0 admit the Casimir functions x3 + λx4 + λ2 x5 and x2 + µx1 − µ2 x5 , respectively, it is easy to check that a common Casimir function for both pencils does not exist. The simple local geometry of our trihamiltonian structures may be clarified by an example. Let us consider the “fundamental molecule” (1.27). The lowest dimension in which this diagram can be realised is 9. In fact, the diagram includes six functions, that we assume to be independent. Since the three vectorfields in the diagram commute, by Frobenius’ theorem there exists a coordinate system in ∂ which they coincide with coordinate vectorfields: Xi ≡ ∂x i for i = 1, 2, 3. The diagram shows that the hamiltonian h00 is P0 -conjugate to x1 , while h01 and h10 are P0 -conjugate to x2 and x3 respectively. The other hamiltonians h02 , h11 and h20 are Casimir functions for P0 . Therefore, the 9 functions xi and hij should be functionally independent, and locally form a coordinate system: let x4 ≡ h00 , x5 ≡ h01 , x6 ≡ h10 , x7 ≡ h02 , x8 ≡ h11 and x9 ≡ h20 . It can be read directly from the diagram (1.27) that in these coordinates the three tensors P0 , P1 and P2 have the following matrix components:
0
0
0
1
0 0 0 0 0 0 0 0 −1 0 0 0 P0 = 0 −1 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0
0
0
0
0 0
0
0 0
1
0 0
0
1 0
0
0 0
0
0 0
0
0 0
0
0 0
0
0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0
0 0
0 0
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
0
0
0
0 1
0 0
0
0
1 0
0
0
0 1
0
0
0 0
0
0
0 0
0
0
0 0
0
0
0 0
0
0
0 0
0
0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 P1 = −1 0 0 −1 0 0 0 0 −1 0 0 0 0 0
0
0
0
0
0 0
0 0
0
0
0
0
0
0 0
1 0
0
0
0
0 0
0 0
1
0
0 0
0 0
0
0
0 0
0 0
0
0
0 0
0 0
0
0
0 0
0 0
0
0
0 0
0 0
0
0
0 0
0 0
0
0 1 0 0 . 0 0 0
−1 0 0
0 0
0
0
0 0 0 0 0 0 P2 = 0 0 −1 0 0 0 0 −1 0
0
1133
The fact that three independent Poisson tensor can be simultaneously put in canonical form is possible only because they are all degenerate (in the realization of minimal dimension, they ought to have the same rank), and their symplectic foliations are different. Having anticipated that the diagram (1.27) corresponds to the trihamiltonian structure of gl(3), we have in fact shown that the latter trihamiltonian space admits local multicanonical coordinates. On the other hand, although six of the multicanonical coordinates coincide with the coefficients of the characteristic determinant of the Lax matrix Aλ+M , the first three coordinates can be found only upon explicit integration of the dynamical system. Analogous considerations hold for the “fundamental molecule” of any space gl(n)κ , and actually for any finite trihamiltonian recursion diagram (under the assumption that all the hamiltonians are independent, which is generically true for gl(n)κ ). The lowest dimension to accommodate a trihamiltonian structure admitting multicanonical coordinates is 4 (the “fundamental molecule” contains just one vectors field and three hamiltonians); an example is the algebra gl(2). The reader can easily find out the multicanonical form of the three Poisson tensors generating the gl(2)2 diagram (1.28). Another type of coordinates, the separation coordinates, can instead be obtained explicitly in an alternative way, which does not require the integration of the vectorfields. This is explained in the next section.
November 7, 2002 15:28 WSPC/148-RMP
1134
00151
L. Degiovanni & G. Magnano
3. Separation of Variables 3.1. Darboux–Nijenhuis coordinates In this section we shall adapt to the trihamiltonian framework the construction of separation variables proposed by Falqui, Magri and Pedroni in [2, 3]. The basic notion involved in their construction is the definition of Darboux– Nijenhuis coordinates [12, 20]. Consider a bihamiltonian structure P Q on a manifold M, with dim(M) = 2m, such that at least one of the Poisson tensors (say, P ) is nondegenerate. Let N be the recursion operator defined by (2.3); following Magri [12], we say that M is endowed with a Poisson–Nijenhuis structure. We shall assume that P and Q are such that at generic point of M, the recursion tensor N has m distinct (double) eigenvalues λi . If moreover the m eigenvalues λi are functionally independent when regarded as functions on M, then it has been proved [20] that (at least locally) other m functions µi exist such that: (i) the functions (λi , µi ) form a system of coordinates on M; (ii) in this coordinate system, the Poisson tensor P is in canonical form, i.e. {λi , µj }P = δij and {λi , λj }P = {µi , µj }P = 0, and the recursion tensor N is diagonal. The coordinates (λi , µi ) are called Darboux–Nijenhuis coordinates. The property (ii) completely determines the Q-Poisson brackets of the coordinates: {λi , µi }Q = λi and {λi , µj }Q = {λi , λj }Q = {µi , µj }Q = 0 for i 6= j. Suppose hi to be a set of m hamiltonians, independent and in involution with respect to P and Q (although not necessarily generated by Magri–Lenard recursion). Falqui, Magri and Pedroni [2, 3] have recently found an intrinsic coupling condition with the recursion operator, ensuring that all the functions hi are separable in the Darboux–Nijenhuis coordinates. In the case of a degenerate P Q structure, one can sometimes perform a reduction onto a symplectic leaf of P , by projection along appropriate transversal vectorfields. This allows one to compute explicitly not only the coordinates λi (which are the roots of the characteristic polynomial of N , or rather of its minimal polynomial), but also the other coordinates µi , as the values taken by a suitable polynomial p(λ) after the substitutions λ = λi (for the construction of p(λ), the exact statements and the proofs we refer the reader to [21]). For a trihamiltonian structure (P0 , P1 , P2 ) with a nondegenerate Poisson tensor P0 , one is naturally led to introduce two recursion operators, N1 = P1 · P0−1
and
N2 = P2 · P0−1 ,
(3.1)
so obtaining a compatible pair of Poisson–Nijenhuis structures. In this case, under appropriate conditions it is possible to obtain trihamiltonian Darboux–Nijenhuis coordinates, having the most simple and natural property that one could imagine in this context: Proposition 3.3. Let N1 , N2 be the two tensors defined by (3.1) from a compatible trihamiltonian structure on a 2m-dimensional manifold. If
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1135
(1) all the eigenspaces of N1 and N2 coincide (equivalently, N1 and N2 have the same centraliser ); (2) both N1 and N2 have m distinct eigenvalues, forming together a set of 2m independent functions, that we denote by λi and µi respectively (i = 1, . . . , m); (3) for any pair of eigenvalues λi and µi , corresponding to the same common eigenspace, one has {λi , µi }P0 = 1 ; then the eigenvalues of N1 and N2 (respectively denoted by λi and µi ) form a Darboux-Nijenhuis coordinate system. Proof. For any Nijenhuis recursion operator N and any of its eigenvaules λ, it is always true (see [12]) that N ∗ dλ = λdλ. The fact that N1 has m independent eigenvalues implies that all its eigenspaces are bidimensional, and the same holds for N2 . We have also assumed that the eigenvalues of N1 and N2 corresponding to the ith common eigenspace are independent functions. Therefore, each eigenspace itself is spanned by the two differentials dλi and dµi : N1∗ dλi = λi dλi ,
N2∗ dλi = µi dλi ,
N1∗ dµi = λi dµi
N2∗ dµi = µi dµi .
From the definition (3.1) and the trivial fact that N1 P0 = P1 = P0 N1∗ , at any point where λj 6= 0 one has {λi , λj }P0 = =
1 1 hP0 dλi , λj dλj i = hP0 dλi , N1∗ dλj i λj λj λi 1 hP0 N1∗ dλi , dλj i = {λi , λj }P0 λj λj
and, since λi 6= λj for i 6= j, one should have {λi , λj }P0 = 0 for all i, j. In a similar way, using the recursion tensor N2 one obtains {µi , µj }P0 = 0 for all i, j. Furthermore, 1 1 hP0 dλi , λj dµj i = hP0 dλi , N1∗ dµj i {λi , µj }P p = λj λj =
λi 1 hP0 N1∗ dλi , dµj i = {λi , µj }P0 λj λj
which entails {λi , µj }P0 = 0 for i 6= j. The above results extend by continuity to points where λj = 0. The additional normalisation condition (3) ensures that (λi , µi ) are canonical coordinates for P0 ; by construction, both tensors N1 and N2 are diagonal in these coordinates, which therefore are Darboux–Nijenhuis for the trihamiltonian structure. In other terms, if two Poisson pencils are available, and the two recursion operators are “independent” and “compatible” in the sense given above, then all the
November 7, 2002 15:28 WSPC/148-RMP
1136
00151
L. Degiovanni & G. Magnano
Darboux–Nijenhuis variables are obtained as eigenvalues (in the usual bihamiltonian setup, only half of these variables are defined as eigenvalues). The theorem also clarifies that trihamiltonian structures generated by single recursion operator, P1 = N P0 and P2 = N P1 = N 2 P0 , often encountered in the literature, are not suitable for our purposes. To obtain a twofold Poisson–Nijenhuis manifold also when all Poisson tensors P0 is degenerate, we exploit a projection technique, which has been already used for a different purpose in [22]. Let us start with two compatible Poisson tensors P0 and P1 . The rank of P0 is assumed for simplicity to be constant on the phase manifold M (in the sequel, dim M = 2m + k, k being the corank of P0 ). The kernel of P0 is then pointwise spanned by the differentials of k Casimir functions cα . The main ingredient in our approach is a set of k independent vectorfields Zα (spanning a distribution Z) having these properties: (i) normalization:
Zα (cβ ) = δαβ ;
(ii) integrability:
[Zα , Zβ ] ∈ Z ;
(3.2)
(iii) symmetry for P0 : LZα P0 = 0 . The normalization property (i) entails that all the vectorfields Zα are transversal to the symplectic leaves of P0 . The concrete possibility of finding such a set of transversal vectorfields is not ensured a priori. Here we shall assume their existence, and in the last section of the article we will give an explicit recipe to find them for a relevant class of trihamiltonian structures. The vectorfields Zα induce a decomposition on T M and T ∗ M: ∀θ ∈ T ∗ M ,
θ = θ⊥ + θ// ,
∀X ∈ T M ,
X = X⊥ + X// ,
where θ⊥ = hZα , θidcα , where
X⊥ = X(cα )Zα .
Lemma 3.4. The decomposition satisfies the following properties: hθ// , Zα i = 0 , X// (cα ) = 0 . Proof. hθ// , Zα i = hθ − θ⊥ , Zα i = hθ, Zα i − hθ, Zβ iZα (cβ ) = 0 ; X// (cα ) = X(cα ) − X⊥ (cα ) = X(cα ) − X(cβ )Zβ (cα ) = 0. Using the above-defined decomposition, one proves that Lemma 3.5. The Assumptions (3.2) imply [Zα , Zβ ] = 0. Proof. [Zα , Zβ ] ∈ Z entails [Zα , Zβ ]// = 0, but one also has [Zα , Zβ ]⊥ = 0 because [Zα , Zβ ](cγ ) = Zα (δβγ ) − Zβ (δαγ ) = 0.
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1137
We are now able to introduce a new tensor P˜1 by setting P˜1 (θ, φ) = P1 (θ// , φ// ) ;
(3.3)
the kernel of the “deformed” tensor P˜1 contains all the differentials of the Casimir functions of P0 . Still, the tensor P˜1 is not automatically a Poisson tensor. From the beginning, we have assumed that LZα P0 = 0 (3.2). This is equivalent to requiring that P0 be projectable along the flows of the vectorfields Zα , i.e. for any pair of functions (f, g) such that Zα (f ) = Zα (g) = 0 everywhere, their Poisson bracket be also constant along the same flows: Zα ({f, g}P0 ) = 0. So far, nothing ensures that the new tensor P˜1 is projectable in the same sense. Whenerver it turns out to be so, i.e. LZα P˜1 = 0 ,
(3.4)
then the P˜1 -bracket P˜1 (df, dg) can be reduced to Zα -invariant functions as well. For a Zα -invariant function f one has df = (df )// , so the P˜1 -bracket of such functions coincide with their P1 -bracket. Then, if P˜1 is projectable, the Jacobi identity is straightfowardly proved for Zα -invariant functions. We claim that (3.4), supplemented by some auxiliary conditions, ensures both the Jacobi identity and the compatibility with P0 for the P˜1 -bracket of arbitrary functions on M. Proposition 3.6. If (i) all the vectorfields Zα fulfill both (3.2) and (3.4); (ii) all the Casimir functions cα are in mutual involution with respect to P1 , and (iii) the functions cα generate bihamiltonian vectorfields, i.e. one can find k functions hα such that V α ≡ P1 dcα = P0 dhα , then P˜1 is a Poisson tensor compatible with P0 . The proof requires several steps. Lemma 3.7. If the vectorfields V α ≡ P1 dcα are hamiltonian also with respect to P0 , then the Poisson tensor P1 and the tensor P˜1 differ by a Lie derivative of P0 : P˜1 = P1 + LXP1 P0 ,
(3.5)
where (summation is understood on repeated indices) XP1 = hα Zα . Proof. One has P˜1 (df, dg) = P1 ((df )// , (dg)// ) = P1 (df − (df )⊥ , dg − (dg)⊥ ) = P1 (df, dg) + Zα (f )Zβ (g)P1 (dcα , dcβ ) − Zα (g)Zβ (g)P1 (df, dcα ) − Zα (f )P1 (dcα , dg) = P1 (df, dg) − Zα (g)Zβ (g)P0 (df, dhα ) − Zα (f )P0 (dhα , dg) ,
(3.6)
November 7, 2002 15:28 WSPC/148-RMP
1138
00151
L. Degiovanni & G. Magnano
which proves the statement since (LXP1 P0 )(df, dg) = hα Zα ({f, g}P0 ) − {Zα (f ), g}P0 − {f, Zα (g)}P0 − Zα (f ){hα , g}P0 − Zα (g){f, hα }P0 = hα [(LZα P0 )(df, dg)] − Zα (f ){hα , g}P0 − Zα (g){f, hα }P0 = −Zα (f ){hα , g}P0 − Zα (g){f, hα }P0 . This also proves that, under the given hypotheses, LXP1 P0 = V α ∧ Zα . If condition (iii) in the statement of Proposition 3.6 is fulfilled, then (3.4) can be rewritten in terms of the tensor P1 , using the previous Lemma: Lemma 3.8. Upon assuming (3.5), (3.4) is equivalent to LZα P1 = [V β , Zα ] ∧ Zβ .
(3.7)
Proof. Here and in the sequel we exploit the properties of the Schouten bracket, a bilinear operator defined on contravariant, antisymmetric tensors of arbitrary rank p (p-vectors). We refer the reader to [23] for the general theory; the properties that we shall use are the following: • • • • • •
if P is a p-vector and Q is a q-vector, then [P, Q]S = (−1)pq [Q, P ]S ; if, moreover, R is a r-vector, [P, Q ∧ R]S = [P, Q]S ∧ R + (−1)q(p+1) Q ∧ [P, R]S ; if X is an ordinary vectorfield, then [X, P ]S ≡ LX P ; LX [P, Q]S = [LX P, Q]S + [P, LX Q]S ; if P is a bivector, then it is a Poisson tensor if and only if [P, P ]S = 0; if P and Q are both Poisson tensors, then they are compatible if and only if [P, Q]S = 0.
Proving the Lemma is now straightforward: LZα P1 = LZα (P˜1 − LXP1 P0 ) = −LZα LXP1 P0 = −[Zα , V β ∧ Zβ ]S = −[Zα , V β ] ∧ Zβ − V β ∧ [Zα , Zβ ] = [V β , Zα ] ∧ Zβ . The condition (3.7) is also a projectability condition: it means that the tensor P1 itself can be reduced to Zα -invariant functions, but in this case one cannot identify the image of P1 through projection (on the quotient space) with the restriction of P1 on a submanifold transversal to the vectorfields Zα (for instance, a symplectic leaf of P0 ). The “deformed” tensor P˜1 is exactly constructed in such a way that its restriction to that leaf coincides with the reduction of P1 under projection.
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1139
Now we can eventually prove the main statement. Proof. We need to show that [P˜1 , P˜1 ]S = 0 and [P0 , P˜1 ]S = 0. By assumption, [P0 , P0 ]S = [P1 , P1 ]S = [P0 , P1 ]S = 0. Taking the Lie derivative of [P0 , P0 ]S one finds that [P0 , LX P0 ]S = 0 for any vectorfield X, and from (3.5) it follows that [P0 , P˜1 ]S = [P0 , P1 + LXP1 P0 ]S ≡ 0. To prove the first equality, observe furthemore that the vectorfields V α are assumed to be hamiltonian for P1 , hence [P1 , V α ]S ≡ LV α P1 = 0. Then [P˜1 , P˜1 ]S = [P1 , P1 ]S + 2[P1 , V α ∧ Zα ]S + [LXP1 P0 , V α ∧ Zα ]S = 2[P1 , V α ]S ∧ Zα − 2V α ∧ [P1 , Zα ]S + [LXP1 P0 , V α ]S ∧ Zα − V α ∧ [LXP1 P0 , Zα ]S = 2V α ∧ [Zα , V β ] ∧ Zβ + [V α , V β ∧ Zβ ]S ∧ Zα − V α ∧ [Zα , V β ∧ Zβ ]S = 2V α ∧ [Zα , V β ] ∧ Zβ + [V α , V β ] ∧ Zβ ∧ Zα + V β ∧ [V α , Zβ ] ∧ Zα − V α ∧ [Zα , V β ] ∧ Zβ − V α ∧ V β ∧ [Zα , Zβ ] = 2V α ∧ [Zα , V β ] ∧ Zβ − V α ∧ [Zα , V β ] ∧ Zβ − V α ∧ [Zα , V β ] ∧ Zβ = 0. We have seen that P0 and P˜1 are compatible and can both be reduced by projection along the flows of the vectorfields Zα on any symplectic leaf of P0 . Then, on each leaf one can define a recursion tensor and look for Darboux–Nijenhuis coordinates; but there is no evidence that such coordinates can be extended to some neighborhood of the symplectic leaf in M. However, a Nijenhuis tensor N1 can be directly defined on the full manifold M in the following way. To any vectorfield X over M one can associate the vectorfield X// which is, by construction, everywhere tangent to the simplectic leaves of P0 . Consider now any one–form θX such that X// = P0 θX ; we set N1 X = P˜1 θX .
(3.8)
The tensor N1 is actually independent of the choice of θX , because the latter is defined up to an element of the kernel of P0 , which is also the kernel of P˜1 . From the definition, it follows immediately that N1 Zα = 0, and that for any hamiltonian vectorfield Xf = P0 df one has N1 Xf = P˜1 df . Proposition 3.9. Under the same hypotheses of the previous proposition, the tensor N1 defined by (3.8) is a Nijenhuis tensor. Proof. We need to show that the Nijenhuis torsion tensor TN of N1 vanishes: TN (X, Y ) ≡ [N1 X, N1 Y ] − N1 [N1 X, Y ] − N1 [X, N1 Y ] + N12 [X, Y ] = 0 .
November 7, 2002 15:28 WSPC/148-RMP
1140
00151
L. Degiovanni & G. Magnano
The Nijenhuis torsion tensor acts pointwise on vectors; thus, it is sufficient to show that TN (X, Y ) vanishes on any pair of elements of a basis of the tangent space to M at any point. In our case, such a basis is provided by a set of P0 -hamiltonian vectorfields spanning the tangent spaces to the symplectic leaves, and by the transversal vectorfields Zα . We already know that N1 Zα = 0 and [Zα , Zβ ] = 0, thus TN (Zα , Zβ ) = 0 for any pair (α, β). More generally, for any vectorfield X TN (X, Zα ) = N1 (N1 [X, Zα ] − [N1 X, Zα ]) = N1 (LZα (N1 X) − N1 LZα X) = N1 (LZα (N1 )P0 θX ) but since LZα P˜1 = LZα P0 = 0, one has LZα (N1 )P0 = 0. Finally, we evaluate the torsion tensor on two hamiltonian vectorfields Xf = P0 df and Xg = P0 dg; for notational convenience, we set Yf = P1 df . This part of the proof is well known and can be found, in greater detail, in [20]: it is shown there that the compatibility of P˜1 and P0 , i.e. [P0 , P˜1 ]S = 0, implies [Xf , Yg ] + [Yf , Xg ] = X{f,g}P˜ + Y{f,g}P . 1
0
(3.9)
Using the fact that Yf = N1 Xf , applying once again N1 to both sides of (3.9), and rearranging terms, one gets Y{f,g}P˜ = N1 [Xf , N1 Xg ] + N1 [N1 Xf , Xg ] − N12 [Xf , Xg ] ; 1
on the other hand, Y{f,g}P˜ = [Yf , Yg ] = [N1 Xf , N1 Xg ] and so 1
TN (Xf , Xg ) = 0 . These results suggest the following strategy: given a trihamiltonian structure, choose one of the Poisson tensors, say P0 ; a complete set of Casimir functions cα can be directly read out from the “fundamental molecule”, as well as the functions hα and k α used in (3.6). In this case the vectorfields P1 dcα and P2 dcα are automatically commuting, so the conditions (ii) and (iii) of Proposition 3.6 are fulfilled. If one is able to find a complete set of vectorfields fulfilling all of (3.2) and being symmetries of both P˜1 and P˜2 (in the sequel, we shortly write “a good set of transversal symmetries”), then one can apply the above procedure to each of the pairs (P0 , P1 ) and (P0 , P2 ) separately, obtaining in this way a pair of Nijenhuis tensors N1 and N2 on the full manifold M. This fact, however, does not yet ensure the existence of Darboux–Nijenhuis coordinates. The proof of the existence of Darboux–Nijenhuis coordinates has been given in Proposition 3.3 only for pairs of Nijenhuis tensors fulfilling some additional requirements, first of all that of being non-degenerate. We have not addressed the general problem of the existence of a canonical form for a non-regular Nijenhuis tensor; incidentally, to our knowledge non-regular Nijenhuis tensors have never been previously considered in connection with finite-dimensionl integrable systems. In the next section, we will rather characterise a particular class of trihamiltonian systems
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1141
for which Darboux–Nijenhuis coordinates exist, are separation variables for the Hamiltonians occurring in the “fundamental molecule”, and can even be constructed without having to compute eigenvalues. Although the assumptions that we make could seem rather artificial and difficult to test in practice, we will eventually show that this class of systems is not empty, and contains relevant examples described by Lax equations (with spectral parameter). 3.2. Sklyanin separation of trihamiltonian systems We first recall the separability criterion introduced by Sklyanin [8]. Let hi be m hamiltonian in involution for a nondegenerate Poisson tensor P , and let (λi , µi )i=1,...,m a system of canonical coordinates for P . If m functions Wi of m + 2 variables exist such that W1 (λ1 , µ1 ; h1 (λi , µi ), . . . , hm (λi , µi )) = 0 .. .
(3.10)
Wm (λm , µm ; h1 (λi , µi ), . . . , hm (λi , µi )) = 0 identically, then all hamiltonians hi are separable in the coordinates (λi , µi ). This setting refers to the symplectic case (as it should since the notion of separability of the Hamilton–Jacobi equation makes sense only in symplectic manifolds, namely on cotangent bundles). In the case of degenerate Poisson manifolds, one can discuss separability on symplectic leaves. Upon reduction, one may expect to find Sklyanin functions Wi depending on auxiliary parameters labeling the symplectic leaves (in our case, the Casimir functions cα ). This is indeed what happens in the example that we shall now discuss: first, we introduce a set of equations built from the common Casimir function fλµ (generating the “fundamental molecule”) and the transversal vectorfields Zα , and we list sufficient conditions for the roots of these equations to form, together with the Casimir functions cα ), a system of generalised Darboux–Nijenhuis coordinates (λi , µi , cα ) on M. Then, we produce m functions Wi , each depending on the ith pair of coordinates (λi , µi ), the k Casimir functions cα , and other m arguments. Once the latter are replaced by the remaining m Hamiltonians hab (λj , µj , cα ) of the fundamental molecule, the functions Wi vanish identically for any value of the coordinates cα , thus on all symplectic leaves simultaneously. As previously, we assume that all the three Poison tensors (P0 , P1 , P2 ) have the same rank 2m, and the dimension of the manifold is 2m + k. We also assume that the common Casimir function fλµ is complete, i.e. that among its coefficients one can find m + k independent functions, including k Casimir functions for each of the three Poisson tensors (different tensors may indeed have some common Casimir functions). In the cases that we shall consider (for instance, the trihamiltonian spaces gl(r)n ), the polynomial fλµ has exactly m + k non-constant coefficients. The leading role in our construction is played by the derivatives of the common Casimir function fλµ along the transversal vectorfields Zα . We denote these k
November 7, 2002 15:28 WSPC/148-RMP
1142
00151
L. Degiovanni & G. Magnano
functions (still depending on the two parameters λ, µ) by Sα (λ, µ) = Zα (fλµ ) .
(3.11)
The letter S is chosen because of the coincidence with Sklyanin’s minors [8], in the particular case discussed in Sec. 4. Having assumed that the common Casimir function fλµ is polynomial in both parameters, the functions Sα (λ, µ) are polynomials as well. We shall prove that, whenever appropriate conditions are met, the common roots (λi , µi ) of the polynomials Sα (λ, µ) are Darboux–Nijenhuis coordinates and fulfill Sklyanin’s separability condition (3.11). Note that we use the abbreviated notation F (λ, µ)|λi ,µi for F (λ, µ)|λ=λi ,µ=µi . Proposition 3.10. If a good set of transversal symmetries Zα fulfills in addition the following requirements: (1) all second directional derivatives of the complete common Casimir function fλµ vanish identically, i.e. Zα (Zβ (fλµ )) = 0
for all α, β ;
(3.12)
(2) the polynomials Sα (λ, µ) = Zα (fλµ ) have 2m functionally independent common roots {λi , µi }; (3) the following equality holds: ∂Sα ∂Sβ ∂Sα ∂Sβ − = , (3.13) {Sα (λ, µ), Sβ (λ, µ)}P0 ∂λ ∂µ ∂µ ∂λ λi ,µi
λi ,µi
and for any i there is at least one pair (α, β) for which both sides are not identically vanishing; then, the 2m functions (λi , µi ) form a system of Darboux–Nijenhuis coordinates on each symplectic leaf of P0 , and moreover Zα (λi ) = Zα (µi ) = 0. Proof. First, we compute the parameter-dependent vectorfield associated to the common Casimir function fλµ under the deformed Poisson pencil (P˜1 − λP0 ): (P˜1 − λP0 ) dfλµ = (P1 + LXP1 P0 − λP0 ) dfλµ = (LXP1 P0 ) dfλµ .
(3.14)
But (LXP1 P0 ) dfλµ = −
k X
Sα (λ, µ) · P0 dhα :
(3.15)
α=1
in fact, from the definition of the deformation vectorfields (3.6), one sees that the vectorfield (LXP1 P0 ) dfλµ acts on an arbitrary function g as follows:
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
[(LXP1 P0 ) dfλµ ](g) = −
k X
Zα (fλµ ){hα , g}P0 −
α=1
=−
k X
k X
1143
Zα (g){fλµ , hα }P0
α=1
Sα (λ, µ) · (P0 dhα )(g) .
α=1
Taking the derivative along Zα of (3.14), one obtains (P˜1 − λP0 ) dSα (λ, µ) = −
k X
Sβ (λ, µ) · P0 d[Zα (hβ )] ,
β=1
(we exploited the fact that Zα is a symmetry for both P0 and P˜1 ). Since (λi , µi ) is a pair of roots of Sα , one finds that [(P˜1 − λP0 ) dSα (λ, µ)]|λi ,µi = 0 and that 0 = d[Sα (λi , µi )] = [dSα (λ, µ)]|λi ,µi ∂Sα ∂Sα + dλ + dµi ; i ∂λ λi ,µi ∂µ λi ,µi then,
∂Sα ∂Sα ˜ (P1 dλi − λi P0 dλi ) + (P˜1 dµi − λi P0 dµi ) = 0 . ∂λ λi ,µi ∂µ λi ,µi
On the other hand, ∂Sα ∂Sα + Zβ (µi ) = Zβ [Sα (λi , µi )] − Zβ [Sα (λ, µ)]|λi ,µi = 0 , Zβ (λi ) ∂λ λi ,µi ∂µ λi ,µi having assumed Zβ [Sα (λ, µ)] = 0 and Sα (λi , µi ) = 0. Let S be the k × 2 matrix α ∂Sα whose rows are ( ∂S ∂λ , ∂µ ), and let S(i) denote the same matrix after the substitution (λ, µ) → (λi , µi ). The condition (3.13) entails that S(i) has maximal rank for all i = 1, . . . , m so the previous results imply that Zα (λi ) = Zα (µi ) = 0 , P˜1 dλi = λi P0 dλi , P˜1 dµi = λi P0 dµi , and the same holds for P˜2 . Repeating now the same argument used in the proof of Proposition 3.3 one obtains that {λi , λj }P0 = 0 and {µi , µj }P0 = 0 for all (i, j), and that {λi , µj }P0 = 0 for i 6= j. To get the remaining canonical bracket, one should again rely on (3.13): for any pair of polynomials F (λ, µ) and G(λ, µ) such that F (λi , µi ) = G(λi , µi ) = 0, one has ∂F ∂G ∂F ∂G − ; {F (λ, µ), G(λ, µ)}P0 |λi ,µi = {λi , µi }P0 ∂λ ∂µ ∂µ ∂λ λi ,µi
November 7, 2002 15:28 WSPC/148-RMP
1144
00151
L. Degiovanni & G. Magnano
then, if (3.13) holds and both sides are nonvanishing, one concludes {λi , µi }P0 = 1 . Next, we prove that Sklyanin’s separability condition holds in the new coordinates. Proposition 3.11. Let (λi , µi , cα )i=1,...,m,α=1,...,k be the generalised Darboux– Nijenhuis coordinates associated to the projection along the vectorfields Zα according to Proposition 3.10. For each i = 1, . . . , m fλµ |λi ,µi = pi (λi , µi )
(3.16)
where pi (λ, µ) are polynomials with constant coefficients. Hence, the coordinates (λi , µi ) and the remaining m hamiltonians hab (restricted to the symplectic leaf Σ) fulfill the separability condition (3.10), with Wi (λi , µi ; hab ) = fλµ λi ,µi − pi (λi , µi ) . Proof. From the previous proposition, we know that Sα (λi , µi ) = 0 ,
Zα (λi ) = 0 ,
Zα (µi ) = 0 :
(3.17)
together with the normalization condition (3.2)(i), these imply that in the coordinate system (λi , µi , cα ) one has ∂ . (3.18) ∂cα We denote by f(i) ≡ fλµ |λi ,µi the function obtained by replacing the spectral parameters (λ, µ) with the ith pair of coordinates (λi , µi ). On account of (3.17), Za ≡
Zα (f(i) ) = Sα (λi , µi ) = 0 thus the m functions f(i) , which would in principle depend on all the coordinates (λj , µj , cα ), actually do not depend on the Casimir coordinates cα . To prove that each f (i) depends only on the pair (λi , µi ), we exploit the properties of Darboux– Nijenhuis coordinates: for any function g, one has {g, λj }P0 =
∂g ∂µj
and
{g, λj }P˜1 = λj
∂g . ∂µj
Assume that for a function g and some i, one has {g, λj }P˜1 − λi {g, λj }P0 = 0 ∂g ≡ 0 and g cannot depend on µj for j 6= i. identically for any j. Then (λj − λi ) ∂µ j If the difference {g, µj }P˜2 − µi {g, µj }P0 vanishes for any j as well, then g depends only on the pair (λi , µi ). So what we need to prove is that, for any j 6= i, {f(i) , λj }P˜ − λi {f(i) , λj }P = {f(i) , µj }P˜ − µi {f(i) , µj }P = 0 . 1
0
2
0
The differential of the function f(i) is given by ∂fλµ ∂fλµ dλi + dµi . df(i) = (dfλµ )|λi ,µi + ∂λ λi ,µi ∂µ λi ,µi
(3.19)
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1145
Thus, the vectorfield defined by applying the tensor (P˜1 − λi P0 ) to the differential of the function f(i) can be obtained from (3.14) by replacing the parameters (λ, µ) ∂f with the ith pair of coordinates (λi , µi ), and adding two terms proportional to ∂λλµ ∂f and ∂µλµ respectively. Applying this vectorfield to a coordinate λj , with j 6= i, one gets: {f(i) , λj }P˜ − λi {f(i) , λj }P = {fλµ , λj }(P˜1 −λP0 ) |λi ,µi 1
0
∂fλµ ∂λ λi ,µi ∂fλµ − λi {µi , λj }P0 | . ∂µ λi ,µi
+ ({λi , λj }P˜1 − λi {λi , λj }P0 ) + {µi , λj }P˜1
By hypothesis, λj is in involution with both λi and µi for j 6= i; therefore, only the first line survives, but due to (3.14) the r.h.s. is equal to Pk − α=1 Sα (λi , µi ){hα , λj }P0 , which once again vanishes on account of (3.17): half of (3.19) is proved. Repeating the whole argument for the coordinate µi , upon replacing P˜1 with P˜2 , one proves the full statement. Thus, the search for separation variables is completely translated into the problem of finding a set of “good” transversal vectorfields. We leave three questions open. First, suppose that one were able to find (in some other way) a set of generalised Darboux–Nijenhuis coordinates for the “deformed” Poisson triple (P0 , P˜1 , P˜2 ); would these coordinates be roots of the polynomials Sα (without imposing further conditions)? Second question, is requirement (3) in Proposition 3.10 really necessary, or is it already implied by the previous assumptions? We could not find any concrete example in which (3.2) and the requirements (1), (2) of Proposition 3.10 are satisfied, but (3) fails to hold. One could then suspect that the equality (3) can be derived from the other (simpler) assumptions. The trouble with the requirement (3) is that it is both uneasy to check and lacking a clear geometrical significance;a nevertheless, we have not yet succeeded in replacing it with another condition equally ensuring that {λi , µi }P0 = 1. Third open problem: under our assumptions, we have obtained Sklyanin’s separation condition with Wi = f(i) − pi (λi , µi ). In the algebro-geometric setting, one deals with a more particular situation, namely pi (λi , µi ) = p(λi , µi ) for a fixed polynomial p(λ, µ) not depending on i, so all separation variables are (pairwise) roots of principle, it would be possible to consider the manifold M × R2 (with the spectral parameters λ, µ regarded as two additional real coordinates), endowed with the direct sum of the structure P0 on M with the Poisson structure on R2 for which (λ, µ) are canonical coordinates. Then, (3) could be rephrased by saying that all the Poisson brackets of the functions Sα (regarded as functions of 2m + k + 2 variables) should vanish on each submanifold described by the equations λ = λi , µ = µi . Still, this does not seem to help very much. a In
November 7, 2002 15:28 WSPC/148-RMP
1146
00151
L. Degiovanni & G. Magnano
a single polynomial fλµ − p(λ, µ), defining the spectral curve of the system. At the moment, we do not know which additional conditions would ensure this stronger type of separability, which is encountered in the examples that we discuss below. In the next subsection we shall produce an example showing that the occurrence of a single spectral curve does not follow automatically from our assumptions. 3.3. Canonical form of trihamiltonian structures in separation coordinates As we have seen, given a trihamiltonian structure (P0 , P1 , P2 ) and a complete common Casimir polynomial fλµ , one needs to find a good set of transversal vectorfields to produce separation variables. Finding such vectorfields is, in general, a difficult task. On the other hand, upon assuming that such vectorfields exist, one can explicitly compute the components of all the relevant object as they would be in separation coordinates in some cases, this provides a concrete procedure to obtain “backwards” the change of variables. From the previous discussion, we know that in a generalised Darboux–Nijenhuis coordinate system: (1) as far as the 2m × 2m block corresponding to the coordinates (λi , µi ) is considered, the tensor P0 is in canonical form, while the tensors P˜1 and P˜2 are obtained from P0 by applying the diagonal recursion operators N1 and N2 , having the coordinates λi and µi , respectively, as (double) eigenvalues; (2) the components of the complete tensors P0 , P˜1 and P˜2 in the coordinates (λi , µi , cα ) are obtained by simply adding k null rows and k null columns to the respective 2m × 2m matrix (by “null” we mean that all the corresponding entries are vanishing); (3) the transversal vectorfields Zα are coordinate vectorfields: Zα = ∂c∂α ; (4) the relation between the original Poisson tensors P1 and P2 with the “deformed” ones, P˜1 and P˜2 , is given by (3.5); (5) each pair of conjugate coordinates (λi , µi ) is a root of the equation fλµ − pi (λ, µ) = 0 for some polynomial pi . The latter information allows one to find the explicit expression of all the hamiltonians hij as functions of (λi , µi , cα ), once fixed the polynomials pi (which may be separately determined or arbitrarily chosen, as we explain below). In fact, let us assume as before that the common Casimir polynomial fλµ contains exactly m + k independent hamiltonians hij . Let us single out the k hamiltonians which are Casimir functions for P0 , which we denote as above by cα , and denote the remaining (independent) hamiltonians as hA , with A = 1, . . . , m. We impose the conditions f(1) (λ1 , µ1 ) = p1 (λ1 , µ1 ) .. . f(m) (λm , µm ) = pm (λm , µm )
(3.20)
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1147
which form a linear system of m independent equations in the m unknowns hA . Solving it, one finds hA = hA (λi , µi , cα ). Next, one produces the deformation vectorfields XP1 and XP2 according to Lemma 3.7. Then, one can compute the components of the two Poisson tensors P1 and P2 : P1 = P˜1 − LXP1 P0
and
P2 = P˜2 − LXP2 P0 .
(3.21) ∂f
It is also important to remark that one can obtain the polynomials Sα (λ, µ) = ∂cλµ α as well. This fact will be used in the applications. To fix the ideas, we work out a concrete example. We remark that in this way we shall display a concrete case where the requirements of Proposition 3.10 can be directly tested, showing that the set of systems fulfilling our requirements is indeed not empty (other examples can be produced in the same way, starting from different “fundamental molecules”). Take the gl(3) “fundamental molecule” represented in (1.27). We set c1 ≡ h02 , c2 ≡ h11 and c3 ≡ h20 ; we also simplify the notation for the remaining hamiltonians by setting h1 ≡ h00 , h2 ≡ h01 , and h3 ≡ h10 . The common Casimir polynomial (apart from possible constant terms, which may anyhow be compensated in the polynomials pi ) becomes fλµ = h1 + h2 λ + c1 λ2 + h3 µ + c2 λµ + c3 λµ2 .
(3.22)
We leave the three polynomials p1 , p2 and p3 undetermined; for brevity, we write p(1) for p1 (λ1 , µ1 ), and so on. Imposing (3.20), the hamiltonians read as follows: h1 = [(λ2 µ3 − λ3 µ2 )p(1) + (λ3 µ1 − λ1 µ3 )p(2) + (λ1 µ2 − λ2 µ1 )p(3) + [(λ3 − λ2 )λ2 λ3 µ1 + (λ1 − λ3 )λ1 λ3 µ2 + (λ2 − λ1 )λ1 λ2 µ3 ]c1 + [(µ3 − µ2 )λ2 λ3 µ1 + (µ1 − µ3 )λ1 λ3 µ2 + (µ2 − µ1 )λ1 λ2 µ3 ]c2 + [(µ2 − µ3 )µ2 µ3 λ1 + (µ3 − µ1 )µ1 µ3 λ2 + (µ1 − µ2 )µ1 µ2 λ3 ]c3 ] · (λ1 µ2 − λ2 µ1 + λ2 µ3 − λ3 µ2 + λ3 µ1 − λ1 µ3 )−1 , h2 = [(µ2 − µ3 )p(1) + (µ3 − µ1 )p(2) + (µ1 − µ2 )p(3) + [(λ2 2 − λ3 2 )µ1 + (λ3 2 − λ1 2 )µ2 + (λ1 2 − λ2 2 )µ3 ]c1 + [(λ2 − λ1 )µ1 µ2 + (λ1 − λ3 )µ1 µ3 + (λ3 − λ2 )µ2 µ3 ]c2 + [(µ2 − µ1 )µ1 µ2 + (µ1 − µ3 )µ1 µ3 + (µ3 − µ2 )µ2 µ3 ]c3 ] · (λ1 µ2 − λ2 µ1 + λ2 µ3 − λ3 µ2 + λ3 µ1 − λ1 µ3 )−1 , h3 = [(λ3 − λ2 )p(1) + (λ3 − λ2 )p(1) + (λ3 − λ2 )p(1) + [(λ1 − λ2 )λ1 λ2 + (λ3 − λ1 )λ1 λ3 + (λ2 − λ3 )λ2 λ3 ]c1 + [(µ1 − µ2 )λ1 λ2 + (µ3 − µ1 )λ1 λ3 + (µ2 − µ3 )λ2 λ3 ]c2
November 7, 2002 15:28 WSPC/148-RMP
1148
00151
L. Degiovanni & G. Magnano
+ [(µ3 2 − µ2 2 )λ1 + (µ1 2 − µ3 2 )λ2 + (µ2 2 − µ1 2 )λ3 ]c3 ] · (λ1 µ2 − λ2 µ1 + λ2 µ3 − λ3 µ2 + λ3 µ1 − λ1 µ3 )−1 . The deformation vectorfields (3.6) are XP1 = h2
∂ ∂ + h3 2 , ∂c1 ∂c
XP2 = h2
∂ ∂ + h3 3 . ∂c2 ∂c
Starting from
0
0
0
1
0 0 0 0 0 0 0 0 −1 0 0 0 P0 = 0 −1 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0
0
0 0 −λ1 P˜1 = 0 0 0 0
0
0
0
0
0 0
0 0
1 0
0 0
0 1
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0 0 0 , 0 0 0
0 0
0 0
0
0
0
λ1
0
0
0 0
0
0
0
λ2
0
0 0
0
0
0
0
λ3
0 0
0
0
0
0
0
0
0 0
−λ2
0
0
0
0
0 0
0
−λ3
0
0
0
0 0
0
0
0
0
0
0 0
0
0
0
0
0
0 0
0 0 0 0 , 0 0 0
0
0
0
0
0
0
0 0
0
0
0
0
µ1
0
0
0
0
0
0
µ2
0
0
0
0
0
0
µ3
0
0
0
0
0
0
0
−µ2
0
0
0
0
0
0
−µ3
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0 0 0 0 0 0 0 0 , 0 0 0 0 0 0
0
0
0
0
0
0
0 0
0 0 −µ1 ˜ P2 = 0 0 0 0 0
0 0
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1149
one can obtain the matrix expressions for the tensors P1 = P˜1 − LXP1 P0 and P2 = P˜2 − LXP2 P0 . They coincide with the latter two matrices above, respectively, as far as the 6 × 6 upper left blocks are concerned, while the remaining three rows and three columns are rather complicated for both tensors (and it would be pointless to write them down here). It is immediate to check that the tensors P0 , P˜1 and P˜2 above are pairwise compatible Poisson tensors, and that the coordinate vecorfields Zα = ∂c∂α are symmetries of all of them. Then, it is enough to reverse the steps of the proof of Proposition 3.6 to show that P1 and P2 are both Poisson tensors compatible with P0 . One can then check directly that the function (3.22) with the three hamiltonians computed above is a Casimir function for P0 , P1 and P2 , as expected. The polynomials Sα (λ, µ; λi , µi ) are
S1 =
S2 =
S3 =
(λ3 − λ2 )λ2 λ3 µ1 + (λ1 − λ3 )λ1 λ3 µ2 + (λ2 − λ1 )λ1 λ2 µ3 λ1 µ2 − λ2 µ1 + λ2 µ3 − λ3 µ2 + λ3 µ1 − λ1 µ3 (λ2 2 − λ3 2 )µ1 + (λ3 2 − λ1 2 )µ2 + (λ1 2 − λ2 2 )µ3 + λ λ1 µ2 − λ2 µ1 + λ2 µ3 − λ3 µ2 + λ3 µ1 − λ1 µ3 (λ1 − λ2 )λ1 λ2 + (λ3 − λ1 )λ1 λ3 + (λ2 − λ3 )λ2 λ3 µ + λ2 , + λ1 µ2 − λ2 µ1 + λ2 µ3 − λ3 µ2 + λ3 µ1 − λ1 µ3 (µ3 − µ2 )λ2 λ3 µ1 + (µ1 − µ3 )λ1 λ3 µ2 + (µ2 − µ1 )λ1 λ2 µ3 λ1 µ2 − λ2 µ1 + λ2 µ3 − λ3 µ2 + λ3 µ1 − λ1 µ3 (λ2 − λ1 )µ1 µ2 + (λ1 − λ3 )µ1 µ3 + (λ3 − λ2 )µ2 µ3 + λ λ1 µ2 − λ2 µ1 + λ2 µ3 − λ3 µ2 + λ3 µ1 − λ1 µ3 (µ1 − µ2 )λ1 λ2 + (µ3 − µ1 )λ1 λ3 + (µ2 − µ3 )λ2 λ3 µ + λµ + λ1 µ2 − λ2 µ1 + λ2 µ3 − λ3 µ2 + λ3 µ1 − λ1 µ3 (µ2 − µ3 )µ2 µ3 λ1 + (µ3 − µ1 )µ1 µ3 λ2 + (µ1 − µ2 )µ1 µ2 λ3 λ1 µ2 − λ2 µ1 + λ2 µ3 − λ3 µ2 + λ3 µ1 − λ1 µ3 (µ2 − µ1 )µ1 µ2 + (µ1 − µ3 )µ1 µ3 + (µ3 − µ2 )µ2 µ3 + λ λ1 µ2 − λ2 µ1 + λ2 µ3 − λ3 µ2 + λ3 µ1 − λ1 µ3 (µ3 2 − µ2 2 )λ1 + (µ1 2 − µ3 2 )λ2 + (µ2 2 − µ1 2 )λ3 µ + µ2 . + λ1 µ2 − λ2 µ1 + λ2 µ3 − λ3 µ2 + λ3 µ1 − λ1 µ3
These polynomials meet all the conditions of Proposition 3.10: first, they do not depend any more on the coordinates cα ; second, the reader can easily check straightforwardly that (λ1 , µ1 ), (λ2 , µ2 ) and (λ3 , µ3 ) are pairs of common roots of the polynoshould check that mials S1 , S2 and S3 ; as far as the third condition is concerned,
∂Sone α ∂Sβ α ∂Sβ
for any i = 1, 2, 3 the two matrices k{Sα , Sβ }P0 kλi ,µi and ∂λ ∂µ − ∂S ∂µ ∂λ λi ,µi (with α, β = 1, 2, 3) are not identically vanishing and coincide. Direct computation
November 7, 2002 15:28 WSPC/148-RMP
1150
00151
L. Degiovanni & G. Magnano
shows that this is indeed the case: for i = 1, the two matrices are both equal to
0
(λ2 − λ1 )(λ3 − λ1 )
(λ1 − λ2 )(λ3 − λ1 )
0
(λ1 − λ2 )(µ3 − µ1 ) +(µ1 − µ2 )(λ3 − λ1 )
(µ1 −µ2 )(µ3 −µ1 )
(λ2 − λ1 )(µ3 − µ1 ) +(µ2 − µ1 )(λ3 − λ1 ) (µ2 − µ1 )(µ3 − µ1 ) 0
and similar expressions, with the appropriate permutiations of indices, are found for i = 2, 3. It is worthwhile to remark that one can produce, choosing arbitrarily the polynomials pi (λ, µ), infinitely many families of hamiltonians which are separable according to Sklyanin’s criterion, but do not coincide with the coefficients of a single spectral curve, unless one sets p1 (λ, µ) ≡ p2 (λ, µ) ≡ p3 (λ, µ) ≡ p(λ, µ). Notice that the choice of the constant polynomials pi does not affect the polynomials Sα . Actually, if one starts from a fixed trihamiltonian structure and is able to find vectorfields Zα fulfilling all the requirement listed in Proposition 3.10, then the polynomials pi are determined a posteriori simply by plugging separately each pair of common roots of the polynomials Sα into fλµ . The above reconstruction of the trihamiltonian structure goes in the reverse direction: the polynomials pi are arbitrary and determine at the same time the hamiltonians and the Poisson tensors P1 and P2 . Thus, in our framework the constant terms in Sklyanin’s separation polynomials Wi , and a fortiori in the spectral curve (whenever it exists), are not directly encoded in the hamiltonian structure underlying a dynamical system and its symmetries (some aspects connected to the arbitrarity of the constant part of the the spectral curve equation have been addressed by J. Harnad in [24]). We have seen that the constant polynomials pi are determined by the choice of a set of transversal vectorfields, or equivalently of a system of separation variables: possible different sets of polynomials pi — and, eventually, different spectral curves — may be associated with different sets of separation variables. Indeed, from the example given above one might believe that different choices of the polynomials pi lead to different hamiltonians (hence, to distinct dynamical systems), but in fact these are — by construction — nothing but the same hamiltonians expressed in two different coordinate systems: the same holds for the components of both P1 and P2 . On the contrary, the transversal vectorfields and the deformed structures P˜1 and P˜2 intrinsically depend on the polynomials p(i) . In conclusion, the spectral curve appears to be an additional datum with respect to the purely hamiltonian structure of an integrable system; its hamiltonian interpretation, however, is deeply connected to the existence of particular canonical coordinates, as (separately) suggested by Sklyanin and Magri. For trihamiltonian structures (with a suitable projection onto a symplectic leaf), we have found a sound connection between separation coordinates and the vanishing of spectral polynomials.
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1151
4. Separation Coordinates for Lax Equations with Spectral Parameter In this section we apply the techniques discussed so far to a particular class of dynamical systems, represented by Lax equations with spectral parameter. More precisely, we shall restrict to the following situation: (1) the Lax operator is a r × r matrix polynomial of degree n in the spectral parameter, L(λ) = Aλn + M1 λn−1 + · · · + Mn , with a constant leading term A ∈ gl(r); (2) the constant matrix A should commute only with linear combinations of its powers (including A0 ≡ 1l). Hence, if A is diagonalisable, it should have distinct eigenvalues; more generally, the canonical Jordan form of A should not contain Jordan blocks proportional to the identity of dimension higher than one. This property is generic on gl(r), but the requirement rules out some cases considered in the literature [2, 3]. A consequence of this requirement is that A may be nilpotent, with Ar = 0, but Ak should not vanish for k < r. (3) the variable matrices Mi are generic matrices belonging to gl(r): we are not considering restrictions to proper subalgebras, nor other types of reductions. This setting includes classical models such as the Euler–Poinsot top (for r = 3, n = 1) and the Lagrange top (for r = 3, n = 2), in the sense which has already been explained in the introduction: the classical models are properly embedded into larger systems, but can be recovered as a subset of the flows leaving invariant theappropriate submanifold (namely, the subalgebra of antisymmetric matrices). Other important examples such as the Kovalewska top, or the Dubrovin–Novikov finitedimensional reductions [15] of the Gel’fand–Dickey soliton hierarchies, are strictly related to the systems that we are considering but cannot be directly obtained by simple restriction “a posteriori”. The application of our framework to the periodic Toda lattice or to other models without a constant leading term in the Lax matrix has not been investigated yet. There are two possible approaches to the construction of multihamiltonian structures for Lax equations of the type considered. Most authors regard them as dyname ical systems on loop algebras gl(r) ≡ gl(r)((λ)), and use the R-matrix technique to define compatible Poisson brackets, which can be reduced to finite-dimensional quotient spaces identified with the linear spaces of fixed-order polynomials in λ [7]. According to another approach, one considers the direct sum of n copies of the Lie algebra gl(r), defines a suitable Lie algebra structure on this vector space (different from the direct product structure), and an appropriate scalar product; in this way, one gets a natural Lie–Poisson bracket on gl(r)n . The other Poisson structures are obtained by a deformation procedure, i.e. are defined as Lie derivatives of the Lie–Poisson tensor along suitable vectorfields [6]. In this approach, the dynamical variable is a n-ple of matrices (M1 , . . . , Mn ), while the fixed matrix A occurs in the definition of the deformation vectorfields which produce the Poisson structures; the Lax matrix L(λ) itself arises as a by-product, in connection with the Hamilton
November 7, 2002 15:28 WSPC/148-RMP
1152
00151
L. Degiovanni & G. Magnano
equations with spectral parameter which naturally represent the trihamiltonian flows. The two approaches are substantially equivalent for our purposes. We shall follow here the second approach, but we stress that most of the definitions could be rephrased in the R-matrix language. As anticipated in the introduction, we are not giving here all the proofs. The proof that the tensors P0 and P1 defined below are compatible Poisson tensors can be found, for instance, in [6] or [7]. The only proof which is included concerns the fact that the characteristic determinant of the Lax matrix is the fundamental common Casimir polynomial for our trihamiltonian structure.
4.1. Affine Lie–Poisson pencils
L Let us consider the linear space gl(r)n ≡ n gl(r); we shall denote elements of this space by M ≡ (M1 , . . . , Mn ), with Mi ∈ gl(r). The scalar product on gl(r) defined by (1.5) is extended “componentwise” to gl(r)n :
(A, B) =
n X
Tr(Ai Bi ) .
(4.1)
i=1
Using this scalar product, the gradient of a function f : gl(r)n → R is again an element of gl(r)n , that we denote by (∇1 f, . . . , ∇n f ). We define on gl(r)n a first Lie–Poisson structure P (n) (depending on Mi in a strictly linear way) by setting {f, g}P (n) =
n X
∇i g,
i=1
n X
! [Mk , ∇k−i+1 f ] .
(4.2)
k=i
It is convenient to represent the Poisson tensor P (n) as a matrix of linear operators acting on the column vector (∇1 f, . . . , ∇n f ):
P (n)
[M1 , · ]
[M2 , · ]
· · · [Mn−1 , · ]
[M , · ] [M3 , · ] · · · 2 .. .. = . . [Mn−1 , · ] [Mn , · ] · · · [Mn , ·]
0
···
[Mn , · ]
[Mn , · ] .. .
0 .. .
0
0
0
0
.
Following [6], from this first structure it is possible to obtain a sequence of other n affine Poisson structures P (n−1) , . . . , P (0) , all mutually compatible, by the iterative formula P (k−1) = −
1 LX P (k) k+1
k = n...1,
(4.3)
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1153
where the components of the deformation vectors field X are affine functions, determined by a the fixed matrix A ∈ gl(r): M1 0 ··· ··· ··· 0 nA .. . . n − 1 ... 0 . . . . . . . . .. .. .. .. + . X = . 0 . . .. .. . .. .. .. . . . . . . . . 0 0 ··· 0 1 0 Mn The first pencil of the trihamiltonian structure that we shall consider is defined by the tensors P (1) and P (0) of the sequence; their expression is ! n−1 i−1 X X [∇n−i+k f, Mk ] ∇i g, [∇n−i f, A] + {f, g}P (1) = (∇1 g, [∇n−1 f, A]) + i=2
k=1
+ (∇n g, [Mn , ∇n f ]) , {f, g}P (0) = (∇1 g, [∇n f, A]) +
n X
∇i g, [∇n−i+1 f, A] +
i=2
In matrix representation, 0 0 . .. (1) = P 0 [ · , A] 0 0 0 . P (0) = .. 0 [ · , A]
i−1 X
! [∇n−i+k+1 f, Mk ] .
k=1
···
0
[ · , A]
0
···
[ · , A]
[ · , M1 ] .. .
0 0
[ · , A]
···
[ · , Mn−3 ]
0
[ · , M1 ]
···
[ · , Mn−2 ]
0
··· ··· ··· [ · , A]
,
[Mn , · ] ··· 0 [ · , A] ··· [ · , A] [ · , M1 ] .. .. . . . ······ [ · , Mn−2 ] ···
[ · , M1 ] · · · · · ·
0
(4.4)
[ · , Mn−1 ]
The Poisson pencil P (1) − λP (0) is called the affine Lie–Poisson pencil on gl(r). From the matrix representation given above, it is easy to see that any function P fλ = hi λi such that ( k = 1, . . . , n − 1 ∇k fλ − λ∇k+1 fλ = 0 , (4.5) n n−1 + · · · + Mn , ∇n fλ ] = 0 . [Aλ + M1 λ
November 7, 2002 15:28 WSPC/148-RMP
1154
00151
L. Degiovanni & G. Magnano
is a Casimir function of the affine Lie–Poisson pencil. This observation leads to the definition of the Lax matrix L(λ) = Aλn +M1 λn−1 +· · ·+Mn , and one immediately sees that the trace of any power of L(λ) fulfills (4.5). Each Casimir function of this type generate a bihamiltonian hierarchy according to the prescriptions (1.19) and (1.20). The corresponding Hamilton equations with spectral parameter are equivalent to Lax equations for L(λ): the k-th flow of the hierarchy, using the same notation as in (1.19), is represented by (k) ˙ L(λ) = [L(λ), ∇n fλ ] .
(4.6)
In the sequel, we revert to the notation used in the previous part of the paper, setting P0 ≡ P (0) and P1 ≡ P (1) . However, on gl(r)n the third compatible structure P2 , necessary to construct the appropriate second pencil, does not belong to the above-defined sequence of affine Lie–Poisson tensors: in particular, it has nothing to do with the structure P (2) , defined by (4.3) for n ≥ 2 (for this reason we had to adopt a different notation). The more efficient method to find the third Poisson tensor P2 still starts from the linear Lie–Poisson structure (4.2), but instead of applying a deformation vectors field one exploits the Lax–Nijenhuis equation. This equation has been introduced in [25] in connection with the following question: “Given a Lax equation, and a Poisson structure P for which the traces of the powers of the Lax matrix are in involution, does it exists a second compatible Poisson structure Q such that the same constants of motion are iteratively linked in a Magri–Lenard hierarchy?” It turns out that, if such a second Poisson structure exists, the two derivatives of the Lax matrix L along the two vectorfields P dh and Qdh generated by any hamiltonian h are linked by the following relation: LQdh L = LP dh (L2 ) + [L, α(dh)] , for some matrix α algebraically depending on differential of the hamiltonian h. In some cases, this equation allows one to determine completely the second Poisson structure Q. This may happen if the Lax matrix is not generic, and in particular if L is assumed to have a fixed degree with respect to some grading in the Lie algebra; for instance, when L is tridiagonal (Toda), or is a polynomial of fixed degree in a spectral parameter (the case we are dealing with). Then, its square L2 has usually a different degree, and the unknown element α occurring in the r.h.s. become determined by a compatibility requirement, i.e. its commutator with L should cancel exactly the terms of higher degree in the derivative of L2 . We refer the reader to [25] for a complete presentation of the method. We omit the details of the computation in our case: a key point is that one should plug in the Lax–Nijenhuis equation a slightly modified Lax matrix polynomial, namely the “convoluted” polynomial L∗ (λ) = A+M1 λ+· · ·+Mn λn . Then, starting from the Poisson tensor P0 , one finds the following new Poisson bracket:
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
{f, g}P2 =
n X
∇i g,
i=1
+
n X
1155
! Mk ∇k−i+1 f A − A∇k−i+1 f Mk
k=i
n X i=2
∇i g,
i−1 n X X
! Mk ∇k+l−i+1 f Ml − Ml ∇k+l−i+1 f Mk
. (4.7)
k=i l=1
For n generic, writing down the representation of this Poisson tensor as a matrix of linear operators, analogous to (4.4), would be rather cumbersome. The reader can easily figure out the general form from the representations of the Poisson tensors respectively corresponding to n = 1, 2, 3: for n = 1 ,
P2 = M (·)A − A(·)M ,
for n = 2 ,
P2 =
for n = 3,
M1 ( · )A − A( · )M1
M2 ( · )A − A( · )M2
M2 ( · )A − A( · )M2
M2 ( · )M1 − M1 ( · )M2
M1 ( · )A − A( · )M1
M2 ( · )A − A( · )M2
M ( · )A − A( · )M2 P2 = 2 M3 ( · )A − A( · )M3
M3 ( · )A − A( · )M3 +M2 ( · )M1 − M1 ( · )M2 M3 ( · )M1 − M1 ( · )M3
, M3 ( · )A − A( · )M3
M3 ( · )M1 − M1 ( · )M3 . M3 ( · )M2 − M2 ( · )M3
For n = 1 one recovers the Poisson structure of Morosi and Pizzocchero [11] already described in the Introduction. The quadratic Poisson structures for n > 1 have never been presented in the previous literature, to our knowledge. In the R-matrix language, they can be obtained by a suitable modification of the so-called Sklyanin bracket [26].
4.2. Fundamental Casimir polynomial We have anticipated in the introduction that the characteristic determinant of the Lax matrix L(λ) provides a complete common Casimir polynomial for the two pencils P1 − λP0 and P2 − µP0 . We shall now prove this statement. We already know that the traces of the powers of the Lax matrix are Casimir functions for the first pencil, as they fulfill (4.5). The same holds for the coefficients of each power of µ in the characteristic polynomial fλµ = det(L(λ) − µ1l), since these coefficients are functionally dependent on the traces of the powers of L(λ). Then, what remains to check is that the polynomials in µ which occur as coefficients of each power of λ in fλµ are Casimir functions for the second pencil. In the following computation, we denote the components of the gradient of fλµ by Vi = ∇i fλµ (notice that each matrix Vi still depends on both λ and µ), and we systematically make use of the fact that Vk = λVk+1 ,
i.e.
Vk = λn−k Vn .
(4.8)
November 7, 2002 15:28 WSPC/148-RMP
1156
00151
L. Degiovanni & G. Magnano
The condition (P2 − µP0 ) dfλµ = 0 translates into the following system of n equations: Pn [µVn , A] − k=1 (Mk Vk A − AVk Mk ) = 0 , [µVn−i+1 , A] + Pi−1 [µVn−i+k+1 , Mk ] k=1 Pn − k=i (Mk Vk−i+1 A − AVk−i+1 Mk ) Pn Pi−1 − k=i l=1 (Mk Vk+l−i+1 Ml − Ml Vk+l−i+1 Mk ) = 0 , for i = 2, . . . , n which, after some algebraic manipulations taking account of (4.8) and of some of the equalities following from (P1 − λP0 ) dfλµ = 0, can be shown to be equivalent to (L(λ) − µ1l)Vn Mi = Mi Vn (L(λ) − µ1l) for all i = 1 . . . , n .
(4.9)
Now, for the characteristic polynomial fλµ = det(L(λ) − µ1l) one has Vn = ∇n fλµ = fλµ · (L(λ) − µ1l)−1 , therefore (4.9) is straigthforwardly satisfied. Under the hypotheses listed at the beginning of this section (the phase space is the full space gl(r)n , without constraints, and A is suitably generic), the non constant coefficients of the characteristic equation are functionally independent and define exactly 12 nr(r + 1) hamiltonians. Figure 1 displays the corresponding “fundamental molecule”. One sees directly from the diagram that, for each one of the Poisson tensors P0 , P1 and P2 , the set of hamiltonians hji includes exactly nr Casimir functions, therefore the rank of the Poisson tensor cannot exceed nr2 − nr = nr(r − 1); on the other hand, the remaining 12 nr(r − 1) hamiltonians are in mutual involution, so the rank of the Poisson tensor cannot be less than nr(r − 1). Therefore, the fundamental Casimir polynomial fλµ is complete. If A does not fulfill our requirements, what happens is that (i) the Poisson structures P1 and P2 have a larger kernel, so there are other independent Casimir functions not occurring in the characteristic determinant; (ii) some coefficients of the characteristic determinant vanish identically, and by consequence (iii) one cannot find properly normalised transversal vectorfields using the recipe presented below.
4.3. Transversal vectorfields and separation coordinates The next step towards the construction of separation coordinates consists in finding a set of P0 -transversal vectorfields, normalized on the nr Casimir functions which one gets from the fundamental Casimir polynomial. We state without proof the general recipe:
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1157
(1) Choose a matrix W1 ∈ gl(r) of rank one such that: Tr(W1 ) = 0 Tr(W1 A) = 0 .. .
(4.10)
Tr(W1 Ar−2 ) = 0 Tr(W1 Ar−1 ) = (−1)r−1 ; this condition makes sense because we know that Ak 6= 0 for k < r (the r linear equations above do not determine uniquely W1 , but any such W1 will work). (2) Compute the adjoint of the characteristic matrix L(λ)−µ1l, that we shall denote by L† (λ, µ); by definition, (L(λ) − µ1l) · L† (λ, µ) = fλµ 1l. The entries of L† (λ, µ) are the cofactors of L(λ) − µ1l, therefore L† (λ, µ) is polynomial of order n(r − 1) in λ and of order (r − 1) in µ. (3) Let Wλ be the matrix polynomial of the form Wλ = W1 λn−1 + W2 λn−2 + · · · + Wn
(4.11)
with W2 = W1 (u0,2 1l + u1,2 A + · · · + ur−2,2 Ar−2 ) , .. .
(4.12)
Wn = W1 (u0,n 1l + u1,n A + · · · + ur−2,n Ar−2 ) . The (r −1)×(n−1) coefficients ui,j are scalar functions on gl(r)n , which should be determined by the following condition: take the polynomial Tr L† (λ, µ) Wλ ; for each power of µ separately, the n highest coefficients in λ should vanish, except for the coefficient of λn(r−1)+(n−1) , which is always equal to one because of (4.10). Namely, the terms which should be canceled with the appropriate choice of ui,j are the following: λk ,
n(r − 1) ≤ k ≤ n(r − 1) + (n − 2) ,
µλk ,
(n − 1)(r − 1) ≤ k ≤ (n − 1)(r − 1) + (n − 1) ,
.. . µn−1 λk ,
0 ≤ k ≤ (n − 1) .
This gives a linear system of equations for the coefficients ui,j . The system is triangular and can always be solved.
November 7, 2002 15:28 WSPC/148-RMP
1158
00151
L. Degiovanni & G. Magnano
(4) Denoting a tangent vector on gl(r)n by v = [M˙ 1 , . . . , M˙ n ], introduce the n vectorfields v10 = [0, . . . , 0, 0, W1 ] , v20 = [0, . . . , 0, W1 , W2 ] , (4.13)
.. . vn0 = [W1 , . . . , Wn−1 , Wn ] .
(5) Take for k = 1, . . . , r − 1 the product of Wλ with the kth power of the Lax matrix L(λ): (k)
Wλ
=
Wλ Lk . λnk
(4.14) (k)
(k)
(k)
Consider the highest n terms in the expansion Wλ = W1 λn−1 + W2 λn−2 + (k) (k) · · · + Wn λ0 + · · · (in particular, one has W1 ≡ W1 Ak ): other n(r − 1) vectorfields vjk , for i = 1, . . . , n and k = 1, . . . , (r − 1), are defined analogously to (4.13): (k)
v1k = [0, . . . , 0, 0, W1 ] , (k)
(k)
v2k = [0, . . . , 0, W1 , W2 ] , (4.15)
.. . (k)
(k)
(k)
vnk = [W1 , . . . , Wn−1 , Wn ] . For the nr vectorfields constructed in this way, the following holds: Proposition 4.12. The vectorfields vjk are symmetries of P0 and fulfill the normalization condition vjk (hlm+(r−k−1)n ) = δ kl δjm
for k, l = 0, . . . , (r − 1) and j, m = 1, . . . , n ; (4.16)
moreover, for all k, j, l and m in the given ranges, l (fµν )) = 0 vjk (vm
and
l [vjk , vm ] = 0.
(4.17)
To ensure that the P0 -transversal vectorfields vjk provide a set of (separation) Darboux–Nijenhuis coordinates, one should also check conditions (2) and (3) of Proposition 3.7. However, this verification may be postponed: let us tentatively assume that these conditions are satisfied. Then, one could find separation coordinates in which the Poisson tensors P0 , P1 and P2 assume the “canonical” form described in Sec. 3.3; in particular, the affine Lie–Poisson tensor P0 becomes canonical in the usual sense. In separation coordinates, we already know the explicit form of the nr polynomials Sα (λ, µ); the latter should coincide, up to the change of coordinates, with
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1159
the polynomials Sjk (λ, µ) ≡ vjk (fλµ ) that one can also compute in terms of the original variables of the Lax equation. Hence, one should recover the separation coordinates by taking the common roots of the latter polynomials. Actually, it is sufficient to compute the common roots of the polynomials v1k (fµν ). The vectorfields v1k are constant: namely, we have seen that they are defined by M˙ i = 0 for i = 1, . . . , (n − 1) and M˙ n = W1 Ak . The corresponding polynomials S1k (λ, µ) are the derivatives of the characteristic determinant fλµ along constant vectors fields, and therefore are nothing but linear combinations of cofactors of the characteristic matrix L(λ) − µ1l, with constant coefficients determined by the matrices W1 and A. Therefore, we recover Sklyanin’s recipe, with a definite prescription of the normalization to be used.b If one were able to compute 12 nr(r − 1) pairs of common roots (λi , µi ) of the polynomials S1k (λ, µ), then one would only have to check a posteriori that the new variables (λi , µi ) are canonical for the Poisson structure P0 . Since the polynomials S1k (λ, µ) are independent linear combinations of cofactors of the characteristic matrix, (λi , µi ) are also roots of the fundamental Casimir polynomial, and this would be enough to say that all the hamiltonians become separable in the new coordinates. In this case there is no ambiguity on the constant part of fλµ itself, which turn out to depend only on the choice of A. Notice that in the new coordinates on gl(r)n , given by the roots (λi , µi ) and by the nr Casimir functions of P0 , the vectorfields vjk automatically become coordinate vectorfields, due to the normalization and to the property (4.17). However, as we have stressed in the introduction, finding explicitly the roots of the polynomials S1k (λ, µ) is impossible in general. A more effective method is the following one: since we know the expression of the polynomials Sjk (λ, µ) in both coordinate systems, we can simply equate the corresponding coefficients for each polynomial to get a set of algebraic equations linking the two coordinate systems. Luckily enough, while the coefficients of Sjk (λ, µ) are in general rather complicated rational functions of the separation coordinates, it is easy to see that a number of coefficients are linear functions of the entries of the Lax matrix L(λ). It turns out that one can provide in this way a complete set of equations relating the two coordinate systems, which either are all linear (in the lower dimensional cases) or can be reduced to linear equations. In this way one can effectively compute the inverse coordinate transformation, i.e. express the original variables as functions of the separation coordinates. The mapping cannot be explicitly inverted in general, but it is sufficient for some purposes, for instance to check that the Poisson tensors transform as expected, proving a posterior that all conditions are met. b To
be honest, exactly because Sklyanin’s prescription does not fix the normalisation of the Baker– Akhiezer function, to prove the equivalence of the two procedures one cannot rely on the simple comparison on the final results, but it would be necessary to compare each object involved in the two constructions, what we have not done yet.
November 7, 2002 15:28 WSPC/148-RMP
1160
00151
L. Degiovanni & G. Magnano
Let us present a concrete example. We have already displayed in Sec. 3.3 the form of the polynomials Sα for the system associated to the Lie algebra gl(3). We now compute the same polynomials for the Lax matrix Aλ + M , with M ∈ gl(3) and A diagonal with distinct eigenvalues (α, β, γ) (we choose this case because it corresponds to the generalised Euler–Poinsot rigid body, as discussed in the introduction). We choose a set of orthonormal coordinates (x1 , . . . , x9 ) in gl(3), setting √1 (x1 + x4 ) √1 (x2 + x5 ) x7 2 2 √1 (x3 + x6 ) . x8 (4.18) M = √12 (x4 − x1 ) 2 √1 (x5 2
− x2 )
√1 (x6 2
− x3 )
x9
The three Casimir functions for P0 (see (1.27)) are h02 = βγx7 + αγx8 + αβx9 , h11 = −(β + γ)x7 − (α + γ)x8 − (α + β)x9 , h20
(4.19)
= x7 + x8 + x9 .
Hence, there is a linear correspondence between the coordinates (x7 , x8 , x9 ) and the coordinates (c1 , c2 , c3 ). Choosing W1 to have all the rows equal to each other, following step (1) above one finds 1 1 1 (α−β)(α−γ)
1 W1 = (α−β)(α−γ) 1 (α−β)(α−γ)
(γ−β)(α−β) 1 (γ−β)(α−β) 1 (γ−β)(α−β)
(β−γ)(α−γ) 1 (β−γ)(α−γ) 1 (β−γ)(α−γ)
.
(4.20)
There are only three constant transversal vectorfields in this case, with M˙ = W1 , M˙ = W1 A and M˙ = W1 A2 respectively. Taking the derivatives of fλµ = det(Aλ + M − µ1l) along these three vectorfields one easily computes the three polynomials S1 (λ, µ), S2 (λ, µ) and S3 (λ, µ). Let us write down only the coefficients which are relevant to our purpose: 1 S1 = λ2 + √ [−(α − 2γ + β)x1 + (α − 2β + γ)x2 + (2α − β − γ)x3 2 √ √ √ + 2(γ − β)x7 + 2(α − γ)x8 + 2(β − α)x9 + (α − β)x4 − (α − γ)x5 + (β − γ)x6 ][(γ − β)(α − β)(α − γ)]−1 µ 1 + √ [γ(α − 2γ + β)x1 − β(α − 2β + γ)x2 − α(2α − β − γ)x3 2 √ √ + 2(γ − β)(−γ + α − β)x7 − 2(α − γ)(α − β + γ)x8 √ + 2(α − β)(α − γ + β)x9 + α(γ − β)x6 − γ(α − β)x4 + β(α − γ)x5 ][(γ − β)(α − β)(α − γ)]−1 λ + · · · ,
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1161
1 S2 = λµ + √ [(2αγ − βγ − αβ)x2 + (αγ − 2βγ + αβ)x3 2 √ √ + (−2αβ + βγ + αγ)x1 + 2(α − γ)βx8 − 2(α − β)γx9 √ −α(−β + γ)x6 + 2α(−β + γ)x7 + γ(α − β)x4 − β(α − γ)x5 ][(γ − β)(α − β)(α − γ)]−1 µ 1 + √ [−β(2αγ − βγ − αβ)x2 2 − α(αγ − 2βγ + αβ)x3 − γ(−2αβ + βγ + αγ)x1 √ √ − 2αγ(α − γ)x8 + 2αβ(α − β)x9 + α2 (−β + γ)x6 √ − 2γβ(−β + γ)x7 − γ 2 (α − β)x4 + β 2 (α − γ)x5 ][(γ − β)(α − β)(α − γ)]−1 λ + . . . , 1 S3 = λ2 + √ [(γ 2 α − βγ 2 − α2 β + α2 γ)x2 + (γ 2 α − βγ 2 + β 2 α − β 2 γ)x3 2 √ + (−β 2 α + β 2 γ − α2 β + α2 γ)x1 − 2(−αβ + αγ − βγ)(α − γ)x8 √ − 2(−αβ + αγ + βγ)(α − β)x9 − (αγ − βγ + αβ)(−β + γ)x6 √ + 2(αγ − βγ + αβ)(−β + γ)x7 + (−αβ + αγ + βγ)(α − β)x4 + (−αβ + αγ − βγ)(α − γ)x5 ] · [(γ − β)(α − β)(α − γ)]−1 µ √ 1 + √ [−γ(−β 2 α + β 2 γ − α2 β + α2 γ)x1 − 2αγβ(−β + γ)x7 2 √ − β(γ 2 α − βγ 2 − α2 β + α2 γ)x2 − 2αβγ(α − γ)x8 √ − α(γ 2 α − βγ 2 + β 2 α − β 2 γ)x3 + 2αβγ(α − β)x9 − (−αβ + αγ + βγ)(α − β)γx4 − (−αβ + αγ − βγ)(α − γ)βx5 ] + (αγ − βγ + αβ)(−β + γ)αx6 · [(γ − β)(α − β)(α − γ)]−1 λ + . . . . Therefore, there are exactly six coefficients which are linear in the coordinates xi . One can check that they are mutually independent and also independent from the other three equations already found from (4.19): the determinant of the full 9 × 9 linear system is exactly equal to one. By equating the polynomials above to the
November 7, 2002 15:28 WSPC/148-RMP
1162
00151
L. Degiovanni & G. Magnano
corresponding expressions listed at the end of Sec. 3.3, one finds the coordinate transformation. On the other hand, all three polynomials contain quadratic terms in λ and µ, and to find the common roots of two of them one should solve an equation of order four in λ (or in µ), which is possible in principle but gives a result which is of little practical use. The generalization of the Lagrange top can be treated in the same way. The phase space is gl(3)2 , and as above we use as coordinates suitable orthonormal linear combinations of the entries of M1 and M2 , that we denoteby xi , i = 1, . . . , 18. The 0 1 0 constant matrix for this case is chosen to be A = −1 0 0 . The Casimir functions 0
0
0
for P0 occurring in the characteristic determinant fλµ = det(A λ2 + M1 λ + M2 ) are h04 , h05 , h12 , h13 , h20 and h21 ; they are linear except for h04 and h12 which are quadratic. There are 6 transversal vectorfields, three of which are constant. Each of the three corresponding polynomials, S40 (λ, µ), S21 (λ, µ) and S02 (λ, µ), has exactly three coefficients which are linear in the coordinates xi , namely the coefficients of µ, λµ and λ3 . Then, one has a total of 13 independent linear equations which can be solved for the 13 coordinates x1 , . . . , x9 , x15 , . . . , x18 . The coefficients of λ and λ2 in the polynomials S40 , S21 and S02 , are linear in the remaining five coordinates x10 , . . . , x14 , which can thus be computed as well. On the contrary, the direct procedure ` a la Sklyanin, i.e. computing the common roots of the bivariate polynomials Sα , is not viable due to the order of the polynomials themselves. Acknowledgments We are much indebted to Franco Magri, Gregorio Falqui, Marco Pedroni and Sergio Benenti for several helpful discussions and exchanges of information on work in progress, and to the referee for useful comments. This work is supported by the national MURST research project “Geometry of Integrable Systems”. References [1] F. Magri, A simple model of the integrable Hamiltonian equation, J. Math. Phys. 19 (1978) 1156–1162. [2] G. Falqui, F. Magri and M. Pedroni, in preparation. [3] G. Falqui and M. Pedroni, Separation of variables for bi-Hamiltonian systems, nlin.SI/0204029. [4] P. A. Griffiths, Linearizing flows and cohomological interpretation of lax equations, Amer. J. Math. 107 (1985) 1445–1483. [5] M. Adler, P. van Moerbeke, Completely integrable systems, Euclidean Lie algebras, and curves, Adv. Math. 38 (1980) 267–317; Linearization of Hamiltonian systems, Jacobi varieties and representation theory, Adv. Math. 38 (1980) 318–379. [6] G. Magnano, Bihamiltonian Approach to Lax Equations with Spectral Parameter, Acc. Sc. Torino Mem. Sc. Fis. 19–20 (1995–1996) 159–209. [7] A. G. Reyman and M. A. Sem¨enov-Tyan-Shansky, Compatible Poisson structures for Lax equations: an r-matrix approach, Phys. Lett. A130 (1988) 456–460. [8] E. Sklyanin, Separations of variables, new trends, Progr. Theo. Phys. Suppl. 118 (1995) 35–60.
November 7, 2002 15:28 WSPC/148-RMP
00151
Tri–Hamiltonian Vector Fields, Spectral Curves and Separation Coordinates
1163
[9] S. V. Manakov, Note on the integration of euler’s equation of the dynamics of an n-dimensional rigid body, Funkt. Anal. Appl. 10 (1976) 93–94. [10] A. G. Reyman, M. A. Sem¨enov-Tyan-Shansky, Group-theoretical methods in the theory of finite-dimensional integrable systems, in Encyclopaedia of Mathematical Sciences, eds. V. I. Arnold and S. P. Novikov, Springer-Verlag, Berlin 1994, pp. 116– 225. [11] C. Morosi and L. Pizzocchero, On the Euler equation: bi-hamiltonian structures and integrals in involution, Lett. Math. Phys. 37 (1996) 117–135. [12] F. Magri and C. Morosi, A geometrical characterization of integrable hamiltonian systems through the theory of Poisson–Nijenhuis manifolds, Quaderno S 19/1984 of the Departement of Mathematics of the University of Milano, unpublished. [13] M. Ugaglia, Sistemi dinamici su algebre di Lie: funzioni di Casimir e significato del parametro spettrale, unpublished BSc Thesis, University of Torino (1994), and private communication. [14] T. Ratiu, Euler–Poisson Equations on Lie Algebras and the n-dimensional Heavy Rigid Body, American J. Math. 104 (1982) 409–448. ¯ [15] G. Falqui, F. Magri and M. Pedroni, Bihamiltonian geometry, Darboux coverings and linearization of the KP hierarchy, Comm. Math. Phys. 197 (1998) 303–324. [16] P. Casati, F. Magri and M. Pedroni, Bi-hamiltonian manifolds and τ -function, in Mathematical Aspects of Classical Field Theory (1991), eds. M. J. Gotay et al., Contemporary Mathematics, 132, American Mathematical Society, Providence, 1992. [17] G. Magnano and F. Magri, Poisson–Nijenhuis structures and sato hierarchy, Rev. Math. Phys. 3 (1991) 403–466. [18] W. Oevel and O. Ragnisco, R-matrices and Higher Poisson Brackets for Integrable Systems, Phys. A161 (1989) 181–220. [19] C. Morosi and L. Pizzocchero, r-matrix theory, formal casimirs and the periodic toda lattice, J. Math. Phys. 37 (1996) 4484–4513. [20] T. Marsico, Una caratterizzazione geometrica dei sistemi che ammettono rappresentazione alla Lax estesa, unpublished PhD Thesis, University of Milano (1995). [21] G. Falqui, F. Magri, M. Pedroni and J. P. Zubelli, A bi-hamiltonian theory for stationary kdv flows and their separability, Reg. Chaotic Dyn. 5 (2000) 33–52. [22] L. Degiovanni, F. Magri and V. Sciacca, On deformation of Poisson manifolds of hydrodynamic type, preprint, the Departement of Mathematics and Applications of the University of Palermo, submitted for publication. [23] I. Vaisman, Lectures on the Geometry of Poisson Manifolds, Prog. Math. 118 (1994). [24] J. Harnad, Loop groups, R-matrices and separation of variables, in Integrable Systems: from Classical to Quantum (Montr´eal, QC, 1999), CRM Proc. Lecture Notes, 26, Amer. Math. Soc., Providence 2000, pp. 21–54. [25] F. Magri and Y. Kosmann-Schwarzbach, Lax–Nijenhuis operators for integrable systems, J. Math. Phys. 37 (1996) 6173–6197. [26] M. A. Sem¨enov-Tyan-Shansky, What is a classical r-matrix?, Funct. Anal. Appl. 17 (1983) 259–272.
November 25, 2002 16:35 WSPC/148-RMP
00150
Reviews in Mathematical Physics, Vol. 14, No. 11 (2002) 1165–1280 c World Scientific Publishing Company
ON THE SCATTERING THEORY OF MASSLESS NELSON MODELS
´ C. GERARD Centre de Math´ ematiques, UMR 7640 CNRS Ecole Polytechnique, 91128 Palaiseau Cedex, France Received 17 April 2001 Revised 6 June 2002
We study the scattering theory for a class of non-relativistic quantum field theory models describing a confined non-relativistic atom interacting with a massless relativistic bosonic field. We construct invariant spaces H± c which are defined in terms of propagation properties for large times and which consist of states containing a finite number of bosons in the region {|x| ≥ ct} for t → ±∞. We show the existence of asymptotic fields and we prove that the associated asymptotic CCR representations preserve the spaces H± c and induce on these spaces representations of Fock type. For these induced representations, we prove the property of geometric asymptotic completeness, which gives a characterization of the vacuum states in terms of propagation properties. Finally we show that a positive commutator estimate imply the asymptotic completeness property, i.e. the fact that the vacuum states of the induced representations coincide with the bound states of the Hamiltonian. Keywords: Non-relativistic quantum electrodynamics; scattering theory; asymptotic completeness.
1. Introduction In this section we describe the class of models that we will consider in this paper, discuss the hypotheses and describe the remain results.
1.1. Massless Nelson models We will consider in this paper a quantum field theory model which describes a confined atom interacting with a field of massless scalar bosons. This model is usually called the Nelson model (see [27, 2, 4, 26]). It was originally introduced in [27] as a phenomenological model of non-relativistic particles interacting with a quantized scalar field. The atom is described with the Hilbert space K := L2 (R3P , dx) , 1165
November 25, 2002 16:35 WSPC/148-RMP
1166
00150
C. G´ erard
where x = (x1 , . . . , xP ), xi is the position of particle i, and the Hamiltonian: K :=
P X X −1 ∆i + Vij (xi − xj ) + W (x1 , . . . , xP ) , 2mi i=1 i<j
where mi is the mass of particle i, Vij is the interaction potential between particles i and j and W is an external confining potential. We will assume (H0)
Vij is ∆-bounded with relative bound 0 , W ∈ L2loc (R3N ) ,
W (x) ≥ c0 |x|2α − c1 ,
c0 > 0 ,
α > 0.
It follows from (H0) that K is symmetric and bounded below on C0∞ (R3P ). We still 1 denote by K its Friedrichs extensions. Moreover we have D((K + b) 2 ) ⊂ H 1 (R3P ) ∩ D(|x|α ), which implies that 1
|x|α (K + b) 2 is bounded .
(1.1)
Note also that (H0) implies that K has compact resolvent on L2 (R3P ). The oneparticle space for bosons is h := L2 (R3 , dk) , where the observable k is the boson momentum. An important role will be played by the observable x := i
d , acting on h . dk
The observable x has the interpretation of the Newton–Wigner position. In fact the one-particle space for relativistic massless scalar bosons can be written as dk ). In this representation the selfadjoint operator L2 (R3 , |k| 1
xNW := i|k| 2
∂ 1 |k|− 2 ∂k
is called the Newton–Wigner position observable (see e.g. [29, Chap. 3c]). By the 1 dk ) and L2 (R3 , dk) the observable unitary map h(k) 7→ |k|− 2 h(k) between L2 (R3 , |k| xNW is sent onto the observable x. Hence x has the interpretation of the Newton– Wigner position. The bosonic field is described with the Fock space Γ(h) and the Hamiltonian dΓ(|k|). The non-interacting system is described with the Hilbert space H := K ⊗ Γ(h) and the Hamiltonian H0 := K ⊗ 1lΓ(h) + 1lK ⊗ dΓ(|k|) .
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1167
We assume that the interaction is of the form V :=
N X
φ(ˇ vj (xj )) ,
(1.2)
j=1
for 1 φ(ˇ vj (xj )) = √ 2
Z
vj (k)e−ik.xj a∗ (k) + v¯j (k)eik.xj a(k)dk ,
where fˇ denotes the inverse Fourier transformation of f and the functions vj satisfy Z (I0) (1 + |k|−1 )|vj (k)|2 dk < ∞ , 1 ≤ j ≤ P . The Hamiltonian describing the interacting system is now: H := H0 + V . The assumption (I0) implies, using Proposition A.1, that φ(ˇ vj (xj )) is H0 -bounded with infinitesimal bound and hence that H is selfadjoint and bounded below on D(H0 ). Note that the interaction is translation invariant (although the full Hamiltonian H is not because of the confining potential W ). Note also that using the notation introduced in (2.1) we can write: V = φ(v) , where v ∈ B(K, K ⊗ h) is defined by vψ(x1 , . . . , xP ) =
P X
e−ik.xj vj (k)ψ(x1 , . . . , xP ) .
(1.3)
j=1
1.2. Scattering theory for confined Nelson models The mathematical framework of scattering theory for confined Nelson models, known as the LSZ approach, is based on the asymptotic Weyl operators. These are defined as the limits: W ± (f ) := s- lim eitH W (ft )e−itH , t→±∞
−itω(k)
f and f belongs to a suitably chosen dense subspace h0 of h. where ft = e Once constructed they define two regular CCR representations called the asymptotic CCR representations. The asymptotic fields φ± (f ) are the hermitian fields associated to these representations. In very broad terms, the basic goal of scattering theory is to study the nature of these representations and in particular to understand the nature of their Fock sub-representations (if they exist). To discuss the scattering theory of confined Nelson models more in details, we will first generalize the discussion to include the massive case, i.e. consider a
November 25, 2002 16:35 WSPC/148-RMP
1168
00150
C. G´ erard 1
dispersion relation ω(k) = (k 2 + m2 ) 2 for m ≥ 0, and introduce some terminology: a Nelson model satisfying (H0) and (I0) is called infrared convergent if assumption (I3) below is satisfied, i.e. Z (1 + ω(k)−2 )|vj (k)|2 dk < ∞ , 1 ≤ j ≤ P , and infrared divergent if (I3) is not satisfied, i.e. Z (1 + ω(k)−2 )|vj (k)|2 dk = +∞, for some j . Note that if m > 0, (I0) implies (I3), i.e. massive Nelson models are always IR convergent. Note also that a massless model with an infrared cutoff (i.e. such that vj (k) ≡ 0 for |k| ≤ ) is clearly IR convergent and is actually very similar to (and in some aspects simpler than) a massive Nelson model. In the physical case with an ultraviolet-cutoff interaction, we have vj (k) = 1 ω(k) 2 χ(k) for χ(k) ∈ C0∞ (R3 ), so the massless Nelson model is IR divergent. Let us now discuss two basic results on confined Nelson models. • It is known (see [10] in the massive case and [17] in the massless case) that IR convergent Nelson models admit a ground state in Hilbert space, and (see [26]) that IR divergent Nelson models do not admit a ground state in Hilbert space (an elementary proof of this fact can be found in [12]). It is believed but not proved that IR divergent Nelson models do not have bound states at all. • The existence of asymptotic fields is known to hold both for IR convergent and IR divergent Nelson models. A proof is given in Sec. 8 under the (very weak) assumption (I4). (It turns out that the behavior of vj (k) for small k does not play any role for the existence of asymptotic fields). The natural vector space h0 1 is then D(ω − 2 ). Finally let us point out that the bound states of the Hamiltonian play a fundamental role because it is easy to see that they are vacua for the asymptotic CCR representations. 1.2.1. IR convergent Nelson models For IR convergent Nelson models, due to the existence of bound states, the asymptotic CCR representations admit a non trivial sub-representation of Fock type (i.e. unitarily equivalent to a direct sum of Fock representations). One can then define isometric operators Ω± called the wave operators between a direct sum of copies of Fock spaces and subspaces H± of H. One can then ask the following two fundamental questions: (1) Are the asymptotic CCR representations entirely of Fock type? If this property holds the wave operators are unitary.
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1169
(2) Are the spaces K± of vacua for the asymptotic CCR representations identical to the space of bound states of the Hamiltonian? This second property is called the asymptotic completeness property. Properties (1) and (2) were first proved in [10] for massive Nelson models. Later they were proved in [16] by similar methods for non confined massless Nelson models, with an infrared cutoff on the interaction, for energies below the ionization energy of the atom. Let us finally mention the paper by Spohn [33] where the author considers a quantized photon field interacting with a confined electron in the dipole approximation. The confining potential is supposed to be a small perturbation of a quadratic potential and hence the full Hamiltonian is a small perturbation of a solvable, quadratic Hamiltonian. It is then possible to prove asymptotic completeness directly using a Dyson expansion for the full evolution. Unfortunately the method of [33] does not seem to extend to more general interactions. 1.2.2. IR divergent Nelson models For IR divergent Nelson models, we expect that H has no bound states in Hilbert space, and therefore that the asymptotic CCR representations contain no subrepresentation of Fock type. The basic framework for confined IR divergent Nelson models is studied in [12], using ideas from [15]. Note that in [15] (see also [28]) the more complicated translation invariant model was studied, where the Haag–Ruelle approach is used instead of the LSZ approach. It turns out that any question concerning the scattering theory of an IR divergent Nelson model can be reduced to a similar question for an IR convergent Nelson model. In fact it is shown in [12] that there exist a IR convergent Nelson model Hren , called the renormalized Hamiltonian, an element g in the dual h00 of h0 such that g∈ / h, and unitary maps U ± on H such that: ± (f )e−i Im(f,g) , W ± (f )U ± = U ± Wren
f ∈ h0 ,
± (f ) are the asymptotic Weyl operators for Hren . where Wren / h, indiThe factor e−i Im(f,g) correspond to a phase translation and, since g ∈ cates that the asymptotic CCR representations for an IR divergent Nelson model should be coherent state representations. Moreover from the above formula, we see that any information on the asymptotic CCR representations for Hren immediately gives an information on the asymptotic CCR representations for H. ± admit a Fock subFor example from the fact that the representations Wren ± representation, we see that W admit a coherent state sub-representation. Similarly if asymptotic completeness holds for Hren , then the CCR representations for H are coherent state representations. Note also that the Hamiltonian Hren is exactly the
November 25, 2002 16:35 WSPC/148-RMP
1170
00150
C. G´ erard
Hamiltonian considered by Arai [4], where the Nelson model is considered in a non-Fock representation. Finally let us mention that for IR divergent Nelson models, it is also possible to define the modified wave operators and the scattering operator. 1.3. Results and methods We now describe the results and methods of this paper. We start by briefly recalling how asymptotic completeness was shown in [10] for the massive case. The answer to question (1) is rather easy in the massive case, and relies on the fact that the total number of particles is dominated by the energy. Question (2) is more difficult, even in the massive case. In [10], this problem was solved in two steps: first a direct geometric characterization of the asymptotic vacua, in terms of their propagation properties for large times, is obtained: one shows that the asymptotic vacua coincide with the states having no particles in {|x| ≥ t} for large t and > 0 arbitrarily small. This property is called in [10] the geometric asymptotic completeness. In a second step this geometric characterization of the asymptotic vacua is combined with a Mourre estimate to obtain the asymptotic completeness. In this paper we give some partial answers to the second problem for IR convergent massless Nelson models. Since IR convergent massless Nelson models admit bound states in the Hilbert space, we expect that properties (1) and (2) should also hold in this case. There are two problems to extend the results of [10] to the massless case. The first problem is that one needs a bound on the number of asymptotically free particles. This problem shows up in connection with property (1) and property (2). The second problem is the lack of smoothness of the dispersion relation |k| at k = 0. Since we cannot a priori exclude bosons of small momenta, propagation estimates with this dispersion relation are not easy to obtain. Let us now describe the new methods used in this paper do deal with these problems: 1.3.1. Singularity of the dispersion relation To handle this difficulty we will use a trick due to Derezinski and Jaksic in [13]. The idea is to add to the system a field of non-physical bosons with dispersion relation −|k|. Note that there is an analogy with a method used by Jaksic and Pillet in [25] for the study of return to equilibrium for similar models at positive temperature, where particles of negative energy appear as holes in the equilibrium distribution. The next step is to go to polar coordinates r = |k| and to glue together the two Fock spaces of bosons of positive/negative energy. In this way one obtains a Fock space over he = L2 (R, dσ) ⊗ L2 (S 2 ) with the (smooth) dispersion relation σ. This construction is described in details in Sec. 3.3 and leads to the so called expanded objects, like the expanded Hilbert space He and Hamiltonian H e .
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1171
All the analytical work will be done on expanded objects. Results on asymptotic observables or asymptotic fields for expanded Hamiltonian H e can be converted to the original Hamiltonian H using results shown in Secs. 3.10 and 8.6. Note however that a result for the expanded Hamiltonian, based on a oneparticle observable a on he , converts to a result for the original Hamiltonian only if a commutes with the projection 1l{σ≥0} . This is not the case for the observable ∂ , which plays a key role in our paper. Therefore a lot of technical work will s = i ∂σ be needed to overcome this difficulty in Secs. 10 and 11, by replacing s by another observable commuting with 1l{σ≥0} . 1.3.2. Bound on the number of particles To show that the asymptotic CCR representations are of Fock type is equivalent to show that the asymptotic number operators (see Sec. 8.2) have dense domains, which are then equal to the range of the wave operators. Experience from timedependent scattering theory suggests that it is better to replace this algebraic description of the range of the wave operators by a geometric description in terms of propagation properties for large, but finite times. This is done in our paper in the following way: We construct in Sec. 5 projections Pce± for 0 < c < 1, commuting with H e , whose range Hce± are the states in He having only a finite number of particles in {|s| ≥ c0 |t|} for each c < c0 . Converting these results to H, we obtain spaces Hc± which are invariant under the evolution and which contain the states having a finite number of particles in {|x| ≥ c0 t} for each c < c0 . We show in Theorems 12.3 and 12.5 that the spaces Hc± have the following properties: (1) Hc± are non trivial if the Hamiltonian has bound states. (2) The asymptotic CCR representations preserve Hc± and are of Fock type when restricted to Hc± . Recalling that the ranges of the wave operators Ω± are denoted by H± this property means that Hc± ⊂ H± . (3) On Hc± the geometric asymptotic completeness holds: the asymptotic vacua in Hc± are exactly the states in Hc± having no particles in {|x| ≥ c0 t} for all c < c0 , t → ±∞. (4) If a Mourre estimate holds on an energy interval ∆ with the generator of dilations as conjugate operator, then a restricted version of asymptotic completeness holds on ∆: the asymptotic vacua in Hc± with energy in ∆ coincide with the bound states of the Hamiltonian in ∆. The proof of geometric asymptotic completeness is done by working with H e and introducing asymptotic partitions of unity and geometric inverse wave operators as in [10]. The simpler approach to geometric asymptotic completeness used in [11] does not seem to be applicable here, since it relied on the fact in the massive case the wave operators are known to be unitary. Let us also note that all the observables used in [10] to show geometric asymptotic completeness are unbounded observables dominated only by the number
November 25, 2002 16:35 WSPC/148-RMP
1172
00150
C. G´ erard
operator. This was not an issue in the massive case, since these observables are then bounded by the total energy. In the massless case, this is no longer true and we have to use different observables to prove corresponding propagation estimates. There are two questions which remain open: first of all one would like to show that the spaces Hc± are equal to the whole Hilbert space H. This would imply that the asymptotic CCR representations are of Fock type and that the wave operators are unitary. We believe that it should be easier to show that Hc± = H than to show that H± = H since we have a geometric description of Hc± instead of the algebraic description of H± given by the asymptotic number operators. A more modest question would be to show that spaces Hc± for different 0 < c < 1 are all identical, which is very likely since the speed of propagation for massless bosons is equal to 1, so no particles should be found in the intermediate region {c1 t ≤ |x| ≤ c2 t} for 0 < c1 < c2 < 1. The second remaining open problem is to show a Mourre estimate for the Hamiltonian H outside of a discrete set of points. Up to now a Mourre estimate has been shown only for sufficiently small coupling constant g and outside some intervals whose size depend on g (see [32, 7, 13]). 1.4. Hypotheses Let us now state the various hypotheses that we will impose on the coupling functions vj in the sequel. In the formulation conditions (I2) and (I5) one introduces k (see (3.1)). polar coordinates σ ˜ = |k|, ω = |k| In Sec. 4, we will impose: Z (I1) (1 + |k|−1−20 )|vj (k)|2 dk < ∞ , 1 ≤ j ≤ P , 0 > 0 . This condition will be needed to obtain sharp estimates on the growth of the total number of particles along the evolution. In Sec. 5, we will impose: (I2)
σ ω) ∈ H0µ (R+ ) ⊗ L2 (S 2 ) , σ ˜ vj (˜
1≤j≤P,
µ > 0,
is the closure of C0∞ (]0, +∞[) in the topology of H µ (R). where the space This condition will allow us to construct H-invariant spaces Hc+ containing a finite number of particles in the region |x| ≥ ct, for 0 < c < 1. In Sec. 3.2, we impose: Z (I3) (1 + |k|−2 )vj (k)|2 dk < ∞ , 1 ≤ j ≤ P . H0µ (R+ )
Conditions (I3) and (H0) for α > 0 imply that H admits a ground state in the Hilbert space H. This fact has two important consequences: firstly the CCR respresentation given by the asymptotic Weyl operators W + (h) constructed in Sec. 8 admits a Fock sub-representation (see Sec. 8.2). Secondly the spaces Hc+ are non trivial (see Theorem 5.6).
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1173
In Sec. 8 we impose: (I4)
µ1 (R3 ), 1 ≤ j ≤ P, µ1 > 0 . vj ∈ Hloc
This condition will allow us to construct the asymptotic fields. In Sec. 4.5 we impose: µ 1 ∆ω 2 σ ˜ vj (˜ σ ω) ∈ L2 (R+ ) ⊗ L2 (S 2 ), 1 ≤ j ≤ P, µ2 > 0 . (I5) (1 + |˜ σ |− 2 ) 1 − 2 σ ˜ This assumption will be needed to control the angular part of the observable |x| = 1
−∆k2 . To illustrate the meaning of these various conditions, let us consider a rotationally invariant coupling function vj of the form: vj (k) = |k|ρ χ(|k|) ,
(1.4)
where χ ∈ C0∞ (R) is an ultraviolet cutoff. Then: (I0) is satisfied if ρ > −1. (I1) is satisfied if ρ > −1 + 0 . σ ) ∈ H0µ (R+ ) if (I2) is satisfied if ρ > µ − 1. In fact it is easy to see that σ ˜ ρ+1 χ(˜ µ < ρ + 1. (I3) is satisfied if ρ > − 21 (I4) and (I5) are satisfied for all values of ρ. The main results of the paper, formulated in Secs. 8 and 12, hold under (H0), (I0), (I2), (I5) for α > 1, µ > 1, µ2 > 1. Hence we see that for a coupling function of the form (1.4), the results of the paper hold for ρ > 0. 1.5. Plan of the paper Let us now describe the plan of the paper. In Sec. 2 we define some notation and recall some notions introduced in [10]. In Sec. 3 we describe the abstract framework in which we will work for most of the paper. In this framework, the original Hilbert space and Hamiltonian are denoted by H and H respectively. We introduce the so-called expanded objects, in particular the expanded Hilbert space He and the expanded Hamiltonian H e which will play an important role. The one-particle space is now L2 (R, dσ) ⊗ L2 (S 2 ) and the one-particle kinetic energy for the expanded Hamiltonian is simply the operator of multiplication by σ. A number of basic technical estimates are also proved in this section. Section 4 is devoted to estimating the growth of the number observable along the evolution. We show that if the interaction is of size O(|k|0 ) near k = 0, then the number of particles is bounded by tρ0 when t → +∞, where ρ0 depends on 0 . We also prove some estimates on the growth of the ‘angular’ part of |x| along the evolution which will be useful in Sec. 11.
November 25, 2002 16:35 WSPC/148-RMP
1174
00150
C. G´ erard
Most of the analytical work will be done on the expanded objects. In Sec. 5, we construct the spaces Hce+ described in Sec. 1.3. In Sec. 6, we construct an asymptotic partition of unity on Hce+ . Using this partition of unity, we can split a state in Hce+ into pieces having a fixed number of particles in {|s| ≥ c0 t} for c < c0 , where s is the operator canonically conjugate to σ. In Sec. 7, we construct geometric inverse wave operators on Hce+ . The asymptotic fields and the wave operators both for H and H e are constructed in Sec. 8 and their relationship is studied. In Sec. 9 we prove the geometric asymptotic completeness on the spaces Hce+ for H e . Sections 10 and 11 are devoted to a reinterpretation of the spaces Hce+ . Originally these spaces are described in terms of the observable s. As explained in Sec. 1.3, this description is not convenient to obtain corresponding spaces for H, which is the reason why another description with a different observable is given. In Sec. 12 we prove the main results of this paper for the original Hamiltonian H. The construction and properties of the spaces Hc+ are obtained from the results of Secs. 5, 8, 9 and from functorial arguments, using the alternative description of Hce+ in Sec. 11. Finally in Sec. 13, we study the consequences of a Mourre estimate for H and show that it implies the asymptotic completeness restricted to Hc+ . Various technical results are collected in the Appendix. 2. Notation 2.1. General notation We collect some notations that will be used throughout the paper. 2.1.1. Function spaces We will denote by C∞ (Rn ) the space of continuous functions on Rn tending to 0 at infinity. We set S 0 (Rn ) = {f ∈ C ∞ (Rn )k∂xα f (x)| ≤ Cα , α ∈ Nn } . We denote by H s (Rn ) the Sobolev space of order s ∈ R. 2.1.2. Hilbert spaces If H is a Hilbert space, we denote by B(H), respectively U(H) the set of bounded, respectively unitary operators on H. If H is a bounded below selfadjoint operator on H, we will denote by the letter b a constant such that H + b ≥ 1l. If H is a selfadjoint operator on H and R 3 t 7→ Φ(t) ∈ B(H) is an operatorvalued function, we denote by DΦ(t) the Heisenberg derivative: DΦ(t) = ∂t Φ(t) + [H, iΦ(t)] . For u ∈ H, we set ut = e−itH u.
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1175
Often the Hamiltonian H can be written as a sum H = H0 + V , where H0 is a ‘free’ Hamiltonian and V an interaction term. In this case we denote by D0 the free Heisenberg derivative associated to H0 : D0 Φ(t) = ∂t Φ(t) + [H0 , iΦ(t)] . If R 3 t 7→ Φ(t) is a map with values in linear operators on H and N is a positive selfadjoint operator on H we will say that Φ(t) ∈ O(N α )tµ for α ∈ R+ ,
µ∈R
if D(N α ) ⊂ D(Φ(t)) for t ∈ R and kΦ(t)(N + 1)−α k ∈ O(tµ ). The notation Φ(t) ∈ o(N α )tµ is defined similarly. If A, B are two selfadjoint operators, we denote by adA B the expression adA B = [A, B]. Usually the commutator [A, B] is first defined as a quadratic form on D(A) ∩ D(B) and then extended as an operator on some domain. The precise meaning of adA B will either be specified or clear from the context. Finally we recall (see [3]) that if A is a selfadjoint operator and B ∈ B(H), one says that B ∈ C 1 (A) if the map R 3 s 7→ eisA Be−isA ∈ B(H) is C 1 for the strong topology. If H is a selfadjoint operator, one says that H ∈ C 1 (A) if for some z ∈ C\σ(H), (z − H)−1 ∈ C 1 (A). If H ∈ C 1 (A) then the quadratic form [(z − H)−1 , iA] extend from D(A) to a bounded quadratic form on H and d isA −1 e (z − H)−1 e−isA ] = (z − H)−1 [A, iH](z − H)−1 . |s=0 = [A, i(z − H) ds For 0 < < 1, we say that H ∈ C 1+ (A) if H ∈ C 1 (A) and the map R 3 s 7→ eisA [(z − H)−1 , iA]e−isA ∈ B(H) is C for the norm topology. 2.2. Fock space notation 2.2.1. Fock spaces
N Let h be a Hilbert space, which we will call the one-particle space. Let ns h denote the symmetric nth tensor power of h. Let Sn denote the orthogonal projection of N Nn h onto ns h. The Fock space over h is the direct sum Γ(h) :=
n ∞ O M n=0
h.
s
Ω will denote the vacuum vector — the vector 1 ∈ C = N is defined as N |Nns h = n1l .
N0 s
h. The number operator
November 25, 2002 16:35 WSPC/148-RMP
1176
00150
C. G´ erard
The space of finite particle vectors, for which 1l[n,+∞] (N )u = 0 for some n ∈ N, will be denoted by Γfin (h). For h ∈ h we denote by a∗ (h), a(h), the creation annihilation operators, by φ(h) = √12 (a∗ (h) + a(h)) the field operators and by W (h) = eiφ(h) the Weyl operators (see eg [10, Sec. 2]). It is convenient to extend the definition of a∗ (v), a(v) in the following way: Suppose that K is a Hilbert space. If v ∈ B(K, K ⊗ h), then we can define a∗ (v), a(v), φ(v) as unbounded operators on K ⊗ Γ(h) by: √ a∗ (v)|K⊗Nns h := n + 1(1lK ⊗ Sn+1 )(v ⊗ 1lNns h ) , a(v) := (a∗ (v))∗ ,
(2.1)
1 φ(v) := √ (a(v) + a∗ (v)) . 2 They satisfy the estimates ka] (v)(N + 1)− 2 k ≤ kvk , 1
(2.2)
where kvk is the norm of v in B(K, K ⊗ h). If b is an operator on h, we define the operator dΓ(b) : Γ(h) → Γ(h) , dΓ(b)|Nns h :=
n X j=1
1|l ⊗ ·{z · · ⊗ 1}l ⊗ b ⊗ 1|l ⊗ ·{z · · ⊗ 1}l . j−1
n−j
If hi , i = 1, 2 are Hilbert spaces, q : h1 7→ h2 is a bounded linear operator, one defines Γ(q) : Γ(h1 ) 7→ Γ(h2 ) Γ(q) : Nn h := q ⊗ · · · ⊗ q . s
If q, r are operators from h1 to h2 one defines dΓ(q, r) : Γ(h1 ) 7→ Γ(h2 ) , dΓ(q, r)|Nns h1 :=
n X j=1
q ⊗ ···⊗ q ⊗ r ⊗ q ⊗ ···⊗ q. | {z } | {z } j−1
n−j
Let us now introduce some notation related to Heisenberg derivatives. Let ω be a selfadjoint operator on h. We denote by d0 the Heisenberg derivative associated to ω: ∂ + [ω, i·], acting on B(h) . d0 = ∂t Let D0 be the Heisenberg derivative associated to the Hamiltonian H0 = dΓ(ω). Then for a function R 3 t 7→ b(t) ∈ B(h), we have: D0 dΓ(b(t)) = dΓ(d0 b(t)) .
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1177
2.2.2. Operators Pk (f ) and Qk (f ) We now recall some objects introduced in [10] which will play an important role in the sequel. Let f0 , f∞ be operators on h. Let f := (f0 , f∞ ). We define the operators Pk (f ) = Pk (f0 , f∞ ) and Qk (f ) = Qk (f0 , f∞ ) for k ∈ N by setting Pk (f ) : Γ(h) → Γ(h) , X Pk (f )|Nns h :=
f1 ⊗ · · · ⊗ fn ,
]{i|i =∞}=k
where i = 0, ∞ and Qk (f ) :=
k X
Pj (f ) .
j=0
We will sometimes denote Pk (f ) by Pk (f0 , f∞ ) if f = (f0 , f∞ ). For f = (f0 , f∞ ) and g = (g0 , g∞ ) we define dPk (f, g) : Γ(h) → Γ(h) , X f1 ⊗ · · · ⊗ fj−1 ⊗ g0 ⊗ fj+1 ⊗ · · · ⊗ fn dPk (f, g)|Nns h := ]{i|i =∞}=k
+
X
f1 ⊗ · · · ⊗ fj−1 ⊗ g∞ ⊗ fj+1 ⊗ · · · ⊗ fn ,
]{i|i =∞}=k−1
and dQk (f, g) :=
k X
dPj (f, g) .
j=0
2.2.3. Canonical map Let hi , i = 1, 2 be Hilbert spaces. Let pi be the projection of h1 ⊕h2 onto hi , i = 1, 2. We define U : Γ(h1 ⊕ h2 ) → Γ(h1 ) ⊗ Γ(h2 ) , by UΩ = Ω ⊗ Ω , U a (h) = (a] (p1 h) ⊗ 1l + 1l ⊗ a] (p2 h))U, h ∈ h1 ⊕ h2 . ]
(2.3)
Since the vectors a∗ (h1 ) · · · a∗ (hn )Ω form a total family in Γ(h1 ⊕ h2 ), and since U preserves the canonical commutation relations, U extends as a unitary operator from Γ(h1 ⊕ h2 ) to Γ(h1 ) ⊗ Γ(h2 ).
November 25, 2002 16:35 WSPC/148-RMP
1178
00150
C. G´ erard
ˇ ˇ k) 2.2.4. Operators Γ(j) and dΓ(j, Let j0 , j∞ ∈ B(h). Set j = (j0 , j∞ ). We identify j with the operator j : h → h ⊕ h, jh := (j0 h, j∞ h) . We have j∗ : h ⊕ h → h , ∗ h∞ , j ∗ (h0 , h∞ ) = j0∗ h0 + j∞
and ∗ j∞ . j ∗ j = j0∗ j0 + j∞
By second quantization, we obtain the map Γ(j) : Γ(h) → Γ(h ⊕ h) . Let U denote the canonical map between Γ(h⊕h) and Γ(h)⊗Γ(h) introduced above. We define ˇ Γ(j) : Γ(h) → Γ(h) ⊗ Γ(h) , cΓ(j) := U Γ(j) . ˇ Another formula defining Γ(j) is n ∗ n ∗ ∗ ˇ Γ(j)Π i=1 a (hi )Ω := Πi=1 (a (j0 hi ) ⊗ 1l + 1l ⊗ a (j∞ hi ))Ω ⊗ Ω, hi ∈ h .
(2.4)
Let N0 = N ⊗ 1l, N∞ = 1l ⊗ N acting on Γ(h) ⊗ Γ(h). Then if we denote by Ik the Nn−k Nk Nn h and h⊗ h, then we have: natural isometry between s n! N ˇ n j0 ⊗ · · · ⊗ j0 ⊗ j∞ ⊗ · · · ⊗ j∞ . = Ik 1l{k} (N∞ )Γ(j)| s h {z } | {z } (n − k)!k! | n−k
Finally we set ˇ . ˇ k (j) := 1l{k} (N∞ )Γ(j) Γ Let j = (j0 , j∞ ), k = (k0 , k∞ ) be maps from h to h ⊕ h. We set ˇ k) : Γ(h) → Γ(h) ⊗ Γ(h) , dΓ(j, ˇ ) := U dΓ(j, k) . dΓ(j, ˇ k) = U dΓ(k) will be denoted simply by dΓ(k). ˇ The operator dΓ(1,
k
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1179
2.2.5. Scattering identification operator Let i : h ⊕ h → h, (h0 , h∞ ) 7→ h0 + h∞ . An important role in scattering theory is played by the following identification operator (see [23]): ˇ ∗ )∗ : Γ(h) ⊗ Γ(h) → Γ(h) . I := Γ(i)U ∗ = Γ(i √ Note that since kik = 2, the operator Γ(i) is unbounded. Another formula defining I is: I
n Y
a∗ (hi )Ω ⊗
i=1
p Y
a∗ (gi )Ω :=
i=1
p Y i=1
a∗ (gi )
n Y
a∗ (hi )Ω ,
hi , g i ∈ h .
(2.5)
i=1
If h = L2 (Rd , dk), then we can write still another formula for I: Z p O 1 h . (2.6) Iu⊗ψ = ψ(k1 , . . . , kp )a∗ (k1 ) · · · a∗ (kp )udk , u ∈ Γ(h) , ψ ∈ 1 (p!) 2 s We deduce from (2.5) that I(N + 1)−k/2 ⊗ 1l restricted to Γ(h) ⊗
k O
h is bounded.
(2.7)
s
Let j0 , j∞ ∈ B(h) such that 0 ≤ j0 ≤ 1, 0 ≤ j∞ ≤ 1, and j0 + j∞ = 1. Let j = (j0 , j∞ ) : h → h ⊗ h, as above. Clearly 0 ≤ j ∗ j ≤ 1, hence kjk ≤ 1, and ˇ therefore Γ(j) is a bounded operator. We have ij = 1, hence ˇ I Γ(j) = 1l . We also have ˇ = Qk (j) , I1l{1,...,k} (N∞ )Γ(j) ˇ = Pk (j) . I1l{k} (N∞ )Γ(j)
(2.8)
2.2.6. Use of sub- and superscripts To help the reader with the notation, we briefly describe the use of various suband superscripts in the paper. Asymptotic observables obtained by letting the time t tend to +∞ will be denoted with the superscript +. Observables depending on a constant c, which has the meaning of a speed of propagation, will be denoted with the subscript c. In addition to the original objects, e.g. Hilbert spaces, Hamiltonians, asymptotic observables, wave operators, etc we will consider two other families of associated objects:
November 25, 2002 16:35 WSPC/148-RMP
1180
00150
C. G´ erard
Expanded objects, which correspond to the addition of non-physical bosons of negative energy to the system, and which will be denoted by adding a superscript e to the corresponding original objects. Sometimes an object defined in the expanded framework has no counterpart in the original framework, in which case we will omit the superscript e. Extended objects, which correspond to the addition of asymptotically free bosons, and which will be denoted with a subscript ext, (with some exceptions). 3. Massless Pauli Fierz Hamiltonians We describe in this section an abstract framework introduced in [10] in which we will work for most of the paper. We also define the expanded objects, which correspond to adding bosons of negative energy to the system. Finally we prove various basic estimates which will be needed in the sequel. 3.1. The abstract setup We describe now the abstract framework in which we will work for most of the paper. The models that we will introduce describe a small system (e.g. an atom or a spin) interacting with a scalar bosonic field. Using the terminology of [10] we can call this class of models massless Pauli–Fierz models. The small system is described with a separable Hilbert space K and a bounded below selfadjoint operator K on K. Without loss of generality we will assume that K is positive. σ ) ⊗ g, where g is some auxiliary separable Hilbert space, be Let h := L2 (R+ , d˜ the one-particle boson space. The Hilbert space of the interacting system is H := K ⊗ Γ(h) . The one-particle energy is the operator σ ˜ of multiplication by σ ˜ on h. The free Hamiltonian describing the non interacting system is σ) . H0 := K ⊗ 1lΓ(h) + 1lK ⊗ dΓ(˜ The interaction is described by an operator v ∈ B(K, K ⊗ h). Note that since K and g are separable, we can consider v as a function ˜ 7→ v(˜ σ ) ∈ B(K, K ⊗ g) , R+ 3 σ defined a.e. σ ˜ by setting ψ ∈ K,
v(˜ σ )ψ := (vψ)(˜ σ) ,
σ ) ⊗ g with L (R , d˜ σ ; K ⊗ g). and identifying K ⊗ L (R , d˜ The Hamiltonian describing the interacting system is now 2
+
2
+
H := H0 + φ(v) , acting on H, where φ(v) is defined in (2.1).
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1181
We will assume that (I0 0)
(1 + σ ˜ − 2 )v ∈ B(K, K ⊗ h) , 1
which implies by Proposition A.1 that φ(v) is H0 -bounded with infinitesimal bound and hence that H is selfadjoint and bounded below on D(H0 ). In terms of the function v(˜ σ ) (I0 0) is equivalent to Z +∞ (1 + |˜ σ |−1 )kv(˜ σ )k2 d˜ σ < ∞. 0
We will denote by N the number operator on H N = 1lK ⊗ dΓ(1l) . We now explain how to cast the massless Nelson Hamiltonian into this framework. Let H be a massless Nelson Hamiltonian as introduced in Sec. 1.1. On the one-particle space L2 (R3 , dk) we introduce polar coordinates by the unitary map: σ) ⊗ g , u : L2 (R3 , dk) → L2 (R+ , d˜
(3.1)
uψ(˜ σ , ω) := σ ˜ ψ(˜ σ ω) , for g = L2 (S 2 ). We lift the unitary map u to a map 1lK ⊗Γ(u) from K⊗Γ(L2 (R3 , dk)) into K ⊗ Γ(h) and the free Hamiltonian K ⊗ 1l + 1l ⊗ dΓ(|k|) becomes K ⊗ 1l + 1l ⊗ dΓ(˜ σ ). The interaction φ(v) becomes φ(uv), which we will still denote by φ(v). If we represent as in Sec. 1.1 v ∈ B(K, K ⊗ h) by a function R3 3 k 7→ v(k) ∈ B(K) a.e. k , then uv is represented by the function v(˜ σ) = σ ˜ v(˜ σ ω), a.e. σ ˜ where v(˜ σ ω) for fixed σ ˜ is an element of B(K, K ⊗ g). The Pauli–Fierz Hamiltonian (still denoted by H) obtained in this way is said associated to the Nelson Hamiltonian H. 3.2. Existence of bound states The existence of bound states of H in the Hilbert space H is an important property of the Hamiltonian H. In particular it implies that the CCR representation given by the asymptotic fields constructed in Sec. 8.1 admits a Fock sub-representation. In this subsection we recall a result of [17] proving the existence of a ground state for H under appropriate condition on the interaction v. For related results see [5, 6, 4, 19, 26]. We introduce the following conditions (H0 0) (I0 3)
(K + i)−1 is compact on K . (1 + σ ˜ −1 )v(K + 1)− 2 ∈ B(K, K ⊗ h) . 1
November 25, 2002 16:35 WSPC/148-RMP
1182
00150
C. G´ erard
In terms of the function v(˜ σ ) (I0 3) is equivalent to Z +∞ 1 1 σ < ∞. 1 + 2 kv(˜ σ )(K + 1)− 2 k2B(K,K⊗g) d˜ σ ˜ 0 The following result is shown in [17, Theorem 1]. Theorem 3.1. Assume hypotheses (H0 0), (I0 0), (I0 3). Then infspec(H) is an eigenvalue of H. In other words H admits a ground state in H. The condition corresponding to (I0 3) for the concrete Nelson Hamiltonian is (I3) introduced in Sec. 1.1. Hence we obtain: Theorem 3.2. Assume hypotheses (H0) for α > 0, (I0), (I3). Then infspec(H) is an eigenvalue of H. In other words H admits a ground state in H. 3.3. Expanded objects We describe in this subsection the expanded objects, corresponding to the addition of non physical bosons of negative energy. This idea appeared first in [13]. We use the notation in Sec. 3.1. Let ˜ e := K ⊗ Γ(h) ⊗ Γ(h) , H ˜ e := H ⊗ 1lΓ(h) − 1lK⊗Γ(h) ⊗ dΓ(˜ σ) , H ˜ e is selfadjoint ˜ e . As the sum of two commuting selfadjoint operators H acting on H on its natural domain and essentially selfadjoint on D(H) ⊗ D(dΓ(˜ σ )). We set he := L2 (R, dσ) ⊗ g ,
He := K ⊗ Γ(he ) ,
and consider the unitary map w : h ⊕ h → he ,
(
h1 ⊕ h2 → h with h(σ) :=
h1 (σ), σ ≥ 0 , h2 (−σ), σ < 0 .
If U : Γ(h ⊕ h) → Γ(h) ⊗ Γ(h)
(3.2)
is the canonical map defined in Sec. 2.2, we set ˜ e → K ⊗ Γ(he ) = He , W :H W := 1lK ⊗ Γ(w)U −1 . We set also v e := 1lK ⊗ w(v ⊕ 0) ∈ B(K, K ⊗ he ) ,
(3.3)
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1183
where v ⊕0 is an element of B(K, K⊗(h⊕h)). In terms of operator-valued functions, we have v e (σ) = v(σ)1l{σ≥0} .
(3.4)
Note also that w(˜ σ ⊕ −˜ σ )w∗ = σ , where σ is the operator of multiplication by σ on he = L2 (R, dσ) ⊗ g. Using the tensorial properties of U (see e.g. [10, Sec. 2.7]), we obtain: ˜ e W ∗ =: H e , WH where H e = K ⊗ 1lΓ(he ) + 1lK ⊗ dΓ(σ) + φ(v e ) . On H e , we denote by N e = 1lK ⊗ dΓ(1l) the number operator and by H0e = K ⊗ 1lΓ(he ) + 1lK ⊗ dΓ(σ) the ‘free’ expanded Hamiltonian. 3.4. Conversion of asymptotic observables In this subsection we explain how to deduce results for the scattering theory of H from corresponding results for the scattering theory of H e . We start by describing the canonical embedding of H into He . Let IΩ :
˜ e = H ⊗ Γ(h) H→H u 7→ u ⊗ Ω ,
where Ω ∈ Γ(h) is the vacuum vector. We have IΩ∗ IΩ = 1lH , IΩ IΩ∗ = 1lH ⊗ |ΩihΩ| , IΩ e−itH = e−itH IΩ . e
If we set j:
h → he h 7→ 1l{σ≥0} h ,
then WIΩ = 1lK ⊗ Γ(j) is an isometry from H into He and WIΩ e−itH = e−itH WIΩ . e
(3.5)
November 25, 2002 16:35 WSPC/148-RMP
1184
00150
C. G´ erard
Let us now describe how to convert various asymptotic observables. Let b ∈ B(he ), b = b∗ such that 1l{σ≤0} b1l{σ≥0} = 0 .
(3.6)
We set then b± := 1l{±σ≥0} b1l{±σ≥0} . Note that b+ can be identified with j ∗ bj ∈ B(h). Lemma 3.3. Let b ∈ B(he ), b = b∗ with 1l{σ≤0} b1l{σ≥0} = 0. Then (i)
IΩ∗ W −1 Γ(b) = Γ(b+ )IΩ∗ W −1 , Γ(b)WIΩ = WIΩ Γ(b+ ) Γ(b+ ) = IΩ∗ W −1 Γ(b)WIΩ .
(ii)
IΩ∗ W −1 f (dΓ(b))WIΩ = f (dΓ(b+ )) ,
f ∈ C∞ (R) .
Proof. Because of the hypothesis on b we have w−1 bw = b+ ⊕ b− . Hence W −1 Γ(b)W = U Γ(w−1 bw)U −1 = U Γ(b+ ⊕ b− )U −1 = Γ(b+ ) ⊗ (b− ) . This easily implies (i). By the same argument IΩ∗ W −1 e−itdΓ(b) WIΩ = IΩ∗ WΓ(e−itb )WIΩ = Γ(e−itb+ ) = e−itdΓ(b+ ) . This proves (ii) for f (λ) = e−itλ . By a density argument (ii) holds for all f ∈ C∞ (R).
The following proposition describes how to deduce existence of asymptotic observables for H from corresponding results for H e . Proposition 3.4. Let R 3 t 7→ bt ∈ B(he ), with bt = b∗t , bt ≥ 0, supt∈R kbt k < ∞ and 1l{σ≤0} bt 1l{σ≥0} = 0. Let b+t = 1l{σ≥0} bt 1l{σ≥0} . (I) Assume that s- lim eitH Γ(bt )e−itH = Γe+ exists, e
e
t→+∞
[H e , Γe+ ] = 0 .
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1185
Then (i)
s- lim eitH Γ(b+t )e−itH = Γ+ exists,
(ii)
[H, Γ+ ] = 0 ,
(iii)
Γe+ WIΩ = WIΩ Γ+ , Γ+ = IΩ∗ W −1 Γe+ WIΩ ,
(iv)
IΩ∗ W −1 Γe+ = Γ+ IΩ∗ W −1 .
t→+∞
(II) Assume that s- lim eitH (dΓ(bt ) + λ)−1 e−itH =: Re+ (λ) exists for λ ∈ C\R− , e
e
t→+∞
[H e , Re+ (λ)] = 0 . Then (i)
s- lim eitH (dΓ(bt ) + λ)−1 e−itH =: R+ (λ) exists for λ ∈ C\R− ,
(ii)
R+ (λ) = IΩ∗ W −1 Re+ (λ)WIΩ ,
(iii)
[H, R+ (λ)] = 0 .
t→+∞
(III) By Proposition A.7 the limits P e+ := s- lim −1 Re+ (−1 ) , →0
P + := s- lim −1 R+ (−1 ) →0
exist and are orthogonal projections. Then (i)
P + = IΩ∗ W −1 P e+ WIΩ ,
(ii)
P e+ = WP + ⊗ 1lΓ(h) W −1 .
Proof. (I) follows from Lemma 3.3(i) and the identity e−itH W IΩ = W IΩ e−itH . (II) follows from exactly the same arguments, using Lemma 3.3(ii) instead. (III)(i) follows directly from (II)(ii). To prove (III)(ii) is equivalent to show that e
W −1 P e+ W = P + ⊗ 1lΓ(h) . We have: ˜e
˜e
W −1 P e+ W = s- lim s- lim eitH (1l + dΓ(b+t ) ⊗ 1l + 1l ⊗ dΓ(b−t ))−1 e−itH , →0
t→+∞
˜e
˜e
P + ⊗ 1lΓ(h) = s- lim s- lim eitH (1l + dΓ(b+t ) ⊗ 1l)−1 e−itH . →0
t→+∞
Using that ˜e
[1lH ⊗ (N + 1)−1 , e−itH ] = 0 ,
November 25, 2002 16:35 WSPC/148-RMP
1186
00150
C. G´ erard
and k((1l + dΓ(b+t ) ⊗ 1l + 1l ⊗ dΓ(b−t ))−1 − (1l + dΓ(b+t ) ⊗ 1l)−1 )1l ⊗ (N + 1)−1 k ≤ C , we obtain that (W −1 P e+ W − P + ⊗ 1lΓ(h) )1lH ⊗ (N + 1)−1 = 0 , which proves (III)(ii). 3.5. Properties of the expanded Hamiltonian We use the notation of Secs. 1.1 and 3.3. The main problem encountered when working with the Hamiltonian H e is that it is not bounded below. As a consequence we cannot use energy cutoffs χ(H e ) to control error terms in propagation estimates. To overcome this difficulty we will use the fact that H e commutes with other observables. For example H e commutes with the Hamiltonians e := K ⊗ 1l + 1l ⊗ dΓ(σ+ ) + φ(v e ) H+
and e := 1l ⊗ dΓ(σ− ) , H− e e e + H− and that H+ is selfadjoint on for σ± = 1l{±σ≥0} σ. Note that H e = H+ 0 e D(K ⊗ 1l + 1l ⊗ dΓ(σ+ )), using (I 0). As a consequences H commutes with the Hamiltonian e e − H− = K ⊗ 1lΓ(he ) + 1lK ⊗ dΓ(|σ|) + φ(v e ) . L := H+
We deduce as in Sec. 3.1 from hypothesis (I0 0) and (3.4) that L is selfadjoint and bounded below on D(L0 ), for L0 = K ⊗ 1lΓ(he ) + 1lK ⊗ dΓ(|σ|) . e e ) ∩ D(H− ), H e is essentially selfadjoint on D(L) It is easy to see that D(L) = D(H+ and that
H e = H0e + φ(v e ) on D(L) . In the sequel propagation estimates for H e will contain cutoffs χ(L), which will be used to control error terms. For later use we collect below various basic properties of L. Lemma 3.5. Assume (I0 0). Then (i) (z − L)−1 ∈ C 1 (N e ) for z ∈ C\σ(L), (ii) (z − L)−1 N e = N e (z − L)−1 + i(z − L)−1 φ(iv e )(z − L)−1 , for z ∈ C\σ(L), as an identity on D(N e ), (iii) χ(L) preserves D((N e )r ) for r ∈ R+ , χ ∈ C0∞ (R) and (L + i)(N e )r χ(L)(N e + 1)−r is bounded for r ∈ R+ .
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1187
Proof. We have Ls := eisN Le−isN = L0 + φ(eis v e ) . e
e
1
Since φ(eis v) is L02 bounded, we have D(Ls ) = D(L0 ) and kL0 (Ls − z)−1 k ≤ C|Im z|−1 ,
z ∈ K b C\R ,
uniformly for |s| ≤ 1. Then s1 ((z − Ls )−1 − (z − L)−1 ) = s−1 (z − Ls )−1 (Ls − L)(z − L)−1 = s− 1(z − Ls )−1 φ((eis − 1)v e )(z − L)−1 . Using Proposition A.1, we see that (L0 + 1)− 2 φ(s−1 (eis − 1)v e )(L0 + 1)− 2 → (L0 + 1)− 2 φ(iv e )(L0 + 1)− 2 in norm , 1
1
1
1
and ((z − Ls )−1 − (z − L)−1 )(L0 + 1) 2 → 0 in norm 1
when s → 0. Hence s−1 ((z − Ls )−1 − (z − L)−1 ) → (z − L)−1 φ(iv e )(z − L)−1 in norm , when s → 0. This proves (i) and (ii). To prove (iii) we use the identity (N e + 1)(z − L)−1 (N e + 1)−1 = (z − L)−1 − i(z − L)−1 φ(iv e )(z − L)−1 (N e + 1)−1 . By induction, using the fact that adkN φ(v e ) = i−k φ(ik v e ), we obtain that k(L + i)(N e + 1)k (z − L)−1 (N e + 1)−k k ∈ O(|Im z|−Ck ) ,
z ∈ K b C\R ,
Using then the functional calculus formula (see eg [20, 9]): Z i ∂z¯χ(z)(z ˜ − A)−1 dz ∧ d¯ z, χ(A) = 2π C
k ∈ N.
(3.7)
where A is a selfadjoint operator and χ ˜ ⊂ C0∞ (C) is an almost-analytic extension of χ satisfying χ| ˜ R = χ, ˜ ≤ Cn |Im z|n , |∂z¯χ(z)|
n ∈ N,
we obtain (iii) for r ∈ N. Then we extend the result to r ∈ R+ by interpolation. Lemma 3.6. Assume (I0 0). Let b = b(σ) be a bounded real function supported in {|σ| ≥ 0 }, 0 > 0, and B = dΓ(b). Then (i) (z − L)−1 ∈ C 1 (B), (ii) (z − L)−1 B = B(z − L)−1 + i(z − L)−1 φ(ibv e )(z − L)−1 , for z ∈ C\σ(L), as an identity on D(B). (iii) B k (L + i)−k is bounded for k ∈ N.
November 25, 2002 16:35 WSPC/148-RMP
1188
00150
C. G´ erard
Proof. (i) and (ii) can be shown as in Lemma 3.5, introducing Ls = eisB Le−isB = L0 + φ(eisb v e ). Since supp b ⊂ {|σ| ≥ 0 }, B(L + i)−1 is bounded, which proves (iii) for k = 1. To prove (iii) for arbitrary k, we commute repeatedly factors of B through (L+i)−1 , using (ii), until each factor of B is followed by a factor of (L + i)−1 . Commutation of B with (L + i)−1 produces an extra factor of (L + i)−1 φ(ibv)(L + i)−1 . Morever adkB φ(v e ) = i−k φ(ik bk v e ) is L0 -bounded. The details are left to the reader. e The Hamiltonians H± have similar properties.
Lemma 3.7. Assume (I0 0). Then e −1 e ) ∈ C 1 (N e ) for z ∈ C\σ(H± ), (i) (z − H± e −1 e e e −1 e −1 e −1 ) φ(iv e )(z − H± ) , for z ∈ (ii) (z − H± ) N = N (z − H± ) + i(z − H± e e C\σ(H± ), as an identity on D(N ), e ) preserves D((N e )r ) for r ∈ R+ , χ ∈ C0∞ (R), and (iii) χ(H± e e + i)(N e )r χ(H± )(N e + 1)−r (H±
is bounded for r ∈ R+ . A consequence of Lemma 3.7 is e−itH χ(L) preserves D((N e )r ) , e
for χ ∈ C0∞ (R), r ∈ R+ .
e −itH+
−itH e
χ(L) as e In fact we can write e χ1 , χ2 ∈ C0∞ (R) and apply Lemma 3.7(iii). A consequence of Lemma 3.5 and (3.8) is
e e e χ1 (H+ )e−itH− χ2 (H− )χ(L),
(3.8) for some
Proposition 3.8. Assume (I0 0). Then k(N e + 1)r e−itH χ(L)(N e + 1)−r k ≤ Cr htir , e
χ ∈ C0∞ (R) ,
r ∈ R+ .
(3.9)
Proof. We will prove the proposition for r ∈ N by induction and then argue by interpolation. Let u1 ∈ D(N ), u2 ∈ D(L) and consider f (t) = (u2t , N e χ(L)u1t ) (note that f (t) is finite by (3.8)). We have f 0 (t) = i(H e u2t , N e χ(L)u1t ) − i(u2t , N e H e χ(L)u1t ) = i(H0e u2t , N e χ(L)u1t ) − i(u2t , N e H0e χ(L)u1t ) + i(φ(v e )u2t , N e χ(L)u1t ) − i(u2t , N e φ(v e )χ(L)u1t ) = −(u2t , φ(iv e )χ(L)u1t ) . By Proposition A.1, we obtain |f 0 (t)| ≤ Cku2 k ku1k , which proves (3.9) for r = 1.
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1189
Assume now that (3.9) holds for all r0 < r. Let u1 ∈ D((N e )r ), u2 ∈ D(L). Again we differentiate f (t) = (u2t , (N e )r χ(L)u1t ) , and obtain f 0 (t) = (u2t , [φ(v e ), i(N e )r ]u1t ) . The commutator [φ(v e ), i(N e )r ] can be written as a sum of terms of the form φ(iα v e )(N e )β for β ≤ r − 1. We write φ(iα v e )(N e )β χ(L) = φ(iα v e )(L + i)−1 (L + i)(N e )β χ1 (L)(N e + 1)−β (N e + 1)β χ(L) , for χ1 χ = χ. By Proposition A.1 and Lemma 3.5(iii), we obtain |f 0 (t)| ≤ Cku2 k k(N e + 1)r−1 χ(L)u1t k ≤ Chtir−1 ku2 k k(N e + 1)r−1 u1 k , by the induction hypothesis. This proves (3.9) for r. Finally we state a lemma analogous to Lemma 3.5 for the Hamiltanian H. Lemma 3.9. Assume (I0 0). Then (i) (z − H)−1 ∈ C 1 (N ) for z ∈ C\σ(H), (ii) (z − H)−1 N = N (z − H)−1 + i(z − H)−1 φ(iv)(z − H)−1 , for z ∈ C\σ(H), as an identity on D(N ), (iii) χ(H) preserves D(N r ) for r ∈ R+ , χ ∈ C0∞ (R) and (H + i)N r χ(H)(N + 1)−r is bounded for r ∈ R+ . The proof is completely similar to Lemma 3.5. 3.6. Bounds on field operators σ2 + σ ˜ − 2 ), 1 ≤ i ≤ n. Then Lemma 3.10. Assume (I0 0) and let hi ∈ D(˜
n n
Y Y 1 1
−n/2 φ(h )(H + b) k(1 + σ ˜2 +σ ˜ − 2 )hi k . ≤ C
i n
1
1
1
1
Lemma 3.11. Assume (I0 0) and let hi ∈ D(|σ| 2 + |σ|− 2 ), 1 ≤ i ≤ n. Then:
n n
Y
Y 1 1
−n/2 ≤ C φ(h )(L + b) k(1 + |σ| 2 + |σ|− 2 )hi k .
i n
1
1
1
1
The proofs of Lemmas 3.10 and 3.11 being completely similar, we prove only Lemma 3.11. 1
Proof. Let first B ≥ 1, A be two selfadjoint operators with D(B 2 ) ⊂ D(A). Then Z ∞ 1 − 12 −1 =π s− 2 A(s + B)−1 ds , (3.10) AB 0
November 25, 2002 16:35 WSPC/148-RMP
1190
00150
C. G´ erard
where the integral is norm convergent on D(B ) for any > 0. As bounded operators on He , we have: A(s + B)−1 = (s + B)−1 A − (s + B)−1 [A, B](s + B)−1 .
(3.11)
If we assume that [A, B] extends from a bounded quadratic form on D(B) to a 1 bounded quadratic form on D(B 2 ), we deduce from (3.10), (3.11) that Z ∞ 1 − 12 −1 s− 2 (s + B)−1 [A, B](s + B)−1 ds (3.12) [A, B ] = −π 0
satisfies k[A, B − 2 ]k ≤ CkB − 2 [A, B]B − 2 k . 1
1
1
(3.13)
1 2
For B = L + b, A = φ(h), h ∈ D(|σ| ), we have [A, B] = −iφ(i|σ|h) + i Im(h, v e )he . By Proposition A.1 and (I0 0), we obtain kB − 2 [A, B]B − 2 k ≤ Ckhσi 2 hk , 1
1
1
and hence using (3.13) k[φ(h), (L + b)− 2 ]k ≤ Ckhσi 2 hk . 1
1
(3.14)
Similarly we have 1
1
kadφ(h1 ) adφ(h2 ) Lk ≤ Ckhσi 2 h1 k khσi 2 h2 k ,
(3.15)
and adφ(h1 ) · · · adφ(hl ) L = 0 ,
for l ≥ 3 .
(3.16)
We deduce easily from the identity (3.12) that kadφ(h1 ) · · · adφ(hl ) (L + b)− 2 k ≤ Ci Πl1 khσi 2 hi k . 1
1
(3.17)
Let us now prove the lemma. We consider more generally products of factors of φ(h1 ), adφ(h1 ) · · · adφ(h1 ) R
and R
for R = (L + b)− 2 . If a product P contains n factors of φ(hi ) (for different i) and p factors of R, we define its degree d(P ) to be equal to n and its weight w(P ) to be equal to n − p. Note that d(P1 P2 ) = d(P1 ) + d(P2 ), w(P1 P2 ) = w(P1 ) + w(P2 ). We claim that a product P of zero weight is a bounded operator, which in particular implies the lemma. The claim is clearly true in two cases: if the degree of P is zero and if each factor of φ(hi ) in P is followed by a factor of R and the weight of P is zero. In this last case we say that P is controlled. Commuting φ(h) with a factor R produces an extra term adφ(h) R of zero weight and commuting φ(h) with a factor of adφ(h1 ) · · · adφ(hl ) R also. Hence we can move around the factors of φ(hi ) in a product P of zero weight until we get a controlled product of zero weight, producing error terms of zero weight and strictly lower 1
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1191
degrees. Iterating this procedure, we see that P is a bounded operator. The fact that
n n
Y Y 1 1
k(1 + |σ| 2 + |σ|− 2 )hi k ,
φ(hi )(L + b)−n/2 ≤ Cn
1
1
follows then from (3.17) and Proposition A.1. 4. Number Estimates In this section we prove some bounds on the growth of the number observable along the evolution which take into account the infrared behavior of the interaction. We consider abstract Pauli–Fierz Hamiltonians as introduced in Sec. 3.1. The estimates in Secs. 4.1 and 4.2 show that if the interaction behaves for small k like |k|−1+0 for 0 > 0, (see hypothesis (I0 1) below and the discussion in Sec. 1.4), the total number of particles (both for H and H e ) is bounded by |t|δ for all δ > (1 + 0 )−1 . As explained in Sec. 3.5, propagation estimates shown in Secs. 5 and 6 will contain cutoffs χ(L). The estimates in Sec. 4.2 will be used for H e to bound commutators between χ(L) and second-quantized observables based on the operators ∂ . s = i ∂σ In Secs. 4.3 and 4.4, we prove that for large times no particles are found with momentum smaller than t−δ for δ > −1 0 . This fact will be used in Sec. 11 to reformulate geometric asymptotic completeness for H e in terms of the observable |s|0 introduced in Sec. 10.2. Finally Sec. 4.5 contains rather easy estimates on the ‘angular part’ of |x|, needed for the final description of geometric asymptotic completeness for H. We introduce the following strengthened version of (I0 0): Z ∞ 0 (1 + |˜ σ |−1−20 )kv ∗ (˜ σ )v(˜ σ )kB(K) d˜ σ < ∞ , 0 > 0 . (I 1) 0
Note that (I0 1) implies that Z (1 + |˜ σ |−1 )kv ∗ (˜ σ )v(˜ σ )kB(K) d˜ σ ≤ Cr20 , |˜ σ|≤r
0 < r ≤ 1.
(4.1)
For the Nelson Hamiltonians, the condition (I1) in Sec. 1.1 implies that the associated Pauli–Fierz Hamiltonian satisfies (I0 1). 4.1. Case of H We consider first the Hamiltonian H introduced in Sec. 3.1. In Proposition 4.3 below we show that under hypothesis (I0 1) the number operator grows at most −1 like t(1+0 ) along the evolution. In the sequel we will use Proposition 4.2 which contains essentially the same information. Let f ∈ C0∞ (R) be an even function with f (λ) ≡ 1 near 0, 0 ≤ f ≤ 1, λf 0 (λ) ≤ 0. Let ˜) , rt := f (tρ0 σ
Nt := dΓ(rt ) ,
for ρ0 = (1 + 0 )−1 .
November 25, 2002 16:35 WSPC/148-RMP
1192
00150
C. G´ erard
Lemma 4.1. Let χ ∈ C0∞ (R), F ∈ S 0 (R). Then
χ(H) ,
F
Nt tδ
∈ O(t−δ−ρ0 0 ) .
Proof. Using formula (3.7), we write
χ(H), F
Nt tδ
i = 2π
Z
−1
C
∂z¯χ(z)(z ˜ − H)
H, F
Nt tδ
z. (z − H)−1 dz ∧ d¯
t )] = [φ(v), F ( Ntδt )]. To estimate this term, we use a commutator We have [H, F ( N tδ expansion lemma (see e.g. [9, Lemma C.3.1]). We have adjNt φ(v) = (−i)j φ(ij rtj v). This is an unbounded operator, but the remainder terms in the commutator expansion can be estimated using
kφ(ij rtj v)(H0 + 1)− 2 k 1
Z ≤ Cj
σ )|kv ∗ (˜ σ )v(˜ σ )kB(K) d˜ σ (1 + |˜ σ |−1 )|rt (˜
12
≤ Ct−ρ0 0 ,
(4.2)
by (4.1), and using the fact that H0 commutes with Nt . We obtain that
φ(v), F
Nt tδ
(H0 + i)−1 ∈ O(t−δ−ρ0 0 ) .
(4.3)
This implies the lemma by the standard argument, using the properties of χ. ˜ Proposition 4.2. Assume (I0 0), (I0 1) and let δ > (1 + 0 )−1 . Then (i) For G ∈ C0∞ (]0, +∞[), χ ∈ C0∞ (R) we have: Z
∞ 1
2
G Nt χ(H)e−itH u dt ≤ Ckuk2 ,
t
δ t
u ∈ D(N ) .
(ii) For F ∈ C0∞ (R), 0 ≤ F ≤ 1, F (s) ≡ 1 near 0 s- lim e t→+∞
itH
F
Nt tδ
e−itH = 1l .
Proof. We pick a function F (λ) ∈ C0∞ (R), with supp F ⊂]0, +∞[, F 0 (λ) = G2 (λ) for G ∈ C0∞ (]0, +∞[). For χ ∈ C0∞ (R), we set Φ(t) = χ(H)F
Nt tδ
χ(H) .
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1193
Note that by Lemma 3.9 e−itH χ(H) preserves D(N ). We compute the Heisenberg derivative of Φ(t) as a quadratic form on D(N ): Nt Nt 0 DΦ(t) = −δχ(H)F tδ tδ+1 Nt 1 0 + δ χ(H)F dΓ(dt )χ(H) t tδ Nt χ(H) , (4.4) + χ(H) φ(v), iF tδ for ˜ f 0 (tρ0 σ ˜) ≤ 0 . dt = d0 rt = ρ0 tρ0 −1 σ To estimate [φ(v), iF ( Ntδt )], we use the commutator expansion lemma as in the proof of Lemma 4.1. We obtain using (4.3):
χ(H) φ(v), iF Nt χ(H)
δ t
Nt −1
+ i) (H ≤ C φ(v), iF 0
tδ ∈ O(t−δ−ρ0 0 ) . Plugging (4.5) into (4.4), we obtain δ DΦ(t) ≤ − χ(H)(λF 0 ) t
Nt tδ
(4.5)
χ(H) + O(t−δ−ρ0 0 ) .
We pick δ > (1 + 0 )−1 so that δ + ρ0 0 > 1. By Proposition A.3 this proves (i). Let us now prove (ii). Let u = χ(H)v, χ ∈ C0∞ (R), v ∈ D(N ). By Lemma 3.9 ut ∈ D(H) ∩ D(N ). We have ∂t (ut , Nt ut ) = (ut , dΓ(dt )ut ) − (ut , φ(irt v)ut ) ≤ C0 k(H + b) 2 uk2 kφ(irt v)(H0 + 1)− 2 k 1
1
≤ C0 t−ρ0 0 k(H + b) 2 uk2 , 1
using (4.2) and the fact that dt ≤ 0. Integrating from 1 to t we obtain (ut , Nt ut ) ≤ C0 t−ρ0 0 k(H + b) 2 uk2 + C1 k(N + 1) 2 uk2 . 1
1
Hence for δ > (1 + 0 )−1 , F ∈ C ∞ (]0, +∞[), F bounded, we have: Nt ut ≤ C1 t−δ (ut , Nt ut ) ∈ o(1) . ut , F tδ By a density argument this proves (ii).
(4.6)
November 25, 2002 16:35 WSPC/148-RMP
1194
00150
C. G´ erard
Let us state the following corollary of Proposition 4.2, which will not be used in the sequel: Proposition 4.3. Assume (I0 0), (I0 1) and let δ > (1+0 )−1 . Then for F ∈ C0∞ (R), 0 ≤ F ≤ 1, F (s) ≡ 1 near 0: N e−itH = 1l . s- lim eitH F t→+∞ tδ Proof. As above we write for u ∈ D(H) ∩ D(N ): (ut , N ut ) = (ut , Nt ut ) + (ut , (N − Nt )ut ) . We have N − Nt = dΓ((1 − rt )) ≤ tρ0 (H0 + 1) .
(4.7)
Since ρ0 = 1 − ρ0 0 , we deduce from (4.6) that 1
1
(ut , N ut ) ≤ C0 t1−ρ0 0 k(H + b) 2 uk2 + Ck(N + 1) 2 uk2 .
(4.8)
Then we argue as in the proof of Proposition 4.2(ii). 4.2. Case of H e The result of Sec. 4.1 extended trivially to the case of the expanded Hamiltonian H e . We set again rt = f (tρ0 σ) ,
Nte := dΓ(rt ) .
(4.9)
We observe that if W is the unitary map introduced in Sec. 3.3 then, using the fact that rt is even, we have σ) , W ∗ H e W = H ⊗ 1l − 1l ⊗ dΓ(˜ σ) , W ∗ LW = H ⊗ 1l + 1l ⊗ dΓ(˜ W ∗ Nte W = Nt ⊗ 1l + 1l ⊗ Nt . This allows to deduce directly the results of this subsection from those of Sec. 4.1. The details are left to the reader. The analog of Lemma 4.1 is Lemma 4.4. Assume (I0 0), (I0 1). Let χ, χ1 ∈ C0∞ (R), F ∈ S 0 (R). Then e e Nt Nt e χ(L), F χ1 (L) , ∈ O(t−δ−ρ0 0 ) . χ(H ), F δ t tδ Proposition 4.5. Assume (I0 0), (I0 1) and let δ > (1 + 0 )−1 . Then (i) for G ∈ C0∞ (]0, +∞[), χ ∈ C0∞ (R) we have:
2 Z +∞ e
G Nt χ(L)e−itH e u dt ≤ Ckuk2 , u ∈ D(N e ) .
t
δ t 1
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1195
(ii) For F ∈ C0∞ (R), 0 ≤ F ≤ 1, F (s) ≡ 1 near 0 e e e Nt e−itH = 1l . s- lim eitH F t→+∞ tδ The following lemma will be used in later sections to control the number operator along the evolution. Lemma 4.6. Let Nte be the operator introduced in (4.9). Let F ∈ C0∞ (R). Then for δ > (1 + 0 )−1 : e Nt (N e )α F (L + b)−α ∈ O(tδα ) , 0 ≤ α ≤ 1 . tδ Proof. By interpolation it suffices to consider the case α = 1. By Lemma A.2, we deduce from (1 − rt )2 ≤ t2ρ0 |σ|2 that dΓ(1 − rt )2 ≤ t2ρ0 (L0 + b)2 ≤ Ct2ρ0 (L0 + b)2 , since (L + b)−1 (L0 + b) is bounded. Using that N e = Nte + dΓ((1 − rt )) this yields: e e Nt Nt e 2 ) F (N (L + b)−1 (L + b)−1 F δ t tδ e e Nt Nt e 2 ≤ (L + b)−1 F ) F (N (L + b)−1 t tδ tδ e e Nt Nt −1 2 + (L + b) F dΓ(1 − rt ) F (L + b)−1 tδ tδ ≤ Ct2δ + Ct2ρ0 . 4.3. Sharper estimates for H In this subsection, we prove sharper estimates on the localization of bosons of small momenta. We pick a cutoff function g ∈ C ∞ (R) with g(s) = 0, for |s| ≤
1 , 2
g(s) = 1, for |s| ≥ 1 ,
(4.10)
s.g 0 (s) ≥ 0 . We set σ) , g t := g(tδ R˜ for an exponent δ > 0 and a constant R ≥ 1. Lemma 4.7. Assume (I0 0), (I0 1). Then for χ ∈ C0∞ (R): [χ(H), Γ(g t )] ∈ O(t−δ0 ) .
(4.11)
November 25, 2002 16:35 WSPC/148-RMP
1196
00150
C. G´ erard
Proof. We write [χ(H), Γ(g t )] =
i 2π
Z C
∂z¯χ(z)(z ˜ − H)−1 [H, Γ(g t )](z − H)−1 dz ∧ d¯ z,
where χ ˜ is an almost-analytic extension of χ. On D(H), H = H0 + φ(v), and [H0 , Γ(g t )] = 0, 1 1 [φ(v), Γ(g t )] = √ a∗ ((1 − g t )v)Γ(g t ) − √ Γ(g t )a((1 − g t )v) . 2 2 By Proposition A.1, we obtain: σ − 2 vk , k(H0 + 1)− 2 [φ(v), Γ(g t )](H0 + 1)− 2 ≤ Ck(1 − g t )˜ 1
1
1
(4.12)
and hence σ − 2 vk|Im z|−2 , k(z − H)−1 [H, Γ(g t )](z − H)−1 k ≤ Ck(1 − g t )˜ 1
z ∈ supp χ ˜.
By (4.1) we have σ − 2 vk ∈ O(R−0 t−δ0 ) . k(1 − g t )˜ 1
This implies the lemma. Proposition 4.8. Assume hypotheses (I0 0), (I0 1) and δ0 > 1. Then for χ ∈ C0∞ (R): Z +∞ 1 kdΓ(g t , d0 g t ) 2 χ(H)e−itH uk2 dt ≤ Ckuk2 , u ∈ D(N ) , (i) 1
(ii)
s- lim eitH Γ(g t )e−itH =: Γ+ (g, R) exists,
(iii)
[Γ+ (g, R), H] = 0 .
t→+∞
Proof. Let Φ(t) = χ(H)Γ(g t )χ(H). By Lemma 3.9 χ(H)e−itH preserves D(N ) and for u ∈ D(N ) the function R 3 t 7→ (ut , Φ(t)ut ) is C 1 with derivative (ut , χ(H)DΓ(g t )χ(H)ut ), where the Heisenberg derivative DΓ(g t ) equals DΓ(g t ) = dΓ(g t , d0 g t ) + [φ(v), iΓ(g t )] . We have ˜ g 0 (tδ R˜ σ) ≥ 0 , d0 g t = Rδtδ−1 σ and hence dΓ(g t , d0 g t ) ≥ 0. Next by (4.12), we have χ(H)[φ(v), iΓ(g t )]χ(H) ∈ O(t−δ0 ) .
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1197
Hence if δ0 > 1 we obtain (i) by Proposition A.3 with D = D(N ). To prove (ii) we write for χ ∈ C0∞ (R) eitH Γ(g t )e−itH χ2 (H)u = eitH χ(H)Γ(g t )χ(H)e−itH u + o(1) , by Lemma 4.7 and argue by density. (iii) follows similarly from Lemma 4.7. Theorem 4.9. Assume hypotheses (I0 0), (I0 1) and δ0 > 1. Then Γ+ (g, 1) = 1l, i.e.: ˜ ))e−itH u + o(1) , e−itH u = Γ+ (g(tδ σ
u ∈ H.
Theorem 4.9 means that for large times no particles are found with momentum smaller than t−δ for δ > −1 0 , while Proposition 4.3 means that for large times the number of particles with momentum smaller than t−δ for δ > (1 + 0 )−1 is less than tδ . Proof. We claim first that w- lim Γ(g, R) = 1l .
(4.13)
R→+∞
To prove (4.13) we will apply Proposition A.6. For u ∈ D(N ), we have: ∂t (ut , χ(H)Γ(g t )χ(H)ut ) = (ut , χ(H)dΓ(g t , d0 g t )χ(H)ut ) + (ut , χ(H)[φ(v), iΓ(g t )]χ(H)ut ) .
(4.14)
The first term in the r.h.s. of (4.14) is positive and by (4.12) the second term is bounded by CR−0 t−δ0 . Clearly we have σ ) = 1l , w- lim Γ(g(tδ R˜ R→+∞
t ∈ R.
Applying then Proposition A.6 we obtain that if u = χ(H)u, lim (u, Γ+ (g, R)u) = kuk2 .
R→+∞
By density this implies (4.13). Now we use the fact that for δ > δ 0 0
g(tδ σ ˜ ) ≥ g(tδ R˜ σ ), for fixed R and t ≥ TR . Hence if we denote by Γ0+ (g, R) the observable in Proposition 4.8 with the exponent δ 0 , we have: Γ0+ (g, R) ≤ Γ+ (g, 1) ≤ 1l . Letting R → +∞ and using (4.13) we obtain that Γ+ (g, 1) = 1l. We now state a lemma which will be used to control the number operator along the evolution, using the cutoffs Γ(g t ).
November 25, 2002 16:35 WSPC/148-RMP
1198
00150
C. G´ erard
Lemma 4.10. (i) N α Γ(g t )χ(H) ∈ O(tδα ), 0 ≤ α ≤ 1, χ ∈ C0∞ (R). ˜ ) where g1 ∈ C ∞ (R) is such that 0 ∈ / supp g1 , g1 g = g. Then (ii) Let g1t = g1 (tδ σ N 2 Γ(g t )χ(H)Γ(g1t )χ(H) ∈ O(t2δ ) ,
χ ∈ C0∞ (R) .
Proof. Let us first prove (i). Since D(H) = D(H0 ), it suffices to estimate σ ) + 1)−α . On the n-particle sector, we have: N α Γ(g t )(dΓ(˜ !−α n n Y X α δ g(t σ ˜i ) |˜ σi | + 1 ≤ Ctδα , n 1
1
which proves (i). To prove (ii) we write: N 2 Γ(g t )χ(H)Γ(g1t )χ(H) Z i 2 ∂z¯χ(z)N ˜ Γ(g t )(z − H)−1 Γ(g1t )χ(H)dz ∧ d¯ z = 2π C Z i ∂z¯χ(z)N ˜ Γ(g t )(z − H)−1 φ(iv)(z − H)−1 Γ(g1t )χ(H)dz ∧ d¯ z =− 2π C + N Γ(g t )χ(H)N Γ(g1t )χ(H) , using Lemma 3.9(ii). The second term is O(t2δ ) by (i). Using the fact that D(H) = D(H0 ), we write: kN Γ(g t )(z − H)−1 φ(iv)(z − H)−1 k ≤ CkN Γ(g t )(H0 + 1)−1 k kφ(iv)(H0 + 1)−1 k k(H0 + 1)(z − H)−1 k ≤ Ctδ |Im z|−2 , for z ∈ supp χ. ˜ This proves (ii). 4.4. Sharper estimates for H e Let g be as in (4.10) and set g t = g(tδ σ) . Then by exactly the same arguments as in Sec. 4.3, replacing cutoffs in H by cutoffs in L, we obtain: Lemma 4.11. Assume (I0 0), (I0 1). Then for χ, χ1 ∈ C0∞ (R): [χ(L), Γ(g t )], [χ(H e ), Γ(g t )]χ1 (L) ∈ O(t−δ0 ) . Theorem 4.12. Assume hypotheses (I0 0), (I0 1) and δ0 > 1. e−itH u = Γ(g(tδ σ))e−itH u + o(1) , e
e
u ∈ He .
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1199
The proofs are analogous to Lemma 4.7 and Theorem 4.9 and left to the reader. We now state a lemma analogous to Lemma 4.10 for the Hamiltonian H e . Lemma 4.13. (i) (N e )α Γ(g t )χ(L) ∈ O(tδα ), 0 ≤ α ≤ 1, χ ∈ C0∞ (R). / supp g1 , g1 g = g. Then (ii) Let g1t = g1 (tδ σ) where g1 ∈ C ∞ (R) is such that 0 ∈ (N e )2 Γ(g t )χ(L)Γ(g1t )χ(L) ∈ O(t2δ ) ,
χ ∈ C0∞ (R) .
The proof is identical to the proof of Lemma 4.10, replacing cutoffs in H by cutoffs in L. 4.5. Some auxiliary estimates for H In this subsection we consider a positive selfadjoint operator C acting on h such that [C, σ ˜ ] = 0 and we prove some estimates on the growth of C along the evolution. These estimates will be used for the Nelson Hamiltonian in Sec. 12 for the observable ω C = −∆ σ ˜2 . We assume σ |− 2 )hCiµ2 v(K+1)− 2 , (1+|˜ σ |− 2 )hCiµ2 (K+1)− 2 v ∈ B(K, K⊗h), µ2 > 0 . (I0 5) (1+|˜ 1
1
1
1
Note that (I0 5) implies k(1 + |˜ σ |− 2 )F (C ≥ R)v(K + 1)− 2 kB(K,K⊗h) 1
1
+ k(1 + |˜ σ |− 2 )F (C ≥ R)(K + 1)− 2 vkB(K,K⊗h) 1
1
≤ C0 R−µ2 .
(4.15)
The corresponding assumption for the Nelson Hamiltonian is (I5) introduced in Sec. 1.1. In fact let P (ω, ∂ω ) be the expression of −∆ω in some local coordinates on S 2 . Then 1 ∆ω ˜ vj (˜ σ ω)) = e−i˜σ x.ω 2 P (ω, ∂ω − dx.ω)˜ σ vj (˜ σ ω) , − 2 (e−i˜σx.ω σ σ ˜ σ ˜ where dx.ω is the differential of the function ω 7→ x.ω. Since dx.ω ∈ O(|x|), we obtain, using (1.1) to control powers of x, that if (H0) holds for α > 0 and (I5) holds for µ2 > 0 then (I0 5) holds for the associated Pauli–Fierz Hamiltonian with the exponent µ2 replaces by inf(α, µ2 ). Let F ∈ C0∞ (R), 0 ≤ F ≤ 1, F (λ) ≡ 1 for |λ| ≤ 12 , F (λ) ≡ 0 for |λ| ≥ 1 and λF 0 (λ) ≤ 0. We set for ρ, R > 0: C ct := F . (4.16) Rtρ Lemma 4.14. Assume (I0 0), (I0 5). Then for χ ∈ C0∞ (R): [χ(H), Γ(ct )] ∈ O(t−µ2 ρ ) . Using almost-analytic extensions, we are reduced to estimate [(z − H)−1 , Γ(ct )] = (z − H)−1 [H, Γ(ct )](z − H)−1 .
November 25, 2002 16:35 WSPC/148-RMP
1200
00150
C. G´ erard
We have 1 1 [H, Γ(ct )] = √ a∗ ((1 − ct )v)Γ(ct ) − √ Γ(ct )a((1 − ct )v) . 2 2 Using (I0 5) and Proposition A.1, we obtain k(H + i)−1 [H, Γ(ct )](H + i)−1 k ≤ C0 t−ρµ2 ,
(4.17)
uniformly in R ≥ 1. This implies that k(z − H)−1 [H, Γ(ct )](z − H)−1 k ≤ C0 |Im z|−2 t−ρµ2 ,
z ∈ K b C\ R,
which implies the lemma. Proposition 4.15. Assume (I0 0), (I0 5). Assume ρ in (4.16) is such that ρµ2 > 1. Then R 1 +∞ kdΓ(ct , d0 ct ) 2 χ(H)ut k2 dt ≤ Ckuk2 , u ∈ D(N ), χ ∈ C0∞ (R), (i) 1 itH (ii) s- lim e Γ(ct )e−itH =: P + (R) exists, t→+∞
(iii)
[P + (R), H] = 0.
By the standard argument, we compute for u, v ∈ D(N ) the derivative of the function t 7→ (vt , χ(H)Γ(ct )χ(H)ut ) , which equals (vt , χ(H)dΓ(ct , d0 ct )χ(H)ut ) + (vt , χ(H)[φ(v), iΓ(ct )]χ(H)ut ) . By (4.17) the second term is integrable in norm if ρµ2 > 1. We have C C ≥ 0, dt0 ct = −ρF 0 ρ Rt Rtρ+1 hence dΓ(ct , dt0 ct ) ≥ 0. The estimate (i) follows then from Proposition A.3. The existence of the limit (ii) follows from (i), Proposition A.4 and Lemma 4.14. Property (iii) follows from Lemma 4.14. Theorem 4.16. Assume (I0 0), (I0 5). Assume ρ in (4.16) is such that ρµ2 > 1. Then P + (1) = 1l, i.e.: C e−itH u = Γ F e−itH u + o(1) , u ∈ H . tρ Proof. We first claim that w- lim P + (R) = 1l . R→∞
(4.18)
To prove (4.18) we apply Proposition A.6. We have for u ∈ D(N ), χ ∈ C0∞ (R): d (ut , χ(H)Γ(ct )χ(H)ut ) ≥ −Ct−ρµ2 kuk2, uniformly in R . dt
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1201
On the other hand Γ(ct ) ≤ 1l and w- lim Γ(ct ) = 1l , R→∞
∀t ∈ R.
Hence (4.18) follows from Proposition A.6. Finally we use the fact that for ρ0 > ρ: C C ≤F , for fixed R and t ≥ T (R) . F Rtρ tρ0 If we denote by Γ0 (ct ), P 0+ (R) the same observables with the exponent ρ0 , we obtain Γ(ct ) ≤ Γ0 (ct ) ≤ 1l and hence P + (R) ≤ P 0+ (1) ≤ 1l . Letting R → ∞ and using (4.18) we get that P 0+ (1) = 1l. This proves the theorem. 5. Number of Asymptotically Free Particles In this section we consider the expanded Hamiltonian H e introduced in Sec. 3.3. On he = L2 (R, dσ) ⊗ g, we denote by ∂ s=i ∂σ the observable conjugate to σ, which we interpret as a position. The main result is Theorem 5.6 where we construct for 0 < c < 1 H e -invariant subspaces Hce+ describing states which contain a finite number of particles in the region {s ≥ ct}. Finally in Sec. 5.3 we show the rather trivial fact that no propagation takes place in the region {s ≤ −ct} for 0 < c < 1. Let us fix f ∈ C ∞ (R), such that 0 ≤ f ≤ 1,
f0 ≥ 0 ,
for α0 < α1 . We set
f ≡ 0 for s ≤ α0 ,
f ≡ 1 for s ≥ α1 ,
s − ct , Bct = dΓ(bct ) tρ where the constants 0 < c ≤ 1, 0 < ρ < 1 will be fixed later. We assume in this section the following hypothesis: bct := f
(I0 2)
(5.1)
(5.2)
(K + 1)− 2 v e (·), v e (·)(K + 1)− 2 ∈ H0µ (R+ ) ⊗ B(K, K ⊗ g) , 1
1
where for µ > 0 the space H0µ (R+ ) is the closure of C0∞ (]0, +∞[) in the topology of H µ (R). Note that (I0 2) implies kF (|s| ≥ R)v e (K + 1)− 2 kB(K,K⊗he ) ≤ CR−µ , 1
kF (|s| ≥ R)(K + 1)− 2 v e kB(K,K⊗he ) ≤ CR−µ , 1
R ≥ 1.
(5.3)
The corresponding condition for concrete Nelson Hamiltonians is (I2), introduced ˜ vj (˜ σ ω) = e−i˜σx.ω (∂σ˜ − ix.ω)˜ σ vj (˜ σ ω). in Sec. 1.1. In fact we note that ∂σ˜ e−i˜σx.ω σ Using then (1.1) to control powers of x we see that if (H0) holds for α > 0 and (I2) holds for µ > 0 then (I0 2) holds for the associated Pauli–Fierz Hamiltonian with the exponent µ replaced by inf(α, µ).
November 25, 2002 16:35 WSPC/148-RMP
1202
00150
C. G´ erard
5.1. Technical preparations Proposition 5.1. Assume (I0 0) for 0 > 0, (I0 2) for µ > 1. Assume that 0 < c < 1 or that c = 1 and α1 < 0. Then for χ ∈ C0∞ (R): Z +∞ 1 dt kdΓ(d0 bct ) 2 (Bct + λ)−1 χ(L)ut k2 ≤ Ckuk2 , u ∈ D(N e ) , λ > 0 , (i) t 1 (ii)
s- lim eitH χ(L)(Bct + λ)−1 χ(L)e−itH exists, ∀ λ ∈ C\R− . e
e
t→+∞
Proof. Let us first fix λ > 0 and set Φ(t) = χ(L)(Bct + λ)−1 χ(L) . For u ∈ D(N e ) the function R+ 3 t 7→ f (t) = (ut , Φ(t)ut ) is C 1 with derivative ∂t f (t) = (ut , χ(L)DΦ(t)χ(L)ut ) . Note that by (3.8) e−itH χ(L) preserves D(N e ). Since H e = H0e + φ(v e ) on D(L), we have: e
[H e , i(Bct + λ)−1 ] = [H0e , i(Bct + λ)−1 ] + [φ(v e ), i(Bct + λ)−1 ] on D(L). Since K ⊗ Γfin (he ) ∩ D(L) is dense in D(L) we can compute D(Bct + λ)−1 on finite vectors. We obtain D0 (Bct + λ)−1 = −(Bct + λ)−1 dΓ(ct )(Bct + λ)−1 , for ct = d0 bct = f 0
s − ct tρ
1 − c s − ct − ρ+1 tρ t
.
Similarly [φ(v e ), i(Bct + λ)−1 ] = (Bct + λ)−1 φ(ibct v e )(Bct + λ)−1 . Since D(L) = D(L0 ), (K + 1) 2 (L + i)−1 is bounded, and we have: 1
k[φ(v e ), i(Bct + λ)−1 ](L + i)−1 k ≤ k(Bct + λ)−1 φ(ibct v e )(K + 1)− 2 )(Bct + λ)−1 k 1
1
2 v e (K + 1)− 2 k ≤ Ct−µ , ≤ Ckbct 1
using Proposition A.1 and Assumption (I0 2). Next we note that if 0 < c < 1 or c = 1 and α1 < 0 s − ct c for t 1 . ct ≥ ρ f 0 t tρ
(5.4)
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
Applying then Proposition A.3 with D = D(N e ), we obtain Z +∞ 1 dt kdΓ(ct ) 2 (Bct + λ)−1 χ(L)ut k2 ≤ Ckuk2 , u ∈ D(N e ) , t 1
λ > 0.
1203
(5.5)
This proves (i). For λ ∈ C\R− , we have, as quadratic forms on D(L) ∩ D(N e ): D0 (Bct + λ)−1 = −(Bct + λ)−1 dΓ(ct )(Bct + λ)−1 = −(Bct + 1)−1 dΓ(ct ) 2 R(t)dΓ(ct ) 2 (Bct + 1)−1 , 1
1
(5.6)
for R(t) = (Bct + 1)2 (Bct + λ)−2 ∈ O(1). Moreover (5.4) is still valid for λ ∈ C\R− . Applying then (5.6), (5.4) and the estimate (5.5) for λ = 1, we obtain (ii) by Proposition A.4. The next three lemmas will be needed in the proof of Theorem 5.5 to get rid of the cutoffs χ(L) in the statements of Proposition 5.1. Note that commutators between functions of L and functions of Bct are bounded only by the number operator, since [|σ|, is] = 1l. Therefore we introduce cutoffs in Nte to control these error terms. Lemma 5.2. Let ft ∈ C ∞ (R) with |∂sα ft | ≤ Cα t−ρα , α ∈ N. Then [ft (s), |σ|] ∈ O(t−ρ ) . Proof. We have k[ft (s), |σ|]k = t−ρ k[ft (tρ s), |σ|]k . Note that gt (s) = ft (tρ s) satisfies |∂sα gt | ≤ Cα , α ∈ N. It remains to check that [gt (s), |σ|] is bounded, which follows by writing |σ| as r1 (σ) + r2 (σ), where r1 (σ) is bounded and r2 (σ) ∈ C ∞ (R), |∂σα r2 (σ)| ≤ Cα (1 + |σ|)1−|α| . Lemma 5.3. Assume (I0 0) for 0 > 0, (I0 2) for µ > 0. Let Nte be defined in (4.9). Assume the exponent ρ in (5.2) is such that ρ > δ > (1 + 0 )−1 . Then for χ, F ∈ C0∞ (R), λ ∈ C\R− : e Nt χ(L) ∈ o(1) . (L + i)[(Bct + λ)−1 , χ(L)]F tδ Proof. We write (L + i)[(Bct + λ)−1 , χ(L)]F =
Z
Nte tδ
χ(L)
∂ χ(z) ˜ (L + i)(z − L)−1 [(Bct + λ)−1 , L] ∂ z¯ e Nt χ(L)dz ∧ d¯ z. × (z − L)−1 F tδ i 2π
November 25, 2002 16:35 WSPC/148-RMP
1204
00150
C. G´ erard
By Lemma 3.5 (z − L)−1 preserves D(N e ). On D(L) ∩ D(N e ) we have: [(Bct + λ)−1 , L] = [(Bct + λ)−1 , L0 ] + [(Bct + λ)−1 , φ(v e )] . By (5.4) [(Bct + λ)−1 , φ(v e )](z − L)−1 ∈ O(|Im z|−1 t−µ ) . Next on D(L) ∩ D(N e ) [(Bct + λ)−1 , L0 ] = −(Bct + λ)−1 dΓ([bct , |σ|])(Bct + λ)−1 .
(5.7)
By Lemma 5.2, [bct , |σ|] ∈ O(t−ρ ) and hence [(Bct + λ)−1 , L0 ] ∈ O(N e )t−ρ . By Lemma 3.5 we have (N e + 1)(z − L)−1 (N e + 1)−1 ∈ O(|Im z|−2 ) ,
z ∈ supp χ ˜,
(5.8)
Nte tδ
and by Lemma 4.6 N e F ( )χ(L) ∈ O(tδ ). Finally we obtain that for u ∈ D(N e ):
e
(L + i)(z − L)−1 [(Bct + λ)−1 , L](z − L)−1 F Nt χ(L)u
δ t ˜. ≤ C(tδ−ρ |Im z|−3 + t−µ |Im z|−1 )kuk, z ∈ supp χ Using the properties of χ ˜ this proves the lemma. Proposition 5.4. Assume (I0 0), (I0 1) for 0 , (I0 2) for µ > 0. Then for ρ > δ > (1 + 0 )−1 , λ ∈ C\R− , χ ∈ C0∞ (R): e e Nt Nt −1 2 −1 χ (L) = χ(L)(Bct + λ) F χ(L) + o(1) . (Bct + λ) F tδ tδ Proof. We combine Lemma 5.3 and Lemma 4.4. 5.2. Asymptotic projections Theorem 5.5. Assume (I0 0), (I0 1) for 0 > 0, (I0 2) with µ > 1 and pick ρ in (5.2) such that ρ(1 + 0 ) > 1. Then: (i) for each λ ∈ C\R− the limit s- lim eitH (Bct + λ)−1 e−itH =: Rce+ (λ) exists . e
e
t→+∞
(ii) [Rce+ (λ), L] = [Rce+ (λ), H e ] = 0; (iii) The limit s- lim −1 Rce+ (−1 ) =: Pˆce+ exists →0
and is an orthogonal projection;
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1205
(iv) [H e , Pˆce+ ] = [L, Pˆce+ ] = 0 , u = Pˆce+ u ⇔ s- lim s- lim eitH (Bct + 1l)−1 e−itH u = u . e
→0
e
t→+∞
The projections Pˆce+ are constructed by a standard pseudo-resolvent argument. In fact it is easy to see that the operators Rce+ (λ) form a pseudo-resolvent family, i.e. satisfy the resolvent identity. From this family a selfadjoint operator Nce+ (with a possibly non dense domain) can be constructed. The operator Nce+ can be seen as the (formal) limit: Nce+ = lim eitH Bct e−itH , e
e
t→+∞
i.e. as the asymptotic number of particles in s ≥ ct. The range of Pˆce+ is the closure of the domain of Nce+ , i.e. the closure of the space of states where this number is finite. In the sequel, only the range of Pˆce+ will play a role and we will not consider the associated selfadjoint operator Nce+ . Note also that the projections Pˆce+ depend on the choice of the cutoff function f in (5.2). We introduce projections independent on the choice of f in the next theorem. Theorem 5.6. Assume (I0 0), (I0 1) for 0 > 0, (I0 2) with µ > 1 and pick ρ in (5.2) such that ρ(1 + 0 ) > 1. Let for 0 < c < 1: Pce+ := inf0 Pˆce+ 0 , c
Hce+ := Ran Pce+ .
Then: (i) Pce+ is an orthogonal projection independent on the choice of the function f in (5.2). (ii) [H e , Pce+ ] = [L, Pce+ ] = 0, (iii) Ωe+ (Hpp (H e ) ⊗ Γ(he )) ⊂ Hce+ , where the wave operator Ωe+ is defined in Sec. 8.5. The space Hce+ can be understood as the space of states having a finite number of particles in the region {s ≥ c0 t} for all c0 > c. By part (iii) of Theorem 5.6 we know if H e has bound states, in particular under the assumptions (H0 0), (I0 3), then Hce+ is non trivial. Proof. Let f1 , f2 be two functions such that 0 ≤ fi ≤ 1, fi0 ∈ C0∞ (R), fi0 ≥ 0 and fi ≡ 0 for s −1, fi ≡ 1 for s 1. Clearly there exists s0 such that f1 (s) ≤ f2 (s + s0 ) for any s0 ≥ s0 . This implies that if c1 > c2 s − c1 t s − c2 t f1 ≤ f , t≥T. (5.9) 2 tρ tρ
November 25, 2002 16:35 WSPC/148-RMP
1206
00150
C. G´ erard
e+ e+ Let us denote by Bi,ct the observable defined in (5.2) for f = fi and by Ri,c (λ), Pˆi,c the objects constructed in Theorem 5.5 for f = fi . It follows from (5.9) that
(B1,c1 t + λ)−1 ≥ (B2,c2 t + λ)−1 for t ≥ T, λ > 0 hence e+ e+ (λ) ≥ R2,c (λ), λ > 0 , R1,c 1 2
and e+ e+ ≥ Pˆ2,c Pˆ1,c 1 2
if c1 > c2 .
(5.10)
If we take f1 = f2 = f , we obtain that the family of projections Pˆce+ is increasing w.r.t. c, which shows the existence of Pˆce+ . Using again (5.10) we obtain that Pˆce+ does not depend on f . (ii) follows from Theorem 5.5. It remains to prove (iii). We first note that Hpp (H e ) ⊂ Hce+ . In fact this is a direct consequence of the fact that for > 0 (Bct + 1)−1 tends strongly to 1l when t → +∞. Next we use the fact proved in Theorem 8.7 that the asymptotic Weyl operators W e+ (h) preserve the space Hce+ . These two observations imply (iii). Proof of Theorem 5.5. Let us first prove (i). By density it suffices to show the existence of lim eitH (Bct + λ)−1 e−itH u e
e
(5.11)
t→+∞
for u = χ(L)u, χ ∈ C0∞ (R). Let us pick an exponent δ with ρ > δ > (1 + 0 )−1 , which is possible since ρ(1+0 ) > 1. By Proposition 4.5(ii), we have for F ∈ C0∞ (R), F ≡ 1 near 0: eitH (Bct + λ)−1 e−itH u e
e
= eitH (Bct + λ)−1 F e
=e =e
itH e
itH e
−1
(Bct + λ)
F
Nt tδ Nt tδ
−1
χ(L)(Bct + λ)
e−itH u + o(1) e
χ(L)e−itH u + o(1) e
χ(L)F
Nt tδ
e−itH u + o(1) e
= eitH χ(L)(Bct + λ)−1 χ(L)e−itH u + o(1) , e
e
where we used Proposition 4.5 and Proposition 5.4. Hence by Proposition 5.1 the limit (5.11) exists. The first statement of (ii) follows by the arguments above. Let us now prove the second statement of (ii). It suffices to show that e
e
Rce+ (λ) = eit1 H Rce+ (λ)eit1 H ,
∀ t1 ∈ R ,
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1207
or equivalently s- lim eitH ((Bct + λ)−1 − (Bct−t1 + λ)−1 )e−itH = 0 . e
e
(5.12)
t→+∞
We have (Bct + λ)−1 − (Bct−t1 + λ)−1 = (Bct + λ)−1 (Bct − Bct−t1 )(Bct−t1 + λ)−1 , and Bct − Bct−t1 = dΓ(bct − bct−t1 ) . Since kbct − bct−t1 k ∈ O(t−ρ ), this gives ((Bct + λ)−1 − (Bct−t1 + λ)−1 ∈ O(N e )t−ρ .
(5.13)
Let u ∈ He with χ(L)u = u for χ ∈ C0∞ (R). We pick δ with ρ > δ > (1 + 0)−1 and write eitH ((Bct + λ)−1 − (Bct−t1 + λ)−1 )e−itH u e
e
= eitH ((Bct + λ)−1 − (Bct−t1 + λ)−1 )F e
Nt tδ
χ(L)e−itH u + o(1) . e
Combining (5.13) and Lemma 4.6 we obtain (5.12). This complete the proof of (ii). Statements (iii) and (iv) follow from Proposition A.7. 5.3. Soft propagation estimates In this subsection we show rather easy propagation estimates. More precisely we show that for any state in He there is no propagation in the region {s ≤ −ct} for 0 < c < 1. We fix a cutoff function f1 ∈ C ∞ (R) such that for some α4 < α3 < 0: 0 ≤ f1 ≤ 1, supp f1 ⊂] − ∞, α3 ] ,
(5.14)
f1 ≡ 1 in ] − ∞, f10 (s) ≤ 0 , and set for 0 < ρ < 1, 0 < c < 1: s + ct , b1,t = f1 tρ
B1,t := dΓ(b1,t ) .
(5.15)
Proposition 5.7. Assume (I0 0), (I0 1) for 0 > 0, (I0 2) for µ > 1 and pick ρ in (5.15) such that ρ(1 + 0 ) > 1. Then: (i)
R1+ () := s- lim eitH (1 + B1,t )−1 e−itH exists ,
(ii)
[R1+ (), H e ] = [R1+ (), L] s- lim R1+ () = 1l . →0
(iii)
e
e
t→+∞
= 0,
Proposition 5.7 means that any state has a finite number of particles in the region {s ≤ −ct}.
November 25, 2002 16:35 WSPC/148-RMP
1208
00150
C. G´ erard
Proof. We first prove the existence of s- lim eitH χ(L)(1 + B1,t )−1 χ(L)e−itH . e
e
(5.16)
t→+∞
Arguing exactly as in the proof of Proposition 5.1, we obtain for λ ∈ C\R− : χ(L)D(B1,t + λ)−1 χ(L) = −χ(L)(B1,t + λ)−1 dΓ(c1,t )(B1,t + λ)−1 χ(L) − χ(L)(B1,t + λ)−1 φ(ib1,t v e )(B1,t + λ)−1 χ(L) , for
s + ct 1+c s + ct − ρ tρ tρ tρ+1 s + ct 1 0 , for t 1 . ≤ −cf1 tρ tρ
c1,t = d0 b1,t = f10
Moreover by Proposition A.1: 1
2 v e (K + 1)− 2 k ≤ Ct−µ , kχ(L)D(B1,t + λ)−1 φ(ib1,t v e )(B1,t + λ)−1 χ(L)k ≤ Ckb1,t 1
by (I0 2). Arguing as in Sec. 5.1 we obtain the existence of the limit (5.16). As in the proof of Theorem 5.5, using an analog of Proposition 5.4 we obtain then the existence of R1+ (). Property (ii) can be shown as in Theorem 5.5. Let us now prove (iii). By Proposition A.7 we obtain the existence of s- lim→0 R1+ (), and it suffices to show that w- lim R1+ () = 1l .
(5.17)
→0
By density it suffices to consider states u ∈ He such that u = χ(L)u for some χ ∈ C0∞ (R). We will apply Proposition A.6 to Φ (t) = χ(L)(1 + B1,t )−1 χ(L). We have: χ(L)D(1 + B1,t )−1 χ(L) = −χ(L)(1 + B1,t )−1 dΓ(c1,t )(1 + B1,t )−1 χ(L) − χ(L)(1 + B1,t )−1 φ(ib1,t v e )(1 + B1,t )−1 χ(L) . Using the fact that k(1 + B1,t )−1 (1 + B1,t )− 2 k ≤ C− 2 , uniformly in t , 1
1
and Proposition A.1, we obtain 1
2 v(K + 1)− 2 k ≤ Ct−µ , kχ(L)(1 + B1,t )−1 φ(ib1,t v e )(1 + B1,t )−1 χ(L)k ≤ Ckb1,t 1
(5.18) uniformly in . Since c1,t ≤ 0, we obtain χ(L)D(1 + B1,t )−1 χ(L) ≥ −Ct−µ , uniformly in .
(5.19)
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1209
Clearly w- lim→0 (1 + B1,t )−1 = 1l for t > 0 and 0 ≤ R1+ () ≤ 1l. Applying Proposition A.6 we obtain (5.17). Let now f0 ∈ C ∞ (R) be a cutoff function such that: f0 ≡ 1 in s ≤ α1 ,
f0 ≡ 0 in s ≥ α2 ,
f00 ≤ 0 . Here the constants α1 < α2 are such that 0 < α1 < α2 . We set −s − ct , fRt = f0 Rtρ
(5.20)
(5.21)
for R ≥ 1 and 0 < ρ < 1 as in (5.2). The following two lemmas are analogous to Lemmas 6.1 and 6.2 for k = 0 and their proofs are similar. Lemma 5.8. Assume (I0 0), (I0 1) for 0 > 0, (I0 2) for µ > 0. Assume the constants ρ, δ are chosen so that ρ > δ > (1 + 0 )−1 , µ > δ/2. Then for χ1 , χ2 , F ∈ C0∞ (R): e Nt t [Γ(fR ), χ1 (L)]F χ2 (L) ∈ o(1) . tδ Lemma 5.9. Assume (I0 0), (I0 1) for 0 > 0, (I0 2) for µ > 0. Assume the constants ρ, δ are chosen so that ρ > δ > (1 + 0 )−1 , µ > δ/2. Then for χ1 , χ2 , F ∈ C0∞ (R): e e Nt Nt t (L)χ (L) = χ (L)Γ(f )F Γ(fRt )F χ χ2 (L) + o(1) . 1 2 1 R tδ tδ The following lemma is analogous to Proposition 6.3. Lemma 5.10. Assume (I0 0), (I0 2) for µ > 1. Let B1,t defined in (5.15). Then for χ ∈ C0∞ (R), λ > 0, R ≥ 1 large enough: s- lim eitH χ(L)Γ(fRt )(B1,t + λ)−1 χ(L)e−itH exists . e
e
t→+∞
(5.22)
Proof. As in the proof of Proposition 6.3 we compute Dχ(L)Γ(fRt )(B1,t + λ)−1 χ(L) = χ(L)DΓ(fRt )(B1,t + λ)−1 χ(L) + χ(L)Γ(fRt )D(B1,t + λ)−1 χ(L) .
(5.23)
By the proof of Proposition 5.7, we have: |(v, χ(L)Γ(fRt )D(B1,t + λ)−1 χ(L)u)| ≤ CkR1 (t)uk kR1 (t)vk ,
(5.24)
uniformly in R, where R1 (t) is integrable along the evolution. Let us now consider the first term in (5.23). We have: i i [φ(v e ), iΓ(fRt )] = √ Γ(fRt )a((1 − fRt )v e ) − √ a∗ ((1 − fRt )v e )Γ(fRt ) . 2 2
November 25, 2002 16:35 WSPC/148-RMP
1210
00150
C. G´ erard
By (5.20) supp(1 − fRt ) ⊂ {s ≤ −ct − α1 Rtρ } , and b1,t ≡ 1 in {s ≤ −ct + α4 tρ } . Since α1 > 0 b1,t ≡ 1 on supp(1 − fRt ) for t ≥ 0, R ≥ R0 . Applying then Proposition A.1, we obtain: k[φ(v), Γ(fRt )](B1,t + λ)−1 χ(L)k ≤ Ck(1 − fRt )v e (K + 1)− 2 k 1
≤ Ct−µ ,
(5.25)
0
uniformly in R ≥ R0 , by (I 2). Finally D0 Γ(fRt ) = dΓ(fRt , d0 fRt ) , −s − ct −1 − c ρ t 0 + ρ+1 (s + ct) d0 fR = f0 Rtρ Rtρ Rt −s − ct c |f00 | , ≥ ρ Rt Rtρ
(5.26)
uniformly in R ≥ R0 . From (5.25), (5.26), (5.24) we obtain: |(v, Dχ(L)Γ(fRt )(B1,t + λ)−1 χ(L)u)| ≤
3 X
kRi (t)uk kRi (t)vk ,
(5.27)
i=1
uniformly in R ≥ R0 , where Ri (t) are integrable along the evolution. This implies that the limit (5.22) exists. The following proposition is an improvement on Proposition 5.7. It means that asymptotically there are no particles in {s ≤ −ct}. Proposition 5.11. Assume (I0 0), (I0 1) for > 0, (I0 2) for µ > 1 and pick ρ such that ρ(1 + 0 ) > 1. Then e e −s − ct e−itH = 1l . s- lim eitH Γ f0 t→+∞ tρ t the operator in (5.21) to emphasize the dependence Proof. We denote by fR,ρ on the exponent ρ. Using Lemma 5.9, Proposition 5.4, Proposition 4.5 and a density argument as in the proof of Theorem 5.5(i), we deduce from Lemma 5.10 the existence of t )(B1,t + λ)−1 e−itH , ∀ > 0 . s- lim eitH Γ(fR,ρ e
e
t→+∞
By Proposition 5.7 and a density argument, we obtain the existence of t )e−itH =: Γ+ s- lim eitH Γ(fR,ρ R,ρ , e
e
t→+∞
+ and the fact that the limit (5.28) equals Γ+ R,ρ R1 ().
(5.28)
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1211
Next we apply Proposition A.5 to obtain: + + w- lim Γ+ R,ρ R1 () = R1 () . R→+∞
(5.29)
In fact the intergrability uniformly in R (Condition (A.2)) follows from (5.27) and w- lim Γ(f 0 tR,ρ ) = 1l , R→+∞
∀t > 0
t ≡ 1 in {s ≥ −ct − α1 Rtρ } and α1 > 0. Applying Proposition A.5 we since fR,ρ obtain (5.29). Applying then Proposition 5.7(iii) we obtain
w- lim Γ+ R,ρ = 1l . R→+∞
(5.30)
Let now ρ1 with ρ1 (1 + 0 ) > 1 and ρ > ρ1 . We claim that t t ≥ fR,ρ , for t ≥ TR . f1,ρ 1
(5.31)
t t ⊂ {s ≥ −ct − α2 Rtρ1 } and f1,ρ ≡ 1 in {s ≥ −ct − α1 tρ }, so In fact supp fR,ρ 1 t t f1,ρ ≡ 1 on supp fR,ρ1 for t ≡ TR , since 0 < α1 < α2 and ρ > ρ1 . By (5.31) + + Γ+ R,ρ1 ≤ Γ1,ρ1 ≤ 1l, and hence Γ1,ρ = 1l by (5.30).
6. Asymptotic Partition of Unity In this section we construct in Theorem 6.4 an asymptotic partition of unity on the spaces Hce+ constructed in Sec. 5. This partition of unity allows to cut a state in Hce+ into pieces having a definite number of particles in the region {s ≥ ct}. The partition of unity is constructed using the operators Pk (f ) for a pair of cutoff functions (f0 (s), f∞ (s)) defined in Sec. 2.2. For technical reasons we will also need to consider in Sec. 6.2 a particular family of cutoffs (f0 (s), f∞, (s)) and to prove a week convergence result when → 0. 6.1. Asymptotic cutoffs Let us fix two functions f0 , f∞ ∈ C ∞ (R) with 0 ≤ f ≤ 1, = 0, ∞ and f0 ≡ 1 in s ≤ α1 , f0 ≡ 0 in s ≥ α2 , f∞ ≡ 0 in s ≤ α1 , f∞ ≡ 1 in s ≥ α2 ,
(6.1)
0 ≥ 0. f00 ≤ 0, f∞
Here the constants α1 < α2 are such that α0 < α1 < α2 where the constant α0 is t ) for fixed in Sec 5. We set f = (f0 , f∞ ), f t = (f0t , f∞ s − ct , (6.2) ft := f tρ for constants 0 < c ≤ 1 and 0 < ρ < 1. We consider in this section the localization operators Pk (f ), Qk (f ) defined in Sec. 2.2.
November 25, 2002 16:35 WSPC/148-RMP
1212
00150
C. G´ erard
We recall (see [10, Lemma 2.9]) kPk (f )k ≤ 1, kQk (f )k ≤ 1 if f0 + f∞ ≤ 1 . Using then the definition of Pk (f ), Qk (f ) we notice that kPk (f )k ≤ α−k ,
kQk (f )k ≤
1 − α−k , 1−α
(6.3)
if f0 + αf∞ ≤ 1 ,
α > 0.
(6.4)
Indeed it suffices consider the new cutoffs f˜ = (f0 , αf∞ ) and use that Pk (f ) = α−k Pk (f˜). In this section we will always assume that (f0 , f∞ ) satisfy (6.4). We recall the following identities [10, Lemma 2.11]: D0 Pk (f t ) = dPk (f t , d0 f t ) , i t v)Pk−1 (f t ) [φ(v), iPk (f t )] = √ (a∗ ((1 − f0t )v)Pk (f t ) − a∗ (f∞ 2 t v)) . − Pk (f t )a((1 − f0t )v) + Pk−1 (f t )a(f∞
(6.5)
The next two lemmas, analogous to Lemma 5.3 and Proposition 5.4, are needed to get rid of the cutoffs χ(L) in the statement of Proposition 6.3. Lemma 6.1. Assume (I0 0), (I0 1) for > 0, (I0 2) for µ > 0. Assume the constants ρ, δ are chosen so that ρ > δ > (1 + 0 )−1 , µ > δ/2. Then for χ1 , χ2 , F ∈ C0∞ (R): e Nt [Pk (f t ), χ1 (L)]F χ2 (L) ∈ o(1) . tδ Proof. By the argument above we may assume that f0 + f∞ ≤ 1. We have Z ∂χ ˜1 i (z)(z − L)−1 [L, Pk (f t )](z − L)−1 dx ∧ d¯ z. (6.6) [Pk (f t ), χ1 (L)] = 2π ∂ z¯ On D(N e ) ∩ D(L) we have: [L, Pk (f t )] = dPk (f t , [|σ|, f t ])| + [φ(v e ), Pk (f t )] . By Lemma 5.2 we have [|σ|, ft ] ∈ O(t−ρ ) ,
= 0, ∞ .
Applying then [10, Lemma 2.11], we get dPk (f t , [|σ|, f t ]) ∈ O(N e )t−ρ .
(6.7)
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1213
Using then (6.5) and (I0 2), we obtain k(K + 1)− 2 [φ(v e ), Pk (f t )](N e + 1)− 2 k ∈ O(t−µ ) . 1
1
By Lemma 3.5 for α = 1 and an interpolation argument for α = k(N e + 1)α (z − L)−1 (N e + 1)−α k ≤ C|Im z|−2 ,
α=
(6.8) 1 2
we obtain
1 , 1, z ∈ B b C\R . 2
Ne
Recall also from Lemma 4.6 that (N e )α F ( tδt )χ2 (L) ∈ O(tδα ). This yields
e
(z − L)−1 [[L, Pk (f t )](z − L−1 )F Nt χ2 (L)
δ t ≤ C(t−ρ+δ + t−µ+δ/2 )|Im z|−3 . Using (6.6) we obtain the lemma. The following lemma follows from Lemmas 4.4 and 6.1. Lemma 6.2. Assume (I0 0), (I0 1) for > 0, (I0 2) for µ > 0. Assume the constants ρ, δ are chosen so that ρ > δ > (1 + 0 )−1 , µ > δ/2. Then for χ1 , χ2 , F ∈ C0∞ (R): e e Nt Nt t t χ1 (L)χ2 (L) = χ1 (L)Pk (f )F χ2 (L) + o(1) . Pk (f )F δ t tδ We recall that the observable Bct was defined in (5.2). For f = (f0 , f∞ ), g ∈ B(he ) we define the operator Rk (f, g) as Rk (f, g)| Nns he :=
n X
X
f1 ⊗ · · · ⊗ fj−1 ⊗ g ⊗ fj+1 ⊗ · · · ⊗ fn .
(6.9)
j=1 ]{i|i =∞}=k
If f0 + αf∞ ≤ 1, we see as in [10, Lemma 2.11] that |(v, Rk (f, g)u)| ≤ α−k kgkB(he ) k(N e ) 2 uk k(N e) 2 vk . 1
1
Proposition 6.3. Assume (I0 0), (I0 2) for µ > 1. Assume 0 < c < 1 or c = 1 and α2 < 0. For χ ∈ C0∞ (R), λ > 0: Z +∞ 1 1 kRk (f t , |gt |) 2 (Bct + λ)− 2 χ(L)ut k2 dt ≤ Ckuk2 , u ∈ D(N e ), = 0, ∞ , (i) 1
if gt = d0 ft . (ii)
s- lim eitH χ(L)Pk (f t )(Bct + λ)−1 χ(L)e−itH exists . e
e
t→+∞
Proof. Let Φk (t) = χ(L)(Bct + λ)−1 Pk (f t )χ(L) ,
λ > 0.
November 25, 2002 16:35 WSPC/148-RMP
1214
00150
C. G´ erard
Note that [Pk (f t ), Bct ] = 0. For u ∈ D(N e ) ∩ D(L) the function t 7→ (ut , Φk (t)ut ) is C 1 with derivative (ut , DΦk (t)ut ) and DΦk (t) = χ(L)D(Bct + λ)−1 Pk (f t )χ(L) + χ(L)(Bct + λ)−1 (dPk (f t , d0 ft ) + [φ(v e ), iPk (f t )])χ(L) .
(6.10)
t . We observe that bct defined in (5.2) is equal to 1 on supp(1 − f0t ) and on supp f∞ t Using the fact that Bct commutes with Pk (f ), we obtain
kχ(L)(Bct + λ)−1 [φ(v e ), iPk (f t )]χ(L)k ≤ Ck(Bct + λ)−1 a∗ ((1 − f0t )(K + 1)− 2 v e )k 1
+ Ck(Bct + λ)−1 a((1 − f0t )(K + 1)− 2 v e )k 1
t (K + 1)− 2 v e )k + Ck(Bct + λ)−1 a∗ (f∞ 1
t (K + 1)− 2 v e )k) . + k k(Bct + λ)−1 a(f∞ 1
Applying Proposition A.1, we obtain kχ(L)(Bct + λ)−1 [φ(v e ), iPk (f t )]χ(L)k t (K + 1)− 2 v e k ≤ Ck(1 − f0t )(K + 1)− 2 v e k + Ckf∞ 1
1
≤ Ct−µ ,
(6.11)
by (I0 2). On the other hand we have on the n-particle sector: dPk (f, g) =
n X
X
f1 ⊗ · · · ⊗ fj−1 ⊗ g0 ⊗ fj+1 ⊗ · · · ⊗ fn
j=1 ]{i|i =∞}=k
+
n X
X
f1 ⊗ · · · ⊗ fj−1 ⊗ g∞ ⊗ fj+1 ⊗ · · · ⊗ fn
j=1 ]{i|i =∞}=k−1
= Rk (f, g0 ) + Rk−1 (f, g∞ ) . Finally as in the proof of Proposition 5.1: D(Bct + λ)−1 = −(Bct + λ)−1 dΓ(ct )(Bct + λ)−1 + [φ(v e ), i(Bct + λ)−1 ] , and by (5.4) kχ(L)[φ(v e ), i(Bct + λ)−1 ]k ∈ O(t−µ ) .
(6.12)
Note also that (Bct + λ)−1 dΓ(ct )Pk (f t )(Bct + λ)−1 = (Bct + λ)−1 dΓ(ct ) 2 Pk (f t )dΓ(ct ) 2 (Bct + λ)−1 , 1
1
(6.13)
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1215
since Pk (f t ) commutes with Bct and dΓ(ct ). Using (6.12), (6.13) and Proposition 5.1, we see that the first term on the r.h.s. of (6.10) is intergrable along the evolution. Let us now consider the second term. Assume first that f0 + f∞ = 1. Then t , and hence d0 f0t = −d0 f∞ dPk (f t , d0 f t ) = Rk (f t , g0t ) − Rk−1 (f t , g0t ) , where we set R−1 (f, g) = 0. Next g0t = d0 f0t = f00
s − ct tρ
(6.14)
1−c s − ct − ρ ρ+1 ρ t t
.
If 0 < c < 1 or c = 1 and α2 < 0 we have g0t ≤ 0 for t 1. By (6.11), Proposition 5.1 and Proposition A.3, we obtain Z +∞ 1 1 kR0 (f t , |g0t |) 2 (Bct + λ)− 2 χ(L)ut k2 dt ≤ Ckuk2 , u ∈ D(N e ) . 1
Using then (6.14) we obtain by induction on k: Z +∞ 1 1 kRk (f t , |g0t |) 2 (Bct + λ)− 2 χ(L)ut k2 dt ≤ Ckuk2 ,
u ∈ D(N e ) .
(6.15)
1
Let us now assume that f0 + αf∞ ≤ 1. Introducing the cutoffs f˜ = (f0 , αf∞ ), we may assume that f0 + f∞ ≤ 1. Since f0 ≤ (1 − f∞ ), f∞ ≤ 1 − f0 , we have: Rk (f t , |g0t |) ≤ Rk (lt , |g0t |) for lt = (f0t , 1 − f0t ) , t t t t |) ≤ Rk (lt , |g∞ |) for lt = (1 − f∞ , f∞ ). Rk (f t , |g∞
(6.16)
t t , f∞ ) then If we set lt = (f0t , 1 − f0t ) then g0t = d0 l0t and if we set lt = (1 − f∞ t t g∞ = −d0 l0 . Hence (i) follows from (6.16) and the estimates (6.15) for the two choices of lt above. Property (ii) follows from (i) and Proposition A.4.
Theorem 6.4. Assume (I0 0), (I0 1) for 0 > 0, (I0 2) for µ > 1 and pick ρ in (5.2) t ) such that ρ(1 + 0 ) > 1. Fix 0 < c < 1 and c < c0 < 1. Let us denote f t = (f0t , f∞ t defined in (6.2) by fc to indicate the dependence on the constant c. Then (i) the limit Pc+0 k (f0 , f∞ ) := s- lim eitH Pk (fct0 )e−itH exists on Hce+ , e
e
t→+∞
(ii) [Pc+0 k (f0 , f∞ ), H e ] = 0, (iii) [Pc+0 k (f0 , f∞ ), L] = 0, (iv) if f0 + f∞ = 1 then s−
+∞ X 0
Pc+0 k (f0 , f∞ ) = 1l on Hce+ .
November 25, 2002 16:35 WSPC/148-RMP
1216
00150
C. G´ erard
For k = 0 the asymptotic cutoffs Pc+0 k (f ) take a simpler form. In fact we have P0 (f0 , f∞ ) = Γ(f0 ). We denote Pc+0 0 (f0 , f∞ ) by Γe+ c0 (f0 ) and we have itH Γ(fct0 0 )e−itH on Hce+ , for 0 < c < c0 < 1 . Γe+ c0 (f0 ) = s- lim e e
e
t→+∞
(6.17)
Proof. Let us first prove (i). By the definition of Hce+ it suffices to prove the theorem on Ran Pˆc+0 for c < c0 . Changing notation we may replace c0 by c. By Theorem 5.5 we may restrict ourselves to vectors u ∈ Ran Pˆc+ such that u = χ(L)u, χ ∈ C0∞ (R). Moreover for each u ∈ Ran Pˆc+ and 1 > 0 there exists > 0 such that e−itH u = (Bct + 1)−1 e−itH u + e−itH r + o(1) , e
e
e
(6.18)
with kr k ≤ 1 . We pick now δ > 0 such that ρ > δ, µ > δ/2 and δ(1+0 ) > 1, which is possible since ρ(1 + 0 ) > 1, µ > 1, and consider the observable Nte constructed in Sec. 4.2. If F ∈ C0∞ (R), F ≡ 1 near 0 we have: e e e Nt Pk (f t )e−itH u = Pk (f t )F χ2 (L)e−itH u + o(1) δ t e e Nt e−itH u + o(1) = χ(L)Pk (f t )χ(L)F tδ = χ(L)Pk (f t )χ(L)e−itH u + o(1) , e
(6.19)
where we used successively Proposition 4.5, Lemma 4.4, Lemma 6.2 and Proposition 4.5 again. Next we write using (6.18): χ(L)Pk (f t )χ(L)e−itH u e
= χ(L)Pk (f t )e−itH u e
= χ(L)Pk (f t )(Bct + λ)−1 e−itH u + χ(L)Pk (f t )e−itH r + o(1) e
e
= χ(L)Pk (f t )(Bct + λ)−1 χ(L)e−itH u + χ(L)Pk (f t )e−itH r + o(1) . e
e
Hence to prove (i) it suffices to prove the existence of s- lim eitH χ(L)Pk (f t )(Bct + 1)−1 χ(L)e−itH u , e
e
t→+∞
which is shown in Proposition 6.3. (iii) follows from the same arguments as in (6.19). In fact using Lemma 6.2 we obtain that if χ(L)u = u then χ1 (L)Pk+ (f )u = Pk+ (f )χ1 (L)u, which proves (iii). To prove (ii), it suffices to prove that eit1 H Pk+ (f )e−it1 H = Pk+ (f ) , e
e
∀ t1 ∈ R ,
or equivalently s- lim eitH (Pk (f t ) − Pk (f t−t1 ))e−itH = 0 . e
t→+∞
e
(6.20)
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1217
Using [10, Lemma 2.11], we have: Z
t1
Pk (f t ) − Pk (f t−t1 ) = −
dPk (f t , ∂t f t−r )dr . 0
Since ∂t f t ∈ O(t−ρ ), we obtain (Pk (f t ) − Pk (f t−t1 )) ∈ O(N e )t−ρ .
(6.21)
For u ∈ Hce+ with u = χ(L)u, χ ∈ C0∞ (R), we have: (Pk (f t ) − Pk (f t−t1 ))e−itH u e
= (Pk (f ) − Pk (f t
t−t1
))F
Nte tδ
χ(L)e−itH u + o(1) , e
by Proposition 4.5. Using then (6.21) and Lemma 4.6, we obtain (Pk (f t ) − Pk (f t−t1 ))F
Nte tδ
χ(L) ∈ O(tδ−ρ ) ,
which proves (6.20) since ρ > δ. Let us now prove (iv). We claim that if f = (f0 , f∞ ), f0 + f∞ ≤ 1 and b ≡ 1 on supp f∞ then
∞
X 1
. Pk (f )(dΓ(b) + λ)−1 ≤
m+λ
m
(6.22)
In fact on the n-particle sector we have: ∞ X
Pk (f )(dΓ(b) + λ)−1
m
X
=
f1 ⊗ · · · ⊗ fn
n X
!−1 bi + λ
i=1
m≤]{i|i =∞}≤n
≤ (m + λ)−1 . Assume now that f0 + f∞ = 1. Since density to show that
lim
m→∞
1l −
m−1 X 0
Pm 0
Pk+ (f ) ≤ 1l, to prove (iv) it suffices by
! Pk+ (f ) Rce+ (−1 )u = 0 ,
∀ > 0,
November 25, 2002 16:35 WSPC/148-RMP
1218
00150
C. G´ erard
where Rce+ (λ) is defined in Theorem 5.5. Now
! m−1
X
+ e+ −1 Pk (f ) Rc ( )u
1l −
0
∞
e
itH e X = lim e Pk (f t )(Bct + 1)−1 e−itH u t→+∞
m ≤ (m + 1)−1 kuk , by (6.22). This proves (iv). 6.2. Weak limits We will consider now for technical purposes a specific choice of the cutoffs f0 , f∞ . We set ( −1 −1 exp−(t−α1 ) (α2 −t) , t ∈ ]α1 , α2 [ , g(t) := 0, t ∈ / ]α1 , α2 [ , ( −1 −1 exp−(t−α1 −) (α2 −t) , t ∈ ]α1 + , α2 [ , g (t) := 0, t ∈ / ]α1 + , α2 [ , for 0 < < 12 (α2 − α1 ). Clearly g ≤ g. Let Z Z +∞ g(t)dt , C = C= −∞
+∞
−∞
and note that lim→0 C = C. We set Z s g(s0 )ds0 , f∞ (s) = C −1 −∞
f∞, (s) = C−1
g (t)dt ,
Z
s
−∞
g (s0 )ds0 .
(6.23)
Since g ≤ g, we have f∞, ≤
C f∞ , C
0 f∞, ≤
C 0 f , C ∞
and f∞ (s) ≡ 1 for s ≤ α2 ,
f∞ (s) ≡ 0 for s ≤ α1 ,
f∞, (s) ≡ 1 for s ≤ α2 ,
f∞, (s) ≡ 0 for s ≤ α1 + . 1 2
1
2 ∈ C ∞ (R). Next we set Note also that by [9, Lemma A.4.1], f∞ , f∞,
f0 := 1 − f∞ , 1 2
(6.24)
and again by [9, Lemma A.4.1] f0 ∈ C ∞ (R). The following lemma summarizes the properties of f0 , f∞, .
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1219
Lemma 6.5. (i)
f0 + f∞ = 1 ,
(ii)
∃ α > 0, ∀ > 0 f0 + αf∞, ≤ 1 ,
(iii)
2 ≤ 1. ∀ > 0, ∃ α > 0 f02 + f∞,
1
1
Proof. (i) is obvious. (ii) follows from the fact that C f∞, ≤ Cf∞ . Since f0 ≤ 1 and f∞, ≡ 0 in {s ≤ α1 + }, we have: ∀α > 0,
1
1
2 (s) ≤ 1 in {s ≤ α1 + } . f02 (s) + αf∞,
So it suffices to verify that 1
1 − f02 (s)
inf
1
s>α1 +
> 0,
2 f∞, (s)
or equivalently 1
(1 − f02 (s))2 > 0. inf s>α1 + f∞, (s) If s > α1 + , f∞ (s) ≥ r > 0 hence 1
1 1 (1 − f02 (s))2 ≥ (1 − f02 (s))2 ≥ (1 − (1 − r ) 2 )2 > 0 . f∞, (s)
Proposition 6.6. Assume (I0 0), (I0 1) for 0 > 0, (I0 2) for µ > 1 and pick ρ in (6.2) such that ρ(1 + 0 ) > 1. Then for 0 < c < c0 < 1 Pc+0 k (f0 , f∞ ) = w- lim Pc+0 k (f0 , f∞, ) on Hce+ . →0
Proof. As in the proof of Theorem 6.4 it suffices to prove the proposition on 0 RanPˆce+ 0 . Changing notation, we may replace c by c. By density it suffices to prove that + + (f0 , f∞, )Rc+ (λ)χ(L) = χ(L)Pck (f0 , f∞ )Rc+ (λ)χ(L) , w- lim χ(L)Pck →0
C0∞ (R),
λ > 0. Let us omit the index c to simplify notation. We will apply for χ ∈ Proposition A.5. To do this we need to estimate uniformly in the Heisenberg derivative of t )(Bct + λ)−1 χ(L) . Φ (t) = χ(L)Pk (f0t , f∞,
By (6.12) and (6.13) we have: t )D(Bct + λ)−1 χ(L)u1 ) | | (u2 , χ(L)Pk (f0t , f∞, t )k ku1 k ku2 k ≤ Ct−u kPk (f0t , f∞, t k kdΓ(ct ) 2 (Bct + λ)−1 χ(L)u2 k kdΓ(ct ) 2 (Bct + λ)−1 χ(L)u1 k . + kPk (f0t , f∞, 1
1
t )k ≤ α−k uniformly in . By Lemma 6.5(ii) we have kPk (f0t , f∞,
November 25, 2002 16:35 WSPC/148-RMP
1220
00150
C. G´ erard
t Let us now consider the terms coming from DPk (f0t , f∞, ). By (6.11): t )]χ(L)k kχ(L)(Bct + λ)−1 [φ(v e ), iPk (f0t , f∞, t (K + 1)− 2 v e k ≤ Ck(1 − f0t )(K + 1)− 2 v e k + Ckf∞, 1
1
t (K + 1)− 2 v e k ≤ Ck(1 − f0t )(K + 1)− 2 v e k + Ckf∞ 1
1
≤ Ct−µ , uniformly in 0 < <
1 (α2 − α1 ) , 2
since f∞, ≤ C0 f∞ . Finally t t t t ) = Rk ((f0t , f∞, ), d0 f0t ) + Rk−1 ((f0t , f∞, ), d0 f∞, ). D0 Pk (f0t , f∞,
Since f∞, ≤ C0 f∞ ,
0 0 f∞, ≤ C0 f∞
uniformly for 0 < < 12 (α2 − α1 ), we have: 1
k/2
1
k/2
1
t t ), d0 f0t )| 2 ≤ C0 Rk ((f0t , f∞ ), |d0 f0t |) 2 , |Rk ((f0t , f∞, 1
t t t t ), d0 f∞, )| 2 ≤ C0 Rk−1 ((f0t , f∞ ), |d0 f∞ |) 2 . |Rk−1 ((f0t , f∞,
This yields t )χ(L)u1 )| |(u2 , χ(L)(Bct + λ)−1 D0 Pk (f0t , f∞,
≤ C0k (kR1 (t)χ(L)u1 k kR1 (t)χ(L)u2 k + kR2 (t)χ(L)u1 k kR2 (t)χ(L)u2 k) , for t ), |d0 f0t |) 2 , R1 (t) = (Bct + λ)− 2 Rk ((f0t , f∞ 1
1
t t ), |d0 f∞ |) 2 . R2 (t) = (Bct + λ)− 2 Rk−1 ((f0t , f∞ 1
1
By (I0 2), Proposition 5.1 and Proposition 6.3, Hypothesis (A.2) of Proposition A.5 is satisfied. Hypothesis (A.1) is clearly satisfied since 1 t )k ≤ α−k , uniformly in 0 < < (α2 − α1 ) , kPk (f0t , f∞, 2 by Lemma 6.5(i). Finally t t ) = Pk (f0t , f∞ ), w- lim Pk (f0t , f∞, →0
∀t ≥ 0,
and hence Hypothesis (A.3) of Proposition A.5 is satisfied. Applying Proposition A.5 we obtain the proposition. 7. Geometric Inverse Wave Operators This section is devoted to the construction of geometric inverse wave operators on the spaces Hce+ . This is an essential step in the proof of geometric asymptotic completeness on Hce+ . The key technical result in this section is Lemma 7.3.
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1221
7.1. Extended objects We first define so called extended objects which provide a convenient framework for scattering theory (see [10, Sec. 3.4]). Let e := He ⊗ Γ(he ) , Hext e e := H e ⊗ 1lΓ(he ) + 1lHe ⊗ dΓ(σ), acting on Hext , Hext e . Lext := L ⊗ 1lΓ(he ) + 1lHe ⊗ dΓ(|σ|), acting on Hext
We set N0e := N e ⊗ 1l ,
e N∞ := 1l ⊗ N e ,
e e Next := N0e + N∞ .
e is as follows: Γ(he ) contains the The interpretation of the tensor product Hext asymptotically free bosons while He contains the atom and the bosons staying close to it. We define also the extended Heisenberg derivatives (see [10, Sec. 3.4]):
ˇ 0 f (t) := ∂ f (t) + (σ ⊕ σ)if (t) − if (t)σ , d ∂t f (t) ∈ B(he , he ⊕ he ) , ˇ 0 F (t) := ∂ F (t) + (dΓ(σ) ⊗ 1l + 1l ⊗ dΓ(σ))iF (t) − iF (t)dΓ(σ) , D ∂t F (t) ∈ B(Γ(he ), Γ(he ) ⊗ Γ(he )) , ∂ ˇ DB(t) := B(t) + H ext iB(t) − iB(t)H e , ∂t B(t) ∈ B(He , Hext ) . Note that with the notation in Sec. 2.2 we have ˇ0f ) . ˇ d ˇ 0 dΓ(f ) = dΓ( D ˇ ˇ k (j) defined in Sec. 2.2 for the In this section, we will use the operators Γ(j), Γ following choice of j. We pick cutoff functions j0 , j∞ satisfying (6.1) and (6.4). We t ) for set j t = (j0t , j∞ s − ct , 0 < c ≤ 1 , 0 < ρ < 1 , = 0, ∞ . (7.1) jt = j tρ 7.2. Technical estimates Lemma 7.1. Assume (I0 0), (I0 1) for 0 > 0, (I0 2) for µ > 0 and let ρ > δ > (1 + 0 )−1 , µ > δ/2. Then for χ1 , χ2 ∈ C0∞ R: e ˇ k (j t ) − Γ ˇ k (j t )χ1 (L))F Nt χ2 (L) ∈ o(1) . (χ1 (Lext )Γ tδ
November 25, 2002 16:35 WSPC/148-RMP
1222
00150
C. G´ erard
ˇ k (˜j t ), we may ˇ k (j t ) = α−k Γ Proof. Considering ˜j = (j0 , αj∞ ) and noting that Γ 2 e ˇ t ˇ k (j t ) = 1l{k} (N∞ ≤ 1. Since Γ )Γ(j ) assume that j0 + j∞ ≤ 1 and hence j02 + j∞ e t ˇ and N∞ commutes with Lext , it suffices to prove the lemma for Γ(j ). We write e Nt t t ˇ ˇ (χ1 (Lext )Γ(j ) − Γ(j )χ1 (L))F χ2 (L) tδ Z ∂χ ˜1 i ˇ t) (z)(z − Lext )−1 (Lext Γ(j = 2π ∂ z¯ e Nt t −1 ˇ z. χ2 (L)dz ∧ d¯ − Γ(j )L)(z − L) F tδ On D(L) we have: L = K ⊗ 1lΓ(he ) + 1lK ⊗ dΓ(|σ|) + φ(v e ) , and on D(Lext ) Lext = K⊗1lΓ(he ) ⊗1lΓ(he ) +1lK ⊗dΓ(|σ|)⊗1lΓ(he ) +φ(v e )⊗1lΓ(he ) +1lK ⊗1lΓ(he ) ⊗dΓ(|σ|) . By [10, Lemma 2.14]: ˇ t ) − Γ(j ˇ t )φ(v e ) φ(v e ) ⊗ 1lΓ(he ) Γ(j 1 = √ ((a∗ ((1 − j0t )v e ) ⊗ 1lΓ(he ) 2 t e ˇ t ˇ t )a((1 − j0t )v e )) , ˆ ∗ (j∞ − 1lΓ(he ) ⊗a v ))Γ(j ) − Γ(j
(7.2)
ˆ is defined as follows: let T : K⊗Γ(he )⊗Γ(he ) → where the twisted tensor product ⊗ e e Γ(h ) ⊗ K ⊗ Γ(h ) be the unitary operator defined by T ψ ⊗ u1 ⊗ u2 = u1 ⊗ ψ ⊗ u2 . Then if B is an operator on K ⊗ Γ(he ), we set ˆ := T −1 1lΓ(he ) ⊗ BT . 1lΓ(he ) ⊗B By [10, Lemma 2.16]: ˇ t ˇ t ˇ t t Lext 0 Γ(j ) − Γ(j )L0 = dΓ(j , k ) , t ), kt = [|σ|, jt ]. By Lemma 5.2, kt ∈ O(t−ρ ) and by [10, Lemma 2.16] for k t = (k0t , k∞ we have: e −ρ ˇ t ˇ t . (Lext 0 Γ(j ) − Γ(j )L0 ) ∈ O(N )t
(7.3)
0
Using then (7.2), Proposition A.1 and Hypothesis (I 2), we have: ˇ t ) − Γ(j ˇ t )φ(v e ))(N + 1)− 12 k k(Lext + i)−1 (φ(v e ) ⊗ 1lΓ(he ) Γ(j t (K + 1)− 2 v e k ≤ Ck(1 − j0t )(K + 1)− 2 v e k + Ckj∞ 1
≤ Ct−µ .
1
(7.4)
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1223
Now using (5.8) and Lemma 4.6 we obtain ˇ t ) − Γ(j ˇ t )L)(z − L)−1 F (z − Lext )−1 (Lext Γ(j
Nte tδ
χ2 (L)
˜1 . ≤ C(tδ−ρ + t−µ+δ/2 )|Im z|−4 , z ∈ supp χ This implies the lemma. Lemma 7.2. Assume (I0 0), (I0 1) for 0 > 0 (I0 2) for µ > 0 and ρ > δ > (1 + 0 )−1 , µ > δ/2. Then for χ1 , χ2 ∈ C0∞ (R): e e ˇ k (j t )F Nt χ2 (L) ∈ o(1) . ˇ k (j t )F Nt χ1 (L)χ2 (L) − χ1 (Lext )Γ Γ tδ tδ Proof. We combine Lemma 7.1 and Lemma 4.4. In the following lemma we use the operators Rk (f, g) introduced in (6.9). Lemma 7.3. Assume j0 +αj∞ ≤ 1. Let rt = d0 jt , = 0, ∞. Then for u ∈ He , v ∈ e : Hext 1 t ˇ 0Γ ˇ k (j t )u)| ≤ ((u, Rk (j t , |r0t |)u) + (u, Rk−1 (j t , |r∞ |)u)) 2 |(v, D
× (α−k (v, R0 (j t , |r0t |) ⊗ 1lΓ(he ) v) 1
t |)v)) 2 . + (v, 1lK⊗Γ(he ) ⊗ Rk−1 (j t , |r∞
Proof. To lighten notation we will suppress the exponent t in jt , rt . On the n-particle sector, we have (see Sec. 2.2): k
ˇ k (j) = Ik Γ
! 12
n
j0 ⊗ · · · ⊗ j0 ⊗ j∞ ⊗ · · · ⊗ j∞ , | {z } | {z } n−k
k
so ˇ k (j) = Ik ˇ 0Γ D
k n
n−k X
! 12
n
Let k n
j0 ⊗ · · · ⊗ r0 ⊗ · · · ⊗ j0 ⊗ j∞ ⊗ · · · ⊗ j∞ ^
i=1
k
+ Ik
R=
! 12
!
i n X
j0 ⊗ · · · ⊗ j0 ⊗ j∞ ⊗ · · · ⊗ r∞ ⊗ · · · ⊗ j∞ . ^
i=n−k+1
n X i=1
i
r ⊗ · · · ⊗ j∞ u), u ∈ (u, j0 ⊗ · · · ⊗ ^ i
n O s
he ,
November 25, 2002 16:35 WSPC/148-RMP
1224
00150
C. G´ erard
where ri = r0 if i ≤ n − k, ri = r∞ if i > n − k. We claim that R = (u, dPk (j, r)u) . In fact
(7.5)
d I(x) , R= dx x=0
for I(x) =
k
! (u, j0 (x) ⊗ · · · j0 (x) ⊗ j∞ (x) ⊗ · · · ⊗ j∞ (x)u) , {z } | {z } |
n
n−k
k
Nn
e and j (x) = j + xr . Since u ∈ s h this equals Pk (j(x)) (this identity does not hold if u is not symmetric w.r.t. permutations). Hence (7.5) follows from [10, Lemma 2.11]. We now write
r ⊗ · · · j∞ as Ai Bi Ai , Ik j0 ⊗ · · · ^ i
for 1
1
1
2 | 2 ⊗ · · · j∞ , Ai = j02 ⊗ · · · |r ^
i
Bi = Ik 1l ⊗ · · · sign r ⊗ · · · 1l . ^
i
Note that kBi k ≤ 1. We have ˇ k (j)u)| ˇ 0Γ |(v, D !1 n k 2 X = (v, Ai Bi Ai u) n i=1
≤
≤
k
! 12
n
n X
kAi vkkAi uk
i=1
k n
!
n X
! 12 kAi uk2
i=1
n X
! 12 kAi vk2
i=1
By the identity (7.5) we have: ! n k X kAi uk2 n i=1 = (u, dPk (j, |r|)u) = (u, Rk (j, |r0 |)u) + (u, Rk−1 (j, |r∞ |)u) ,
.
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1225
where Rk (f, g) is defined in (6.9). On the other hand n X
kAi vk2
i=1
=
v,
n−k X
! j0 ⊗ · · · ⊗ |r0 | ⊗ · · · ⊗ j0 ⊗ j∞ ⊗ · · · ⊗ j∞ v ^
i=1
+
v,
i n X
!
j0 ⊗ · · · ⊗ j0 ⊗ j∞ ⊗ · · · ⊗ |r∞ | ⊗ · · · ⊗ j∞ v ^
i=n−k+1
i
≤ α−k (v, R0 (j, |r0 |) ⊗ 1lNk he v) + (v, 1lNn−k he ⊗ Rk−1 (j, |r∞ |)v) , Nn using the fact that j∞ ≤ α−1 . This proves the lemma for u ∈ K ⊗ s he and Nn−k e Nk e v ∈ K ⊗ s h ⊗ s h . To prove the lemma for arbitrary u ∈ He , v ∈ Hext we set Πn = 1l{n} (N e ) ,
e Πext n = 1l{n} (Next ) ,
and note that ˇ k (j)Πn = Πext D ˇΓ ˇ k (j) . ˇ 0Γ D n The estimate for arbitrary u, v follows from the estimate for Πn u, Πext n v and the Cauchy–Schwarz inequality. 7.3. Number of asymptotically free particles e . We set In this subsection we extend the results in Sec. 5 to Hext ext e = Bct ⊗ 1lΓ(he ) + 1lHe ⊗ Bct , acting on Hext , Bct
where Bct is defined in Sec. 5. By exactly the same arguments as in Sec. 5, we obtain Proposition 7.4. Assume (I0 0), (I0 2) for µ > 1. Assume that 0 < c < 1 or that c = 1 and α1 < 0. Then for χ ∈ C0∞ (R): Z +∞ e 1 dt ext k(dΓ(d0 bt ) ⊗ 1lΓ(he ) + 1lHe ⊗ dΓ(d0 bt )) 2 (Bct + λ)−1 χ(Lext )e−itHext uk2 t 1 ≤ Ckuk2 , e ), λ > 0. for u ∈ D(Next
Theorem 7.5. Assume (I0 0), (I0 1) for 0 > 0, (I0 2) for µ > 1 and pick ρ in (5.2) such that ρ(1 + 0 ) > 1. Then: (i) for each λ ∈ C\R− the limit ext + λ)−1 e−itHext =: Rc+ext (λ) exists . s- lim eitHext (Bct e
t→+∞
e
November 25, 2002 16:35 WSPC/148-RMP
1226
00150
C. G´ erard
e (ii) [Rc+ext (λ), Lext ] = [Rc+ext (λ), Hext ] = 0. (iii) The limit
s- lim −1 Rc+ext (−1 ) =: Pˆc+ext exists →0
and is an orthogonal projection. (iv) e , Pˆc+ext ] = [Lext , Pˆc+ext ] = 0 , [Hext ext + λ)−1 e−itHext u = u . u = Pˆc+ext u ⇔ s- lim s- lim eitHext (Bct e
→0
e
t→+∞
Theorem 7.6. Assume (I0 0), (I0 1) for 0 > 0, (I0 2) for µ > 1 and pick ρ in (5.2) such that ρ(1 + 0 ) > 1. Let for 0 < c < 1: Pc+ext := inf0 Pˆc+0 ext ,
Hce+ext := Ran Pc+ext .
c
Then: (i) Pc+ext is an orthogonal projection independent on the choice of the function f in (5.2). e , Pc+ext ] = [Lext , Pc+ext ] = 0. (ii) [Hext e+ (iii) Hc ext = Hce+ ⊗ Γ(he ). Proof. Parts (i) and (ii) can be shown exactly as in Theorem 5.6. To prove (iii) we have to show that for 0 < c ≤ 1 Pˆc+ext = Pˆce+ ⊗ 1lΓ(he ) , which means s- lim −1 Rc+ext (−1 ) = s- lim −1 Rce+ (−1 ) ⊗ 1lΓ(he ) . →0
→0
(7.6)
We note that ext + 1l)−1 − (Bct + 1l)−1 ⊗ 1lΓ(he ) )1lHe ⊗ (N e + 1)−1 k ≤ C . k((Bct e , we obtain Since 1l ⊗ N e commutes with Hext
k(−1 Rce+ext (−1 ) − −1 Rce+ (−1 ) ⊗ 1lΓ(he ) )1lHe ⊗ (N e + 1)−1 k ≤ C . This proves (7.6) by a density argument. 7.4. Geometric inverse wave operators Theorem 7.7. Assume (I0 0), (I0 1) for 0 > 0, (I0 2) for µ > 1 and pick ρ in (5.2) t ) be such that ρ(1 + 0 ) > 1. Fix 0 < c < 1 and c < c0 < 1. Let j t = (j0t , j∞ 0 constructed as in (7.1) with the constant c . Then (i) the limit ˇ k (j t )e−itH exists on He+ ; Wk+ (j) := s- lim eitHext Γ c e
t→+∞
e
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1227
(ii) for χ ∈ C0∞ (R) χ(Lext )Wk+ (j) = Wk+ (j)χ(L) ; (iii) for χ ∈ C0∞ (R) e )Wk+ (j) = Wk+ (j)χ(H e ) ; χ(Hext
(iv) let f0 as in (6.1) with f0 j0 = j0 . Then + Wk+ (j) = Γe+ c0 (f0 ) ⊗ 1lΓ(he ) Wk (j) ;
(v) for all c00 > c, λ > 0 we have: Rc+00 ext (λ)Wk+ (j) = Wk+ (j)Rce+ 00 (λ) ; (vi) Wk+ (j)Hce+ ⊂ Hce+ext ; (vii) the limit e e ˇ k (j t )∗ e−itHext exists on Hce+ext s- lim eitH Γ
t→+∞
and equals Wk+ (j)∗ . 2 ˇ k (j t )k ≤ ≤ 1 we have kΓ Proof. Let us first prove (i). Note first that since j02 +α2 j∞ −k t ˇ α and hence Γk (j ) is uniformly bounded in t. By the definition of Hce+ it suffices to show the existence of the limit on RanPˆce+ 0 . Changing notation we may replace c0 by c. By Theorem 5.5 we may restrict ourselves to vectors u ∈ Ran Pˆc+ such that u = χ(L)u, χ ∈ C0∞ (R). Arguing as in the proof of Theorem 6.4, it suffices to show the existence of
ˇ k (j t )e−itH Rce+ (λ)u , lim eitHext Γ e
e
t→+∞
for λ > 0. We pick now δ > 0 such that ρ > δ, µ > δ/2 and δ(1 + 0 ) > 1, and consider the observable Nte constructed in Sec. 4.2. If F ∈ C0∞ (R), F ≡ 1 near 0 we have by Proposition 4.5: e e e e Nt χ2 (L)e−itH Rc+ (λ)u + o(1) . e−itH Rce+ (λ)u = e−itH χ2 (L)Rce+ (λ)u = F tδ Using again Lemmas 7.2, 4.4 and Proposition 4.5 we have: ˇ k (j t )e−itH Re+ (λ)u eitHext Γ c e
e
=e
e itHext
ˇ k (j t )F χ(Lext )Γ
Nte tδ
χ(L)e−itH Rce+ (λ)u + o(1) e
e ˇ k (j t )e−itH e Re+ (λ)u + o(1) = eitHext χ(Lext )Γ c
= eitHext χ(Lext )Γk (j t )(Bct + λ)−1 χ(L)e−itH u + o(1) , e
e
where in the last step we used the definition of Rce+ (λ). e ) the function For u1 ∈ D(L) ∩ D(N e ), u2 ∈ D(Lext ) ∩ D(Next ˇ k (j t )(Bct + λ)−1 χ(L)e−itH u1 ) R+ 3 t 7→ (e−itHext u2 , χ(Lext )Γ e
e
(7.7)
November 25, 2002 16:35 WSPC/148-RMP
1228
00150
C. G´ erard
is C 1 with derivative: e ˇΓ ˇ k (j t )(Bct + λ)−1 χ(L)e−itH e u1 ) (e−itHext χ(Lext )u2 , D
ˇ k (j t )D(Bct + λ)−1 χ(L)e−itH u1 ) + (e−itHext χ(Lext )u2 , Γ e
e
=: I1 (t) + I2 (t) . Let us first estimate I2 (t). As in Proposition 5.1 we have: D(Bct + λ)−1 χ(L) = −(Bct + λ)−1 dΓ(ct )(Bct + λ)−1 χ(L) + O(t−u ) . ˇ k (j t ) in Sec. 2.2 and the fact that j t commutes with bt From the expression of Γ and ct we see that ˇ k (j t )(Bct + λ)−1 dΓ(ct )(Bct + λ)−1 Γ ext ˇ k (j t )dΓ(ct ) 2 (Bct + λ)−1 . = (Bct + λ)−1 (dΓ(ct ) ⊗ 1l + 1l ⊗ dΓ(ct )) 2 Γ 1
1
Hence e ext ˇ k (j t )kk(dΓ(ct ) ⊗ 1l + 1ldΓ(ct )) 12 (Bct + λ)−1 χ(Lext )e−itHext u2 k |I2 (t)| ≤ kΓ
× kdΓ(ct ) 2 (Bct + λ)−1 e−itH u1 k e
1
+ Ct−µ ku1 k ku2k .
(7.8)
Let us now estimate I1 (t). We have: ˇ k (j t ) + iφ(v e ) ⊗ 1lΓ ˇΓ ˇ k (j t ) = D ˇ 0Γ ˇ k (j t ) − iΓ ˇ k (j t )φ(v e ) D ˇ k (j t ) + C1 (t) . ˇ 0Γ =: D We use the identity (7.2) and the fact that ext ˇ k (j t ) ˇ k (j t )(Bct + λ)−1 = (Bct + λ)−1 Γ Γ
to obtain ext + λ)−1 k kχ(Lext)C1 (t)(Bct + λ)−1 χ(L)k ≤ Cka∗ ((1 − j0t )v e (K + 1)− 2 ) ⊗ 1l(Bct 1
t e ext v (K + 1)− 2 ) ⊗ 1l(Bct + λ)−1 k + ka∗ (j∞ 1
+ Cka((1 − j0t )v e (K + 1)− 2 )(Bct + λ)−1 k . 1
t , we obtain by Since by (6.1) and (5.1) bct ≡ 1 on supp(1 − j0t ) and on supp j∞ Proposition A.1 that
kχ(Lext )C1 (t)(Bct + λ)−1 χ(L)k t e v (K + 1)− 2 k) ≤ C(k(1 − j0t )v e (K + 1)− 2 k + kj∞ 1
≤ Ct−µ ,
1
(7.9)
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1229
by (I0 2). Next we apply Lemma 7.3 and the fact that 1 ext ˇ k (j t )(Bct + λ)−1 = (Bct ˇ 0Γ ˇ k (j t )(Bct + λ)− 12 ˇ 0Γ + λ)− 2 D D
to obtain ˇ k (j t )(Bct + λ)−1 χ(L)u1 )| ˇ 0Γ |(u2 , χ(Lext )D ≤ ((χ(L)u1 , Rk (j t , |r0t |)(Bct + λ)−1 χ(L)u1 ) t |)(Bct + λ)−1 χ(L)u1 )) 2 + (χ(L)u1 , Rk−1 (j t , |r∞ 1
ext × (α−k (χ(Lext )u2 , R0 (j t , |r0t |) ⊗ 1l(Bct + λ)−1 χ(Lext )u2 ) t ext |)(Bct + λ)−1 χ(Lext )u2 )) 2 , + (χ(Lext )u2 , 1l ⊗ Rk−1 (j t , |r∞ 1
(7.10)
for rt = d0 jt . We note that by exactly the same proof, estimates similar to those of Proposition 6.3 with Rk (f t , |gt |) replaced by either Rk (f t , |gt |) ⊗ 1l or 1l ⊗ e ext and L by Lext hold for the evolution e−itHext . Rk (f t , |gt |), Bct by Bct Combining (7.8), (7.9), (7.10) and Propositions 5.1, 6.3, 7.4 we obtain the existence of the limit in (i), by Proposition A.4. Property (ii) follows from Lemma 7.2, arguing as in (7.7). To prove (iii) it suffices as in the proof of Theorem 6.4(ii) to show that e ˇ k (j t ) − Γ ˇ k (j t−t1 ))e−itH e = 0 , s- lim eitHext (Γ
t→+∞
∀ t1 ∈ R .
By [10, Lemma 2.16]: Z ˇ k (j t−t1 ) = − ˇ k (j t ) − Γ Γ
t1
ˇ k (j t , ∂t j t−r )dr , dΓ
0 t ≤ 1, we have by [10, Lemma 2.16]: and since j0t + αj∞
ˇ k (j t , ∂t j t−r )(N e + 1)−1 k ≤ Ck∂t j t−r k ≤ Ct−ρ . kdΓ Next we argue as in the proof of Theorem 6.4 using that
e
e
(N + 1)F Nt χ(L) ∈ O(tδ ) .
δ t Property (iv) follows from the fact that ˇ k (j t ) = Γ ˇ k (j t ), if f0 j0 = j0 . Γ(f0t ) ⊗ 1lΓ Property (v) follows from the fact that if Bct is the observable defined in (5.2) for any constant 0 < c0 ≤ 1 we have: ˇ k (j t ) . ˇ k (j t )(Bct + λ)−1 = (B ext + λ)−1 Γ Γ ct
November 25, 2002 16:35 WSPC/148-RMP
1230
00150
C. G´ erard
Property (vi) follows from (v) and Theorem 7.6. Finally the existence of the limit (vii) follows from exactly the same arguments as those used to prove (i). Finally we prove a result similar to Proposition 6.6. Proposition 7.8. Assume the hypotheses of Theorem 7.7. Let j = (f0 , f∞ ), j = (f0 , f∞, ), where f0 , f∞ , f∞, are defined in (6.23), (6.24). Then Wk+ (j) = w- lim Wk+ (j ) . →0
Proof. We apply again Proposition A.5. By density it suffices to show that for λ > 0, χ ∈ C0∞ (R): χ(Lext )Wk+ (j)χ(L)Rce+ (λ) = w- lim χ(Lext Wk+ (j )χ(L)Rce+ (λ) . →0
To check Hypothesis (A.2) of Proposition A.5 we have to consider the Heisenberg derivative of ˇ k (jt )(Bct + λ)−1 χ(L) . Φ (t) = χ(Lext )Γ The estimates (7.8)–(7.10) and the fact that 0 0 ≤ Cj∞ , uniformly in j0 + αj∞, ≤ 1, j∞, ≤ Cj∞ , j∞,
ˇ k (jt )k ≤ α−k . show that Hypothesis (A.2) is satisfied. Similarly (A.1) holds since kΓ Finally ˇ k (jt ) = Γ ˇ k (j t ) , w- lim Γ →0
∀t ∈ R.
Applying Proposition A.5 we obtain the proposition. 8. Asymptotic Fields and Wave Operators This section is devoted to asymptotic fields and wave operators for H and H e . The case of H is treated in Secs. 8.1, 8.2, while the case of H e is treated in Secs. 8.4, 8.5, by arguments similar to those used in the massive case (see [10]). The conversion of scattering objects from H e to H is described in Sec. 8.6. Finally in Sec. 8.7 it is shown that the asymptotic Weyl operators W e+ (f ) preserve the spaces Hce+ and define on them representations of Fock type. 8.1. Asymptotic fields for H In this section we show the existence of asymptotic Weyl operators and asymptotic fields for the Nelson Hamiltonian introduced in Sec. 1.1. Similar results can be shown under corresponding hypotheses for abstract Pauli–Fierz models introduced in Sec. 3.1 (see the remark at the beginning of Sec. 8.4). We recall that the one-particle space is h = L2 (R3 , dk). We set ht := e−it|k| h ,
h ∈ h,
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1231
and h0 := {h ∈ h | |k|− 2 h ∈ h} 1
equipped with the graph topology. In this section, we assume Condition (I4) introduced in Sec. 1.4. Introducing the operator v ∈ B(K, K ⊗ h) defined in (1.3) we see that if (H0) holds for α > 0, then (I4) implies: ∀ > 0, ∃ C such that kF (|x| ≥ R)F ( ≤ |k| ≤ −1 )(K + 1)− 2 vkB(K,K⊗h) ≤ CR−inf(α,µ1 ) , 1
− 12
kF (|x| ≥ R)F ( ≤ |k| ≤ −1 )v(K + 1)
(8.1)
kB(K,K⊗h) ≤ CR−inf(α,µ1 ) .
In fact this follows from the fact that ∂k (e−ik.x vj (k)) = e−ik.x (∂k − ix)vj (k), if we use (1.1) to control the powers of x appearing when differentiating e−ik.x vj (k). Theorem 8.1. Assume (H0) for α > 1, (I0), (I4) for µ1 > 1. Then (i) for h ∈ h0 the asymptotic Weyl operator W + (h) := s- lim eitH W (ht )eitH exists .
(8.2)
t→+∞
(ii) The map h0 3 h 7→ W + (h) ∈ U(H) is strongly continuous for the topology of h0 . (iii) W + (h)W + (g) = e−i Im(h,g) W + (f + g). (iv) eitH W + (h)e−itH = W + (h−t ). Proof. We have W (ht ) = e−itH0 W (h)e−itH0 ,
(8.3)
which implies that as a quadratic form on D(H0 ) one has: ∂t W (ht ) = [−H0 , iW (ht )] . Since on D(H0 ), H = H0 + φ(v), we have as quadratic forms on D(H) = D(H0 ): ∂t eitH W (ht )e−itH = eitH [φ(v), iW (ht )]e−itH = ieitH W (ht ) Im(ht , v)e−itH . Integrating this relation we obtain, first as a quadratic form identity on D(H), and then by a simple argument as an operator identity on H: eitH W (ht )e−itH (H + i)−1 − W (h)(H + i)−1 Z t =i eisH W (ht ) Im(hs , v)(H + i)−1 e−isH uds . 0
For h ∈
C0∞ (R3 \{0}), −1
kIm(ht , v)(H + i)
we obtain by stationary phase arguments (8.1):
k ≤ Ck(ht , v(K + 1)− 2 )k + Ck(ht , (K + 1)− 2 v)k ≤ Ct−µ1 . 1
1
November 25, 2002 16:35 WSPC/148-RMP
1232
00150
C. G´ erard
The existence of the limit (8.2) follows for h ∈ C0∞ (R3 \{0}). Next we use the identity: W (h1 ) − W (h2 ) = W (h1 )(1l − e− 2 Im(h1 ,h2 ) ) + e− 2 Im(h1 ,h2 ) W (h1 )(1l − W (h1 − h2 )) . i
i
Using that
p i |1l − e− 2 Im(h1 ,h2 ) | ≤ C|Im(h1 , h2 )| ≤ Ckh1 − h2 k kh1 k2 + kh2 k2 , k(1l − W (h1 − h2 ))uk ≤ kφ(h1 − h2 )uk ,
we obtain that for kh1 k, kh2 k ≤ R, k(W (h1 ) − W (h2 ))(H + i)−1 k ≤ CR kh1 − h2 kh0 . Since
C0∞ (R3 \{0})
(8.4)
is dense in h0 , we deduce from (8.4) the existence of the limit
s- lim eitH W (ht )e−itH χ(H), for χ ∈ C0∞ (R), h ∈ h0 . t→+∞
By density this proves the existence of the limit (8.2) for all h ∈ h0 . Statement (ii) follows directly from (8.4). Statements (iii) and (iv) are immediate. Theorem 8.2. Assume (H0) for α > 1, (I0), (I4) for µ1 > 1. Then: (i) there exists for h ∈ h0 a selfadjoint operator φ+ (h) called the asymptotic field + such that W + (sh) = eisφ (h) , s ∈ R. 1 (ii) For h ∈ h0 , D(H + b) 2 ⊂ D(φ+ (h)) and : φ+ (h)(H + b)− 2 = s- lim eitH φ(ht )(H + b)− 2 e−itH , 1
1
t→+∞
kφ+ (h)(H + b)− 2 k ≤ Ck(1 + |k|− 2 )hk . 1
1
1
For hi ∈ h0 ∩ D(|k| 2 ), 1 ≤ i ≤ n, n ≥ 2, D((H + b)n/2 ) ⊂ D(Πn1 φ+ (hi )) and Πni=1 φ+ (hi )(H + b)−n/2 = s- lim eitH Πni=1 φ(hi,t )(H + b)−n/2 e−itH t→+∞
kΠn1 φ+ (hi )(H + b)−n/2 k ≤ Cn Πn1 k(1 + |k| 2 + |k|− 2 )hi k . 1
1
(iii) The operators φ+ (h) satisfy in the sense of quadratic forms on D(φ+ (h1 )) ∩ D(φ+ (h2 )) the canonical commutation relations [φ+ (h2 ), φ+ (h1 )] = i Im(h2 |h1 ) . Note that the estimates on the domain of Πni=1 φ+ (hi ) described in (ii) are better for n = 1 than for arbitrary n ≥ 2. Proof. (i) and (iii) follow by general arguments from the fact that h0 3 h 7→ W + (h) is a regular CCR representation (see eg [10, Sec. 2.2]). We will prove (ii) for arbitrary n and explain then the modifications for the case n = 1 .We first prove the existence of the norm limit lim eitH Πn1 φ(hi,t )(H + b)−n/2 e−itH =: R+ (h1 , . . . , hn ) ,
t→+∞
(8.5)
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1233
1
for hi ∈ h0 ∩ D(|k| 2 ). We deduce from the identity (8.3) that the Heisenberg derivative of Πn1 φ(hi,t )(H + b)−n/2 defined as a quadratic form on D(H) equals [φ(v), iΠn1 φ(hi,t )](H + b)−n/2 . Since [φ(v), iφ(hi,t )] = Im(hi,t , v) is bounded and using Lemma 3.10, (8.1) and stationary phase arguments as in the proof of Theorem 8.1 we obtain the existence of the limit (8.5) for hi ∈ C0∞ (R3 \{0}). A density argument and the norm continuity of (h1 , . . . , hn ) ∈ (h0 ∩ D(|k| 2 ))n 7→ Πn1 φ(hi )(H + b)−n/2 1
shown in Lemma 3.10 proves the existence of the limit (8.5) for arbitrary hi ∈ 1 h0 ∩ D(|k| 2 ). It follows then again from Lemma 3.10 that kR+ (h1 , . . . , hn )k ≤ Cn Πn1 k(1 + |k| 2 + |k|− 2 )hi k . 1
1
(8.6)
Let us now complete the proof of (ii) by induction on n. The proof of (ii) for n = 1 1 needed to start the induction argument will be given later. Let hi ∈ h0 ∩ D(|k| 2 ), 1 ≤ i ≤ n. We have to show that D((H + b)n/2 ) ⊂ D(Πn1 φ+ (hi )) ,
(8.7)
Πn1 φ+ (hi )(H + b)−n/2 = R+ (h1 , . . . , hn ) .
(8.8)
and then that To prove (8.7) we have to show that for u ∈ H: sup ks−1 (W + (sh1 ) − 1l)Πn2 φ+ (hi )(H + b)−n/2 uk < ∞ . s∈R
By the induction assumption, D(H + b)n/2 ⊂ Πn2 φ+ (hi ) and Πn2 φ+ (hi )(H + b)−n/2 u = lim eitH Πn2 φ(hi,t )(H + b)−n/2 e−itH u . t→+∞
(8.9)
Using (8.9) and the fact that eitH W (h1,t )e−itH is uniformly bounded in t, we get: s−1 (W + (sh1 ) − 1l)Πn2 φ+ (hi )(H + b)−n/2 u = lim s−1 eitH (W (sh1,t ) − 1l)Πn2 φ(hi,t )(H + b)−n/2 e−itH u . t→+∞
(8.10)
Hence ks−1 (W + (sh1 ) − 1l)Πn2 φ+ (hi )(H + b)−n/2 uk ≤ sup ks−1 (W (sh1,t ) − 1l)Πn2 φ(hi,t )(H + b)−n/2 e−itH uk t∈R
≤ sup kφ(h1,t )Πn2 φ(hi,t )(H + b)−n/2 k kuk t∈R
≤ CΠn1 k(1 + |k| 2 + |k|− 2 )hi k kuk , 1
1
by Lemma 3.10. This proves (8.7). To prove (8.8), it suffices to show that for v ∈ D, D a dense subspace of H: lim (is)−1 (v, (W + (sh1 ) − 1l)Πn2 φ+ (hi )(H + b)−n/2 u) = (v, R+ (h1 , . . . , hn )u) .
s→0
November 25, 2002 16:35 WSPC/148-RMP
1234
00150
C. G´ erard
By (8.10) we have: (is)−1 (v, (W + (sh1 ) − 1l)Πn2 φ+ (hi )(H + b)−n/2 u) = lim (is)−1 (eitH (W (sh1,t ) − 1l)e−itH v, eitH Πn2 φ(hi,t )(H + b)−n/2 u) . t→+∞
Since |s
−1
(eisλ − 1) − iλ| ≤ C0 |s||λ|2 , we have using Lemma 3.10:
k((is)−1 (W (sh1,t ) − 1l) − φ(h1,t ))(H + b)−1 k ≤ C|s|, uniformly in t . Hence for v ∈ D(H), we have: lim (is)−1 (v, (W + (sh1 ) − 1l)Πn2 φ+ (hi )(H + b)−n/2 u)
s→0
= lim lim (is)−1 (eitH (W (sh1,t ) − 1l)e−itH v, eitH Πn2 φ(hi,t )(H + b)−n/2 u) s→0 t→+∞
= lim (e−itH φ(hi,t )e−itH v, eitH Πn2 φ(hi,t )(H + b)−n/2 u) t→∞
= lim (v, eitH Πn1 φ(hi,t )(H + b)−n/2 u) t→+∞
= (v, R+ (h1 , . . . , hn )u) , as claimed. The fact that kΠn1 φ+ (hi )(H + b)−n/2 k ≤ Cn Πn1 k(1 + |k| 2 + |k|− 2 )hi k 1
1
follows then from (8.6). Let us now prove (ii) in the case n = 1. The existence of the limit (8.5) for n = 1 and h1 ∈ h0 follows from the same arguments, using Proposition A.1 instead 1 1 of Lemma 3.10. The proof of the fact that R+ (h1 )(H + b)− 2 = φ+ (h1 )(H + b)− 2 is also similar to the general case, using Proposition A.1 instead of Lemma 3.10.
The following theorem follows directly from Theorem 8.2 and from general properties of regular CCR representations (see eg [10, Sec. 3.3]). Theorem 8.3. Assume (H0) for α > 1, (I0), (I4) for µ1 > 1 .Then (i) for any h ∈ h0 , the asymptotic creation and annihilation operators defined on D(a+] (h)) := D(φ+ (h)) ∩ D(φ+ (ih)) by 1 a+∗ (h) := √ (φ+ (h) − iφ+ (ih)) , 2 1 a+ (h) := √ (φ+ (h) + iφ+ (ih)) , 2 are closed.
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1235
(ii) The operators a+] satisfy, in the sense of forms on D(a+] (h1 )) ∩ D(a+] (h2 )), the canonical commutation relations [a+ (h1 ), a+∗ (h2 )] = (h1 |h2 )1l , [a+ (h2 ), a+ (h1 )] = [a+∗ (h2 ), a+∗ (h1 )] = 0 . (iii) eitH a+] (h)e−itH = a+] (h−t ) . 1 (iv) For h ∈ h0 , D(H + b) 2 ⊂ D(a+] (h)) and
(8.11)
a+] (h)(H + b)− 2 = s- lim eitH a] (ht )(H + b)− 2 e−itH , 1
1
t→+∞
− 12
ka+] (h)(H + b)
k ≤ Ck(1 + |k|− 2 )hk . 1
1
For hi ∈ h0 ∩ D(|k| 2 ), 1 ≤ i ≤ n, n ≥ 2, D((H + b)n/2 ) ⊂ D(Πn1 a+] (hi )) and Πp1 a+] (hi )(H + b)− 2 = s- lim eitH Πp1 a] (hi,t )(H + b)− 2 e−itH , n
n
t→∞
kΠn1 a+] (hi )(H + b)−n/2 k ≤ Cn Πn1 k(1 + |k| 2 + |k|− 2 )hi k . 1
1
8.2. Asymptotic vacuum spaces and operators for H In this subsection, we recall the construction of the asymptotic vacuum spaces and of the wave operators, see e.g. [21], [10, Sec. 5.3], [11, Sec. 10.2]. We define the asymptotic vacuum space K+ := {u ∈ H | a+ (h)u = 0, h ∈ h0 } . The asymptotic space is defined as + := K+ ⊗ Γ(h) . Hext
Proposition 8.4. (i) K+ is a closed H-invariant space. (ii) K+ is included in the domain of Πp1 a+] (hi ), for hi ∈ h0 . (iii) Hpp (H) ⊂ K+ . Proof. (i) and (ii) follow by the general properties of CCR representations (see e.g. [11, Sec. 4]). The fact that K+ is H-invariant follows from (8.11). To prove (iii) we verify that for u ∈ D(H), Hu = λu, h ∈ h0 , a(ht )e−itH u = e−itλ a(ht )u ∈ o(1). The asymptotic Hamiltonian is defined by + H + := K + ⊗ 1l + 1l ⊗ dΓ(|k|), acting on Hext
for K + := H|K+ .
November 25, 2002 16:35 WSPC/148-RMP
1236
00150
C. G´ erard
We also define the wave operator Ω+ : K+ ⊗ Γfin (h0 ) → H , ∗
∗
Ω ψ ⊗ a (h1 ) · · · a (hp )Ω := a +
+∗
(h1 ) · · · a
+∗
(8.12)
(hp )ψ , h1 , . . . , hp ∈ h0 , ψ ∈ K+ .
It follows from general properties of CCR representations that Ω+ is isometric (see e.g. [11, Proposition 4.2]). Hence we can uniquely extend Ω+ as an isometric map + →H Ω+ : Hext
such that a+] (h)Ω+ = Ω+ 1l ⊗ a] (h) ,
h ∈ h0 ,
HΩ+ = Ω+ H + . We set H+ := Ran Ω+ . Finally we give another description of H+ using the notion of asymptotic number operator (see e.g. [11, Sec. 4.2]) which we now recall. We first recall some facts about quadratic forms. We will assume that a positive quadratic form is defined on the whole space H and takes values in [0, ∞]. The domain of a positive quadratic form b is then defined as D(b) := {u ∈ H | b(u) < ∞} . The sum of closed forms is a closed form, and the supremum of a family of closed forms is a closed form. For each finite dimensional space f ⊂ h0 , one define n+ f (u) :=
dim Xf
ka+ (hi )uk2 ,
i=1
/ D(a+ (hi )) for some i, then n+ where {hi } is an orthonormal basis of f. (If u ∈ f (u) = ∞). The quadratic form nf does not depend on the choice of the basis {hi } of f. The quadratic form n+ is defined by n+ (u) : sup n+ f (u) ,
u ∈ H.
f
We can associate to n+ a selfadjoint operator (with an a priori non dense domain) denoted by N + called the asymptotic number operator. Then as shown in [11, Sec. 4.2]: H+ = Ran Ω+ = D(N + ) . One can associate a number operator Nπ to any regular CCR representation h 3 h 7→ Wπ (h) ∈ U(H) on a Hilbert space H (see e.g. [11, Sec. 4.2]). The regular CCR representation is of Fock type if Nπ has a dense domain.
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1237
8.3. Extended wave operators for H Let us first define extended objects similar to those introduced in Sec. 6 for H e . We set: Hext := H ⊗ Γ(h), Hext := H ⊗ 1l + 1l ⊗ dΓ(|k|), acting on Hext . + ⊂ Hext . By Theorem 8.3 we can define the extended wave operator Note that Hext + Ωext as follows: n ∗ n ∗+ (hi )ψ , Ω+ ext : ψ ⊗ Π1 a (hi )Ω 7→ Π1 a
for ψ ∈ D((H + b)n/2 ), hi ∈ D(|k|− 2 + |k| 2 ), 1 ≤ i ≤ n. The extended wave operated is then an unbounded operator from Hext into H with domain 1
D(Ω+ ext ) =
∞ M
1
D((H + b)n/2 ) ⊗
By (8.11) we have
D(|k|− 2 + |k| 2 ) . 1
1
s
n=0 −itHext Ω+ ext e
n O
=e
−itH
Ω+ ext ,
and
= Ω+ . Ω+ ext|H+ ext
By Theorem 8.3, we can rewrite as in [10, Sec. 5.6]: itH Ie−itHext ψ ⊗ u , Ω+ ext ψ ⊗ u = lim e t→+∞
for ψ ∈ D((H + b)n/2 ), u ∈ ⊗ns D(|k|− 2 + |k| 2 ), and I the scattering identification operator defined in Sec. 2.2. 1 1 In particular if ψ ∈ K+ , u ∈ ⊗ns D(|k|− 2 + |k| 2 ) then 1
1
Ω+ ψ ⊗ u = lim eitH Ie−itHext ψ ⊗ u . t→+∞
8.4. Asymptotic fields for H e We now collect results similar to those of Sec. 8.1 for the expanded Hamiltonian H e defined in Sec. 3.3. If we restrict ourselves to expanded Hamiltonians H e obtained from a massless Nelson Hamiltonian H, then all these results are obtained from the results in Sec. 8.1 in the following way: ˜ e, ˜ e, H First the results for H, H immediately give corresponding results for H e ˜e. ˜ acts as the free Hamiltonian −dΓ(|k|) on the second component of H since H Then we use functorial properties of the unitary map W defined in (3.3) to obtain results for He , H e . Remark. All the results in this section are also valid for general massless Pauli– Fierz Hamiltonians. In this case it is more convenient to follow the inverse path, i.e. to start with the expanded Hamiltonian H e and then to go back to H. In this case we can for example assume that (I0 4)
µ1 v e (σ)(K + 1)− 2 , (K + 1)− 2 v e (σ) ∈ Hloc (R\{0}, B(K, K ⊗ g)), µ1 > 0 . 1
1
November 25, 2002 16:35 WSPC/148-RMP
1238
00150
C. G´ erard
Note that (I0 4) implies: ∀ > 0, ∃ C such that kF (|s| ≥ R)F ( ≤ |σ| ≤ −1 )(K + 1)− 2 vkB(K,K⊗he ) ≤ CRµ1 , 1
(8.13)
kF (|s| ≥ R)F ( ≤ |σ| ≤ −1 )v(K + 1)− 2 kB(K,K⊗he ) ≤ CR−µ1 . 1
The proofs of Secs. 8.1, 8.2 and 8.3 extend under Conditions (I0 0), (I0 4) for µ1 > 1, if we replace where appropriate cutoffs in H by cutoffs in L. In this way we extend the results of Secs. 8.4 and 8.5 to the case of general expanded Hamiltonians. Using then functorial properties of W we can extend the results of Secs. 8.1, 8.2 and 8.3 to abstract massless Pauli–Fierz Hamiltonians satisfying (I0 0) and (I0 4) for µ1 > 1. The details are left to the interested reader. We set ht := e−itσ h ,
h ∈ he ,
and he0 := {h ∈ he | |σ|− 2 h ∈ he } , 1
equipped with the graph topology. For h ∈ he , we set h± = 1l{±σ≥0} h and note that h+ ∈ h0 if h ∈ he0 . By the arguments outlined above, we obtain directly the following results: Theorem 8.5. Assume (I0 0) and (I0 4) for µ1 > 1. Then: (i) for h ∈ he0 the asymptotic Weyl operators s- lim eitH W (ht )e−itH =: W e+ (h) exists . e
e
t→+∞
(ii) The map he0 3 h 7→ W e+ (h) ∈ U(He ) is strongly continuous for the topology of he0 . (iii) W e+ (h)W e+ (g) = e−i Im(h,g) W e+ (f + g). e e (iv) eitH W e+ (h)e−itH = W e+ (h−t ). Let φe+ (h), a]e+ (h) be the asymptotic fields and creation-annihilation operators obtained from W e+ (h). Then W e+ (h) = W(W + (h+ ) ⊗ W (h− ))W −1 , φe+ (h) = W(φ+ (h+ ) ⊗ 1l + 1l ⊗ φ(h− ))W −1 , a]e+ (h) = W(a]+ (h+ ) ⊗ 1l + 1l ⊗ a] (h− ))W −1 .
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1239
Let us note the following consequence of Theorem 8.5 eitL W e+ (h)e−itL = W e+ (eit|σ| h) ,
h ∈ he0 .
(8.14) −1
In fact this follows from Theorem 8.1(iv) and the fact that W LW = H ⊗ 1lΓ(h) + 1lH ⊗ dΓ(|k|)). Another result which follow from the proof of Theorem 8.2, using Lemma 3.11 instead of Lemma 3.10 is: 1
Theorem 8.6. For h ∈ he0 , D(L + b) 2 ⊂ D(φe+ (h)) and φe+ (h)(L + b)− 2 = s- lim eitH φ(ht )(L + b)− 2 e−itH , e
1
1
e
t→+∞
kφe+ (h)(L + b)− 2 k ≤ Ck(1 + |σ|− 2 )hk . 1
1
1
For hi ∈ he0 ∩ D(|σ| 2 ), 1 ≤ i ≤ n, n ≥ 2, D((L + b)n/2 ) ⊂ D(Πn1 φe+ (hi )) and Πni=1 φe+ (hi )(L + b)−n/2 = s- lim eitH Πni=1 φ(hi,t )(L + b)−n/2 e−itH e
e
t→+∞
kΠn1 φe+ (hi )(L + b)−n/2 k ≤ Cn Πn1 k(1 + |σ| 2 + |σ|− 2 )hi k . 1
1
8.5. Asymptotic spaces and wave operators for H e As in Sec. 8.2 we define Ke+ := {u ∈ He | ae+ (h)u = 0, ∀h ∈ he0 } , which is a closed H e - and L-invariant vector space. We define the wave operator Ωe+ : Ke+ ⊗ Γfin (he0 ) → He , Ωe+ ψ ⊗ a∗ (h1 ) · · · a∗ (hp )Ω := a∗e+ (h1 ) · · · a∗e+ (hp )ψ , h1 , . . . , hp ∈ he0 ,
(8.15)
ψ ∈ Ke+ ,
which uniquely extends as an isometry e+ → He . Ωe+ : Ke+ ⊗ Γ(he ) =: Hext
We set: He+ := Ran Ωe+ . e e We also define the unbounded extended wave operator Ωe+ ext from Hext into H with domain ∞ n M O 1 1 n/2 ) = D((L + b) ) ⊗ D(|σ|− 2 + |σ| 2 ) , D(Ωe+ ext s
n=0
by lim eitH Ie−itHext ψ ⊗ u Ωe+ ext ψ ⊗ u = t→+∞ Nn 1 1 for ψ ∈ D((L + b)n/2 ), u ∈ s D(|σ|− 2 + |σ| 2 ). N 1 n In particular if ψ ∈ Ke+ , u ∈ s D(|σ|−12 + |σ| 2 ) then e
e
Ωe+ ψ ⊗ u = lim eitH Ie−itHext ψ ⊗ u . e
t→+∞
e
November 25, 2002 16:35 WSPC/148-RMP
1240
00150
C. G´ erard
8.6. Conversion of scattering objects In this subsections we describe how to relate scattering objects for H to scattering objects for H e , using the canonical embedding WIΩ introduced in Sec. 3.4. The results below follow easily from the definition of W and IΩ in Sec. 3 and the formulas in Theorem 8.5. We have WIΩ W + (h)u = W e+ (jh)WIΩ u ,
h ∈ h0 ,
u ∈ H,
(8.16)
where j : h → he is the isometry defined in (3.5). Similarly for ψ ∈ K+ , hi ∈ h0 , 1 ≤ i ≤ n: Πn1 a∗+ (hi )ψ = IΩ∗ W −1 Πn1 a∗e+ (jhi )WIΩ ψ . Also u ∈ K+ ⇔ WIΩ u ∈ Ke+ .
(8.17)
Ω+ = IΩ∗ W −1 × Ωe+ × WIΩ ⊗ Γ(j) ,
(8.18)
This implies that
where we consider WIΩ ⊗ Γ(j) as a map: WIΩ ⊗ Γ(j) : K+ ⊗ Γ(h) → Ke+ ⊗ Γ(he ) . Finally let N e+ be the asymptotic number operator associated to the representation he0 3 h 7→ W e+ (h), which is defined as in Sec. 8.2. Then N e+ = W(N + ⊗ 1lΓ(h) + 1lH ⊗ N )W −1 , and hence: D(N e+ ) = W(D(N + ) ⊗ Γ(he )) .
(8.19)
8.7. Properties of the spaces He+ c In this subsection we describe properties of the spaces Hce+ connected with the asymptotic fields. Note that Hypothesis (I0 2) implies (I0 4). Theorem 8.7. Assume (I0 0), (I0 1) for 0 > 0, (I0 2) for µ > 1 and ρ(1 + 0 ) > 1. Let 0 < c < 1. Then (i) Ωe+ (Hpp (H e ) ⊗ Γ(he )) ⊂ Hce+ ⊂ He+ . (ii) W e+ (h) : Hce+ → Hce+ f or h ∈ he0 . (iii) he0 3 h 7→ W e+ (h) ∈ U(Hce+ ) is a regular CCR representation of Fock type. Proof. We will use the notation of Sec. 5. Let us first prove (i). To prove the first inclusion it suffices by density to show that if u ∈ D(H e ) with H e u = λu, u = χ(L)u
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1241
for χ ∈ C0∞ (R) and hi ∈ he0 ∩ D(|σ| 2 ) for 1 ≤ i ≤ n we have v = Πni=1 a∗e+ (hi )u ∈ Hce+ . This will follow from the fact that 1
−1 )v, for c0 > c . s- lim −1 Rce+ 0 (
(8.20)
→0
As usual to simplify notation we denote c0 by c. By Theorem 8.6, we have: −1 Rce+ (−1 )v = lim eitH (Bct + 1l)−1 Πni=1 a∗ (hi,t )χ(L)e−itλ u , e
t→+∞
and hence lim kv − −1 Rce+ (−1 )vk ≤ lim sup k(Bct + 1l)−1 Bct Πni=1 a∗ (hi,t )χ(L)uk .
→0
→0 t∈R+
Since k(Bct + 1l)−1 Bct k ≤ 1, kΠni=1 a∗ (hi,t )χ(L)k ≤ Cn Πni=1 k(1 + |σ|− 2 + |σ| 2 )hi k , 1
1
by Lemma 3.11, it suffices by density to show that lim sup k(Bct + 1l)−1 Bct Πni=1 a∗ (hi,t )χ(L)uk = 0 ,
→0 t∈R+
for u ∈ D(N e ). This follows from the fact that kBct Πni=1 a∗ (hi,t )(N e + 1)−n/2−1 k ≤ Cn Πni=1 khi k , k(N e + 1)k χ(L)(N e + 1)−k k < ∞ , by Proposition 3.8. This proves (8.20) and hence the first inclusion in (i). To prove the second inclusion, it suffices to show that Hce+ ⊂ D(N e+ ) = He+ , where N e+ is the asymptotic number operator associated to the representation he0 3 h 7→ W e+ (h). By a density argument, it suffices to show that if u ∈ He 1 with χ(L)u = u, χ ∈ C0∞ (R) and λ > 0 then Rce+ (λ)u ∈ D((N e+ ) 2 ). We have for u ∈ He , χ(L)u = u and h ∈ he0 : ae+ (h)Rce+ (λ)u = ae+ (h)χ(L)Rce+ (λ)u = lim eitH a(ht )χ(L)e−itH Rce+ (λ)u e
e
t→+∞
= lim eitH a(ht )χ(L)(Bct + λ)−1 e−itH u , e
e
t→+∞
using Theorem 8.6 and the fact that a(ht )χ(L) is uniformly bounded in t, by Proposition A.1. Then lim eitH a(ht )χ(L)(Bct + λ)−1 e−itH u e
e
t→+∞
= lim e
itH e
t→+∞
−1
a(ht )χ(L)(Bct + λ)
F
= lim eitH a(ht )(Bct + λ)−1 χ(L)F e
t→+∞
by Proposition 4.5 and Lemma 5.3.
Nte tδ Nte tδ
χ(L)e−itH u e
χ(L)e−itH u , e
November 25, 2002 16:35 WSPC/148-RMP
1242
00150
C. G´ erard
Let now f ⊂ he0 be a finite dimensional space, h1 , . . . , hp an orthonormal basis of f. Let hi ∈ C0∞ (R\{0}) ⊗ g such that hi → hi in D(he0 ) when → 0. Since by Theorem 8.6 he0 3 h 7→ ae+ (h)(L + b)− 2 ∈ B(He) 1
is norm continuous, we have p X
kae+ (hi )χ(L)uk2 = lim
→0
i=1
p X
kae+ (hi )χ(L)uk2 ,
u ∈ He .
(8.21)
i=1
Next we observe that by stationary phase estimates, we have for h ∈ C0∞ (R\{0})⊗g: 1
2 )ht ∈ O(t−∞ ) . (1 − bct
(8.22)
Let now Pt =
p X
|hi,t ihhi,t | = e−itσ P0 eitσ .
i=1
By (8.22) we have: 1
2 )Pt k ≤ C,n0 t−n0 , k(1 − bct
∀ n0 ∈ N ,
and hence 1
1
1
1
1
1
1
2 2 2 2 2 Pt bct + bct Pt (1 − bct ) + (1 − bct )Pt Pt = bct 2 2 ≤ bct Pt bct + C,n0 t−n0
≤ kP0 kbct + C,n0 t−n0 .
(8.23)
Hence we obtain p X
a∗ (hi,t )a(hi,t ) = dΓ(Pt ) ≤ kP0 kdΓ(bct ) + C,n0 N e t−n0 ,
∀ n0 ∈ N .
(8.24)
i=1
We now write for u ∈ He , χ(L)u = u: p X
kae+ (hi )Rce+ (λ)uk2
i=1
2 e p X
a(hi,j )(Bct + λ)−1 χ(L)F Nt χ(L)e−itH e u
δ t→+∞ t i=1
= lim
≤
1
2 lim kP0 k
Bct (Bct t→+∞
−1
+ λ)
χ(L)F
Nte tδ
e
−itH e
2
u
2
e
e 1 Nt −1 −itH e 2 (B + lim C,n0 t−n0 ) + λ) χ(L)F u χ(L)e (N ct
.
δ t→+∞ t
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
We have
1243
e
e 1
(N ) 2 (Bct + λ)−1 χ(L)F Nt χ(L)
δ t
e
e
1 1 N t e 12
≤ Ck(N ) χ(L)(N + 1) 2 k (N + 1) 2 F χ(L)
δ t ≤ C0 tδ/2 ,
by Lemma 3.5(iii) and Lemma 4.6. On the other hand:
2
e
1
Bct2 (Bct + λ)−1 χ(L)F Nt e−itH e u ≤ λ−1 kuk2 .
δ t This yields p X
kae+ (hi )Rce+ (λ)uk2 ≤ C(kP0 k + λ−1 )kuk2 ,
(8.25)
i=1
uniformly in , p. Note that since hi → hi in he and h1 , . . . , hp is an orthonormal family, we have kP0 k → 1 when → 0. Using (8.21) and letting → 0 in (8.25), we get: p X
kae+ (hi )Rce+ (λ)uk2 ≤ C(1 + λ−1 )kuk2 ,
i=1
uniformly in p. By the definition of D(N e+ ) recalled in Sec. 8.2 (see also [11, 1 Sec. 4.2]), this implies that Rce+ (λ)u ∈ D((N e+ ) 2 ) for any λ > 0 and completes the proof of (i). Let us now prove (ii). We have to show that for u ∈ Hce+ , h ∈ he0 and c < c0 < 1: −1 ))W e+ (h)u = 0 . lim (1l − −1 Rce+ 0 (
→0
Since 1l − −1 Rc0 (−1 ) is uniformly bounded in it suffices to show by density that for λ > 0: −1 ))W e+ (h)Rce+ lim (1l − −1 Rce+ 0 ( 0 (λ)u = 0 .
(8.26)
→0
We set c0 = c to shorten notation and we have: (1l − −1 Rce+ (−1 ))W e+ (h)Rce+ (λ)u = lim eitH (1l − (Bct + 1l)−1 )eiφ(ht ) (Bct + λ)−1 e−itH u . e
t→+∞
From the identity 1 e−iφ(h) dΓ(b)eiφ(h) = dΓ(b) + φ(ibh) − Re(bh, h) 2
e
November 25, 2002 16:35 WSPC/148-RMP
1244
00150
C. G´ erard
for h ∈ he , b ∈ B(he ), we obtain: (1l − (Bct + 1l)−1 )eiφ(ht ) (Bct + λ)−1 1 = (Bct + 1l)−1 eiφ(ht ) Bct + φ(ibct ht ) − Re(bct ht , ht ) (Bct + λ)−1 . 2 By Proposition A.1 this yields k(1l − (Bct + 1l)−1 )eiφ(ht ) (Bct + λ)−1 k 1
2 ht k + kbct ht k + 1) . ≤ C(λ)(kbct
Since kbct k ≤ 1, kht k = khk, we obtain (8.26). Finally property (iii) follows from (i), (ii) and Theorem 8.5. We end this section with another similar result. Proposition 8.8. Assume (I0 0), (I0 1) for 0 > 0, (I0 2) for µ > 1 and ρ(1 + 0) > 1. Let 0 < c < c0 < 1. Let f0 be a cutoff function as in (6.1). Then e+ . Ran Γe+ c0 (f0 ) ⊂ K
Proof. By a density argument using the fact that Ke+ is closed, it suffices to show that if u ∈ Hce+ , χ(L)u = u for χ ∈ C0∞ (R) (L + i)−1 ae+ (h)Γe+ c0 (f0 )u = 0 ,
∀ h ∈ he0 .
(8.27)
Since by Theorem 8.6 the map he0 3 h 7→ a∗e+ (h)(L + i)−1 ∈ B(He ) is norm continuous, it suffices to prove (8.27) for h ∈ C0∞ (R\{0}) ⊗ g. Let us again set c0 = c to simplify the notation. We have by Theorem 8.6, Proposition 4.5 and the fact that (L + i)−1 a(ht ) is uniformly bounded: (L + i)−1 ae+ (h)Γe+ c (f0 )u
e Nte (L + i) = lim e χ(L)e−itH u t→+∞ tδ e e Nt itH e −1 t t (L + i) a(f1 ht )Γ(f0 )F χ(L)e−itH u , = lim e δ t→+∞ t itH e
−1
a(ht )Γ(f0t )F
where f1t = f1 ( s−ct tρ ), f1 a cutoff function as f0 with f1 f0 = f0 . By stationary phase estimates, since 0 < c < 1, kf1tht khe ∈ O(t−∞ ) for h ∈ C0∞ (R\{0}) ⊗ g and hence
e e
a(f1t ht )Γ(f0t )F Nt χ(L) ≤ kf1t ht k (N e + 1) 12 Γ(f0t )F Nt χ(L)
δ δ t t
e
e
Nt 1
∈ O(t−∞ ) , 2F + 1) χ(L) (N ≤ kf1t ht k
δ t
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1245
by Lemma 4.6. Hence (8.27) holds for all h ∈ C0∞ (R\{0}) ⊗ g, which completes the proof. 9. Geometric Asymptotic Completeness for H e In this section we prove the geometric asymptotic completeness for H e . This property is a geometric characterization of the space Kce+ = Ke+ ∩ Hce+ . The space Kce+ is the space of vacuum states in Hce+ . We show in Theorem 9.5 that those states are localized in the region {|s| ≤ c0 t}, for any c < c0 . We assume in this section Hypotheses (I0 0), (I0 1) for 0 > 0, (I0 2) for µ > 1. We pick the constant 0 < ρ < 1 such that ρ(1 + 0 ) > 1. 9.1. Technical preparations Lemma 9.1. Let j0 , j∞ , b ∈ B(he ), j0 , j∞ ≥ 0, j0 + αj∞ ≤ 1l for α > 0. Then for u1 , u2 ∈ H e : |(u2 , Pk (j0 , j∞ + b)u1 ) − (u2 , Pk (j0 , j∞ )u1 )| ≤
k X
αr−k kdΓ(|b|)r/2 u2 kkdΓ(|b|)r/2 u1 k .
r=1
Proof. We have on
Nn s
he :
Pk (j0 , j∞ + b) − Pk (j0 , j∞ ) =
k X
X
X
r=1 1≤j1 ,...,jr ≤n ]{i|i =∞}=k−r
=
k X
X
j1 ⊗ · · · ⊗ b ⊗ jj+1 ⊗ · · · ⊗ b ⊗ · · · ⊗ jn ^ ^ j1
jr
Mj1 ,...,jr Tj1 ,...,jr ,
r=1 1≤j1 ,...,jr ≤n
for Mj1 ,...,jr =
X
j1 ⊗ · · · ⊗ ^ 1l ⊗ jj+1 ⊗ · · · ⊗ ^ 1l ⊗ · · · ⊗ jn ,
]{i|i =∞}=k−r
j1
jr
Tj1 ,...,jr = 1l ⊗ · · · ⊗ b ⊗ 1l ⊗ · · · ⊗ b ⊗ · · · ⊗ 1l . ^ ^ j1
Note that
jr
X
|Tj1 ,...,jr | = (dΓ|b|)r .
1≤j1 ,...jr ≤n
Since j0 , j∞ ≥ 0, j0 + αj∞ ≤ 1l, we obtain by replacing j∞ by αj∞ that kMj1 ,...,jr k ≤ αr−k .
(9.1)
November 25, 2002 16:35 WSPC/148-RMP
1246
00150
C. G´ erard
For u1 , u2 ∈
Nn s
he , we obtain:
k(u2 , Pk (j0 , j∞ + b)u1 ) − (u2 , Pk (j0 , j∞ )u1 ) ≤
k X
X
1
1
αr−k k|Tj1 ,...,jr | 2 u1 kk|Tj1 ,...,jr | 2 u2 k
r=1 1≤j1 ,...,jr ≤n
≤
k X
12
X
αr−k (u1 , |Tj1 ,...,jr |u1 )
1≤j1 ,...,jr ≤n
r=1
×
12
X
αr−k (u2 , |Tj1 ,...,jr |u2 )
1≤j1 ,...,jr ≤n
=
k X
αr−k kdΓ(|b|)r/2 u1 kkdΓ(|b|)r/2 u2 k ,
r=1
by (9.1). We pick for 0 < < 1 a cutoff function F ∈ C0∞ ([, −1 ]) with 0 ≤ F ≤ 1 and set b := F (σ) . Lemma 9.2. The operator (L + i)−k/2 I1l ⊗ Γ(b ) is bounded on He ⊗
Nk s
he for k ∈ N.
Proof. We first claim that if B is an open set included in R\{0} × S 2 , p ∈ N then Z ka(σ1 , ω1 ) · · · a(σp , ωp )uk2 dσ1 · · · dσp dω1 · · · dωp Bp
≤ Cp k(L + b)p/2 uk2 , u ∈ He .
(9.2)
In fact we write he = L2 (B, dσdω) ⊕ L2 (B c , dσdω) =: heB ⊕ he⊥ B . Let U be the canonical map from Γ(he ) into Γ(heB ) ⊗ Γ(he⊥ B ). Then U α(σ, ω) = (α(σ, ω) ⊗ 1l)U, for (σ, ω) ∈ B , U dΓ(1lB ) = N e ⊗ 1lU .
(9.3)
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
This yields
1247
Z Bp
kΠp1 a(σi , ωi )uk2 dσ1 · · · dσp dω1 · · · dωp Z = Bp
kΠp1 a(σi , ωi ) ⊗ 1lU uk2 dσ1 · · · dσp dω1 · · · dωp
= (N e · · · (N e − p + 1) ⊗ 1lU u, U u) ≤ Cp ((N e + 1)p ⊗ 1lU u, U u) = Cp k(dΓ(1lB ) + 1l)p/2 uk2 , by (9.3). Using then Lemma 3.6(iii), we obtain (9.2). N Next for u, v ∈ He , ψ ∈ ks he , we have |(u, (L + i)−k/2 I1l ⊗ Γ(b )v ⊗ ψ)| Z 1 = 1 ψ(σ1 , ω1 , . . . , σk , ωk )Πki=1 F (σi ) k! 2 −k/2
×(a(σ1 , ω1 ) · · · a(σk , ωk )(L − i)
u, v)He dσ1 . . . dσk dω1 . . . dωk
≤ Ck, kψkNk he kukHe kvkHe , s
by (9.2) for B =],
−1
[×S 2 and Cauchy–Schwarz inequality.
9.2. Geometric asymptotic completeness Proposition 9.3. Let 0 ≤ c < c0 ≤ 1. Let j0 , j∞ satisfying (6.1) and (6.4) with 1
t 2 ∈ C ∞ (R) and let j0t , j∞ be as in (7.1) for the constant c0 . Then j∞ 1
1
t 2 t 2 −itH 2 2 ) b (j∞ ) )e =: Pk+ (j0 , j∞ b j∞ ) w- lim eitH Pk (j0t , (j∞ e
1
1
e
t→+∞
exists on
Hce+
and equals Ωe+ 1l ⊗ Γ(b )Wk+ (j).
Proof. By the definition of Hce+ it suffices to show the existence of the limit on 0 Ran Pˆce+ 0 . Changing notation we may replace c by c. We first claim that for χ1 ∈ ∞ C0 (R): χ1 (L)Ωe+ 1l ⊗ Γ(b )Wk+ (j) e ˇ k (j t )e−itH e on Ran Pˆce+ . = s- lim eitH χ1 (L)I1l ⊗ Γ(b )Γ
t→+∞
(9.4)
First since by Lemma 9.2, χ1 (L)I1l ⊗ Γ(b ) is bounded, it suffices to prove (9.4) for u ∈ Ran Pˆce+ with u = χ(L)u for χ ∈ C0∞ (R). We note first that by Theorem 7.7(ii): Wk+ (j)u = Wk+ (j)χ(L)u = χ(Lext )Wk+ (j)u .
November 25, 2002 16:35 WSPC/148-RMP
1248
00150
C. G´ erard
Moreover by Theorem 7.7(iv) and Proposition 8.8 1l ⊗ Γ(b )Wk+ (j)u ∈ Ke+ ⊗ Γ(he ) = D(Ωe+ ) . Also 1l ⊗ Γ(b )Wk+ (j)u = χ(Lext )1l ⊗ Γ(b )Wk+ (j)u Nk 1 1 ∈ D((L + i)k/2 ) ⊗ s D(|σ|− 2 + |σ| 2 ) ⊂ D(Ωe+ ext ) . Hence for χ1 ∈ C0∞ (R): χ1 (L)Ωe+ 1l ⊗ Γ(b )Wk+ (j)u = χ1 (L)Ωext+ 1l ⊗ Γ(b )Wk+ (j)u = lim eitH χ1 (L)I1l ⊗ Γ(b )e−itH e
ext
t→+∞
Wk+ (j)u .
Since by Lemma 9.2 χ1 (L)I1l ⊗ Γ(b ) is bounded, we can apply the chain rule and obtain (9.4). Next we note that t ˇ k (j t ) = Pk (j0t , b j∞ ), I1l ⊗ Γ(b )Γ
which implies t )k ≤ C, uniformly in t . kχ1 (L)Pk (j0t , b j∞
(9.5)
Note also that since b ≤ 1, we have 1
1
t 2 t 2 t ) b (j∞ ) ≤ j0t + αj∞ ≤ 1, j0t + α(j∞
and hence 1
1
t 2 t 2 ) b (j∞ ) )k ≤ C, uniformly in t . kPk (j0t , (j∞
(9.6)
By (9.4), (9.5), (9.6) and a density argument, it suffices to prove the proposition to show that for u, v ∈ D((N e )∞ ): t t 2 t 2 ) − Pk (j0t , (j∞ ) b (j∞ ) ))χ(L)e−itH u) = 0 . lim (e−itH v, χ(L)(Pk (j0t , b j∞ e
1
e
1
t→+∞
(9.7) 1
1
t t 2 t 2 = (j∞ ) b (j∞ ) + rt , for We have b j∞ t 2 t 2 ) ](j∞ ) , rt ∈ O(t−ρ ) . rt = [b , (j∞ 1
1
Let b1, be a cutoff function similar to b such that b1, ≡ 1 on supp b . By p.d.o. calculus: (rt∗ rt )p = bp1, (r∗t rt )p bp1, + rpt ,
rpt ∈ O(t−∞ ) ,
Let us fix k ≥ 1 and set: r∞ (t) := sup krpt k1/2p ∈ O(t−∞ ) . 1≤p≤k
p ∈ N.
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1249
We have 2p (r∗t rt )p ≤ C 2p t−2pρ b2p 1, + r∞ (t)1l ,
p ≤ k.
1 2
Since the function λ 7→ λ is matrix monotone, this gives 2p 2 |rt |p ≤ (C 2p t−2pρ b2p 1, + r∞ (t)1l) 1
≤ C p t−pρ bp1, + r∞ (t)p 1l ≤ (Ct−ρ b1, + r∞ (t)1l)p , p ≤ k .
(9.8)
Let f (t) be expression on the l.h.s. of (9.7). By Lemma 9.1, we have |f (t)| ≤
k X
αr−k kdΓ(|rt |)r/2 χ(L)e−itH vk kdΓ(|rt |)r/2 χ(L)e−itH uk . e
e
(9.9)
r=1
Using then (9.8) we deduce from Lemma A.2 that: dΓ(|rt |)p ≤ (Ct−ρ dΓ(b1, + r∞ (t)N )p ,
p ≤ k.
(9.10)
Using now (9.10), we obtain for r ≤ k: kdΓ(|rt |)r/2 χ(L)e−itH uk2 e
≤ (χ(L)e−itH u, (Ct−ρ dΓ(b1, ) + r∞ (t)N )r χ(L)e−itH u) e
≤
r X
e
C j t−jρ r∞ (t)r−j kdΓ(b1, )j χ(L)e−itH uk k(N e)r−j χ(L)e−itH uk . e
e
j=0
By Lemma 3.6 dΓ(b1, )j χ(L) is bounded. By Proposition 3.8 and the fact that e u ∈ D((N e )∞ ) we know that k(N e )r−j χ(L)e−itH uk ≤ Ctr−j . Hence lim kdΓ(|rt |)r/2 χ(L)e−itH uk = 0 , e
t→+∞
which proves (9.7). Theorem 9.4. Let 0 < c < c0 < 1. Let f0 , f∞ be defined in (6.23), (6.24). Let j = (f0 , f∞ ), j t be as in (7.1) for the constant c0 . Then Ωe+ Wk+ (j) = Pk+ (j) on Hce+ . Proof. By the same argument as in Proposition 9.3, we set c0 = c and reduce ourselves to prove the theorem on RanPˇce+ . We first note that Wk+ (j) Ran Pˇce+ ⊂ Ke+ ⊗
k O
he ⊂ D(Ωe+ ) ,
s
by Proposition 8.8 and Theorem 7.7(iv). By Proposition 6.6 Pk+ (j) = w- lim Pk+ (f0 , f∞, ) , →0
November 25, 2002 16:35 WSPC/148-RMP
1250
00150
C. G´ erard
and by Proposition 7.8 Wk+ (j) = w- lim Wk+ (f0 , f∞, ) . →0
Hence it suffices to prove that for any 0 > 0: Ωe+ Wk+ (f0 , f∞,0 ) = Pk+ (f0 , f∞,0 ) .
(9.11)
We note the following identity, similar to those in [10, Lemma 2.14], valid for r0 , r∞ , b ∈ B(he ), 0 ≤ r0 + αr∞ ≤ 1, 0 ≤ b ≤ 1, r = (r0 , r∞ ): ˇ ˇ ∗ 1l ⊗ Γ(b)1l{k} (N∞ )Γ(r) = Pk (r02 , r∞ br∞ ) . Γ(r) 1
(9.12) 1
2 Let us fix the constant 0 . We will apply (9.12) to r0 = f02 , r∞ = f∞, 0 , b = b , where b is defined in Lemma 9.2. Note that by Lemma 6.5 there exists α > 0 such 1 1 2 that f02 + αf∞, 0 ≤ 1, so we can apply this identity. By Theorem 7.7(i) and (vii), (9.12) and the chain rule of wave operators, we get: 1
1
2 2 Wk+∗ (r)1l ⊗ Γ(b )Wk+ (r) = Pk+ (f0 , f∞, 0 b f∞,0 ) .
By Proposition 9.3, we obtain Wk+ (r)∗ 1l ⊗ Γ(b )Wk+ (r) = Ωe+ 1l ⊗ Γ(b )Wk+ (f0 , f∞,0 ) . Now since Wk+ (r), Ωe+ are bounded operators: s- lim Wk+ (r)∗ 1l ⊗ Γ(b )Wk+ (r) = Wk+ (r)∗ Wk+ (r) = Pk+ (f0 , f∞,0 ) , →0
s- lim Ω+ 1l ⊗ Γ(b )Wk+ (f0 , f∞,0 ) = Ω+ Wk+ (f0 , f∞,0 ) . →0
Hence (9.11) holds and this completes the proof of the theorem. The following theorem is the so called geometric asymptotic completeness. It provides a geometric characterization of the asymptotic vacuum states. Theorem 9.5. Assume (I0 0), (I0 1) for 0 > 0, (I0 2) for µ > 1 and let ρ(1 + 0) > 1. Let 0 < c < c0 < 1. Let f1 ∈ C ∞ (R) be a cutoff function such that 0 ≤ f1 ≤ 1, 0 t f1 ≡ 1 in {s ≤ α1 }, f1 ≡ 0 in {s ≥ α2 } and f1t = f1 ( s−c tρ ). e+ Then Γc0 (f1 ) defined in (6.17) is equal to the orthogonal projection on Kce+ := + K ∩ Hce+ . e+ and since Γe+ Proof. By Proposition 8.8 we know that Ran Γe+ c0 (f1 ) ⊂ K c0 (f1 ) e+ clearly preserves Hce+ , we have Ran Γc0 (f1 ) ⊂ Kce+ . Let us prove the converse inclusion. By Theorem 8.7, he0 3 h 7→ W e+ (h) ∈ U(Hce+ ) is a regular CCR representation of Fock type. Hence the restriction Ωe+ c of the wave operator Ωe+ to Kce+ ⊗ Γ(he ): e+ e e+ Ωe+ c : Kc ⊗ (h ) → Hc
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1251
is unitary. Let now j = (f0 , f∞ ) as in Theorem 9.4 for a constant c00 with c < c00 < c0 . Let ∞ M Wk+ (j) . W + (j) = k=0
N Since kWk+ (j)k ≤ 1 and Ran Wk+ (j) ⊂ He ⊗ ks he , we have kW + (j)k ≤ 1. Next we note that by Theorem 7.6(iii) and Theorem 7.7(vi) we have W + (j)Hce+ ⊂ Hce+ ⊗ Γ(he ) . Moreover by Theorem 7.7(iv) and the fact that f0t f1t = f0t for t large enough, we have + W + (j) = Γe+ c0 (f1 ) ⊗ 1lW (j) .
(9.13)
By Proposition 8.8 this implies that W + (j)Hce+ ⊂ Kce+ ⊗ Γ(he ) . Finally by Theorem 9.4 and Theorem 6.4(iv) + e+ Ωe+ c W (j) = 1l on Hc ,
which means that −1 . W + (j) = (Ωe+ c ) e+ e+ By (9.13) this implies that Γe+ c0 (f1 ) = 1l on Kc , and hence that Γc0 (f1 ) is the e+ orthogonal projection on Kc .
10. 1-Particle Space Estimates This section is devoted to some estimates on the one-particle space L2 (R3 , dk). They will be used in Secs. 11 and 12 to construct the spaces analogous to Hce+ for the Nelson Hamiltonian. The need for these estimates can be understood as follows: The space Hce+ is constructed using the observable s = i∂σ acting on he = 2 L (R, dσ) ⊗ g. This observable has the drawback that it does not commute with the projection 1lR+ (σ) and hence does not satisfy the condition (3.6) in Sec. 3.4. In Sec. 10.2 we introduce the observable |s|0 which is the square root of the Laplacian ∂2 − ∂σ 2 with Dirichlet condition at 0 and satisfies (3.6). We estimate the difference between some functions of s and |s|0 . It will allow us in Sec. 11 to reinterpret the space Hce+ using the observable |s|0 . In this way a space Hc+ can be constructed on H using the abstract arguments in Sec. 3.4. In Sec. 12 we describe the space Hc+ replacing the observable |s|0 by the more physical position observable |x|. We note that |x| is the square root of the Laplacian −∆k acting on L2 (R3 , dk). Going to polar coordinates we see that |x| is the square 2 2 + ω σ ) ⊗ L2 (S 2 ) with a Dirichlet condition at 0. root of − ∂∂σ˜ 2 − ∆ σ ˜ 2 acting on L (R , d˜ Again we need to estimate the difference between functions of |x| and functions of
November 25, 2002 16:35 WSPC/148-RMP
1252
00150
C. G´ erard
ω |s|0 . This is done in Sec. 10.1, by introducing cutoffs in the angular part − ∆ σ ˜ 2 of the Laplacian. The use of these cutoffs in Sec. 12 wil be justified using the results of Sec. 4.5.
10.1. Case Of h We use the notation of Sec. 1.1. We will consider the observable 1
|x| = (−∆k ) 2 . Note that −∆k with domain H 2 (R3 ) is also the Friedrichs extension of −∆k on C0∞ (R3 \{0}), since H01 (R3 \{0}) = H 1 (R3 ). Let σ ) ⊗ L2 (S 2 ) u : L2 (R3 , dk) → L2 (R+ , d˜ uφ(˜ σ , ω) = σ ˜ φ(˜ σ ω) be the unitary map introduced in Sec. 3.1. We have uC0∞ (R3 \{0}) = C0∞ (]0, +∞[) ⊗ C ∞ (S 2 ) and on C0∞ (]0, +∞[) ⊗ C ∞ (S 2 ) we have: u(−∆k )u−1 = −
∂2 ∆ω − 2 , 2 ∂σ ˜ σ ˜
where ∆ω is the Laplacian on S 2 . By the above remark this means that u(−∆k )u−1 2 ∞ ∞ 2 ω is the Friedrichs extension of − ∂∂σ˜ 2 − ∆ σ ˜ 2 on C0 (]0, +∞[) ⊗ C (S ). ∂2 2 ∞ Let now s0 be the Friedrichs extension of − ∂ σ˜ 2 on C0 (]0, +∞[) ⊗ C ∞ (S 2 ), i.e. s20 = −
∂2 , ∂σ ˜2
D(s20 ) = H01 (]0, +∞[) ⊗ L2 (S 2 ) ∩ H 2 (]0, +∞[) ⊗ L2 (S 2 ) .
Then we set 1
a0 := (s20 ) 2 ,
a := u|x|u−1 .
(10.1)
We note that for u ∈ D(a0 ) = H01 (]0, +∞[) ⊗ L2 (S 2 ), we have: Z +∞ ∂ 2 2 u d˜ σ, ka0 uk = ∂σ ˜ 0 hence for z ∈ C\R
∂
(a0 ± z)−1 = ka0 (a0 ± z)−1 k ≤ 1 .
∂σ ˜
By duality we also have
(a0 ± z)−1 ∂ ≤ 1 .
∂σ ˜
(10.2)
(10.3)
Let now f ∈ C ∞ (R) with f (λ) ≡ 1 for λ 1, f ≡ 0 for λ −1 .
(10.4)
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1253
We set for 0 < c ≤ 1, 0 < ρ < 1: a0 − ct −a0 − ct +f , b0,t := f tρ tρ a − ct −a − ct + f . bt := f tρ tρ We also set for δ, ρ1 > 0:
−∆ω ≤ 1 F (tδ σ ˜ ≥ 1) tρ1 σ ˜2 −∆ω 1 δ ≤2 F t σ ˜≥ , g1 := F tρ1 σ ˜2 2
g := F
(10.5)
so that g1 g = g. Lemma 10.1. Assume ρ > δ. Then (i) (1 − g1 )(b0,t + µ + R)−1 ∈ O(t−∞ ), for µ ∈ C\R− , uniformly for R ≥ 0, (ii) (bt − b0,t )g ∈ O(tρ1 −2ρ log t). Proof. As a preparation for the proof of Lemma 10.1, we first show: Lemma 10.2. We have (1 − g1 )(z 2 − a20 )−1 g ∈ O(|Im z|−N −2 tN δ ), N ∈ N, z ∈ C\R .
(10.6)
a20 (1 − g1 )(z 2 − a20 )−1 g ∈ O(tN δ |Im z|−N −2 hzi) + O(t(N +2)δ |Im z|−N −1 ), N ∈ N . (10.7) Proof. We have (z 2 − a20 )−1 g = g(z 2 − a20 )−1 + (z 2 − a20 )−1 [a20 , g](z 2 − a20 )−1 , [a20 , g] = −g 0
∂ 0 ∂ − g , for g 0 = ∂σ˜ g , ∂σ ˜ ∂σ ˜
and hence (1 − g1 )(z 2 − a20 )−1 g = (1 − g1 )(z 2 − a20 )−1 [a20 , g](z 2 − a20 )−1 .
(10.8)
We observe that k∂σ˜α gk ∈ O(tδα ). Moreover if g2 is another cutoff similar to g with g2 g = g, g2 g1 = g2 , we have [a20 , g] = g2 [a20 , g] and (1 − g1 )(z 2 − a20 )−1 g = (1 − g1 )(z 2 − a20 )−1 [a20 , g2 ](z 2 − a20 )−1 [a20 , g](z 2 − a20 )−1 . If we iterate N times the identity (10.8), we can write (1 − g1 )(z 2 − a20 )−1 g as a finite sum of terms of the form 2 2 −1 . (1 − g1 )(z 2 − a20 )−1 ΠN j=1 Rj (z − a0 )
(10.9)
November 25, 2002 16:35 WSPC/148-RMP
1254
00150
C. G´ erard
Using the form of [a20 , g] given above and the estimates (10.2), (10.3), we see that we have k(a0 ± z)−1 Rj k ≤ Ctδ , or kRj (a0 ± z)−1 kCtδ . We can rewrite (10.9) as −1 Rj (z − a0 )−1 )(z + a0 )−1 . (1 − g1 )(z − a0 )−1 (ΠN j=1 (z + a0 )
(10.10)
We have k(z + a0 )−1 Rj (z − a0 )−1 k ≤ C|Im z|−1 tδ , which proves (10.6). To prove (10.7), we write a20 (1 − g1 )(z 2 − a20 )−1 g = (1 − g1 )a20 (z 2 − a20 )−1 g − [a20 , g1 ](z 2 − a20 )−1 g = z(1 − g1 )(z 2 − a20 )−1 g − [a20 , g1 ](z 2 − a20 )−1 g . Now [a20 , g1 ] = −2g10 ∂σ˜ − g100 . Using the expression analogous to (10.10) with 1 − g1 replaced by [a20 , g1 ], and (10.2), we obtain [a20 , g1 ](z 2 − a20 )−1 g ∈ O(t(N +2)δ |Im z|−N −1 ) ,
∀N ∈ N.
Using also (10.6) we obtain (10.7). Proof of Lemma 10.1. Let us first prove (i). The function λ − ct −λ − ct ft : λ 7→ f +f tρ tρ is equal to 1 for |λ| ≥ c0 t and satisfies |∂λα ft | ≤ Cα t−ρα , α ∈ N. This implies that the function χt (λ) = (ft (λ) + µ + R)−1 − (1 + µ + R)−1 satisfies supp χt ⊂ {|λ| ≤ c0 t} , |∂λα χt (λ)| ≤ Cα t−ρα , α ∈ N, uniformly in R ≥ 0 .
(10.11)
Using the construction in [9, Proposition C.2.1], we can find an almost-analytic extension χ ˜t of χt such that supp χ ˜t ⊂ {z ∈ C | |Re z| ≤ c0 t, |Im z| ≤ c0 tρ } ˜t (z)| ≤ CN |Im z|N t−ρ(N +1) , N ∈ N, uniformly in R ≥ 0 . |∂z¯χ
(10.12)
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1255
We observe that if χ ∈ C0∞ (R) is an even function and A a selfadjoint operator, we have Z i ∂z¯χ(z)(z ˜ − A)−1 dz ∧ d¯ z χ(A) = 2π Z 1 i ((z − A)−1 + (z + A)−1 dz ∧ d¯ ˜ z ∂z¯χ(z) = 2π 2 Z i 2 ˜ − A2 )−1 dz ∧ d¯ z, (10.13) ∂z¯χ(z)z(z = 2π using the identity (z − A)−1 + (z + A)−1 = 2z(z 2 − A2 )−1 . Applying this identity to the even function χt , we have: (1 − g1 )(b0,t + µ + R)−1 g = (1 − g1 )χt (a0 )g Z i ˜t (z)z(1 − g1 )(z 2 − a20 )−1 gdz ∧ d¯ z. ∂z¯χ = 2π (i) follows then from (10.6) and (10.12), since ρ > δ. Let us now prove (ii). We denote again by χt (λ) the function ft (λ) − 1 which is even and satisfies (10.11). We have by (10.13) (bt − b0,t )g = (ft (a) − ft (a0 ))g Z i ˜t (z)z(z 2 − a2 )−1 (a2 − a20 )(z 2 − a20 )−1 gdz ∧ d¯ z. ∂z¯χ = 2π Next we write (z 2 − a2 )−1 (a2 − a20 )(z 2 − a20 )−1 g = (z 2 − a2 )−1 (a2 − a20 )(1 − g1 )(z 2 − a20 )−1 g + (z 2 − a2 )−1 (a2 − a20 )g1 (z 2 − a20 )−1 g = I1 (z) + I2 (z) and estimate separately the two terms. We have kI1 (z)k = k(z 2 − a2 )−1 (a2 − a20 )(1 − g1 )(z 2 − a20 )−1 gk ≤ k(z 2 − a2 )−1 a2 k k(1 − g1 )(z 2 − a20 )−1 gk + k(z 2 − a2 )−1 k ka20 (1 − g1 )(z 2 − a20 )−1 gk ≤ CN |Im z|N +2 tN δ + CN |Im z|N +4 hzitN δ + CN |Im z|N +3 t(N +2)δ , (10.14) using (10.6), (10.7) and the fact that ka2 (z 2 − a2 )−1 k ≤ 1, k(z 2 − a2 )−1 k ≤ |Imz|−2 . This yields
Z
∈ O(t−∞ ) ,
∂z¯χ ˜ (z)zI (z)dz ∧ d¯ z (10.15) t 1
November 25, 2002 16:35 WSPC/148-RMP
1256
00150
C. G´ erard
using (10.12) and the fact that ρ > δ. Let us now estimate I2 (z). We have k(a2 − a20 )g1 k ≤ Ctρ1 . A sharp estimate for (z 2 − a2 )−1 where a is any selfadjoint operator is k(z 2 − a2 )−1 k ≤ C inf(|Im z|−1 |Re z|−1 , |Im z|−2 ) . Let us now estimate
Z
∂z¯χ ˜t (z)zI2 (z)dz ∧ d¯ z
.
Recall that ∂z¯χ ˜t is supported in {z ∈ C | |Re z| ≤ c0 t, |Im z| ≤ c0 tρ }. We cut the integral in three parts:
Z
∂z¯χ ˜t (z)zI2 (z)dz ∧ d¯ z R1 =
|Re z|≤1
Z ≤ |∂z¯χ ˜t (z)|hzi|Im z|−4 tρ1 dz ∧ d¯ z |Re z|≤1
Z ≤
|Re z|≤1,|Im z|≤c0 tρ
hzitρ1 −5ρ dz ∧ d¯ z
≤ Ctρ1 −3ρ ;
Z
∂z¯χ ˜t (z)zI2 (z)dz ∧ d¯ z R2 =
|Re z|≥c1 |Im z|,|Re z|≥1
Z |∂z¯χ ˜t (z)|hzi|Re z|−2 |Im z|−2 tρ1 dz ∧ d¯ z ≤ |Re z|≥c1 |Im z|,|Re z|≥1
Z ≤
|Re z|≥c1 |Im z|,|Re z|≥1
hzi|Re z|−2 tρ1 −3ρ dz ∧ d¯ z
≤ Ctρ1 −2ρ log t ;
Z
∂z¯χ ˜t (z)zI2 (z)dz ∧ d¯ z R3 =
|Re z|≤c1 |Im z|,|Re z|≥1
Z ≤
|Re z|≤c1 |Im z|
|∂z¯χ ˜t (z)|hzi|Im z|−4 tρ1 dz ∧ d¯ z
Z ≤
|Re z|≤c1 |Im z|,|Im z|≤c0 tρ
≤ Ctρ1 −2ρ .
tρ tρ1 −5ρ dz ∧ d¯ z
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
This yields
1257
Z
ρ1 −2ρ
∂z¯χ ˜t (z)zI2 (z)dz ∧ d¯ z log t .
≤ Ct
Using (10.15) this proves (ii). 10.2. Case of he In this subsection, we prove similar results on the Hilbert space he = L2 (R, dσ) ⊗ g. ∂ ∂ 2 12 , so that |s| = (− ∂σ We recall that on he we defined the observable s = i ∂σ 2 ) . We define the observable |s|0 by s20 = −
∂2 with Dirichlet condition at 0 , ∂σ 2
i.e. D(s20 ) = H 2 (R\{0}) ∩ H01 (R\{0}) , 1
|s|0 := (s20 ) 2 . Let again f ∈ C ∞ (R) with f (λ) ≡ 1 for 1, f ≡ 0 for λ −1. We set for 0 < c ≤ 1, 0 < ρ < 1: |s|0 − ct −|s|0 − ct + f , b0,t := f tρ tρ s − ct −s − ct +f , bt := f tρ tρ and for δ > 0: g = F (tδ |σ| ≥ 1) ,
g1 = F
1 tδ |σ| ≥ , 2
so that g1 g = g. The following lemma is analogous to Lemma 10.1. Lemma 10.3. Assume ρ > δ. Then (i) (1 − g1 )(bt + µ + R)−1 g ∈ O(t−∞ ), for µ ∈ C\R− , uniformly for R ≥ 0, (ii) (bt − b0,t )g ∈ O(t−∞ ). Proof. Let us first prove (ii). We apply the identity (10.13) to the even (t-dependent) function λ − ct −λ − ct + f −1, χt (λ) = f tρ tρ and obtain bt − b0,t = χt (|s|) − χt (|s|0 ) Z ∂ i χ ˜t (z)z((z 2 − s2 )−1 − (z 2 − s20 )−1 )dz ∧ d¯ z, = 2π z¯
November 25, 2002 16:35 WSPC/148-RMP
1258
00150
C. G´ erard
where χ ˜t is an almost-analytic extension of χ ˜ satisfying (10.12). We recall the identity (see [1, Theorem 3.1.2]): (z 2 − s2 )−1 − (z 2 − s20 )−1 =
i |φz ihφz¯| for Im z > 0 , 2z
where φz (σ) = eiz|σ| . We have: kφz k ≤ C|Im z|− 2 , 1
−δ
kgφz k ≤ C|Im z|− 2 e−t 1
|Im z|
.
This gives −δ
k((z 2 − s2 )−1 − (z 2 − s20 )−1 )gk ≤ C|Im z|−1 e−t
|Im z|
,
Im z 6= 0 .
(10.16)
We deduce from (10.16) and (10.12) that k(bt − b0,t )gk Z ≤ CN
−δ
|zkIm z|N −1 t−ρ(N +1) e−t
|Im z|
dz ∧ d¯ z
supp χ ˜t
Z ≤ CN
|z|t(δ−ρ)N dz ∧ d¯ z ∈ O(t−∞ ) ,
supp χ ˜t
since ρ > δ. This proves (ii). To prove (i), we write (1 − g1 )(bt + µ + R)−1 g = −(1 − g1 )(bt + µ + R)−1 [bt , g](bt + µ + R)−1 . By p.d.o. calculus, [bt , g] = g2 [bt , g], where g2 = F (tδ |σ| ≥ 32 ), and [bt , g] ∈ O(tδ−ρ ). Iterating this argument we obtain (i). 11. Reinterpretation of the Spaces He+ c In this section we describe the spaces Hce+ using the observable |s|0 introduced in Sec. 10.2. It will allow us in Sec. 12 to construct corresponding spaces Hc+ for the original Hamiltonian H. 11.1. Preliminary results In this subsection we show that the spaces Hce+ can also be defined with a cutoff function in s which is even. This easy result uses the fact shown in Sec. 5.3 that there is no propagation in the region {s ≤ −ct}. Let f ∈ C ∞ (R) satisfying (5.1) for 0 < α0 < α1 . We set for 0 < ρ < 1: s − ct −s − ct (11.1) + f , Bct := dΓ(bct ) . bct := f tρ tρ (The reader should compare (11.1) with (5.2)). Theorem 11.1. Assume (I0 0), (I0 1) for 0 > 0, (I0 2) for µ > 1 and pick ρ in (11.1) such that ρ(1 + 0 ) > 1. Then:
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1259
(i) for each λ ∈ C\R− the limit e e ˜ c+ (λ) exists. s- lim eitH (Bct + λ)−1 e−itH =: R
t→+∞
˜ + (λ), H e ] = 0. ˜ + (λ), L] = [R (ii) [R c c −1 ˜ + −1 (iii) s- lim→0 Rc ( ) =: Pˆce+ , where the orthogonal projection Pˆce+ is defined in Theorem 5.5. It follows from Theorems 11.1 and 5.6 that u ∈ Hce+ if and only if there exists c > c such that 0
lim lim eitH (Bc0 t + 1l)−1 e−itH u = u . e
e
→0 t→+∞
Proof. We set f1 (s) = f (−s) and note that f1 satisfies (5.14). Let s − ct , B+,t := dΓ(b+,t ) , b+,t = f tρ s + ct , B−,t := dΓ(b−,t ) , b−,t = f1 tρ so that Bct = B+,t + B−,t . By Theorem 5.5 and Proposition 5.7 we know that for all λ, λ0 ∈ C\R+ the limit s- lim eitH (λ − B+,t )−1 (λ0 − B−,t )−1 eitH exists . e
e
t→+∞
Note that the functions R2 3 (s, s0 ) 7→ (λ − s)−1 (λ0 − s0 )−1 for λ, λ0 ∈ C\R are total in C∞ (R2 ). Hence for all χ ∈ C∞ (R2 ), the limit s- lim eitH χ(B+,t , B−,t )e−itH exists . e
e
t→+∞
We claim that s- lim s- lim eitH (B+,t + B−,t + 1l)−1 e−itH = Pˆce+ , e
→0
e
(11.2)
t→+∞
where Pˆce+ is defined in Theorem 5.5. By density using Proposition 5.7(iii), it suffices to show that s- lim s- lim eitH ((B+,t + B−,t + 1l)−1 − (B+,t + 1l)−1 )e−itH R1+ (0 ) = 0 , e
→0
e
t→+∞
(11.3) for any 0 > 0, where R1+ (0 ) is defined in Proposition 5.7. Now ((B+,t + B−,t + 1l)−1 − (B+,t + 1l)−1 )(0 B−,t + 1l)−1 = −((B+,t + B−,t ) + 1l)−1 (B+,t + 1l)−1 B−,t (0 B−,t + 1l)−1 = O(−1 0 ) uniformly in t . This proves (11.3) and hence (11.2). Statements (i) and (ii) follow from Theorem 5.5 and Proposition 5.7. Statement (iii) follows from (11.2).
November 25, 2002 16:35 WSPC/148-RMP
1260
00150
C. G´ erard
11.2. Reinterpretation of the space He+ c We now want to replace the observable bc t by an observable bc 0 t which commutes with the projections 1l{±σ≥0} . Let |s|0 be the observable defined in Sec 10.2. We set |s|0 − ct −|s|0 − ct bc 0 t = f + f , Bc 0 t := dΓ(bc 0 t ) . tρ tρ Proposition 11.2. Assume (I0 0), (I0 1) for 0 > 0, (I0 2) for µ > 1 and pick ρ in (11.1) such that ρ0 > 1. Then for each λ ∈ C\R− the limit s- lim eitH (Bc 0,t + λ)−1 e−itH e
e
t→+∞
exists and equals e e ˆ e+ (λ) . s- lim eitH (Bct + λ)−1 e−itH = R c
t→+∞
The following consequence of Theorems 11.1 and 5.6 gives the final description of the space Hce+ : Theorem 11.3. Assume (I0 0), (I0 1) for 0 > 0, (I0 2) for µ > 1 and pick ρ in (11.1) such that ρ0 > 1. Then u ∈ Hce+ if and only if there exists c0 > c such that lim lim eitH (Bc0 0 t + 1l)−1 e−itH u = u . e
e
→0 t→+∞
Proof of Proposition 11.2. We drop the subscript c to simplify the notation. We will use the notation in Sec. 10.2. Recall that we have set: 1 , g = F (tδ |σ| ≥ 1) , g1 = F tδ |σ| ≥ 2 for δ > 0. We fix δ < ρ with δ0 > 1 so that the results of Sec. 4.4 apply. By a density argument, using Theorem 4.12 and Lemma 4.11, it suffices to prove that for χ ∈ C0∞ (R): ((λ + dΓ(bt ))−1 − (λ + dΓ(b0t ))−1 )Γ(g1 )χ(L)Γ(g)χ(L) ∈ o(1) .
(11.4)
We first claim that (1l − Γ(g1 ))(λ + dΓ(bt ))−1 Γ(g) ∈ O(N )t−∞ . In fact on the n-particle sector: 1l − Γ(g1 ) =
n X
1l ⊗ · · · ⊗ 1l ⊗ (1 − g1,j ) ⊗ g1,j+1 ⊗ · · · ⊗ g1,n ,
j=1
so k(1l − Γ(g1 ))(λ + dΓ(bt ))−1 Γ(g)k ≤ n sup k(1 − g1 )(bt + λ + R)−1 gk ∈ O(N e )t−∞ R≥0
(11.5)
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1261
by Lemma 10.3(i). Now we write ((λ + dΓ(b0t ))−1 − (λ + dΓ(bt ))−1 )Γ(g) = (λ + dΓ(b0t ))−1 dΓ(bt − b0t )(λ + dΓ(bt ))−1 Γ(g) = (λ + dΓ(b0t ))−1 dΓ(bt − b0t )Γ(g1 )(λ + dΓ(bt ))−1 Γ(g) + (λ + dΓ(b0t ))−1 dΓ(bt − b0t )(1l − Γ(g1 ))(λ + dΓ(bt ))−1 Γ(g) =: I1 + I2 . By Lemma 10.3(ii), we have dΓ(bt − b0t )Γ(g1 ) ∈ O(N e )t−∞ ,
(11.6)
and hence kI1 χ(L)Γ(g1 )χ(L)k ≤ Ct−∞ k(N e + 1)Γ(g)χ(L)k ∈ O(t−∞ ) by Lemma 4.13(i). Similarly by (11.5), we have (λ + dΓ(b0,t ))−1 dΓ(bt − b0,t )(1l − Γ(g1 ))(λ + dΓ(bt ))−1 Γ(g) ∈ O((N e )2 )t−∞ , (11.7) and hence if g2 is such that gg2 = g, g1 g2 = g2 we have kI2 χ(L)Γ(g1 )χ(L)k = kI2 Γ(g2 )χ(L)Γ(g1 )χ(L)k ≤ Ct−∞ k(N e )2 Γ(g2 )χ(L)Γ(g1 )χ(L)k ≤ Ct−∞ , by Lemma 4.13(ii). This proves (11.4) and completes the proof of the proposition. 11.3. Reinterpretation of Γ+ (f0 ) Let f0 be a cutoff function as in (6.1) with 0 < α0 < α1 < α2 . We recall that the observable Γe+ c (f0 ) was defined in (6.17). Proposition 11.4. Assume (I0 0), (I0 1) for 0 > 0, (I0 2) for µ > 1 and pick ρ in (11.1) such that ρ0 > 1. Then for 0 < c < c0 < 1: e |s|0 − c0 t itH e (f ) = slim e Γ f e−itH on Hce+ . Γe+ 0 0 c0 t→+∞ tρ Proof. Let us replace c0 by c to simplify notation. By Proposition 5.11 we have itH Γ(ct )e−itH , Γe+ c (f0 ) = s- lim e e
e
t→+∞
for
ct = f 0
s − ct tρ
f0
−s − ct tρ
.
November 25, 2002 16:35 WSPC/148-RMP
1262
00150
C. G´ erard
Let c0t = f0
|s|0 − ct tρ
f0
−|s|0 − ct tρ
,
and note that c0t = f0
|s|0 − ct tρ
for t 1 ,
since |s|0 ≥ 0. As in the proof of Proposition 11.2, we set g = F (tδ |σ| ≥ 1) for δ < ρ with δ0 > 1 so that the results of Sec. 4.4 apply. The function λ − ct −λ − ct f χt (λ) = f0 0 tρ tρ is an even function of λ, satisfying (10.11). As in the proof of Lemma 10.3(ii), we have (ct − c0t )g ∈ O(t−∞ ) .
(11.8)
As in the proof of Proposition 11.2, it suffices to show that for χ ∈ C0∞ (R): (Γ(ct ) − Γ(c0t ))Γ(g)χ(L) ∈ o(1) .
(11.9)
We claim that if a, b, g ∈ B(he ) with 0 ≤ a, b, g ≤ 1 then k(Γ(b) − Γ(a))Γ(g)(N e + 1)−1 k ≤ k(b − a)gk .
(11.10)
To prove (11.10), we write on the n-particle sector Γ(b) − Γ(a) =
n X
b1 ⊗ · · · ⊗ bi−1 ⊗ (bi − ai ) ⊗ ai+1 ⊗ · · · ⊗ an .
i=1
Using then (11.8) and (11.10) we get (Γ(ct ) − Γ(c0,t ))Γ(g) ∈ O(N e )t−∞ . Next k(Γ(ct ) − Γ(c0t ))Γ(g)χ(L)k = k(Γ(ct ) − Γ(c0t ))Γ(g)Γ(g1 )χ(L)k ≤ k(Γ(ct ) − Γ(c0t ))Γ(g)(N e + 1)−1 k k(N e + 1)Γ(g1 )χ(L)k ∈ O(t−∞ ) , by Lemma 4.13(i). This proves (11.9) and completes the proof of the proposition.
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1263
12. Scattering Theory for H This section contains the main results of this paper. We first construct for 0 < c < 1 spaces Hc+ containing a finite number of particles in the region {|x| ≥ c0 t} for any c < c0 . We show then that the asymptotic Weyl operators W + (h) induce on Hc+ a regular CCR representation of Fock type. Finally we prove the geometric asymptotic completeness property, which states that the vacuum states of this induced representation contain no particles in the region {|x| ≥ c0 t}, for any c < c0 . We start with an easy technical lemma. Lemma 12.1. Hypotheses (I2) for µ > 0, (I5) for µ2 > 0 imply hypothesis (I4) for µ1 = inf(µ, 2µ2 ). From Lemma 12.1 we see that if (I2), (I5) are satified for µ > 1, µ2 > 1 then (I4) is satified for µ1 > 1. Proof. Let > 0 and set vj, = χ( ≤ |k| ≤ −1 )vj . We drop the index j to simplify notation. Going to polar coordinates as in Sec. 10, we have by (I2):
and by (I5)
˜ v (˜ σ ω) ∈ L2 (R+ , d˜ σ ) ⊗ L2 (S 2 ) , (−∂σ˜2 + 1)µ/2 σ
(12.1)
µ2 ∆ω σ ˜ v (˜ σ ω) ∈ L2 (R+ , d˜ σ ) ⊗ L2 (S 2 ) . − 2 +1 σ ˜
(12.2)
˜ included in ]0, +∞[, (12.2) is equivalent to Since v has support in σ ˜ v (˜ σ ω) ∈ L2 (R+ , d˜ σ ) ⊗ L2 (S 2 ) . (−∆ω + 1)µ2 σ
(12.3)
Clearly (12.1) and (12.3) imply that ˜ v (˜ σ ω) ∈ L2 (R+ , d˜ σ ) ⊗ L2 (S 2 ) , (−∂σ˜2 − ∆ω + 1)µ1 /2 σ
(12.4)
for µ1 = inf(µ, 2µ2 ). Again because of the support of v , (12.4) implies that µ1 /2 ∆ω 2 σ ˜ v (˜ σ ω) ∈ L2 (R+ , d˜ σ ) ⊗ L2 (S 2 ) . (12.5) −∂σ˜ − 2 + 1 σ ˜ This can be shown by a direct computation for µ1 ∈ N and then extended to µ1 ∈ R+ by interpolation. Going back to the original coordinates we see that (12.5) implies (I4) for µ1 . 12.1. Number of asymptotically free particles Let f ∈ C ∞ (R) a cutoff function such that 0 ≤ f ≤ 1, f 0 ≥ 0, f ≡ 0 for s ≤ α0 , f ≡ 1 for s ≥ α1 , for 0 < α0 < α1 . We set bct := f
|x| − ct tρ
for constants 0 < c < 1, 0 < ρ < 1.
(12.6)
,
Bct = dΓ(bct ) ,
(12.7)
November 25, 2002 16:35 WSPC/148-RMP
1264
00150
C. G´ erard
Proposition 12.2. Assume (H0) for α > 1, (I0), (I1) for 0 > 1, (I2) for µ > 1, (I5) for µ2 > 1 and choose ρ in (12.7) such that ρ0 , ρµ2 > 1. Then (i)
ˆ c+ (λ) s- lim eitH (Bct + λ)−1 e−itH =: R t→+∞
exists for λ ∈ C\R− . (ii)
ˆ + (λ), H] = 0 . [R c
(iii)
ˆ c+ (−1 ) exists Pˆc+ := s- lim −1 R →0
and is orthogonal projection. Theorem 12.3. Assume (H0) for α > 1, (I0), (I1) for 0 > 1, (I2) for µ > 1, (I5) for µ2 > 1 and choose ρ in (12.7) such that ρ0 > 1, ρµ2 > 1. Let Pc+ := inf0 Pˆc+0 , c
Hc+ := Ran Pc+ ,
Then (i) Pc+ is an orthogonal projection independent on the choice of the function f in (12.7). (ii) [H, Pc+ ] = 0. (iii) u ∈ Hc+ ⇔ WIΩ u ∈ Hce+ , where Hce+ is defined in Theorem 5.6. (iv) Ω+ (Hpp (H) ⊗ Γ(h)) ⊂ Hc+ ⊂ H+ , where the space H+ is defined in Sec. 8.2. (v) W + (h) : Hc+ → Hc+ for h ∈ h0 . (vi) h0 3 h → W + (h) ∈ U(Hc+ ) is a regular CCR representation of Fock type. Proof of Proposition 12.2. We will use the notation and results in Secs. 11, 10 and 3.4. Note also that to consider a Nelson Hamiltonian as an abstract Pauli–Fierz Hamiltonian one has to introduce polar coordinates using the unitary transformation u defined in Sec. 3.1. To lighten notation, we will omit this transformation and its extension Γ(u) to Fock spaces in the computations below. For example the observable |x| will be identified with the observable a = u|x|u−1 considered in Sec. 10.1 By Proposition 11.2: e e ˆ e+ (λ) , s- lim eitH (Bc 0 t + λ)−1 e−itH = R c
t→+∞
λ ∈ C\R− .
(12.8)
Note that because of the Dirichlet condition at 0 in the definition of s20 (see Sec. 10.1) we have: 1l{σ≥0} bc 0 t 1l{σ≤0} = 0 .
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1265
Hence bc 0 t satisfies property (3.6) in Sec. 3.4. Moreover bc+t = 1l{σ≥0} bc 0 t 1l{σ≥0} a0 − ct −a0 − ct =f + f , tρ tρ where a0 is defined in (10.1). Note also that since |x| ≥ 0 a − ct −a − ct +f , bct = f tρ tρ where a is defined in (10.1). We deduce then from Proposition 3.4 that ˆ + (λ) s- lim eitH (dΓ(bc+,t ) + λ)−1 e−itH =: R c0 t→+∞
exits for λ ∈ C\R
−
and ˆ + (λ)] = 0 . [H, R c0
We claim now that ˆ + (λ) = s- lim eitH (Bct + λ)−1 e−itH , R c0 t→+∞
(12.9)
which will prove (i) and (ii). Property (iii) follows then from Proposition A.7. Let us now prove (12.9). Let g, g1 ∈ B(h) be defined in (10.5) for exponents ρ1 , δ ω such that ρ1 µ2 > 1, δ0 > 1. Using Theorems 4.9 and 4.16 for C = −∆ σ ˜ 2 , we have: e−itH u = Γ(g)e−itH u + o(1) ,
u ∈ H.
(12.10)
By a density argument and using Lemmas 4.7 and 4.14, (12.9) will follow from the fact that ((dΓ(bct ) + λ)−1 − (dΓ(bc+t ) + λ)−1 )Γ(g)χ(H)Γ(g1 )χ(H) ∈ o(1) ,
(12.11)
for χ ∈ C0∞ (R). Let us prove (12.11) following the proof of (11.4). First (1l − Γ(g1 ))(dΓ(bct ) + λ)−1 Γ(g) ∈ O(N )t−∞ ,
(12.12)
using Lemma 10.1(i) and the same argument as in the proof of (11.5). Next we write: ((dΓ(bc+t ) + λ)−1 ) − ((dΓ(bct ) + λ)−1 )Γ(g) = (dΓ(bc+t ) + λ)−1 dΓ(bct − bc+t )(dΓ(bct ) + λ)−1 Γ(g) = (dΓ(bc+t ) + λ)−1 dΓ(bct − bc+t )Γ(g1 )(dΓ(bct + λ)−1 Γ(g) + (dΓ(bc+t ) + λ)−1 dΓ(bct − bc+t )(1l − Γ(g1 ))(dΓ(bct + λ)−1 Γ(g) = I1 + I2 . By Lemma 10.1(ii) dΓ(bct − bc+t )Γ(g1 ) ∈ O(N )tρ1 −2ρ log t ,
November 25, 2002 16:35 WSPC/148-RMP
1266
00150
C. G´ erard
and hence kI1 χ(H)Γ(g1 )χ(H)k ≤ Ctρ1 −2ρ log tkN Γ(g)χ(H)k . Applying then Lemma 4.10, we obtain kI1 χ(H)Γ(g1 )χ(H)k ∈ O(tρ1 −2ρ+δ log t) . Similarly by (12.12), we have I2 ∈ O(N 2 )t−∞ , and hence if g2 is another operator analogous to g such that gg2 = g, g1 g2 = g1 , then kI2 χ(H)Γ(g1 )χ(H)k ≤ Ct−∞ kN 2 Γ(g2 )χ(H)Γ(g1 )χ(H)k ≤ Ct−∞ , by the same argument as in the proof of Lemma 4.10(ii). Since ρ is such that ρ0 > 1, ρµ2 > 1, we can pick exponents ρ1 , δ in the definition of g with δ0 > 1, ρ1 µ2 > 1 and ρ > δ, ρ > ρ1 . Hence (12.9) holds and the proof is complete. Proof of Theorem 12.3. Applying first Proposition 3.4 and using (12.8), (12.9) we obtain that Pˆc+ = IΩ∗ W −1 Pˆce+ WIΩ , Pˆc+ = W Pˆc+ ⊗ 1lΓ(h) W −1 , where Pˆce+ is defined in Theorem 5.6, and hence: Pc+ = IΩ∗ W −1 Pce+ WIΩ , Pce+ = WPc+ ⊗ 1lΓ(h) W −1 . Clearly this implies (i), (ii), (iii). We note next that u ∈ Hpp (H) ⇔ WIΩ u ∈ Hpp (H e ) ,
(12.13)
u ∈ H+ = D(N + ) ⇔ WIΩ u ∈ He+ = D(N e+ ) , by (8.19) and u ∈ Ω+ (Hpp (H) ⊗ Γ(h)) ⇔ WIΩ u ∈ Ωe+ (Hpp (H e ) ⊗ Γ(h2 )) , by (8.18) and (12.13). This proves the two inclusions in (iv), using the corresponding inclusions in Theorem 8.7. Finally (v) follows from (8.16) and the corresponding statement in Theorem 8.7.
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1267
12.2. Operators Γ+ c (f0 ) Let f0 ∈ C ∞ (R) such that 0 ≤ f0 ≤ 1, f00 ≤ 0, f0 ≡ 1 for s ≤ α1 , f0 ≡ 0 for s ≥ α2 , for 0 < α0 < α1 < α2 . Let |x| − ct f0ct := f0 , acting on h = L2 (R3 , dk) , tρ
(12.14)
(12.15)
for constants 0 < c < 1, 0 < ρ < 1. Theorem 12.4. Assume (H0) for α > 1, (I0), (I1), for 0 > 1, (I2) for µ > 1, (I5) for µ2 > 1 and choose ρ in (12.15) such that ρ0 > 1, ρµ2 > 1. Then for 0 < c < c0 < 1: + (i) s- limt→+∞ eitH Γ(f0,c0 ,t )e−itH =: Γ+ c0 (f0 ) exists on Hc , + (ii) [Γc0 (f0 ), H] = 0, e+ (iii) WIΩ Γ+ c0 (f0 ) = Γc0 (f0 )WIΩ .
Proof. We use the notation in Secs. 10, 11 and in the proof of Proposition 12.2. By Proposition 11.4 we know that e |s|0 − c0 t e+ itH e Γ f0 e−itH on Hce+ . Γc0 (f0 ) = s- lim e t→+∞ tρ Since by Theorem 12.3: u ∈ Hc+ ⇔ WIΩ u ∈ Hce+ , we deduce from Proposition 3.4 and the fact that |s|0 satisfies (3.6) that a0 − c 0 t itH + e−itH =: Γ+ s- lim e Γ f0 c0 ,0 (f0 ) exists on Hc t→+∞ tρ and [Γ+ c0 ,0 (f0 ), H] = 0 , e+ WIΩ Γ+ c0 ,0 (f0 ) = Γc0 (f0 )WIΩ .
To prove the theorem, it remains to prove that |x| − c0 t + itH e−itH on Hc+ . Γc0 ,0 (f0 ) = s- lim e Γ f0 t→+∞ tρ Using (12.10) and a density argument, (12.16) will follow from a0 − c 0 t |x| − c0 t − Γ f0 Γ(g)χ(H) ∈ o(1) , Γ f0 tρ tρ
(12.16)
(12.17)
for χ ∈ C0∞ (R). Let us replace c0 by c to simplify notation. We claim that |x| − ct a0 − ct (12.18) − f g ∈ O(tρ1 −2ρ log t) . f0 0 tρ tρ
November 25, 2002 16:35 WSPC/148-RMP
1268
00150
C. G´ erard
In fact set f (s) = 1 − f0 (s). The function f satisfies condition (10.4) in Sec. 10. Then −s − ct s − ct s − ct + f = 1 − f for s ≥ 0, t 1 . f 0 tρ tρ tρ Applying Lemma 10.1(ii) we obtain (12.18). Using (12.18) and (11.10), we obtain
|x| − ct −1 ρ1 −2ρ
Γ f0 a0 − ct log t) . − Γ f Γ(g)(N + 1) 0
∈ O(t
ρ ρ t t By Lemma 4.10, we obtain
|x| − ct
Γ f0 a0 − ct − Γ f0 Γ(g)χ(H)
ρ ρ t t ≤ Ctρ1 −2ρ log tk(N + 1)Γ(g1 )χ(H)k ≤ Ctρ1 −2ρ+δ log t , if g1 is as g with g1 g = g. As in the proof of Proposition 12.2, we can choose ρ1 , δ such that ρ1 − 2ρ + δ < 0. This proves (12.17) and completes the proof of the theorem. 12.3. Geometric asymptotic completenses Theorem 12.5. Assume (H0) for α > 1, (I0), (I1) for 0 > 1, (I2) for µ > 1, (I5) for µ2 > 1. Let 0 < ρ < 1 such that ρ0 > 1, ρµ2 > 1 and 0 < c < c0 < 1. Let f0 be a cutoff function satisfying (12.14). Then the operator Γ+ c0 (f0 ) is equal to the orthogonal projection on the space Kc+ := K+ ∩ Hc+ . Proof. By Theorem 12.3(iii) and identity (8.17) we have u ∈ Kc+ ⇔ WIΩ u ∈ Kce+ . By Theorem 12.4, ∗ −1 e+ Γc0 (f0 )WIΩ . Γ+ c0 (f0 ) = IΩ W
The theorem follows then from the corresponding result in Theorem 9.5. 13. The Mourre Estimate and Its Consequences In this section we study the consequences of a Mourre estimate for the Hamiltonian H for the spaces Hc+ . We show that if a Mourre estimate holds on an energy interval ∆ with the generator of dilations as conjugate operator, then the space 1l∆ (H)Kc+ of asymptotic vacua in Hc± with energy in ∆ coincide with the space of bound states of H in ∆.
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1269
Let a := − 12 (k.Dk + Dk .k) acting on h = L2 (R3 , dk) be the generator of dilations on the one-particle space. Let A = 1lK ⊗ dΓ(a). We introduce the following hypothesis on the coupling functions vj defined in Sec. 1.1: Z (1 + |k|−1 )ka|1+ vj (k)|2 dk < ∞ , (I6) Z |k|2+2 |vj (k)|2 dk < ∞, 1 ≤ j ≤ P, 0 ≤ ≤ 1 . 0
Lemma 13.1. Assume (I0), (H0) for α > 1 and (I6) for > 0. Then H ∈ C 1+ (A) for 0 = inf(α − 1, ). Proof. Let v ∈ B(K, K⊗h) be defined in (1.3). We first claim that under hypothesis (I6) we have: (1 + |k|− 2 )k|a|1+ hxi−1− vk ∈ B(K, K ⊗ h) . 1
It suffices to prove the claim for = 0, 1 and then argue by interpolation. The proof of the claim for = 0, 1 is easy if we note the identity a(e−ik.xj vj ) = e−ik.xj (a + k.xj )vj , and use the factor of hxi to control the powers of xj appearing when computing ai v for i = 1, 2. We deduce from our claim that under Hypothesis (H0) for α > 1 and (I6) for > 1, we have 0
(1 + |k|− 2 )(K + b)− 2 a1+ v ∈ B(K, K ⊗ h) , 1
1
Another easy observation is that for vs = e − 12
k(1 + |k|
isa
0 = inf(α − 1, ) .
(13.1)
v we have
)vs kB(K,K⊗h) ≤ C, uniformly in |s| ≤ 1 .
(13.2)
We first claim that the map R 3 s 7→ eisA (z − H)−1 e−isA (H0 + b) 2 ∈ B(H) 1
is C 1 for the norm topology. In fact let H(s) := eisA He−isA = e−s H0 + φ(eisa v) . We have D(H(s)) = D(H0 ) and k(H(s) + i)−1 (H0 + i)k ≤ C uniformly for |s| ≤ 1. We compute s−1 ((z − H(s))−1 − (z − H)−1 )(H0 + b) 2 1
= s−1 (z − H(s))−1 (H(s) − H)(z − H)−1 (H0 + b) 2 1
= s−1 (e−s − 1)(z − H(s))−1 H0 (z − H)−1 (H0 + b) 2 1
+ (z − H(s))−1 φ(s−1 (eisa − 1l)v)(z − H)−1 (H0 + b) 2 1
= s−1 (e−s − 1)(z − H(s))−1 H0 (z − H)−1 (H0 + b) 2 1
November 25, 2002 16:35 WSPC/148-RMP
1270
00150
C. G´ erard
+ (z − H(s))−1 (H0 + b) 2 (H0 + b)− 2 (K + b) 2 1
1
1
× (K + b)− 2 φ(s−1 (eisa − 1l)v)(z − H)−1 (H0 + b) 2 . 1
1
Using (13.1) and Proposition A.1, we obtain that lim s−1 ((z − H(s))−1 − (z − H)−1 )(H0 + b) 2 1
s→0
= (z − H)−1 (−H0 + φ(iav))(z − H)−1 (H0 + b) 2 1
in norm, which proves in particular that H ∈ C 1 (A). It remains to prove that the map R 3 s 7→ eisA (z − H)−1 (−H0 + φ(iav))(z − H)−1 e−isA ∈ B(H) 0
is C for the norm topology. We write eisA (z − H)−1 H0 (z − H)−1 e−isA = eisA (z − H)−1 e−isA (H0 + b) 2
1
× (H0 + b)− 2 eisA H0 e−isA (H0 + b)− 2 1
1
× (H0 + b) 2 eisA (z − H)−1 e−isA . 1
The first and third terms in the product are C 1 in norm. The second term is equal to e−s H0 (H0 + b)−1 and hence is also C 1 in norm. This shows that R 3 s 7→ eisA (z − H)−1 H0 (z − H)−1 e−isA is C 1 in norm. We consider next eisA (z − H)−1 φ(iav)(z − H)−1 e−isA = eisA (z − H)−1 e−isA (H0 + b) 2 1
× (H0 + b)− 2 φ(eisa iav)(H0 + b)− 2 1
1
× (H0 + b) 2 eisA (z − H)−1 e−isA . 1
Again the first and third terms in the product are C 1 in norm. The second term we write as (H0 + b)− 2 (K + b) 2 φ(eisa (K + b)− 2 iav)(H0 + b)− 2 . 1
1
1
1
0
Using (13.1) and Proposition A.1 we see that the second term is C in norm. This completes the proof of the lemma. Lemma 13.2. Let ft (x) ∈ C ∞ (R) with |∂xα ft (x)| ≤ Cα t−ρ|α| . Then (i) for F ∈ C ∞ (R) with ∂λα F ∈ O(hλi−|α| ), we have: A ∈ O(N )t−ρ ; Γ(ft ), F t
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1271
(ii) if supp ft ⊂ {|x| ≤ ct}, |ft | ≤ 1 then A Γ(ft ) Γ(ft ) ≤ cdΓ(|k|) + CN t−ρ . t Proof. Let us first prove (i). We set F (λ) = (λ + i)F−1 (λ), with ∂λα F−1 (λ) ∈ O(hλi−1−α ). We have A A A A A + i Γ(ft ), F−1 = Γ(ft ), F−1 + . (13.3) Γ(ft ), F t t t t t Now [Γ(ft ), At ] = dΓ(ft [ft , at ]) ∈ O(N )t−ρ , using [10, Lemma 2.8]. Let us estimate the second term in (13.3). Let F˜−1 ∈ C ∞ (C) be an almostanalytic extension of F−1 satisfying (see e.g. [9, Proposition C.2.2]): ∂ F˜ −1 (z) ≤ CN hzi−2−N |Im z|N , N ∈ N , ∂ z¯ supp F˜−1 ⊂ {z | |Im z| ≤ ChRe zi} . We have A A + i Γ(ft ), F−1 t t −1 −1 Z A A A A i ˜ +i z− ∂z¯F−1 (z) dz ∧ d¯ z z− Γ(ft ), = 2π C t t t t ∈ O(N )t−ρ , using the properties of F˜−1 and the fact that N commutes with A and Γ(ft ). This completes the proof of (i). Let us now prove (ii). We have A Γ(ft ) Γ(ft ) t
x 1 x 1 Γ(ft )dΓ · Dx Γ(ft ) + Γ(ft )dΓ Dx · Γ(ft ) 2 t 2 t x 1 x 1 · Dx + dΓ Dx · Γ(ft2 ) + O(N )t−ρ . = Γ(ft2 )dΓ 2 t 2 t =
Hence on the n-particle sector we have A 1X ai bi + bi ai + O(N )t−ρ , Γ(ft ) Γ(ft ) = t 2 i=1 n
for ai = Πnj=1 ft2 (xj )
xi , t
bi = Dx i .
November 25, 2002 16:35 WSPC/148-RMP
1272
00150
C. G´ erard
Note that we have the following identity (ab + ba)2 = 4ba2 b + 2[[a, b], ab] + [a, b]2 .
(13.4)
This yields (ai bi + bi ai )2 ≤ 4bi a2i bi + O(t−2ρ ) ≤ 4c2 b2i + O(t−2ρ ) , 1
using the properties of ft . Using the fact that the function λ → λ 2 is matrix monotone (see [8, Sec. 2.2.2]), we obtain 1 ± (ai bi + bi ai ) ≤ c|bi | + O(t−ρ ) . 2 Summing over i, we get A ±Γ(ft ) Γ(ft ) ≤ cdΓ(|k|) + CN t−ρ , t which proves (ii). The following theorem is the main result of this section. It means that if a Mourre estimate holds on an energy interval ∆, then on the range of 1l∆ (H) the space of asymptotic vacua in Hc+ coincide with the space of bound states for c small enough. Theorem 13.3. Assume (H0) for α > 1, (I0), (I1) for 0 > 1, (I2) for µ > 1, (I5) for µ2 > 1 and (I6) for > 0. Let 0 < ρ < 1 such that ρ0 > 1, ρµ2 > 1 and 0 < c < c0 < 1. Let ∆ ⊂ R be an open interval such that the following Mourre estimate holds on ∆: 1l∆ (H)[H, iA]1l∆ (H) ≥ c0 1l∆ (H) + R , where c0 > 0 and R ∈ B(H) is compact. Then for 0 < c < c(∆, c0 ) we have 1l∆ (H)Kc+ = 1l∆ (H)Hpp (H) . Proof. To prove the theorem, it suffices to prove that for c small enough 1l∆ (H)Hcont (H) ∩ Kc+ = {0} ,
(13.5)
where Hcont (H) is the continuous spectral subspace of H. In fact (13.5) implies that 1l∆ (H)Kc+ ⊂ 1l∆ (H)Hpp (H). The fact that Hpp (H) ⊂ Kc+ is shown in Proposition 8.4. We first recall that it follows from the fact that H ∈ C 1 (A) and that H satisfies a Mourre estimate on ∆ that: (i) σpp (H) is locally finite in ∆, (ii) ∀ λ ∈ ∆\σpp (H), ∀ > 0, there exists δ > 0 such that 1l[λ−δ,λ+δ] (H)[H, iA]1l[λ−δ,λ+δ] (H) ≥ (c0 − )1l[λ−δ,λ+δ] (H) .
(13.6)
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1273
Let now f0 ∈ C ∞ (R) satisfying (12.14) and let λ ∈ ∆\σpp (H). We will show that for δ and c small enough, we have: kΓ+ c (f0 )1l[λ−δ,λ+δ] (H)k < 1 . Note that by Theorem projection on the space that for c small enough
(13.7)
12.5 Γ+ c0 (f0 )1l[λ−δ,λ+δ] (H) is equal to the orthogonal 1l[λ−δ,λ+δ] (H)Kc+ for 0 < c < c0 . Hence (13.7) implies 1l[λ−δ,λ+δ] (H)Kc+ = {0} ,
which implies (13.5). Let us now prove (13.7). We deduce first from (13.6) and the fact that H ∈ 1+0 (A) for some 0 > 0 that for any > 0 we have C A −itH 1l[λ−δ,λ+δ] (H)u + o(1) , (13.8) e e−itH 1l[λ−δ,λ+δ] (H)u = F t where F ∈ C ∞ (R), 0 ≤ F ≤ 1, is supported in {λ ≥ c0 − 2} and equal to 1 in {λ ≥ c0 −}. This abstract result is due to [31]. A proof under the hypotheses above can be found in [18]. 1 Let now u ∈ D(N 2 ) and χ ∈ C0∞ (]λ − δ, λ + δ[). We recall that it follows from 1 (4.8) and the fact that χ(H) preserves D(N 2 ) that −1
k(N + 1) 2 χ(H)e−itH uk ≤ Ct(1+0 ) k(N + 1) 2 uk . 1
1
(13.9) −1
We have using (13.8), (13.9), Lemma 13.2(i) and the fact that ρ > (1 + 0 ) (e−itH u, χ(H)Γ(f0 c t )2 χ(H)e−itH u) A χ(H)e−itH u + o(1) = e−itH u, χ(H)Γ(f0 c t )2 F t A −itH −itH u, χ(H)Γ(f0 c t )F u Γ(f0 c t )χ(H)e = e t
+ O(t−ρ )k(N + 1) 2 χ(H)e−itH uk2 + o(1) A Γ(f0 c t )χ(H)e−itH u + o(1) . = e−itH u, χ(H)Γ(f0 c t )F t 1
Next we have (e−itH u, χ(H)Γ(f0 c t )2 χ(H)e−itH u) A Γ(f0 c t )χ(H)e−itH u + o(1) = e−itH u, χ(H)Γ(f0 c t )F t A Γ(f0 c t )χ(H)e−itH u + o(1) ≤ e−itH u, χ(H)Γ(f0 c t ) (c0 − )t c + 1 χ(H)dΓ(|k|)χ(H)e−itH u + o(1) , ≤ e−itH u, c0 −
November 25, 2002 16:35 WSPC/148-RMP
1274
00150
C. G´ erard
using Lemma 13.2(ii) and the fact that supp f0 c t ⊂ {|x| ≤ (c + 1 )t} for all 1 > 0, t ≥ T (1 ). Next we have χ(H)dΓ(|k|)χ(H) ≤ c1 (∆)χ2 (H) . Picking c such that cc1 (∆) < c0 , we obtain , 1 small enough (eitH u, χ(H)Γ(f0 c t )2 χ(H)e−itH u) < (u, χ2 (H)u) + o(1) . This yields kΓ+ c (f0 )χ(H)uk < kχ(H)uk ,
1
u ∈ D(N 2 )
and hence proves (13.7) by density. This completes the proof of the theorem. Acknowledgment This work is based on methods developed with Jan Derezi´ nski in a series of papers on the scattering theory for quantum field theory models. I would like to thank him very much for this friendly collaboration and for many helpful discussions. I would also like to thank Jean Ginibre and Giorgio Velo for useful conversations. Finally a part of this work was supported by the grant Nr 2 P03A 019 15 financed by Komitet Badan Naukowych. The research of the author is supported in part by NATO Collaborative Linkage Grant 976047. Appendix A A.1. Operator bounds The following proposition is shown in [13, Proposition 4.1]. Proposition A.1. Let v ∈ B(K, K ⊗ he ), ω be a selfadjoint operator on he . Assume that ω ≥ 0 and that ω is invertible on the range of v. Then ka(v)uk2 ≤ kv ∗ ω −1 vk(u, dΓ(ω)u) , ka∗ (v)uk2 ≤ k(u, v ∗ vu) + kv ∗ ω −1 vk(u, dΓ(ω)u) , kφ∗ (v)uk2 ≤ k(u, v ∗ vu) + 2kv ∗ ω −1 vk(u, dΓ(ω)u) , 1
for u ∈ D(dΓ(ω) 2 ). Lemma A.2. Let a, b be two selfadjoint operators on h such that 0 ≤ ap ≤ bp for each 0 ≤ p ≤ k, p, k ∈ N. Then (dΓ(a))k ≤ (dΓ(b))k .
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1275
We first note that if ai , bi ∈ B(Hi ), i = 1, 2 with 0 ≤ ai ≤ bi then a1 ⊗ a2 ≤ b1 ⊗ b2 . Next on the n-particle sector, we have: X ai1 ⊗ · · · ⊗ ain dΓ(a)k = i1 +···+in =k
X
≤
bi1 ⊗ · · · ⊗ bin
i1 +···+in =k
= dΓ(b)k and completes the proof of the lemma. A.2. Propagation estimates and existence of limits In this subsection we formulate two generalizations of standard arguments due to Sigal–Soffer [30]. Their proofs are analogous to the standard ones. Proposition A.3. Let H be a Hilbert space, D ⊂ H a dense subspace, H a selfadjoint operator on H and R+ 3 t 7→ Φ(t) ∈ B(H) a function with supt≥0 kΦ(t)k < ∞. Assume that for u ∈ D the function: f (t) = (ut , Φ(t)ut ) ∈ C 1 (R) if ut = e−itH u , and n X d f (t) ≥ (ut , R∗ (t)Ri (t)ut ) − (ut , Ri∗ (t)R(t)ut ) dt i=1
where
Z
+∞
kRi (t)ut k2 dt ≤ Ckuk2 ,
u ∈ D,
1 ≤ i ≤ n.
1
Then
Z
+∞
kR(t)ut k2 dt ≤ Ckuk2 ,
u ∈ D.
1
Proposition A.4. Let Hi , Di , Hi i = 1, 2 be as in Proposition A.3. Let R+ 3 t 7→ Φ(t) ∈ B(H1 , H2 ) a function with supt≥0 kΦ(t)k < ∞. Assume that for ui ∈ Di the function f (t) = (u2,t , Φ(t)u1,t ) ∈ C 1 (R) and
n df (t) X ≤ kB2,j (t)u2,t k kB1,j (t)u1,t k , dt i=1
November 25, 2002 16:35 WSPC/148-RMP
1276
00150
C. G´ erard
where
Z
+∞
kBi,j (t)ui,t k2 dt ≤ Ckui k2 , ui ∈ Di i = 1, 2, 1 ≤ j ≤ n . 1
Then s- lim eitH2 Φ(t)e−itH1 exists . t→+∞
A.3. Existence of limits of asymptotic observables In this subsection we give two different methods to show the existence of weak or strong limits of asymptotic observables. Proposition A.5. Let Hi , Di , Hi , i = 1, 2, be as in Proposition A.3. Let for ∈ [0, 1[: R+ 3 t 7→ Φ (t) ∈ B(H1 , H2 ) such that sup kΦ (t)k < ∞, ∀ ≥ 0, sup kΦ (0)k < ∞ ;
t∈R+
∈[0,1[
(A.1)
f (t) = (vt , Φ (t)ut ) ∈ C (R ) for u ∈ D1 , v ∈ D2 ; 1
+
n df (t) X
≤ kR1,j (t)ut k kR2,j (t)vt k uniformly in ∈ [0, 1[ dt i=1 Z +∞ kRi,j (t)ut k2 dt ≤ Ckuk2 , u ∈ Di , i = 1, 2, 1 ≤ j ≤ n ; with
(A.2)
1
w- lim Φ (t) = Φ0 (t), ∀ t 1 , →0
respectively
s- lim Φ (t) = Φ0 (t), ∀ t 1 .
(A.3)
→0
Then (i) s- limt→+∞ eitH Φ (t)e−itH =: Φ+ exists ∀ ∈ [0, 1[ , + = Φ , (ii) w- lim→0 Φ+ 0 + respectively s- lim→0 Φ+ = Φ0 . Proof. (i) Follows from Proposition A.4. It follows from (A.1), (A.2) that Φ+ is uniformly bounded in . Hence to prove (ii) it suffices by density to show that + lim (v, Φ+ u) − (v, Φ0 u) = 0 ,
→0
v ∈ D2 ,
u ∈ D1 ,
respectively that lim
sup
→0 v∈D ,kvk≤1 2
+ |(v, Φ+ u) − (v, Φ0 u)| = 0 ,
u ∈ D1 .
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1277
We have + (v, Φ+ u) − (v, Φ0 u)
= (vT , Φ (T )uT ) − (vT , Φ0 (T )uT ) Z +∞ Z +∞ d d (vt , Φ (t)ut )dt − (vt , Φ0 (t)ut )dt . + dt dt T T The sum of the last two terms is less than 12 Z n Z +∞ X 2 kR1,j (t)ut k dt 2 T
i=1
≤C
12
+∞
kR2,j (t)ut k dt 2
T
n Z X
12
+∞
kR1,j (t)ut k dt 2
kvk ,
T
i=1
uniformly in by (A.2). For T 1 this is less that αkvk for fixed u ∈ D1 , uniformly in . Then (ii) follows from the fact that for fixed T Φ (T ) converges weakly (respectively strongly) to Φ0 (T ) when → 0. Proposition A.6. Let H, D, H be as in Proposition A.3. Let for ∈ ]0, 1[ R+ 3 t 7→ Φ (t) ∈ B(H) selfadjoint , such that for fixed , Φ (t) satisfies the hypotheses of Proposition A.4 with Hi = H, Di = D and Hi = H, i = 1, 2. It follows that itH Φ (t)e−itH exists ∀ > 0 . Φ+ = s- lim e t→+∞
Assume that 0 ≤ Φ+ ≤ 1l ; d (ut , Φ (t)ut ) ≥ −kR(t)ut k2 , ∈ ]0, 1[ , u ∈ D ; dt Z +∞ kR(t)ut k2 dt ≤ Ckuk2 , u ∈ D ; where 1
w- lim Φ (t) = 1l , →0
∀t 1.
Then w- lim Φ+ = 1l . →0
Proof. Since Φ+ is uniformly bounded, it suffices by density to show that lim (u, Φ+ u) = (u, u) ,
→0
u ∈ D.
November 25, 2002 16:35 WSPC/148-RMP
1278
00150
C. G´ erard
We have
Z (u, Φ+ u)
+∞
= (uT , Φ (T )uT ) + Z
T
d (ut , Φ (t)ut )dt dt
+∞
≥ (uT , Φ (T )uT ) −
kR(t)ut k2 dt . T
For α > 0 we first choose T 1 such that the second term is less than α then 0 such that for < 0 the first term is greater than (u, u)− α. We obtain lim→0 (u, Φ+ u) ≥ 2 + 2 kuk . Since (u, Φ u) ≤ kuk this proves the proposition. A.4. Existence of some projections In this subsection we show the existence of some projections, using pseudo-resolvent arguments. Proposition A.7. Let H, H be as in Proposition A.3 and let R+ 3 t 7→ Bt , where Bt is a selfadjoint operator on H, Bt ≥ 0. Assume that ∀ λ ∈ C\R: R+ (λ) = s- lim eitH (Bt + λ)−1 e−itH exists . t→+∞
Then (i) for χ ∈ C∞ (R) the limit χ+ = s- lim eitH χ(Bt )e−itH exists ; t→+∞
(ii) If χ ∈ C0 (R), 0 ≤ χ ≤ 1, χ decreasing, χ ≡ 1 near 0 and χn (λ) = χ(n−1 λ), then + exists s- lim χ+ n = P t→+∞
and is an orthogonal projection independent on the choice of χ; (iii) P + = s- lim s- lim eitH (Bt + 1l)−1 e−itH . →0
t→+∞
Proof. (i) The functions s 7→ (s + λ)−1 for λ ∈ C\R are total in C∞ (R) by the Stone Weierstrass theorem. Hence the limit χ+ exists for all χ ∈ C∞ (R). + + + + = (ii) Clearly we have [χ+ n , χm ] = 0 ∀ n, m and χn ≤ χn+1 ≤ 1l. Hence P + w- limn→+∞ χn exists. + + For m ≥ n0 n with n0 large enough, we have χ+ n χm = χn . Letting m → +∞, + + + +2 = P + , i.e. P + is a we obtain χn P = χn . Letting then n → +∞ we obtain P 2 projection. We also have χm(n) ≤ χn ≤ χn , for m(n) n, m(n) → +∞ when n → +2 + + = w- limn→+∞ χ+2 +∞. Hence χ+ n . m(n) ≤ χn ≤ χn . Letting n → +∞, we get P Then we compute 2 + lim (u, (χ+2 lim k(P + − χ+ n )uk n − P )u) = 0 ,
n→+∞
n→+∞
November 25, 2002 16:35 WSPC/148-RMP
00150
On the Scattering Theory of Massless Nelson Models
1279
+ which shows that P + = s- limn→+∞ χ+ is independent on the n . To prove that P choice of χ, we note that if χ1 , χ2 are two such functions, we have χ1,m(n) ≤ χ2,n for + m(n) → +∞ when n → +∞. This yields χ+ 1,m(n) ≤ χ2,n and proves the statement by letting n → +∞. To prove (iii) it suffices to show that if χ ∈ C∞ (R), 0 ≤ χ ≤ 1, χ decreasing on + R+ and χ(0) = 1 then s- limn→+∞ χ+ n = P . For such χ and fixed 0 > 0, we can find a function χ1 satisfying the conditions of (ii) such that kχ − χ1 k∞ ≤ 0 . Then + the statement follows from the fact that s- limn→+∞ χ+ 1,n = P , by (ii).
References [1] S. Albeverio, F. Gesztesy, R. Høgh-Krohn and H. Holden, Solvable Models in Quantum Mechanics, Text and Monographs in Physics, Springer, 1988. [2] Z. Ammari, Asymptotic completeness for a renormalized non-relativistic Hamiltonian in quantum field theory: the Nelson model (preprint). [3] W. Amrein, A. Boutet de Monvel and W. Georgescu, C0 -Groups, Commutator Methods and Spectral Theory of N -Body Hamiltonians, Birkh¨ auser, Basel–Boston– Berlin, 1996. [4] A. Arai, Ground state of the massless Nelson model without infrared cutoff in a non-Fock representation (preprint). [5] A. Arai and M. Hirokawa, On the existence and uniqueness of ground states of a generalized spin-boson model, J. Funct. Anal. 151 (1997) 455–503. [6] V. Bach, J. Fr¨ ohlich and I. Sigal, Quantum electrodynamics of confined nonrelativistic particles, Adv. Math. 137(2) (1998) 299–395. [7] V. Bach, J. Fr¨ ohlich, I. Sigal and A. Soffer, Positive commutators and the spectrum of Pauli–Fierz Hamiltonian of atoms and molecules, Comm. Math. Phys. 207 (1999) 557–587. [8] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Text and Monographs in Physics, Springer, Berlin, 1981. [9] J. Derezi´ nski and C. G´erard, Scattering Theory of Classical and Quantum N -Particle System, Text and Monographs in Physics, Springer, 1997. [10] J. Derezi´ nski and C. G´erard, Asymptotic completeness in quantum field theory, massive Pauli–Fierz Hamiltonians, Rev. Math. Phys. 11 (1999) 383–450. [11] J. Derezi´ nski and C. G´erard, Spectral and scattering theory of spatially cut-off P (ϕ)2 Hamiltonians, Comm. Math. Phys. 213 (2000) 39–125. [12] J. Derezi´ nski and C. G´erard, Scattering theory of infrared-divergent Pauli–Fierz Hamiltonians, in preparation. [13] J. Derezi´ nski and V. Jaksic, Spectral theory of Pauli–Fierz operators (preprint). [14] J. Fr¨ ohlich, Existence of dressed one-electron states in a class of persistent models, Forschr. Phys. 22 (1974) 159–198. [15] J. Fr¨ ohlich, On the infrared problem in a model of scalar electrons and massless scalar bosons, Ann. Inst. Henri Poincar´e A 19 (1973) 1–103. [16] J. Fr¨ ohlich, M. Griesemer and B. Schlein, Asymptotic electromagnetic fields in models of quantum-mechanical matter interacting with the quantized radiation field (preprint). [17] C. G´erard, On the existence of ground states for massless Pauli–Fierz Hamiltonians, Ann. Henri Poincar´e 1(3) (2000) 443–459. [18] C. G´erard and F. Nier, Scattering theory for the perturbations of periodic Schr¨ odinger operators, J. Math. Kyoto Univ. 38 (1998) 595–634.
November 25, 2002 16:35 WSPC/148-RMP
1280
00150
C. G´ erard
[19] M. Griesemer, E. Lieb and M. Loss, Ground states in non-relativistic quantum electrodynamics (preprint). [20] B. Helffer and J. Sj¨ ostrand, Equation de Schr¨ odinger avec champ magn´etique et equation de Harper, Springer Lecture Notes in Physics 345, 1989, pp. 118–197. [21] R. Høgh-Krohn, On the spectrum of the space cutoff: P (ϕ): Hamiltonian in 2 spacetime dimensions, Comm. Math. Phys. 21 (1971) 256–260. [22] R. Høgh-Krohn, Asymptotic limits in some models of quantum field theory, I, J. Math. Phys. 9 (1968) 2075–2079. [23] M. H¨ ubner and H. Spohn, Radiative decay: non-perturbative approaches, Rev. Math. Phys. 7 (1995) 363–387. [24] M. H¨ ubner and H. Spohn, Spectral properties of the spin-boson Hamiltonian, Ann. Inst. H. Poincare 62 (1995) 289–323. [25] V. Jaksic and C. A. Pillet, On a model for quantum friction. III. Ergodic properties of the spin-boson system, Comm. Math. Phys. 178 (1996) 627–651. [26] J. Lorinczi, R. Minlos and H. Spohn, The infrared behavior in Nelson’s model of a quantum particle coupled to a massless scalar field (preprint). [27] E. Nelson, Interaction of non-relativistic particles with a quantized scalar field, J. Math. Phys. 5 (1964) 1190–1197. [28] A. Pizzo, One particle (improper) states and scattering states in Nelson’s massless model (preprint). [29] S. Schweber, An Introduction to Relativistic Quantum Field Theory, New York, Harper and Row, 1961. [30] I. Sigal and A. Soffer, The N -particle scattering problem: asymptotic completeness for short-range quantum systems, Ann. of Math. 125 (1987) 35–108. [31] I. Sigal and A. Soffer, Local decay and velocity bounds (preprint). [32] E. Skibsted, Spectral analysis of N -body systems coupled to a bosonic field, Rev. Math. Phys. 10(7) (1988) 989–1026. [33] H. Spohn, Asymptotic completeness for Rayleigh scattering, J. Math. Phys. 38(5) (1997) 2281–2296.
December 11, 2002 9:10 WSPC/148-RMP
00155
Reviews in Mathematical Physics, Vol. 14, No. 12 (2002) 1281–1334 c World Scientific Publishing Company
WZW BRANES AND GERBES
KRZYSZTOF GAWE ¸ DZKI and NUNO REIS Laboratoire de Physique, ENS-Lyon, 46, All´ ee d’Italie, F-69364 Lyon, France Received 30 July 2002 We reconsider the role that bundle gerbes play in the formulation of the WZW model on closed and open surfaces. In particular, we show how an analysis of bundle gerbes on groups covered by SU (N ) permits to determine the spectrum of symmetric branes in the boundary version of the WZW model with such groups as the target. We also describe a simple relation between the open string amplitudes in the WZW models based on simply connected groups and in their simple-current orbifolds. Keywords: Boundary confromal field theory; strings in group manifolds.
1. Introduction The WZW (Wess–Zumino–Witten) model [46], a version of a two-dimension sigma model with a group manifold G as the target, constitutes an important laboratory for conformal field theory (CFT). It is a source of numerous rational models of CFT [36] and a building block of certain string vacua [7]. It is also closely connected to the topological 3-dimensional Chern–Simons gauge theory [18, 47]. It has been clear from the very start that the model involves topological effects of a new type which are due to the presence of the topological Wess–Zumino term in its action functional. The topological intricacies of the model appear already at the classical level as global obstructions in the definition of the action functional. Those obstructions lead to the quantization of the coupling constant (the level) of the model. The phenomenon is similar to the Dirac quantization of the magnetic monopole charge but in the loop space rather than in the physical space. In the physical space, it involves closed 3forms instead of magnetic field 2-forms. Quite straightforward for simply connected groups G, this effect becomes more subtle for non-simply connected ones leading to more involved selection rules for the level and, possibly, multiple (theta-)vacua of the quantum theory [19]. As is well known, a convenient mathematical framework for the Dirac monopoles, their quantization and the Bohm–Aharonov effect is provided by the theory of line bundles with hermitian connections. Up to isomorphism, such bundles may be characterized by certain sheaf cohomology classes. More exactly, they correspond to the elements of the real version of the degree 2 Deligne cohomology 1281
December 11, 2002 9:10 WSPC/148-RMP
1282
00155
K. Gaw¸edzki & N. Reis
[17, 24]. It was realized in [25] that the Deligne cohomology in degree 3 provides a mathematical language to treat the topological intricacies in the WZW model. The theory is somewhat analogous to the degree 2 case when the original space is replaced by its loop space. Indeed, a third degree real Deligne class determines a (unique up to isomorphism) hermitian line bundle with connection on the loop space [25]. The degree 3 theory appears, however, to be much richer. In particular, one of the basic constructions in degree 2, that of the parallel transport along curves, becomes that of the “parallel transport” around two-dimensional surfaces which may have different topology. For closed surfaces one obtains the U (1)-valued “holonomies” that enter the Feynman amplitudes of classical field configurations in the WZW model. For surfaces with boundary, the amplitudes take instead values in the product of lines associated to the boundary loops. New phenomena appear when the boundary components or their pieces are restricted to special submanifolds (D-branes) over which the Deligne cohomology class trivializes. The discussion in [25] extends easily to that case as was briefly evoked in [26]. This is precisely the situation that one confronts when studying boundary conditions in the WZW models that preserve (half of the) symmetries of the bulk theory. One of the main points of this paper is to show how the order 3 Deligne classes enter in the classification of such boundary conditions, i.e. of the WZW branes. Although the whole discussion may be made using the cohomological language, it is convenient to have at ones disposal geometric objects whose isomorphism classes are characterized by the degree 3 real Deligne cohomology classes. This was recognized in Ref. 6 which proposed to use the theory of “gerbes” [29] to provide for such objects. It seems, that the most appropriate geometric notions are those of (hermitian) bundle gerbes with connection defined in [38] and of their stable isomorphisms introduced in [39]. The bundle gerbes with connection are simple geometric objects whose stable isomorphism classes are exactly described by the order 3 real Deligne cohomology. Their use allows to translate the cohomological discussions of [25] to a more geometric language which is indeed useful when discussing the issues related to branes. See also [4, 10, 13, 30, 48] for the discussions of gerbes in different, although related, contexts. The paper is organized as follows. In Sec. 2 we recall the essential points of [25], with some of the details relegated to Sec. 10.1, and discuss their translation to the bundle-gerbe language, In Sec. 3 we present an explicit construction of gerbes over the SU (N ) groups. Section 4 is devoted to the case of non-simply connected groups covered by SU (N ). In parenthetical Sec. 5, we explain how to define gerbes on discrete quotient spaces, an issue which was previously discussed in the context of discrete torsion in [42]. How the construction from Sec. 4 fits into this general scheme is shown in Appendix B. In Sec. 6, we describe the line bundles with connection on the loop spaces induced by gerbes and relevant for the geometric description of closed string amplitudes. Section 7 shows how those line bundles may be trivialized when restricted to loop spaces of branes and how to describe the brane structure in terms of gerbes. We also discuss the line bundles induced by
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1283
gerbes on the space of paths with ends on branes, the open string counterpart of the loop space construction. In Sec. 8.1, we examine branes in the SU (N ) groups and in Sec. 8.2 the ones in the groups covered by SU (N ). In the latter case, we obtain a completely explicit description of the (symmetric) branes confirming the results based on studying consistency of quantum amplitudes. In Sec. 9, we evoke the bearing that the geometric constructions discussed in this paper have on the spectrum and the boundary partition functions of the WZW models based on the groups covered by SU (N ). We identify a general relation, that seems at least partially new in the context of the WZW theory, between the spaces of states for the boundary WZW models with non-simply connected groups and the ones for the models based on the covering groups. Section 10 briefly indicates how to extend this relation to general open string quantum amplitudes. In Sec. 11, we present a local description of the line bundles over the loop spaces and open path spaces induced by gerbes, discussed before in more abstract terms. Conclusions give a brief summary of what was achieved in the paper and list some open problems. More technical calculations referred to in the main text have been collected in Appendices. 2. Topological Action Functionals and Gerbes Let us start by recalling the some basic points of [25], changing the notations to more up-to-date ones. 2.1. Dirac monopoles and line bundles Suppose that B is a (magnetic field) closed 2-form on a manifold M . To describe a particle of unit charge moving in such a field along a trajectory ϕ(t), one has R to add to the action functional the coupling term ϕ∗ A where A = d−1 B is the vector potential of B, i.e. a 1-form such that dA = B. The problem arises when B is not exact so that there is no global A (like for the magnetic field of a monopole). Dirac’s solution of the problem, when translated to a geometric language, is to define R ∗ −1 the Feynman amplitudes ei ϕ d B of closed particle paths ϕ as holonomies in a hermitian line bundle L with (hermitian) connection ∇ of curvature curv(∇) = B, 1 B is integral in provided such bundle exists. This is the case if the closed 2-form 2π the sense that its periods over closed 2-cycles in M are integers. Let (Oi ) be a sufficiently fine open covering of M . We shall use the standard notation Oij , Oijk etc. for the multiple intersections of the sets Oi . A choice of local sections si : Oi → L of length 1 gives rise to the local data (gij , Ai ) for L such that sj = gij si and ∇si = 1i Ai si . They have the following properties: −1 : Oij → U (1) and on Oijk (1) gij = gji −1 gjk gik gij = 1 ,
(2.1)
(2) Ai are real 1-forms on Oi such that dAi = B ,
(2.2)
December 11, 2002 9:10 WSPC/148-RMP
1284
00155
K. Gaw¸edzki & N. Reis
(3) on Oij −1 Aj − Ai = igij dgij .
(2.3)
If s0i correspond to a different choice of local sections so that s0i = fi si then 0 = gij fj fi−1 gij
and A0i = Ai + ifi−1 dfi .
(2.4)
The local data also naturally restrict to finer coverings. The two collections of local data are considered equivalent if they are related by (2.4) when restricted to a sufficiently fine common covering. The equivalence classes w = [gij , Ai ] may be viewed as (real, degree 2) Deligne (hyper-)cohomology classes [25]. The class of local data depends only on the bundle L with connection and not on the choice of its local sections. Besides, isomorphic bundles give rise to the same Deligne class. Denote by W (M, B) the set of equivalence classes w of local data corresponding to fixed B and by w(L) the class of local data of the bundle L. In fact, the line bundle L together with its hermitian structure and connection may be reconstructed from the local data (gij , Ai ) up to isomorphism. One just takes the disjoint union S F i (Oi ×C) ≡ i (Oi ×{i}×C) of trivial bundles and one divides it by the equivalence relation (x, i, gij z) ∼ (x, j, z) .
(2.5)
The covariant derivative given by ∇ = d + 1i Ai on Oi defines a connection on the quotient bundle. It follows that the elements of W (M, B) are in one-to-one correspondence with the isomorphism classes of hermitian line bundles with connections. 1 B is integral. In the In particular, W (M, B) is non-empty if and only if the 2-form 2π 1 latter case, the cohomology group H (M, U (1)) acts on W (M, B) in a free, transitive way, i.e. W (M, B) is a H 1 (M, U (1))-torsor. The action sends w = [gij , Ai ] to ˇ cocycle representing u ∈ H 1 (M, U (1)). uw = [uij gij , Ai ], where (uij ) is the Cech The latter group may be also viewed as that of characters of the fundamental group of M . The line bundles corresponding to w and to uw have holonomies differing by the corresponding character. In particular, the set W (M, 0) of isomorphism classes of flat hermitian line bundles may be identified with the group H 1 (M, U (1)). The multiplication in H 1 (M, U (1)) corresponds to the tensor product of flat bundles. The holonomy H(ϕ) in L along a closed loop ϕ : ` → M may be expressed using the local data of L. One splits ` into small closed intervals b with common vertices v in such a way that b ⊂ Oib for some ib , choosing also for each vertex v and index iv so that v ∈ Oiv . Then " # XZ Y ∗ ϕ Aib giv ib (ϕ(v)) , (2.6) H(ϕ) = exp i Q
b
b
v∈b
where the product v∈b is taken with the convention that the entry following it is inverted if v is the beginning of b. Since the holonomy depends only on the isomorphism class of L, the right hand side depends only on the class w of the local data (gij , Ai ). More generally, for arbitrary curves ϕ : ` → M , the parallel transport
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1285
in L defines an element in L−1 ϕ(v− ) ⊗ Lϕ(v+ ) , where v± are the ends of `, Lm denotes the fiber of L over m ∈ M , and L−1 is the bundle dual to L. Using local sections si± such that ϕ(v± ) ∈ Oi± , this element may be represented by a number that is still given by the right hand side of (2.6). The value of (2.6) changes now upon changing the indices i± assigned to the endpoints of ` according to the identifications (2.5). It also changes when ones changes the local data (gij , Ai ) within the class w, but in the way consistent with isomorphisms of the line bundles reconstructed from such data. 2.2. Topological actions in two-dimensional field theories In the two-dimensional field theory, for example in the WZW model (see [25] for another example), one needs to make sense of action functionals written formally R as φ∗ d−1 H where H is a closed but (possibly) not exact real 3-form H on the target manifold M in which the two-dimensional field φ takes values. This may be done in analogy to the one-dimensional prescription (2.6). Suppose that, for a sufficiently fine covering (Oi ), one may choose the local data (gijk , Aij , Bi ) with the following properties: sign(σ)
(1) gijk = gσ(i)σ(j)σ(k) : Oijk → U (1) and on Oijkl −1 −1 gjkl gikl gijl gijk = 1,
(2.7)
(2) Aij = −Aji are real 1-forms on Oij and on Oijk −1 Ajk − Aik + Aij = igijk dgijk ,
(2.8)
(3) Bi are real 2-forms on Oi such that dBi = H ,
(2.9)
Bj − Bi = dAij .
(2.10)
(4) on Oij ,
Such local data naturally restrict to finer coverings. Following [25], we shall consider two collections of local data equivalent if, upon restriction to a common sufficiently fine covering, −1 0 = gijk χ−1 gijk jk χik χij ,
(2.11a)
A0ij = Aij + Πj − Πi − iχ−1 ij dχij ,
(2.11b)
Bi0 = Bi + dΠi
(2.11c)
for χij = χ−1 ji : Oij → U (1) and real 1-forms Πi on Oi . The equivalence classes w = [gijk , Aij , Bi ] may be viewed as Deligne (hyper-)cohomology classes in the degree three [25]. The set W (M, H) of the classes corresponding to a given closed 1 H is integral in the sense that all its 3-form H is non empty if and only if 2π
December 11, 2002 9:10 WSPC/148-RMP
1286
00155
K. Gaw¸edzki & N. Reis
periods over closed 3-cycles in M are integers. If this is the case then W (M, H) is a H 2 (M, U (1))-torsor, with the cohomology group H 2 (M, U (1)) acting on W (M, H) by (gijk , Aij , Bi ) 7→ (uijk gijk , Aij , Bi ) .
(2.12)
If H 3 (M, Z) (or H2 (M, Z)) is without torsion, the above action is equivalent to (gijk , Aij , Bi ) 7→ (gijk , Aij , Bi + F ) ,
(2.13)
where F is a closed 2-form on M . In the latter case, the class in W (M, H) does not 1 F is an integral 2-form. The equivalence of the two actions change if and only if 2π follows from the isomorphism H 2 (M, U (1)) ∼ = H 2 (M, R)/H 2 (M, 2πZ). Let φ be a map from a compact oriented surface Σ to M . One may triangulate Σ in such a way that for each triangle c there is an index ic such that c ⊂ Oic . We shall also choose indices ib for edges bR and iv for vertices v so that φ(b) ⊂ Oib ∗ −1 and φ(v) ∈ Oiv . The formal amplitudes ei φ d H may now be defined, as was first proposed in [2], by # " XZ Y XZ ∗ ∗ φ Bic + i φ Aic ib gic ib iv (φ(v)) , (2.14) A(φ) = exp i c
c
b⊂c
b
v∈b⊂c
with the similar orientation conventions as in (2.6). It is straightforward to check that if ∂Σ = ∅ then A(φ) is independent of the choices of the triangulation and of the assignment of the covering indices and does not change under restrictions of the local data to finer coverings and under the equivalences (2.11). F Assume now that Σ has a boundary ∂Σ = s `s with the boundary components `s that may be parametrized by the standard circle S 1 . In this case the expression (2.14) still does not change if one modifies the triangulation and the index assignment in the interior of Σ, but it does change if the changes concern the boundary data. One may abstract from those changes a definition of a hermitian line bundle L over the space LM of loops in M (or over the quotient of the latter by orientation-preserving reparametrizations) in such a way that O Lφ|`s . (2.15) A(φ) ∈ s
The transition functions of the line bundle L have been constructed in [25] where it was also shown that L carries a natural connection whose curvature 2-form Ω is given by Z (2.16) hδ1 ϕ, δ2 ϕ | Ω(ϕ)i = ϕ∗ ι(δ2 ϕ)ι(δ1 ϕ)H . `
For completeness, we include the explicit expressions from [25] in Sec. 10.1. For an equivalent choice of the local data (gijk , Aij , Bi ), the bundle L changes to an isomorphic one so that one obtains a natural map from W (M, H) to W (LM, Ω).
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1287
2.3. Bundle gerbes with connections As we already mentioned, there are simple geometric objects, the (hermitian line) bundle gerbes with connection, whose appropriate isomorphism classes are described by elements of W (M, H). Let us briefly recall this concept [38, 39]. Suppose that we are given a manifold map π : Y → M which admits local sections σi : Oi → Y over the sets of a sufficiently fine covering of M . Let Y [n] = Y × MY . . . × MY denote the n-fold fiber product of Y . Y [n] = {(y1 , . . . , yn ) ∈ Y n | π(y1 ) = · · · π(yn )}. We shall denote by π [n] the obvious map from Y [n] to M and by pn1 ···nk the projection of (y1 , . . . , yn ) to (yn1 , . . . , ynk ). A hermitian line bundle gerbe G over M with connection of curvature H (shortly, a gerbe) is a quadruple (Y, B, L, µ) where (1) B is a 2-form on Y such that dB = π ∗ H ,
(2.17)
(2) L is a hermitian line bundle with a connection ∇ over Y [2] with curvature curv(∇) = p∗2 B − p∗1 B ,
(2.18)
(3) µ is an isomorphism of hermitian line bundles with connection over Y [3] µ : p∗12 L ⊗ p∗23 L → p∗13 L ,
(2.19)
(4) as isomorphisms of line bundles p∗12 L ⊗ p∗23 L ⊗ p∗34 L and p∗14 L over Y [4] µ ◦ (µ ⊗ id) = µ ◦ (id ⊗ µ) .
(2.20)
The 2-form B is called the curving of the gerbe. The isomorphism µ defines a structure of a groupoid on L with the bilinear product µ : L(y1 ,y2 ) ⊗ L(y2 ,y3 ) → L(y1 ,y3 ) . The associativity of the product is guaranteed by (2.20). The bundle L restricted to the diagonal composed of the elements (y, y) may be naturally trivialized by the choice of the units of the groupoid multiplication and µ determines a natural isomorphism between κ∗ L and L−1 , where κ(y1 , y2 ) = (y2 , y1 ). In order to elucidate the abstract definition copied from [38] (except for fixing the curving B of the gerbe), let us immediately provide examples. First, for an exact 3-form H = dB, the quadruple (M, B, M [2] × C, ·) with Y = M , with the trivial bundle L over M [2] ∼ = M , and with µ determined by the product of complex numbers, is a gerbe with curvature H. Given a map π : Y → M admitting local sections and a hermitian line bundle N over Y with connection of curvature F , a simple example of a gerbe is provided by GN = (Y, F, p∗1 N −1 ⊗ p∗2 N, µ) with µ given by the obvious identification between ⊗ Ny2 ) ⊗ (Ny−1 ⊗ Ny3 ) and Ny−1 ⊗ Ny3 . This is a gerbe with the vanishing (Ny−1 1 2 1 curvature. Following [38, 39], we shall call gerbes GN trivial. Trivial gerbes are useful to recognize when a bundle N is isomorphic to a pullback π ∗ P of a hermitian bundle with connection on M . This is the case if and only if there exists a unit length flat
December 11, 2002 9:10 WSPC/148-RMP
1288
00155
K. Gaw¸edzki & N. Reis
section D (called a descent data) of the trivial gerbe GN bundle p∗1 N −1 ⊗ p∗2 N over Y [2] such that µ(D ◦ p12 ⊗ D ◦ p23 ) = D ◦ p13 .
(2.21) F
D(y1 , y2 ) : Ny1 → Ny2 defines then an equivalence relation on y∈π−1 (m) Ny . Taking Pm as the set of the equivalence classes, one obtains canonically a bundle P and an isomorphism of N with π ∗ P . We shall say that P is obtained from N and D by the descent principle. The next example will be central to our application of gerbes. Let (gijk , Aij , Bi ) be local data on M as described in the previous subsection. Take for Y the disjoint F F union i Oi with π(x, i) = x. Then Y [n] = (i1 ,...,in ) Oi1 ···in and the projections pn1 ···nk are the inclusions of Oi1 ···in into Oin1 ···ink . We take as L the trivial hermitian line bundle Y [2] × C. The connection on L will be given by ∇ = d + 1i Aij on Oij and the isomorphism µ by the multiplication by gijk on Oijk . The relation (2.17) is then assured by (2.9) and the equality (2.18) by (2.10). That µ preserves the connections follows from (2.8) and its associativity (2.20) is a consequence of (2.7). Conversely, given a gerbe, one may define local data (gijk , Aij , Bi ) the following way. One first chooses local sections σi : Oi → Y that induce local sections σi1 ···in ≡ (σi1 , . . . , σin ) of Y [n] defined on intersections Oi1 ···in . If the covering of M is sufficiently fine, one may also choose unit length sections sij : σij (Oij ) → L so that sji = s−1 ij ◦ κ. One defines the local data (gijk , Aij , Bi ) by the relations Bi = σi∗ B , ∗ (∇sij ) = σij
1 Aij sij ◦ σij , i
µ ◦ (sij ◦ σij ⊗ sjk ◦ σjk ) = gijk sik ◦ σik .
(2.22a) (2.22b) (2.22c)
Properties (2.9) and (2.10) follow from (2.17) and (2.18). Equation (2.8) arises by covariantly differentiating (2.22c) along directions tangent to σijk (Oijk ) with the use of relations p12 ◦ σijk = σij etc. and of the fact that µ preserves the connections. Finally, the cocycle condition (2.7) follows from the associativity (2.20). For a trivial −1 [2] gerbe GN , we may take sij = (χ−1 ij ◦ π )(si ⊗ sj ), where si are unit length sections −1 si : σi (Oi ) → N and χij = χji : Oij → U (1). One obtains then the local data −1 −1 (χ−1 jk χik χij , Πj − Πi − iχij dχij , dΠi ) ,
(2.23)
where Πi is defined by the relation σi∗ (∇si ) = 1i Πi si ◦ σi . Let us show that the class w ∈ W (M, H) of the local data (gijk , Aij , Bi ) depends only on the gerbe G and not of the choices of local sections used in the construction of the data. First, restricting the sections σi and, accordingly, sij to a finer covering produces the restriction of the local data to that covering which, by definition, does not change the class in W (M, H). For two choices of local sections, one may assume that they have been already restricted to a common covering with the
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1289
sets Oi sufficiently small. One has then to compare the local data induced by the ˜i = (σi , σi0 ) : Oi → Y [2] and let two families of sections σi , sij and σi0 , s0ij . Let σ ˜i (Oi ) → L be unit length sections of L. The relations si : σ 0 ˜i )−1 (sij ◦ σij )(sj ◦ σ ˜j ) = χij s0ij ◦ σij , (si ◦ σ
(2.24)
where on the left hand side the sections of L are multiplied using µ, define U (1)valued functions χij = χ−1 ji on Oij . Let Πi be 1-forms on Oi given by σ ˜i∗ ∇si =
1 i
Πi si ◦ σ ˜i .
(2.25)
Relation (2.11c) follows then from (2.18) and (2.22a). Similarly, identity (2.11b) is a consequence of (2.22b), (2.24) and the fact that µ commutes with the covariant derivation. Finally, the associativity of the product defined by µ together with (2.22c) and (2.24) implies (2.11a). This shows that the local data (gijk , Aij , Bi ) 0 , A0ij , Bi0 ) define the same class w ∈ W (M, H) which, consequently, depends and (gijk only on the gerbe G. We shall denote this class by w(G). Clearly, the class of the gerbe constructed from the local data (gijk , Aij , Bi ) is the class of those data. It is natural to inquire when two gerbes G = (Y, B, L, µ) and G 0 = (Y 0 , B 0 , L0 , µ0 ) on M with curvature H define the same class w ∈ W (M, H). A sufficient condition is that Y = Y 0 , B = B 0 and that there exists a bundle isomorphism ι : L → L0 preserving the remaining structures. We shall call such gerbes isomorphic. It is clear, however, that this is not a necessary condition. For example, two gerbes constructed from equivalent local data on different open coverings of M define the same class in W (M, H) but may have spaces Y and Y 0 with different numbers of components. The appropriate geometric notion of a stable isomorphism of gerbes was introduced in [39]. It provides a necessary and sufficient condition for the equality w(G1 ) = w(G2 ). We shall describe it now. Let G = (Y, B, L, µ) be a gerbe and let ω : Z → M be another map with local sections. Given also a map σ : Z → Y commuting with the projections ∗ ∗ on M , the pullback gerbe σ ∗ G will be defined as (Z, σ∗ B, σ[2] L, σ [3] µ). It has the same curvature as G. For two gerbes with the same Y , one may define their tensor product by taking the tensor product of the hermitian line bundles with connections over Y [2] and the tensor product µ ⊗ µ0 as the groupoid multiplication. The curvings and the curvatures add under such operation. Let G = (Y, B, L, µ) and G 0 = (Y 0 , B 0 , L0 , µ0 ) be two arbitrary gerbes over M . Take Z = Y ×M Y 0 and let σ : Z → Y and σ 0 : Z → Y 0 be the projections on the components in Y ×M Y 0 . By definition, gerbes G and G 0 are stably isomorphic if there exists a line bundle N over Z and a line bundle isomorphism ∗
∗
σ [2] L ⊗ p∗1 N −1 ⊗ p∗2 N → σ 0[2] L0 ι
(2.26)
defining an isomorphism between the gerbes σ ∗ G ⊗ GN and σ 0∗ G 0 . In particular, this requires that the curvature F of N be equal to σ 0∗ B 0 −σ ∗ B. The stable isomorphism of gerbes is an equivalence relation.
December 11, 2002 9:10 WSPC/148-RMP
1290
00155
K. Gaw¸edzki & N. Reis
The line bundle isomorphism ι will be called the stable isomorphism between G and G 0 . In general, it is not unique. If ι0 is another stable isomorphisms between G and G 0 corresponding to line bundles N 0 over Z then it necessarily differs from ι by an isomorphism between the trivial gerbes GN and GN 0 . Such an isomorphism defines descent data for the bundle N −1 ⊗ N 0 so that, canonically, N 0 ∼ = N ⊗ ω∗P 0 for a bundle P on M . Since N and of N have the same curvatures, P has to be a flat bundle. Conversely, the gerbes GN and GN 0 for N 0 = N ⊗ ω ∗ P are canonically isomorphic if P is a flat bundle on M . Remark 2.1. It is easy to see that any pullback gerbe σ ∗ G is stably isomorphic to G. Indeed, taking as the bundle N over Z ⊗M Y the pullback of L by the map σ × Id from Z ⊗M Y to Y [2] , we observe that the groupoid multiplication µ defines a stable isomorphism −1 ⊗ N(σ(z2 ),y2 ) → L(y1 ,y2 ) L(σ(z1 ),σ(z2 )) ⊗ N(σ(z 1 ),y1 )
(2.27)
between σ ∗ G and G. In fact, two gerbes G and G 0 are stable isomorphic if and only if they become isomorphic after the pullback to a common Z (not necessarily equal to Y ×M Y 0 ) and the tensor multiplication by a trivial gerbe. Clearly, the stably isomorphic gerbes have the same curvature H. Moreover, as it is easy to see, they give rise to the same class w ∈ W (M, H). Indeed, under pullbacks of gerbes the local data do not change if we use the local sections σ ◦ σi : Oi → Y and sij for the gerbe before the pullback and σi : Oi → Z and sij ◦ σ [2] for the pullback gerbe. Similarly, under tensor multiplication by a trivial gerbe, the local data change by (2.23), hence again stay in the same class. Converse is also true: if w(G) = w(G 0 ) then G and G 0 are stably isomorphic. To prove this, it is enough to show that any gerbe G = (Y, B, L, µ) is stably isomorphic to the one constructed from its local data associated to the sections σi and sij . This follows from the F fact that the pullback of G by the map σ : i Oi → Y equal to σi on each Oi is isomorphic to the local data gerbe, with the corresponding line bundle isomorphism given by the sections sij (just recall how the local data and the corresponding gerbe are defined). Summarizing: there is a one-to-one correspondence between the cohomology classes in W (M, H) and the stable isomorphism classes of gerbes with curvature H. 3. Gerbes on Groups SU (N ) Let G be a connected, simply connected, simple compact group and let g be its Lie algebra. We shall denote by tr the non-degenerate bilinear invariant form on g which allows to identify g with its dual, by t the Cartan subalgebra of g, by r the rank of g, by ∆ the set of the roots α, by φ the highest root, by α∨ and φ∨ the ∨ coroots, by eα the step generators corresponding to roots α, by αi , α∨ i , λi , and λi
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1291
for i = 1, . . . , r, the simple roots, coroots, weights and coweights and by Q, Q∨ , P and P ∨ the corresponding lattices. In particular, ! M C C Ceα (3.1) g =t ⊕ α∈∆
is the root decomposition of the complexification of g. The standard normalization of tr requires that the long roots have length square 2 so that α∨ = 2α/tr α2 . P ∨ ∨ ki αi , where ki∨ are the dual Kac labels. The dual The highest root φ = φ∨ = P ∨ ∨ Coxeter number h = 1 + ki . The positive Weyl chamber CW ⊂ t is composed τ ∈ t such that tr τ αi ≥ 0 for each i and the positive Weyl alcove AW is its subset restricted by the additional equality tr τ φ ≤ 1. It is the r-dimensional simplex in t with vertices 0 and k1∨ λi . i On G we shall consider the unique up to normalization left- and right-invariant real closed 3-form 1 tr(g −1 dg)3 . (3.2) H= 12π It will be convenient to parametrize the Lie algebra and the Lie group elements using the adjoint action of G. Elements in g and in G may be written, respectively, as γτ γ −1
and γe2πiτ γ −1
(3.3)
for some γ ∈ G and τ ∈ t. The group elements γ are determined up to the right multiplication by γ0 in the isotropy subgroups G0τ and Gτ composed of elements of G commuting with τ and with e2πiτ , respectively. Clearly, G0τ ⊂ Gτ . The subgroups G0τ and Gτ are connected. They correspond to the Lie subalgebras gτ0 and gτ of g with complexifications ! ! M M 0C C C C Ceα , gτ = t ⊕ Ceα , (3.4) gτ = t ⊕ α∈∆0τ
α∈∆τ
where ∆0τ = {α ∈ ∆ | tr τ α = 0} ,
∆τ = {α ∈ ∆ | tr τ α ∈ Z} .
(3.5)
The sets of Lie algebra and group elements (3.3) with fixed τ form, respectively, the (co)adjoint orbit Oτ ⊂ g and the conjugacy class Cτ ⊂ G. We have Oτ ∼ = G/G0τ
and Cτ ∼ = G/Gτ
(3.6)
so that Oτ and Cτ are connected and simply connected. The choice of τ in the parametrizations (3.3) may be fixed if we demand that τ ∈ CW or τ ∈ AW , respectively. Consider the open subsets U0 ⊂ g and O0 ⊂ G composed of elements of the form (3.3) for τ ∈ AW such that tr τ φ < 1. They are related by the exponential map g 3 X 7→ e2πiX ∈ G. Using the parametrization (3.3) it is easy to see that the exponential map is injective on U0 because G0τ = Gτ
December 11, 2002 9:10 WSPC/148-RMP
1292
00155
K. Gaw¸edzki & N. Reis
if tr τ φ < 1. Indeed, the last inequality implies that tr αφ < 1 for all positive roots. Similarly one shows that the derivative of the exponential map is invertible on U0 . It follows that the exponential map is a diffeomorphism between U0 and O0 . Composing the latter with the homotopy (t, X) 7→ tX of U0 and using the Poincare Lemma, one may obtain a 2-form B0 on O0 such that dB0 = H. Explicitly, in the parametrization (3.3), B0 (γe2πiτ γ −1 ) = Q(γe2πiτ γ −1 ) + i tr τ (γ −1 dγ)2 ,
(3.7)
where 1 tr(γ −1 dγ)e2πiτ (γ −1 dγ)e−2πiτ . (3.8) 4π These 2-forms will be the building blocks for the local data of a gerbe on G with curvature H for G = SU (N ). The group SU (N ) has rank r = N − 1. It is simply laced so that αi = α∨ i and ∨ λi = λi . For the Cartan subalgebra composed of the diagonal su(N ) matrices, we may take Q(γe2πiτ γ −1 ) =
αi = diag(0, . . . , 1, −1, . . . , 0) , N −i N − i −i −i ,..., , ,..., λi = diag N N N N
(3.9a) (3.9b)
with 1 and the last NN−i at the (i − 1)th place counting from zero. The highest root is φ = diag(1, 0, . . . , 0, −1), the Kac labels are ki∨ = 1 and the dual Coxeter number is equal to N . The center of SU (N ) is composed of the elements zi = e2πiλi for i = 0, 1, . . . , r, where we set λ0 = 0. Let us consider the sets Oi = zi O0 ⊂ SU (N ). We may define 2-forms Bi on Oi by the pullback of B0 from O0 : Bi (g) = B0 (zi−1 g) .
(3.10)
Clearly, dBi = H. If g = γe2πiτ γ −1 then zi−1 g = γe2πi(τ −λi ) γ −1 . For each i there is an element wi in the normalizer N (T ) ⊂ G of the Cartan subgroup T ⊂ G such that if τ ∈ AW then also wi (τ − λi )wi−1 ≡ σi (τ ) is in AW . Explicitly, the element wi induces the Weyl group transformations diag(a0 , . . . , ar ) 7→ wi diag(a0 , . . . , ar )wi−1 = diag(ai , . . . , ar , a0 , . . . , ai−1 )
(3.11)
and σi (λj ) = λ[j−i] , where [j − i] = (j − i)mod N . It follows that zi−1 γe2πiτ γ −1 = γe2πi(τ −λi ) γ −1 = γwi−1 e2πiσi (τ ) wi γ −1 .
(3.12)
Substituting into (3.7) and (3.8), we obtain Bi (γe2πiτ γ −1 ) = Q(γe2πiτ γ −1 ) + i tr(τ − λi )(γ −1 dγ)2 .
(3.13)
The sets Oi are composed of group elements g such that τ ∈ AW and tr τ αi > 0 in the parametrization (3.3). For such τ , Gτ = G0τ −λi . In terms of the eigenvalues
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1293
of the unitary matrices g given by the entries of e2πi τ = diag(e2πia0 , . . . , e2πiar ) such that a0 ≥ · · · ≥ ar and a0 − ar ≤ 1, the sets Oi are defined by the inequality S ai > ai+1 and O0 by a0 − ar < 1. Clearly, G = i=0,1,...r Oi . In the first step in the construction of the gerbe on SU (N ) with curvature H we set G Oi and B|Oi = Bi . (3.14) Y = i=0,1,...,r
F Oi1 ···in . To continue the conAs discussed before, the fiber products Y [n] = struction of the gerbe, note that on the intersections Oij that form Y [2] (with i, j = 0, 1, . . .), (Bj − Bi )(γe2πiτ γ −1 ) = −i tr λij (γ −1 dγ)2 ,
(3.15)
where λij ≡ λj − λi . The expression on the right hand side coincides with the one for the Kirillov–Kostant symplectic form Fλij on the (co)adjoint orbit Oλij passing through λij . More exactly, there is a map Oij 3 g → ρij (g) ∈ Oλij such that Bj − Bi = ρ∗ij Fλij .
(3.16)
This map is defined as follows. For g ∈ Oij , there exist two Lie algebra elements Xi , Xj ∈ U0 such that zi−1 g = e2πiXi and zj−1 g = e2πiXj . Then ρij (g) = Xi −Xj . Explicitly, if g = γe2πiτ γ −1 then, as may be seen from (3.12), Xi = γwi−1 σi (τ )wi γ −1 = γ(τ − λi )γ −1 and similarly for Xj . Hence ρij (γe2πiτ γ −1 ) = γλij γ −1 .
(3.17)
Another way to see that the map (3.17) is well defined is to check that if tr τ αi > 0 and tr τ αj > 0 for τ ∈ AW then the isotropy subgroup Gτ necessarily is contained in the isotropy subgroup G0λij . Since for i < j i − j i − j N + i − j i − j N + i − j i − j , ,..., , ,..., , ,..., λij = diag N N N N N N i times
(j−i) times
(N −j) times
(3.18) the isotropy subgroup G0λij is composed of block matrices γ0 that preserve the subspace Vij ⊂ CN of vectors with vanishing first i and last N − j coordinates and its orthogonal complement. The coadjoint orbit Oλij ∼ = G/G0λij may be identified with the Grassmannian Grij of (j − i)-dimensional subspaces γ(Vij ) in CN with γ ∈ SU (N ). The Kirillov–Kostant theory [32, 33] provides an explicit construction of a hermitian line bundle Lλ over the coadjoint orbit Oλ with connection of curvature Fλ , provided that λ is a weight which holds for λ = λij . The bundle is obtained by dividing the trivial line bundle over G by the equivalence relation Lλ = (G × C)/ ∼ , λ
(3.19)
December 11, 2002 9:10 WSPC/148-RMP
1294
00155
K. Gaw¸edzki & N. Reis
where (γ, ζ) ∼(γγ0 , χλ (γ0 )−1 ζ) , λ
(3.20)
for γ0 ∈ G0λ . We shall denote the corresponding equivalence classes by [γ, ζ]λ . Above, χλ : G0λ → U (1) stands for the group homomorphism (character) such that ∂t |t=0 χλ (eitX0 ) = i tr λX0 for X0 ∈
gλ0 .
(3.21)
Existence of χλ is guaranteed if λ is a weight. The formula ∇ = d + tr λ(γ −1 dγ)
(3.22)
defines a connection in the trivial bundle over G that descends to the bundle Lλ . The curvature of that connection is equal to Fλ = −i tr λ(γ −1 dγ)2 . For i < j, χλij (γ0 ) = det(γ0 |Vij ) .
(3.23)
We shall also use an alternative description of the bundle Lλij . Let, for i < j, Eij be the tautological vector bundle over the Grassmannian Grij whose fiber at γ(Vij ) is this very subspace in CN . Clearly, Eij is a (j − i)-dimensional subbundle of the trivial bundle Grij × CN from which it inherits the hermitian structure. It may be equipped with the connection ∇ = Pγ(Vij ) d
(3.24)
where Pγ(Vij ) = γPVij γ −1 denotes the orthogonal projection in CN on γ(Vij ). The line bundle Lλij may be identified with the top exterior power ∧j−i Eij of the bundle Eij by the mapping [γ, ζ]λ 7→ ζγei ∧ · · · ∧ γej−1 ,
(3.25)
where el , l = 0, 1, . . . , r, are vectors of the canonical basis of CN . It is easy to see that this mapping is compatible with the equivalence relation (3.20) and that it preserves the connection if we equip ∧j−i Eij with the one inherited from Eij . We may perform now the next step in the construction of the gerbe G = (Y, B, L, µ) on SU (N ) with Y and B given by (3.14). We shall define the line F bundle L with connection over Y [2] = Oij by L|Oij := ρ∗ij Lλij .
(3.26)
Equation (3.16) guarantees that the curvature of L satisfies requirement (2.18). We still have to construct the isomorphism µ providing L with the groupoid structure. It will be given by the isomorphisms between the bundles on the triple intersections Oijk µijk : ρ∗ij Lλij ⊗ ρ∗jk Lλjk → ρ∗ik Lλik .
(3.27)
We may assume that i < j < k. Then the isomorphism µijk is determined by the natural map (γei ∧ · · · ∧ γej−1 ) ⊗ (γej ∧ · · · ∧ γek−1 ) 7→ γei ∧ · · · ∧ γej−1 ∧ γej ∧ · · · ∧ γek−1 and the associativity (2.20) becomes obvious.
(3.28)
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1295
This ends the construction of the gerbe G = (Y, B, L, µ) on the special unitary group SU (N ). Since H 2 (G, U (1)) = {1} for simply connected groups, G is, up to stable isomorphism, a unique gerbe on SU (N ) with curvature H given by (3.2). The tensor powers G k = (Y, kB, Lk , µk ) of G for k ∈ Z give the gerbes on SU (N ) with curvature kH, again unique up to stable isomorphism. In particular, G −1 is the gerbe dual to G (µ−1 is the inverse of the transpose of µ). 4. Gerbes on Groups Covered by SU (N ) Let us consider now the case of non-simply connected groups G0 , quotients of simply connected groups G by a subgroup Z of their center. The closed 3-form H of (3.2) descends from G to G0 to a 3-form H 0 and we shall be interested in the gerbes on G0 with curvature proportional to H 0 . We shall restrict ourselves to the case when G = SU (N ) and G0 = SU (N )/Z, where Z is a cyclic group of order N 0 such that N = N 0 N 00 . More explicitly, Z = {za | N 00 divides a}. As was shown in [19], the 1 kH 0 on SU (N )/Z are integral for even k if N 0 is even and N 00 is odd forms 2π and for integer k in the other cases. In the present section, we shall construct the corresponding gerbes Gk0 on G0 . Not surprisingly in view of the discussion in [19, Appendix 1], the construction reduces to solving a simple cohomological problem in the group cohomology of Z ∼ = ZN 0 , see Appendix A for a brief summary on discrete group cohomology. The resulting gerbe will still be unique up to stable isomorphism since H 2 (G0 , U (1)) = {1} in the case at hand. F We shall take Y 0 = Oi with Oi the open subsets of SU (N ) constructed before. π 0 will be the natural projection from Y 0 on the quotient group SU (N )/Z. For the curving of the gerbe Gk0 , we shall take the 2-form B 0 equal to kBi on Oi , see (3.10). We have g, i01 ), . . . , (za−1 g, i0n−1 )) | zam ∈ Z} , Y 0[n] = {((g, i), (za−1 1 n−1 for g ∈ Oii1 ···in−1 , where im = [i0m + am ]. We may then identify G G Oii1 ···in−1 . Y 0[n] ∼ =
(4.1)
(4.2)
a1 ,...an−1 i,i1 ,...,in−1
The hermitian line bundle L0 with connection over Oij ⊂ Y 0[2] should have the curvature p∗2 (kBj 0 ) − p∗1 (kBi ) = kBj − kBi = −ik tr λij (γ −1 dγ)2
(4.3)
in the parametrization g = γe2πiτ γ −1 . We set L0 |Oij = ρ∗ij Lkλij
(4.4)
where Lλij is the line bundle over the coadjoint orbit Oλij described in the previous section and ρij (γe2πiτ γ −1 ) = γλij γ −1 . Recall that the elements in Lkλij may be viewed as equivalence classes [γ, ζ]kλij , see (3.20). In the next step we should construct the isomorphism µ0 of line bundles over 0[3] defining the groupoid multiplication in L0 , see (2.19). The elements in p∗12 L0 Y
December 11, 2002 9:10 WSPC/148-RMP
1296
00155
K. Gaw¸edzki & N. Reis
over g ∈ Oijl ⊂ Y 0[3] with j = [j 0 + a] and l = [l0 + b] are given by the classes [γ, ζ]kλij . Those in p∗13 L0 by the classes [γ, ζ]kλil . As for the elements of p∗23 L0 , they correspond to the classes [γwa−1 , ζ]kλj0 [l−a] since za−1 g = γwa−1 e2πiσa (τ ) wa γ −1 , see (3.12), and zb za−1 = z[b−a] . The isomorphism µ0 has then to be given by µ0 ([γ, ζ]kλij ⊗ [γwa−1 , ζ 0 ]kλj0 [l−a] ) = [γ, uijl ζζ 0 ]kλil
(4.5)
for U (1)-valued functions uijl on Oijl whose dependence on a and b has been suppressed in the notation. These functions must be constant for µ0 to preserve the connections. Note that µ0 depends on the choice of matrices wa defined up to the multiplication by elements of the Cartan subgroup T . The latter dependence may, however, be absorbed in the choice of uijl . As we show in Appendix B by a direct verification, the associativity of the product defined by µ0 imposes the condition −1 −1 −1 uj 0 [l−a][n−a] u−1 iln uijn uijl = χkλl0 [n−b] (wb wa w[b−a] ) .
(4.6)
Upon taking i = j 0 = l0 = n0 = 0 and setting u0ab ≡ ua[b−a] , relation (4.6) reduces (upon the shift b 7→ [a + b], c 7→ [a + b + c]) to the condition −1 −1 −1 ubc u−1 [a+b]c ua[b+c] uab = χkλc (w[a+b] wa wb ) ≡ Uabc
(4.7)
which may be interpreted in terms of the discrete group cohomology H ∗ (Z, U (1)) with coefficients in U (1), see Appendix A. The U (1)-valued 3-cochain (Uabc ) on the cyclic group Z satisfies the cocycle condition −1 −1 Ua[b+c]d Uab[c+d] Uabc = 1 Ubcd U[a+b]cd
(4.8)
easy to verify with the use of the relation χkλd (wc twc−1 ) = χkλ[c+d] (t)χ−1 kλc (t)
(4.9)
holding for t ∈ T . The condition (4.7) requires that (Uabc ) be a coboundary. This does not have to be always the case since H 3 (Z, U (1)) ∼ = ZN 0 , see Appendix A. Given a solution (uab ) of (4.7), −1 ) uijl = ua[b−a] χkλl0 (wb wa−1 w[b−a]
(4.10)
solves (4.6), as a straightforward check with the use of (4.9) shows. We still have to study when (4.7) may be satisfied. In the action on the vectors of the canonical bases of CN , the matrices wa take the form wa el = ua e[l−a] , where ua are diagonal matrices such that det(ua ) = (−1)a(N −a) assuring that det(wa ) = 1. In particular, we may take ua proportional to the unit matrix: ( 1 for N 0 odd or N 00 even , (4.11) ua = 00 0 0 0 N (−1) N 0 a (N −a ) for N 0 even and N 00 odd , where a = a0 N 00 (here and below, (−1)x ≡ eπix ). For that choice, ( 1 −1 w[a+b] wa−1 wb = N 00 (−1) N 0 ma0 b0 ,
(4.12)
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1297
respectively, for ma0 b0 = −2a0 b0 and the associativity condition (4.7) becomes ( 1, −1 −1 (4.13) ubc u[a+b]c ua[b+c] uab = k(N 00 )2 0 (−1) N 0 ma0 b0 c , and may be solved by taking ( 1 uab = k(N 00 )2 0 0 0 0 (−1)− N 0 a (N −a )b
for N 0 odd or N 00 even , for N 0 even, N 00 odd and k even .
(4.14)
There is no solution for N 0 even and N 00 and k odd. This ends the construction of the gerbes Gk0 on SU (N )/Z with curvature kH 0 for all the values of k where the latter is an integral 3-form. Clearly, Gk0 ∼ = G10k for 0k/2 for N 0 even and N 00 odd. N 0 odd or N 00 even and Gk0 ∼ = G2 5. Gerbes on Discrete Quotients The above construction provides an illustration of a more general one of gerbes on spaces of orbits of a discrete group. The general case, that we shall briefly discuss in the present section which is somewhat parenthetical with respect to the main course of the exposition, sheds more light on the appearance of discrete group cohomology (the so called “discrete torsion” [45]), as was noticed first in [42]. Suppose that G is a gerbe on M with curvature H and that a finite group Γ acts on M preserving H. We may ask the question if the action preserves G in the sense that for each γ ∈ Γ the gerbe γG = (Yγ , B, L, µ), where Yγ = Y as the space but has the projection on M replaced by γ ◦ π, is stably isomorphic to G. Recall that this means that there exists a hermitian line bundle N γ over Zγ = Y ×M Yγ with connection of curvature F γ such that σγ∗ B = σ ∗ B + F γ
(5.1)
with σ, σγ denoting the projections from Zγ to Y and Yγ , and that there exists an isomorphism ∗
∗
ιγ : σ [2] L ⊗ p∗1 (N γ )−1 ⊗ p∗2 N γ → σγ[2] L
(5.2)
of hermitian line bundles with connection that preserves the groupoid multiplication. In particular, ιγ
γ 0 0 L(y1 ,y2 ) ⊗ (N γ )−1 (y1 ,y 0 ) ⊗ N(y2 ,y 0 ) −→ L(y1 ,y2 ) 1
2
(5.3)
if π(y1 ) = π(y2 ) = γπ(y10 ) = γπ(y20 ). For γ = 1 one may take N 1 = L and ι1 defined by µ. Suppose now that Γ acts without fixed points so that M/Γ is non-singular. We would like to construct a gerbe GΓ = (YΓ , BΓ , LΓ , µΓ ) on M/Γ with the curvature equal to the projection H 0 of H. We shall set YΓ = Y with the projection πΓ on
December 11, 2002 9:10 WSPC/148-RMP
1298
00155
K. Gaw¸edzki & N. Reis
M/Γ given by the composition of π : Y → M with the canonical projection on the quotient space. The curving BΓ will be taken equal to B. Note that G [2] Zγ . (5.4) YΓ = γ∈Γ
Let us take LΓ |Zγ = N γ ⊗ π1∗ P γ ,
(5.5)
where P γ is a flat line bundle on M and π1 (y, y 0 ) = π(y). Relation (5.1) assures then that the curvature of LΓ is related to the curving by (2.18). We still have to define the groupoid multiplication µΓ in LΓ that is an isomorphism of line bundles over G [3] {(y, y 0 , y 00 ∈ Y 3 | π(y) = γ1 π(y 0 ), π(y 0 ) = γ2 π(y 00 )} . (5.6) YΓ = γ1 ,γ2 ∈Γ [3]
On the (γ1 , γ2 ) component of YΓ , µΓ : p∗12 N γ1 ⊗ π1∗ P γ1 ⊗ p∗23 N γ2 ⊗ π2∗ P γ2 → p∗13 N γ1 γ2 ⊗ π1∗ P γ1 γ2 ,
(5.7)
where πn = π ◦ pn . A necessary condition for existence of µΓ is that the bundle ˜ γ1 ,γ2 (5.8) p∗12 (N γ1 )−1 ⊗ p∗23 (N γ2 )−1 ⊗ p∗13 N γ1 γ2 ≡ R be isomorphic to a pullback π1∗ Rγ1 ,γ2 of a flat bundle over M . That this condition is fulfilled may be seen the following way. First note that the map ιγ of (5.3) defines isomorphisms γ −1 (N γ )(y1 ,y10 ) → N(y 0 ⊗ L(y1 ,y2 ) ⊗ L(y 0 ,y 0 ) . 2 ,y ) 2
1
(5.9)
2
Combining the latter with the groupoid multiplication in L one obtains canonical ˜ γ1 ,γ2 over the triples (y1 , y 0 , y 00 ) and isomorphisms between the fibers of the bundle R 1 1 [3] 0 00 (y2 , y2 , y2 ) in YΓ with the same projections on M . Such isomorphisms define the ˜ γ1 ,γ2 , see the discussion around (2.21). The existence descent data for the bundle R γ1 ,γ2 ˜ γ1 ,γ2 ∼ and of the canonical isomorphism R of canonical bundle R = π1∗ Rγ1 ,γ2 follows then by the descent principle. Besides, there exists a canonical isomorphism ∼ Rγ1 ,γ2 γ3 ⊗ (γ −1 )∗ Rγ2 ,γ3 , (5.10) Rγ1 ,γ2 ⊗ Rγ1 γ2 ,γ3 = 1
˜ bundles. as may be easily seen on the level of R To construct isomorphisms µΓ of (5.7) becomes then equivalent to specifying a family of isomorphisms ιγ1 ,γ2 of flat bundles over M ιγ1 ,γ2 : P γ1 ⊗ (γ1−1 )∗ P γ2 → P γ1 γ2 ⊗ Rγ1 ,γ2 .
(5.11)
The associativity of µΓ becomes the condition ιγ1 γ2 ,γ3 ιγ1 ,γ2 = ιγ1 ,γ2 γ3 (γ1−1 )∗ ιγ2 ,γ3 for isomorphisms between the bundles P target bundles P γ1 γ2 γ3 ⊗ Rγ1 ,γ2 ⊗ Rγ1 γ2 ,γ3 naturally identified due to (5.10).
γ1
⊗ (γ1−1 )∗ P γ2
(5.12) ⊗ (γ1−1 )∗ (γ2−1 )∗ P γ3
and the
and P γ1 γ2 γ3 ⊗ Rγ1 ,γ2 γ3 ⊗ (γ1−1 )∗ Rγ2 ,γ3
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1299
Given a family ιγ1 ,γ2 of isomorphisms (5.11) such that (5.12) holds, we may obtain another family by multiplying ιγ1 ,γ2 by an U (1)-valued 2-cocycle uγ1 ,γ2 satisfying −1 uγ1 ,γ2 uγ1 γ2 ,γ3 u−1 γ1 ,γ2 γ3 uγ2 ,γ3 = 1 .
(5.13)
The new family gives another solution for associative µΓ . The coboundary choice vγ−1 with U (1)-valued vγ leads to an isomorphic gerbe on M/Γ uγ1 ,γ2 = vγ1 γ2 vγ−1 1 2 whereas the choices leading to non-trivial elements of H 2 (Γ, U (1)) may give stably non-isomorphic gerbes. The construction of gerbes Gk0 on SU (N )/Z in the preceding section is an illustration of the general procedure described here, as we explain in detail in Appendix C. Although above we have assumed that Γ acts without fixed points on M , the above construction still goes through for general orbifolds provided that we redefine [n] the spaces YΓ as [n]
YΓ
:= {(y, y1 , γ1 , . . . , yn−1 , γn−1 ) | π(y) = γm π(ym )} ,
(5.14)
i.e. keeping track of γm ∈ Γ (which could be recovered from ym ’s for free action of Γ). The resulting “orbifold gerbes” provide a natural tool for the treatment of strings on orbifolds in the background of closed 3-forms, see also [43].
6. Bundle Gerbes and Loop-Space Line Bundles The construction of the amplitudes A(φ) described in Sec. 2.2 with the use of the local data may be easily translated to the language of gerbes. In particular, the construction of an isomorphism class of hermitian line bundles with connection over the loop space LM from a class w ∈ W (M, H) may be lifted to a canonical assignment of a hermitian line bundle with connection to a gerbe on M . In the present subsection, we shall describe those constructions that gain in simplicity when formulated with use of gerbes. Let G = (Y, B, L, µ) be a gerbe on M with curvature H and, as before, σi : Oi → Y be local sections. Let φ be a map from a compact surface Σ to M . For a sufficiently fine triangulation of Σ and a label assignment (c, b) 7→ (ic , ib ) such that φ(c) ⊂ Oic and φ(b) ⊂ Oib , let us set φc = σic ◦ φ|c ,
φb = σib ◦ φ|b ,
φcb = σic ib ◦ φ|b ,
(6.1)
the latter for b ⊂ c. These are lifts to Y or to Y [2] of restrictions of φ to the elementary cells. Denoting by H(·) the parallel transport in L, we define: # " O XZ ∗ φc B H(φcb ) . (6.2) A(φ) = exp i c
c
b⊂c
December 11, 2002 9:10 WSPC/148-RMP
1300
00155
K. Gaw¸edzki & N. Reis
c b b c 2
c b c
2
1
3
3
b 4
1
n
c
4
b b . . . . 5
n
Fig. 1.
N Since H(φcb ) ∈ v∈∂b L(yc ,yb ) , where yc = φc (v) and yb = φb (v) (with the convention that the dual fiber is taken if v is the beginning of b), A(φ) ∈
O
L(yc ,yb ) .
(6.3)
v∈b⊂c
The point is that if ∂Σ = ∅ then there is a canonical isomorphism, defined by the gerbe multiplication µ, between the line in (6.3) and the complex line C so that the amplitude A(φ) may be naturally interpreted as a number. Indeed, fixing a vertex v and going around as in Fig. 1, we gather the contribution L(yb1 ,yc1 ) ⊗ L(yc1 ,yb2 ) ⊗ · · · ⊗ L(ycn ,yb1 )
(6.4)
to the line in (6.3) which is trivialized by subsequent application of µ. That the result does not depend on where we start numbering the cells may be seen by choosing yv ∈ Y with π(yv ) = φ(v) and inserting L(ybr ,yv ) ⊗ L(yv ,ybr ) ∼ = C at every second place in the chain (6.4). We may now use µ to trivialize the blocks L(yv ,ybr ) ⊗ L(ybr ,ycr ) ⊗ L(ycr ,ybr+1 ) ⊗ L(ybr+1 ,yv ) .
(6.5)
It is easy to check that the number obtained for A(φ) coincides with the one defined by the expression (2.14) with the use of the local data obtained from the sections σi and sij . We give the proof in Appendix D. From the results of [25], it follows now that A(φ) does not depend on the choices of the local sections σi , of the lifts φc , φb and φcb , nor of the triangulation of Σ. Additionally, A(φ) is invariant under the composition of φ with orientation-preserving diffeomorphisms of Σ and it goes to its inverse for diffeomorphisms reversing the orientation. F Suppose now that ∂Σ = `s . Consider the same expression (6.2). Proceeding as before, we may canonically reduce the line in (6.3) to O O s
v∈b⊂`s
L(yv ,yb ) ,
(6.6)
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
c b b 2
c b 1
2
3
. . . b c bn+1 n
1
1301
n
ls
Fig. 2.
see Fig. 2 which replaces Fig. 1 in the boundary situation. Let, for a closed loop ϕ : ` → M and a sufficiently fine split of `, O L(yv ,yb ) (6.7) Lϕ = v∈b⊂`
with yv ∈ Y such that π(yv ) = v, and yb = ϕb (v), where ϕb lift ϕ|b to Y . Let us show that the lines (6.7) are canonically isomorphic for different choices of yv and ϕb . Let ϕ0b and yv0 , yb0 = ϕ0b (v) be another choice. Note that the parallel transport in L along (ϕb , ϕ0b ) : b → Y [2] defines a canonical trivialization of the line N N 0 v∈b⊂` L(yb ,yb ) . Similarly, the line v∈b⊂` L(yv0 ,yv ) is canonically trivial since each factor is accompanied by its dual. Using also the product µ, we obtain a chain of canonical isomorphisms O O O L(yv ,yb ) ∼ (L(yv0 ,yv ) ⊗ L(yv ,yb ) ⊗ L(yb ,yb0 ) ) ∼ L(yv0 ,yb0 ) . (6.8) = = v∈b⊂`
v∈b⊂`
v∈b⊂`
Associativity of µ assures that the resulting isomorphisms are transitive so that we may free ourselves from the choice of local lifts in the definition of Lϕ by passing to equivalence classes of elements related by the isomorphisms (6.8). Similarly, if we pass to a finer split of ` and use the restrictions of maps ϕb to the new intervals, setting also yv = ϕb (v) for new vertices in the interior of the old intervals, then the net result on Lϕ is to add trivial factors on the right hand side of (6.7). Dropping them is compatible with isomorphisms (6.8). In order to loose memory of the split used in (6.7), one may then define the projective limit L(G)ϕ over trivializations of the lines obtained for fixed trivializations. All in all, we obtain this way a canonical hermitian line bundle L(G) over the loop space LM . Note that, by construction, L(G)ϕ is invariant under orientation-preserving reparametrizations of ` and that the change of orientation gives rise to the dual line. F Comparing the lines (6.6) and (6.7) we infer that if ∂Σ = `s then the amplitude (2.14) may be canonically defined as an element of the product of lines of L(G): O L(G)φ|`s . (6.9) A(φ) ∈ s
The hermitian line bundle L(G) may be equipped with a (hermitian) connection such that the parallel transport along the curve in the loop space LM defined by φ : [0, 1] × ` → M is given by A(φ). The curvature of this connection is equal to
December 11, 2002 9:10 WSPC/148-RMP
1302
00155
K. Gaw¸edzki & N. Reis
the 2-form Ω on LM defined in (2.16), The amplitudes A(φ) for arbitrary surfaces Σ provide a generalization of the parallel transport in the loop space. How does the line bundle L(G) and the amplitude A(φ) depend on the gerbe? First, the line bundles L(G) and L(σ ∗ G), where σ ∗ G is a pullback gerbe, are canonically isomorphic. Second, an isomorphism between gerbes G1 and G2 induces an isomorphism of the bundles L(G1 ) and L(G2 ). Third, the line bundle L(G1 ⊗ G2 ) is canonically isomorphic to L(G1 ) ⊗ L(G2 ). Finally, for a trivial gerbe GN , O O (6.10) (Ny−1 ⊗ Ny b ) ∼ Ny b ∼ Lϕ = =C = v v∈b⊂`
v∈b⊂`
where the last isomorphism is given by the parallel transport in N along ϕb . It follows that a stable isomorphism between gerbes induces an isomorphism of the corresponding line bundles over LM . If the gerbe G is constructed from the local data then L(G) is canonically isomorphic to the line bundle L over LM constructed from the local data described in Sec. 10.1. In the language of the trivialization (6.10), the isomorphism between the lines Lϕ corresponding to the isomorphic trivial gerbes GN 0 and GN for N 0 ∼ = N ⊗ π∗ P with P a flat bundle on M is given by the multiplication by the holonomy of P along ϕ. It follows that the change of stable isomorphism between two gerbes obtained by composition with the isomorphism between the trivial gerbes GN and GN 0 multiplies the isomorphism of the line bundles over LM by the holonomy of P . 7. Gerbes and Branes We have shown in the previous section that, given a gerbe G = (Y, B, L, µ) on M R ∗ −1 of curvature H, the formal amplitudes ei φ d H of a classical fields φ : Σ → M defined on the worldsheet Σ with boundary may be given sense as elements in the tensor product of lines of the line bundle L(G) canonically associated to G, see (6.9). In general, L(G) is a non-trivial bundle so that the amplitude cannot be naturally defined as numbers. Suppose, however, that the field φ is restricted by the boundary conditions φ(`s ) ∈ Ds ⊂ M ,
(7.1)
forcing its values on the boundary loops `s of Σ to belong to submanifolds Ds of M . Suppose moreover that Ds are chosen so that the line bundle L(G) restricted to the space LDs of loops in Ds becomes trivial. Upon a choice of trivializations of L(G)|LDs , the amplitude A(φ) may then be assigned a numerical value. Note that it is not necessary to require that the trivializations of L(G)|LDs flatten the connection. Let us assume that a submanifold D ⊂ M is such that the restriction of the 3-form H to D is exact: H|D = dQ. To the 2-form Q, we may associate a gerbe K = (D, Q, D × C, ·) over D with curvature H|D . Note that the corresponding hermitian line bundle L(K) over LD is trivial but that its connection has a nontrivial curvature if H|D 6= 0. A natural way to assure the triviality of the restricted
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1303
bundle L(G)|LD and to provide for its trivializations is to assume that the restriction GD = (YD , BD , LD , µD ) of the gerbe G to D, where YD = π −1 (D), BD = B|YD etc., is stably isomorphic to the gerbe K. Explicitly, this means that there exist: a line bundle N over YD with connection of curvature F such that ∗ Q BD + F = πD
(7.2)
and an isomorphism ι : L|Y [2] ⊗ p∗1 N −1 ⊗ p∗2 N → YD × C [2]
(7.3)
D
[2]
of line bundles with connection over YD , compatible with the groupoid multiplication. By definition, a brane D of G with support D and curving Q is the quadruple (D, Q, N, ι). We shall consider two branes represented by collections (D, Q, N, ι) and (D, Q, N 0 , ι0 ) equivalent if the line bundles N and N 0 are isomorphic and ι and ι0 are intertwined by the induced isomorphism of the trivial gerbes (note that such isomorphism is not unique). Non-equivalent branes with fixed support and ∗ P , where P is a non-trivial flat bundle over D. curving correspond to N 0 ∼ = N ⊗ πD A choice of N and ι induces canonically an isomorphism between L(G)|LD and the trivial hermitian line bundle L(K), see (6.10), with equivalent choices leading to the same isomorphism. Non-equivalent choices give rise to isomorphisms differing by multiplication by holonomy in a flat line bundle P over D. Given a gerbe G with curvature H, we may ask which submanifolds D with H|D = dQ support branes with curving Q. The obstructions to stable isomorphism of the gerbes GD and K lie in the cohomology group H 2 (D, U (1)) that acts freely and transitively on the set W (D, H|D ) of stable isomorphism classes of gerbes on D with curvature H|D . If the obstruction vanishes, then the cohomology group H 1 (D, U (1)) (the group of isomorphism classes of flat hermitian bundles on D) acts freely and transitively on the set of equivalence classes (moduli) of branes D with curving Q supported by D. We shall see this in work in the next two sections. Let IM be the space of open curves (strings) ϕ : [0, π] → M and G = (Y, B, L, µ) a gerbe on M with curvature H. The same construction that associated to G a line bundle with connection over the loop space LM , when applied to open curves, induces a hermitian line bundle with connection N over the space Y = {(ϕ, y0 , y1 ) ∈ IM × Y 2 | π(y0 ) = ϕ(0), π(y1 ) = ϕ(π)} .
(7.4)
The line bundle N is composed from the fibers Lϕ of (6.7), with all the identifications as before except that one has to keep the memory of yv = y0 and yv = y1 for the end point vertices. The parallel transport in N along a curve in IM is still determined by the amplitude A(φ) defined by (6.2) for φ : [0, 1] × [0, π] → M . The curvature of N is given by the closed 2-form ΩIM (ϕ, y0 , y1 ) = Ω(ϕ) + B(y0 ) − B(y1 ) , on Y, where Ω defined by (2.16) with ` = [0, π].
(7.5)
December 11, 2002 9:10 WSPC/148-RMP
1304
00155
K. Gaw¸edzki & N. Reis
Given two branes D0 and D1 with supports D0 and D1 of gerbe G, we may consider in the space IM of open strings the subspace ID0 D1 M = {ϕ : [0, π] → M | ϕ(0) ∈ D0 , ϕ(π) ∈ D1 } .
(7.6)
A slight modification of the construction described above permits now to define over ID0 D1 M a hermitian line bundle LD0 D1 (G) ≡ LD0 D1 with connection by setting (LD0 D1 )ϕ = (N0 )y0 ⊗ Lϕ ⊗ (N1 )−1 y1 ,
(7.7)
where Lϕ is given by (6.7). Due to the isomorphism (7.3), the lines obtained this way are canonically isomorphic also for different choices of y0 and y1 , giving rise upon their identification to the fibers of LD0 D1 . A choice of equivalent branes leads to to (non-canonically) isomorphic bundles. The parallel transport in LD0 D1 is determined by ! O Hs (φb ) (7.8) AD0 D1 (φ) = A(φ) ⊗ b⊂`s
for φ : [0, 1]×[0, π] → M , where `s denotes the piece of the boundary of [0, 1]×[0, π] mapped into Ds for s = 0, 1, Hs (φb ) stands for the parallel transport in Ns along a lift φb of φ|b to Y and A(φ) is given by (6.2). The curvature of LD0 D1 is given by the 2-form on ID0 D1 M ΩD0 D1 = Ω + e∗1 Q1 − e∗0 Q0 ,
(7.9)
where Ω is as in (7.5) and es are the evaluation maps, e
e
0 ϕ(0) , ϕ −→
1 ϕ −→ ϕ(π) .
(7.10)
More generally, suppose that φ : Σ → M satisfies the boundary conditions (7.1) for `s being closed disjoint subintervals of the boundary loops of Σ, see Fig. 3. Then
l4
l3
l1
lm
l2 Fig. 3.
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
the amplitude defined by (7.8) satisfies A(Ds ) (φ) ∈
O
(LDs Ds0 )φ|`
(s,s0 )
! (s,s0 )
⊗
O
1305
! Lφ|`m
,
(7.11)
m
where `(s,s0 ) are the boundary intervals bordering `s and `s0 and `m are the boundary loops that do not contain intervals `s . The curves φ|`(s.s0 ) stretch between the submanifolds Ds and Ds0 and the line bundles LDs Ds0 correspond to that geometry. The expression (7.8) generalizes the definition of the amplitude of a field to the case when local boundary conditions are imposed on pieces of the boundary of Σ. In the quantum field theory, the amplitudes AD0 D1 (φ) are “summed” (with additional scalar weights) over all fields satisfying the boundary conditions (7.1) resulting, at least formally, in a vector in the tensor product of Hilbert spaces of states, see Sec. 10 below. To each interval `(s,s0 ) there corresponds a factor HDs Ds0 , the Hilbert space of states of the string stretching between the branes Ds and Ds0 and to each `m a factor H, the closed string space of states. Geometrically, spaces HD0 D1 are formed of sections of the corresponding line bundles LD0 D1 and the space H of sections of L (more precisely, they are Hilbert space completions of spaces of sections). Even without going into the detailed construction of such spaces of states, the geometric classification of branes discussed above allows to obtain the spectrum of branes. We shall illustrate that in the next section on the example of the SU (N ) WZW theory and of its versions with groups covered by SU (N ). 8. Branes in the WZW Model In the WZW model, the candidate for the simplest form of the boundary condition that guarantees the conservation of half of the current algebra symmetries is to require that the values of the field g : Σ → G on the boundary loops `s ⊂ ∂Σ belong to the conjugacy classes Cτs ⊂ G. The closed 3-form H of (3.2) becomes exact when restricted to a conjugacy class: H|Cτ = dQτ , where Qτ is given by the expression (3.8) with constant τ . The preservation of the current algebra symmetries requires that one sticks to that choice (or to its multiplicities) for the curving of branes supported by Cτ , see [1] or [28]. Now it is easy to check for which conjugacy classes the restriction of the gerbe on G with curvature kH is stably isomorphic to the gerbe K = (Cτ , kQτ , Cτ × C, ·). 8.1. SU (N ) groups Recall that for G = SU (N ), the gerbe G k = (Y, kB, Lk , µk ) with curvature kH F (unique, up to stable isomorphism) has Y = ri=0 Oi . We shall choose a coadjoint orbit Cτ and denote Z = π −1 (Cτ ) = YCτ , where π is the projection from Y to F SU (N ). Thus Z = Zi where Zi = Cτ ∩ Oi . Since the sets Oi ⊂ SU (N ) are invariant under conjugations, Zi are either empty or equal to Cτ . Let ω = π|Z and
December 11, 2002 9:10 WSPC/148-RMP
1306
00155
K. Gaw¸edzki & N. Reis
σ : Z → Y be the natural inclusion. We have to compare the restriction of the gerbe G k to Cτ with the pullback gerbe σ ∗ K. On Zi the difference of the two curvings kQτ − kBi |Cτ ≡ Fτ i has the form Fτ i (γe2πiτ γ −1 ) = −ki tr(τ − λi )(γ −1 dγ)2 ,
(8.1)
see (3.13). If the two gerbes G k |Cτ and K are stably isomorphic then the closed 1 Fτ i forms Fτ i must be curvature forms of hermitian line bundles N |Zi and hence 2π must be integral. This condition is equivalent to the requirement that kτ be a weight. Indeed, in our case, Zi = Cτ may be identified with the coadjoint orbits Ok(τ −λi ) by the maps γe2πiτ γ −1 7→ γk(τ − λi )γ −1 since the isotropy groups satisfy Gτ = G0τ −λi . Upon this identification, Fτ i becomes the Kirillov–Kostant symplectic 1 Fτ i requires that k(τ − λi ) be a weight. form and integrality of 2π In the latter case, the line bundles N |Zi may be taken to be the pullbacks of the Kirillov–Kostant bundles Lk(τ −λi ) by the identification of Zi with Ok(τ −λi ) . Since the conjugacy classes are simply connected, the resulting line bundle N over Z is unique up to isomorphism. The mapping 00 0 00 [γ, ζ]kλij ⊗ [γ, ζ 0 ]−1 k(τ −λi ) ⊗ [γ, ζ ]k(τ −λj ) 7−→ (y1 , y2 , ζζ ζ ) ι
(8.2)
for y1 = (g, i), y2 = (g, j) and g = γe2πiτ γ −1 ∈ Cτ , which is well defined because χkλij (γ0 )χ−1 k(τ −λi ) (γ0 )χk(τ −λj ) (γ0 ) = 1 for γ0 ∈ Gτ , determines then the unique isomorphism (7.3) that commutes with the groupoid multiplication. It provides a stable isomorphism between the gerbes G k |Cτ and K. Other choices of N and ι lead to equivalent branes in the present case. We thus obtain for the SU (N ) WZW model a family of branes labeled by the weights λ ∈ kAW , supported by the conjugacy classes Cτ with λ = kτ . The weights in the dilated Weyl alcove kAW are called “integrable at level k” [31] and they also label the irreducible highest-weight representations of the level k current algebra and the bulk primary fields of the λ(λ+2ρ) model with conformal weights h(λ) = tr2(k+h ∨) . 8.2. Groups covered by SU (N ) We shall consider branes supported by the conjugacy classes in G0 = SU (N )/Z for Z ∼ = ZN 0 . Each conjugacy class in G0 is an image under the canonical projection from G = SU (N ) to G0 of a conjugacy class Cτ ⊂ G with τ ∈ AW . Recall that the elements zi of the center of SU (N ) act on AW by τ 7→ σi (τ ). The conjugacy classes Cσa (τ ) in G for different za ∈ Z project to the same class in G0 . This way the conjugacy classes in G0 may be labeled by the Z-orbits [τ ] of elements in AW . 0 0 We shall denote the class in G0 corresponding to [τ ] by C[τ ] . For any τ ∈ [τ ], C[τ ] may be canonically identified with Cτ /Zτ , where Zτ is the subgroup of Z leaving τ 0 unchanged (it depends only on the orbit [τ ]). The 3-form H 0 restricted to C[τ ] still 0 0 0 0 satisfies H |C[τ ] = dQτ , with Qτ denoting the the projection of Qτ to the quotient 0 space Cτ /Zτ and defining a 2-form on C[τ ] that does not depend on τ ∈ [τ ]. We 0 0 0 0 0 obtain then the gerbe K = (C[τ ] , kQτ , C[τ ] × C, ·) on C[τ ].
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1307
F F 0 Let us consider the space Z 0 = π 0−1 (C[τ τ ∈[τ ] i Zτ i , with Zτ i = Cτ ∩ Oi . ]) = Let σ 0 : Z 0 → Y 0 be the inclusion map and ω 0 = π 0 |Z 0 . We have to compare the 0 0 0 0 0 0 0 restriction to C[τ ] of the gerbe Gk = (Y , B , L , µ ) on G constructed in Sec. 4 to the 0∗ 0 pullback gerbe σ K . Over Zτ i the difference of the curvings is Fτ i , see (8.1), and the existence of the stable isomorphism between the two gerbes requires again that kτ be a weight, similarly as in the simply connected case. Let N 0 denote the line bundle over Z 0 that over Zτ0 i coincides with the pullback of the Kirillov–Kostant bundle Lk(τ −λi ) by the identification of Zτ i with the coadjoint orbit Ok(τ −λi ) . In order to construct the primed version of the bundle isomorphism (7.3), let us consider the pairs (y1 , y2 ) ∈ Z 0[2] with y1 = (g, i) ,
y2 = (za−1 g, j 0 ) ,
g = γe2πiτ γ −1 ,
za−1 g = γwa−1 e2πiσa (τ ) wa γ −1
and the mapping [γ, ζ]kλij ⊗ [γ, ζ 0 ]−1 k(τ −λi ) ι0
⊗ [γwa−1 , ζ 00 ]k(σa (τ )−λj0 ) 7−→ (y1 , y2 , vτ,a ζζ 0 ζ 00 ) ,
(8.3)
where j = [j 0 + a] and vτ,a ∈ U (1). Note that ι0 is well defined since the isotropy subgroups satisfy Gτ = G0τ −λi = G0τ −λj = wa−1 G0σa (τ )−λj0 wa ⊂ G0λij −1 and for γ0 ∈ Gτ , the product χkλij (γ0 )χ−1 k(τ −λi ) (γ0 )χk(σa (τ )−λj0 ) (wa γ0 wa ) is equal −1 to 1 since χk(σa (τ )−λj0 ) (wa γ0 wa ) = χk(τ −λj ) (γ0 ). We have to choose vτ,a so that the isomorphism ι0 of hermitian line bundles with connections preserves also the groupoid multiplication. Let
Vτ,ab = χkσ[a+b] (τ ) (w[a+b] wa−1 wb−1 )uab ,
(8.4)
where (uab ), a solution of (4.7), enters via (4.10) the definition of the groupoid multiplication µ0 , see (4.5). As we prove in Appendix E, the requirement to preserve the groupoid multiplication imposes the cohomological relation −1 vτ,a . Vτ,ab = vσa (τ ),b vτ,[a+b]
(8.5)
In Appendix F, we show that the 2-cochain (Vτ,ab ) on Z ∼ = ZN 0 with values in the group U (1)[τ ] of U (1)-valued functions on the Z-orbit [τ ] is a 2-cocycle, i.e. that −1 −1 Vτ,a[b+c] Vτ,ab = 1. Vσa (τ ),bc Vτ,[a+b]c
(8.6)
Equation (8.5) requires that it be a coboundary, i.e. that it defines a trivial element in the cohomology group H 2 (Z, U (1)[τ ] ). This always holds since H 2 (Z, U (1)[τ ] ) = {1}, see Appendix A. 0 ) satisfying the The multiplication of a solution of (8.5) by 1-cocycles (vτ,a 0−1 0 0 relation vσa (τ ),b vτ,[a+b] vτ,a = 1 gives all other solutions. Solutions differing by 1-boundaries vσ00a (τ ) vτ00−1 lead to equivalent branes and the set of equivalence classes 0 1 [τ ] 1 [τ ] ∼ of branes supported by C[τ ] forms a H (Z, U (1) )-torsor. Since H (Z, U (1) ) =
December 11, 2002 9:10 WSPC/148-RMP
1308
00155
K. Gaw¸edzki & N. Reis
0 Zτ , see Appendix A, and Zτ ∼ = H 1 (C[τ ] , U (1)) and describes the moduli of flat line 0 bundles on C[τ ] , this agrees with the general result about the classification of branes, see Sec. 7. Let Zτ be a cyclic subgroup of Z of order n0 and let n00 and m00 be such that n0 n00 = N , n0 m00 = N 0 and n00 = m00 N 00 . Explicitly,
Zτ = {za | a = a00 n00
for a00 = 0, 1, . . . , n0 − 1} .
(8.7)
In order to generate all classes in H 1 (Z, U (1)[τ ] ) it is enough to take 0 = (−1) vτ,a
2ar N
(8.8)
i.e. τ -independent and equal to the characters of Z. Besides r above may be restricted to integers between 0 and n0 − 1 since there exists (vτ00 ) such that 2an0
2a0
(−1) N = (−1) m00 = vσ00a (τ ) vτ00−1 . It remains to describe explicitly a single solution of (8.5). Let τ0 ∈ [τ ]. Note that kτ0 =
00 nX −1
ni0
i0 =0
0 nX −1
λi0 +a00 n00
(8.9)
a00 =0
Pn00 −1 with i0 =0 ni0 = nk0 so that n0 has to divide k. The complicated case is when N 0 is even and N 00 is odd and we shall deal with it first. Here, for nk0 odd, n0 must be even since k is necessarily even. Let us choose the elements wa ∈ SU (N ) inducing the Weyl group transformations as at the end of Sec. 4. Let for u ∈ U (1), X PN −1 ni λi , (8.10a) χλ (u) = u i=0 ini if λ = i
1
k even , 0 n (8.10b) · ψ(a, b) = k (−1)− nab00 for 0 odd , n where ua ∈ U (1) are given by (4.11). The first formula may be viewed as extending characters χλ to constant diagonal U (N )-matrices. The following properties of ψ(a, b) are straightforward to verify: χ−1 kλa (ub )
for
ψ(a + n00 , b) = ψ(a, b + N ) = ψ(a, b) , ψ(0, b) = 1 ,
(8.11a)
ψ([a + b], c) = ψ(a, c)ψ(b, c) ,
(8.11b)
−1 ψ(a, [b + c])−1 ψ(a, b)ψ(a, c) = χkλa (u[b+c] u−1 b uc ) .
To each fixed weight λ0 = kτ0 with τ0 ∈ [τ ], we may assign a solution given by k for 0 even , 1 n vτλ00,a = χ−1 kτ0 (ua )χkλa (ua ) · k a2 (−1) 2n 00 for 0 odd , n vσλc0(τ0 ),a = ψ(c, a)−1 vτλ00,a .
(8.11c) λ0 (vτ,a )
of (8.5)
(8.12a)
(8.12b)
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1309
λ0 We show in Appendix G that (vτ,a ) solves indeed (8.5) for N 0 even and N 00 odd. For N 0 odd or N 00 even, since Vτ,ab = 1, we may take the trivial solution of (8.5) λ0 = 1 (with a superfluous dependence on λ0 ). vτ,a As we mentioned above, a general solution of (8.5) is obtained by multiplying λ0 0 ) by (vτ,a ) of (8.8). The bundle isomorphisms ι0 obtained the particular solution (vτ,a 0 this way determine stable isomorphisms between the restriction of gerbe Gk0 to C[τ ] 0 0 0 0 and gerbe K0 . Consequently, they determine the branes (C[τ ] , kQτ , N , ι ) supported 0 by the conjugacy class C[τ ] in SU (N )/Z. As discussed in Sec. 7, using the above structure, one may define the Wess–Zumino amplitudes (7.11) for the fields φ : Σ → SU (N )/Z satisfying boundary conditions (7.1) on subintervals of the boundary. Note that the solutions (vτ,a ) of (8.5) differing by 1-coboundaries (vσ00a (τ ) vτ00−1 ) and giving rise to equivalent branes coincide for a ∈ Zτ . Conversely, two solutions coinciding on Zτ necessarily differ by a 1-coboundary (vσ00a (τ ) vτ00−1 ). Indeed, they 0 0 ) such that vτ,a = 1 for a ∈ Zτ . Setting vτ000 = 1 for fixed differ by a 1-cocycle (vτ,a 0 = vσ00a (τ ) vτ00−1 . Hence there is a τ0 ∈ [τ ] and vσ00a (τ0 ) = vτ0 0 ,a assures then that vτ,a 0 one-to-one correspondence between the moduli of branes supported by C[τ ] and the restrictions of the solutions (vτ,a ) of (8.5) to a ∈ Zτ . The latter may be taken as λ0 ) assigned to λ0 = kτ0 products of the restrictions to Zτ of the special solutions (vτ,a with τ0 ∈ [τ ] by characters ψλ0 of Zτ given by the right hand side of (8.8) with a ∈ Zτ . As follows from (8.12), two pairs (λ0 , ψλ0 ) and (λ00 , ψλ0 0 ) give rise to the 0 same restricted solution if
λ00 = bλ0
and ψλ0 00 (a) = φλ0 (b, a)ψλ0 (a) ,
(8.13)
with bλ0 = kσb−1 (τ0 ) and λ0
φλ0 (b, a) = ψ(b, a)vτλ00,a /vτ 00,a . 0
(8.14)
Note that for any b ∈ Z, φλ0 (b, a) must be a character of Zτ in its dependence on a and that it satisfies a cocycle condition φλ0 (b, a)φbλ0 (c, a) = φλ0 ([b + c], a). The upshot of the above discussion is that the set of moduli of symmetric branes in the SU (N )/Z WZW theory may be identified with the set of equivalence classes [λ0 , ψλ0 ] where λ0 runs through the integrable weights and ψλ0 through the characters of Zτ0 for λ0 = kτ0 , with the equivalence relation given by (8.13). This description of branes, obtained here from the Lagrangian considerations, agrees with the description of symmetric branes in simple current extension conformal field theories conjectured in [22, 44]. The general classification of the branes proposed there, basing on consistency considerations, involves equivalence classes of primary fields and characters of their “central stabilizers” that, for the SU (N ) WZW theory, reduce to the ordinary stabilizers Zτ in the simple current group Z. The cocycle λ0 ) by a λ0 -dependent φλ0 (b, a) is not unique. If we multiply the special solution (vτ,a character ρλ0 (a) of Z, then φλ0 (b, a) 7→ φλ0 (b, a)ρλ0 (a)/ρbλ0 (a) .
(8.15)
December 11, 2002 9:10 WSPC/148-RMP
1310
00155
K. Gaw¸edzki & N. Reis
As we show in Appendix H, upon an appropriate choice of ρλ0 (a), φλ0 (b, a) may λ0 be reduced to 1. In other words, it is possible to choose the solution (vτ,a ) so that, when restricted to Zτ , it does not depend on λ0 . 9. Partition Functions Among the elementary quantum amplitudes of the WZW model are the partition functions. We shall describe them here in the simplest geometries: those of a torus for the bulk theory and of annulus for the boundary one, relating in the latter case the Lagrangian description with the use of gerbes to what was known from previous work. Although we shall concentrate on the example of the WZW model based on groups covered by SU (N ), the general picture should be similar for other WZW models. 9.1. Bulk case The toroidal level k partition functions are formally given by the functional integral over the toroidal amplitudes Z (9.1) Z(τ ) = e−Sσ (φ) A(φ)Dφ , where fields φ map the torus Tτ = C/(2πZ + τ Z) with the modular parameter τ from the upper half planea to the group G, the sigma model action functional Z k ¯ , (9.2) tr(φ−1 ∂φ)(φ−1 ∂φ) Sσ (φ) = 4πi and the amplitude A(φ) is obtained with the use of gerbe on G with curvature kH. In the Hamiltonian language, ¯
Z(τ ) = trH e2πiτ (L0 − 24 )−2πi¯τ (L0 − 24 ) , c
c
(9.3)
where H is the closed-string Hilbert space composed of sections of the bundle L over ¯ 0 are the Virasoro generators and c = k dim(G) the loop group LG, operators L0 , L k+h∨ is the Virasoro central charge of the theory. For the connected, simply-connected simple compact groups, M Vˆλ ⊗ Vˆλ , (9.4) H∼ = λ
where the sum is over the integrable weights, i.e. such that λ ∈ kAW , and Vˆλ carries ˆ associated the unitary level k irreducible representation of the current algebra g with group G and the related action of the Virasoro algebra given by the Sugawara a The modular parameter, for which we use the traditional notation, should not be confused with the Weyl alcove elements also denoted by τ in the present paper.
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1311
construction. Integrable weights λ label the primary fields of the model with the fusion rule X Nλλ0 λ1 λ . (9.5) λ0 ∗ λ1 = λ
The decomposition (9.4) implies that X |χ ˆλ (τ )|2 , Z(τ ) =
(9.6)
λ
ˆ where χ ˆλ (τ ) = trVˆλ , e2πi(L0 − 24 ) are the (restricted) level k affine characters of g satisfying X 0 1 ˆλ (τ + 1) = Sλλ χ ˆλ0 − , (9.7) χ ˆλ (τ ) = e−2πih(λ) χ τ 0 c
λ
where h(λ) is the conformal weight of the primary field corresponding to λ and 0 0 Sλλ = Sλλ0 = Sλ¯λ are the elements of a unitary modular matrix S which enter the Verlinde formula for the fusion coefficients: Nλλ0 λ1
=
X Sλλ0 Sλλ0 Sλλ0 0
λ0
1 0
S0λ
.
(9.8)
The toroidal partition functions for all connected non-simply connected simple compact groups were first obtained in [19]. For G0 = SU (N )/Z with Z ∼ = ZN 0 , the k H0 integer level k ≥ 0 is restricted by the condition of integrality of the 3-form 2π which reads 1 kN 0 tr λ2N 00 = kN 00 (N 0 − 1) ∈ Z . 2 2
(9.9)
The closed string Hilbert space H0 ∼ =
M M
Vˆλ ⊗ Vˆaλ ,
(9.10)
za ∈Z λ∈Ca
P PN −1 ni = k, the where, as before, for integrable weights λ = kτ = i=0 ni λi with transformed weight X a λ = kσa−1 (τ ) = ni λ[i+a] , (9.11) and where
ka 2 λ tr λλN 00 + tr λ ∈ Z 00 N 2N 00 P ini ka(N 0 − 1) + ∈ Z . = λ − N0 2N 0
Ca =
(9.12)
December 11, 2002 9:10 WSPC/148-RMP
1312
00155
K. Gaw¸edzki & N. Reis
Consequently, the partition function X X χ ˆλ (τ )χ ˆaλ (τ ) Z 0 (τ ) = za ∈Z λ∈Ca
1 . = Z (τ + 1) = Z − τ 0
0
(9.13)
One obtains this a way a family of modular invariant sesquilinear combinations of characters χ ˆλ (τ ), for example for the case of the SU (2) group, the A and D series in the ADE classification [8] of modular invariants. The above expressions for the toroidal partition functions coincide with the ones for the “simple current extensions” [34] of the SU (N ) WZW theory by the simple current group Z generated by the simple current J corresponding to the integrable weight kλN 00 with the fusion rule 0
J a ∗ λ = aλ , 0
(9.14)
00
for a = a N . The restriction on the level k is expressed by demanding that the conformal weight h(J) = k2 tr λ2N 00 multiplied by the order N 0 of J be an integer, which coincides with condition (9.9). The requirement λ ∈ Ca is expressed by the monodromy charge with respect to the simple current J and its modulo 2 refinement. The monodromy charge QJ (λ) = h(λ) + h(J) − h(J ∗ λ) mod 1 P ini mod 1 (9.15) = −tr λλN 00 mod 1 = N0 is an important quantity of the conformal field theory. It is conserved in fusion and it relates the matrix elements of the modular matrix Sλλ0 along the orbits of Z: 0
Saλλ0 = e2πia QJ (λ) Sλλ0 .
(9.16)
The condition λ ∈ Ca is equivalent to demanding that QJ (λ) + a0 X ∈ Z ,
(9.17)
where kN 00 (N 0 − 1) k mod 1 (9.18) X = − tr λ2N 00 mod 1 = − 2 2N 0 so that 2X = QJ (J) mod 1. When h(J) = k2 tr λ2N 00 is an integer, the sets Ca coincide for different a and are preserved by the action of Z on the set of integrable weights. In this case, the “pure simple current extension” in the terminology of [34], the partition function (9.13) may be rewritten as 2 2 X X X X 1 0 χ ˆaλ (τ ) = |Zλ | χ ˆλ (τ ) , (9.19) Z (τ ) = 0 N λ∈[λ] [λ] λ∈C0 za ∈Z λ∈C0
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1313
where [λ] runs through the set of the Z-orbits in the set of integrable weights and Zλ denote the corresponding isotropy subgroups of Z (we shall use this notation also below). 9.2. Boundary case For the WZW model based on the simply-connected group G = SU (N ), the open string annular partition function corresponding to branes Ds supported by the conjugacy classes Cτs with s = 0, 1 is formally given by the functional integral expression Z (9.20) ZD0 D1 (T ) = e−Sσ (φ) AD0 D1 (φ)Dφ , where φ : [0, T ] × [0, π] → G with φ(t, 0) ∈ Cτ0 ,
φ(t, π) ∈ Cτ1 ,
φ(T, x) = φ(0, x) ,
(9.21)
i.e. φ is periodic in the time direction. In the Hamiltonian language, ZD0 D1 (T ) = trHD0 D1 e−T (L0 − 24 ) , c
(9.22)
where the open string Hilbert space HD0 D1 is composed of sections of the bundle LD0 D1 over the space of open paths in G. Space HD0 D1 carries a unitary representation of the current algebra gˆ that decomposes into the irreducible representations according to M Wλλ01λ ⊗ Vˆλ , (9.23) H D0 D1 ∼ = λ
where λs = kτs and the multiplicity spaces Wλλ01λ may be naturally identified [27, 28] with the spaces of 3-point conformal blocks of the bulk group SU (N ) level k WZW theory on the sphere with insertions of the primary fields corresponding to the ¯1 and λ. In particular, the dimension of the multiplicity integrable weights λ0 , λ spaces is given by the fusion coefficients: dim(Wλλ01λ ) = Nλλ01λ .
(9.24)
As the result, the annular partition functions of the SU (N ) WZW theory take the form X Ti Nλλ01λ χ ˆλ . (9.25) ZD0 D1 (T ) = 2π λ
In the dual Hamiltonian description, the annular partition functions may be described as closed string matrix elements ZD0 D1 (T ) = hD1 |e−
2π 2 T
¯ 0− c ) (L0 +L 12
| D0 i .
(9.26)
December 11, 2002 9:10 WSPC/148-RMP
1314
00155
K. Gaw¸edzki & N. Reis
The states |Ds i corresponding to branes Ds belong to (a completion of) the closed string Hilbert H. They are combinations of the so called Ishibashi states |λi representing the identity operators of the representation spaces Vˆλ : X eλi ⊗ eλi (9.27) |λi = i
for any orthonormal basis (eλi ) of Vˆλ . Explicitly [9], for the branes Ds supported by the conjugacy classes Cτs with λs = kτs , |Ds i =
X Sλλ p s | λi . S0λ λ
Since hλ| e−
2π 2 T
¯ 0− c ) (L0 +L 12
|λ0 i = δλλ0 χ ˆλ0
(9.28)
2πi T
,
the right hand side of (9.26) is then equal to 0 X Sλλ0 Sλλ 2πi 0 1 0 χ ˆ , λ 0 T S0λ 0
(9.29)
(9.30)
λ
which indeed coincides with the right hand side of (9.25) in virtue of the modular property (9.7) of the affine characters and the Verlinde formula (9.8). For the non-simply-connected group G0 = SU (N )/Z, the annular partition 0 ⊂ G0 function corresponding to branes Ds0 supported by the conjugacy classes C[τ s] is given by the primed versions of (9.20) and (9.22): Z 0 c 0 e−kSσ (φ ) A0D00 D10 (φ0 )Dφ0 = trH0 0 0 e−T (L0 − 24 ) . (9.31) ZD 0 D 0 (T ) = 0 1 D D 0 1
The functional integral is now over fields φ0 : [0, T ] × [0, π] → G0 such that 0 , φ0 (t, 0) ∈ C[τ 0]
0 φ0 (t, π) ∈ C[τ , 1]
φ0 (T, x) = φ0 (0, x) .
(9.32)
Each φ0 may be lifted in N 0 different ways to a twisted periodic map φ : [0, T ] × [0, π] → G such that φ(t, 0) ∈ Cτ0 ,
φ(t, π) ∈ Cτ1 ,
φ(T, x) = za φ(0, x)
(9.33)
for τs ∈ [τs ] and za ∈ Zτ0 ∩Zτ1 ⊂ Z. Expressing the functional integral over fields φ0 in terms of the one over their lifts leads to a natural representation for the boundary partition functions of G0 WZW theory which does not seem to have appeared in the literature, although it is related to the well studied string theory construction of the branes in orbifold theories, see e.g. [14, 15]. Unlike the amplitudes A0D0 D0 (φ0 ), 0 1 which are complex numbers, the ones of field φ are line-bundle valued: AD0 D1 (φ) ∈ (LD˜ 0 D˜ 1 )ϕ˜ ⊗ (LD0 D1 )−1 ϕ ,
(9.34)
where ϕ(x) = φ(T, x), ϕ˜ = φ(0, x) = za−1 ϕ(x) and Ds are the branes of the group G theory supported by the conjugacy classes Cτs .
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1315
How do the amplitudes AD0 D1 (φ) relate to A0D0 D0 (φ0 )? The point is that the 0 1 bundle gerbe Gk0 on G0 together with the branes Ds0 , determine canonically for any ϕ ∈ ICτ0 Cτ1 , za ∈ Z, and ϕ˜ = za−1 ϕ a non-zero element Φ(ϕ, ˜ ϕ) ∈ (LD˜ 0 D˜ 1 )−1 ϕ ˜ ⊗ (LD0 D1 )ϕ ,
(9.35)
where the branes Ds = (Cτs , kQτs , Ns , ιs ) are obtained by the restriction of the 0 , kQ0τs , Ns0 , ι0s ) to the conjugacy classes Cτs for τs ∈ [τs ]. We shall branes Ds0 = (C[τ s] call such group G theory branes Ds compatible with Ds0 . Now, for ϕ = φ(T, ·) and ϕ˜ = φ(0, ·), with za ∈ Zτ0 ∩ Zτs , ˜ ϕ)i A0D00 D10 (φ0 ) = hAD0 D1 (φ), Φ(ϕ,
(9.36)
in the natural pairing. We describe a construction of the elements Φ(ϕ, ˜ ϕ), possessing the multiplicative property ˜˜ ϕ) ˜˜ ϕ) Φ(ϕ, ˜ ⊗ Φ(ϕ, ˜ ϕ) = Φ(ϕ,
(9.37)
zb−1 ϕ, ˜
˜˜ = in Appendix I. The construction relies on the concept for ϕ˜ = za−1 ϕ and ϕ of branes as formulated in the preceding sections. The multiplication by Φ determines an isomorphism between the line bundles LD˜ 0 D˜ 1 and LD0 D1 covering the map za−1 ϕ 7→ ϕ between the spaces ICσa (τ0 ) Cσa (τ1 ) and ICτ0 Cτ1 of open paths in G. Altogether, one obtains the action of Z on the bundle [ LD0 D1 (9.38) L˜0 0 0 = D0 D1
(Ds )
where the union is taken over the branes Ds compatible with Ds0 , s = 0, 1. The multiplicativity of this action follows from (9.37). The line bundle L0D0 D0 over 0 1 0 0 the space ID 0 0 of open paths in G is canonically isomorphic with the quotient 0 D1 bundle L˜0 0 0 /Z. By the formula D0 D1
(aΨ)(ϕ) = Ψ(za−1 ϕ) ⊗ Φ(za−1 ϕ, ϕ) ,
(9.39)
the action of Z may be carried to sections of the line bundle L˜0D0 D0 in the way 0 1 that maps sections of LD˜ 0 D˜ 1 to those of LD0 D1 . Finally, sections of the line bundle L0D0 D0 may be identified with sections of L˜0D0 D0 invariant under the action of Z. 0 1 0 1 The above gives rise to the following simple picture of the open string space of states H0D0 D0 of the group G0 WZW theory. The maps Ψ 7→ aΨ induce the (unitary) 0 1 transformations U (a) : HD˜ 0 D˜ 1 → HD0 D1
(9.40)
between the open string spaces of states of the group G WZW theory. It may be shown that those transformations commute with the current algebra and hence also Virasoro algebra actions. Put together, they define a representation U of the group Z in the Hilbert space M ˜0 0 0 = HD D . (9.41) H D0 D1
0
τ0 ∈[τ0 ] τ1 ∈[τ1 ]
1
December 11, 2002 9:10 WSPC/148-RMP
1316
00155
K. Gaw¸edzki & N. Reis
When restricted to Zτ0 ∩ Zτ1 ⊂ Z, this representation acts diagonally, i.e. within the group G open string spaces HD0 D1 with fixed Ds . The open string space states for the group G0 theory may be naturally identified with the Z-invariant families of states of the group G theory: ˜0 0 0 , H0D00 D10 ' P H D0 D1
(9.42)
1 X U (a) N0
(9.43)
for P =
za ∈Z
denoting for the projector on the Z-invariant subspace. The scalar product in the space H0D0 D0 should, however, be divided by N 0 with respect to the one inherited 0 1 ˜ 0 0 0 to avoid the overcount. from H D0 D1
For the annular partition function of the group G0 theory one obtains this way the Hamiltonian expression: X c 1 X 0 trHD0 D1 e−T (L0 − 24 ) U (a) . (9.44) ZD 0 D 0 (T ) = 0 0 1 N τ ∈[τ ] 0 0 τ1 ∈[τ1 ]
za ∈Zτ0 ∩Zτ1
This indeed is compatible with the functional integral formula if rewrite the functional integral over fields φ0 in (9.31) in terms of the one over fields φ using relation (9.36) and the equality of the sigma model actions Sσ (φ0 ) = Sσ (φ). The factor 1 0 0 N 0 takes care of the N -fold overcount due to the fact that there are N fields φ corresponding to each φ0 . The commutation of the maps U (a) of (9.40) with the the current algebra action implies that they descend to the multiplicity spaces in the decomposition (9.23): a
U (a) : Wλλ01λ → Waλλ01λ .
(9.45)
We may then rewrite expression (9.44) for the open string partition function as X X 1 X Ti 0 (tr λ1 U (a))χ ˆ , (9.46) ZD 0 D 0 (T ) = λ W 0 0 1 λ0 λ N τ ∈[τ ] 2π 0 0 τ1 ∈[τ1 ]
za ∈Zτ0 ∩Zτ1
λ
i.e. in terms of the traces of the action of simple currents on the spaces of 3-point conformal blocks. The actions of simple currents on spaces of genus zero conformal blocs have been defined, up to phases, in [23]. In the action (9.45), the phase freedom 0 in G0 . is fixed by the choice of brane structures Ds0 on the conjugacy classes C[τ s] 0 0 Under the change of the brane structures Ds twisting the isomorphisms ι of (8.3) by the multiplication of vτs ,a by vτ0 s ,a of (8.8) with r = rs , Φ(ϕ, ˜ ϕ) 7→ (−1)
2ar0 N
(−1)−
2ar1 N
Φ(ϕ, ˜ ϕ)
(9.47)
U (a) ,
(9.48)
inducing the transformation U (a) 7→ (−1)
2ar0 N
(−1)−
2ar1 N
i.e. multiplying the representation U by the ratio of characters of Z.
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1317
The annular partition functions for the simple current extension conformal field theories have been described in [22, 44], see also [40]. They fit into the general scheme identified in [3]. In the dual Hamiltonian description they are given by the primed version of the matrix elements (9.26). The states |Ds0 i in (a completion of) the closed string boundary space H0 may now be expressed as combinations of the P λ ei ⊗ eλi in the diagonal components of H0 , see (9.10). Ishibashi states |λ, za i = Let us denote by E the corresponding set of labels, i.e. E = {(λ, a) | λ ∈ Ca , za ∈ Zλ } .
(9.49)
0 corresponding Explicitly, for the branes Ds0 supported by the conjugacy classes C[τ s] to the equivalence classes [λs , ψλs ], where λs = kτs and ψλs are characters of Zτs , see the end of Sec. 8.2,
|Ds0 i = with [22]
(λ,a) X ΨD 0 p s |λ, ai S0λ (λ,a)∈E
(9.50)
√
(λ,a) ΨDs0
=
N0 λ S (a)ψλs (a) . |Zλs | λs
(9.51)
Here Sλλ0 (a) are the matrix elements of modified unitary modular matrices non-zero only if za ∈ Zλ ∩ Zλ0 . They satisfy the identity 0
0
Sbλλ0 (a) = φλ0 (b, a)−1 e2πib (QJ (λ)+a X) Sλλ0 (a) ,
(9.52)
see [21, 22]. The last relation, together with (9.17), assures that the right hand side of (9.51) is independent of the choice of λs ∈ [λs ] if φλ (b, a) is the same as the one used in the definition (8.13) of the equivalence classes [λs , ψλs ]. Recall that the latter was fixed up to the transformations (8.15) with the help of which, as shown in Appendix H, it could be reduced to 1. As described in [21], up to a phase that 0 does not depend on λ and λ0 , the matrix elements Sλλ (a) are equal to the entries ˇ0 Sˇλˇλ of the modular matrix of the WZW theory based on the so called “orbit Lie n00 ), where algebra”. In our case the latter theory is the level nˇk0 one with group SU (ˇ 0 0 00 ˇn ˇ . Writing n ˇ is the order of the subgroup Za of Z generated by za and N = n λ=
00 n ˇX −1
nˇι
ˇ ι=0
with
Pnˇ 00 −1 ˇ ι=0
nˇι =
k n ˇ0 ,
0 n ˇX −1
λˇι+ˇanˇ 00
(9.53)
a ˇ=0
the corresponding weight of the orbit Lie algebra is ˇ= λ
00 n ˇX −1
ˇˇι , nˇι λ
(9.54)
ˇ ι=0
ˇ ˇι are the fundamental weights of SU (ˇ n00 ). One has [21] where λ Sλλ0 (a) = (−1)−
k((ˇ n0 )2 −1)ˇ n00 4ˇ n0
ˇ Sˇλˇλ0 .
(9.55)
December 11, 2002 9:10 WSPC/148-RMP
1318
00155
K. Gaw¸edzki & N. Reis
In particular, Sλλ0 (a) depends on a only through Za . For fixed a, the action of Z on the integral weights λ such that za ∈ Zλ descends to the one of the quotient ˇ The induced action is generated by the fusion group Zˇ ≡ Z/Za on the weights λ. ˇ 1 . The identity ˇ with the simple current J of the SU (ˇ n00 ) theory with the weight nˇk0 λ (9.16) for the orbit group implies now that for zb ∈ Z, ˇ ˇ ˇ ˇ Sˇ(λbλˇ 0 ) = e2πibQJˇ(λ) Sˇλˇλ0 .
(9.56)
In Appendix H we show that ˇ
0
ˇ
0
e2πibQJˇ(λ) = e2πib (QJ (λ)+a X) ,
(9.57)
so that (9.52) holds with φλ0 (b, a) ≡ 1, which is compatible with the results of Sec. 8.2. Relations (9.50) and (9.51), with the use of (9.29) and (9.7), lead to the following expression for the annular partition function: X D0 Ti 0 1 ND0 λ χ ˆλ , (9.58) ZD00 D10 (τ ) = 0 2π λ
where D0 ND01λ 0
=
(λ0 ,a)
X
ΨD 0
1
(λ0 ,a)
ΨD 0 0
0
0
Sλλ
S0λ
(λ0 ,a)∈E
N0 = |Zλ0 kZλ1 |
0
X
0
X Sλλ0 (a)Sλλ (a)Sλλ ψλ (a) 0 0 1 , 0 ψλ1 (a) S0λ 0
(9.59)
za ∈Zλ0 ∩Zλ1 λ ∈Ca
where, again, the right hand side does not depend on the choice of λs ∈ [λs ]. The orthogonality relations [22] X (λ,a) (λ0 ,a0 ) ΨD 0 ΨD 0 = δλλ0 δaa0 (9.60) D0
D0
guarantee that the matrices Nλ = (ND01λ ), whose entries have to be nonnegative 0 integers, represent the fusion algebra [3]: X X D0 D0 D20 λ00 ND01λ ND02λ0 = Nλλ (9.61) 0 N 0 00 . D λ D10
0
1
0
λ00
Search for the representations of the fusion algebra by matrices with entries that are nonnegative integers (the so called “NIM’s”) has been the basis of the approach to classification of boundary conformal field theories developed in [3], see also [41]. Expressions (9.58) with (9.59) are compatible with relation (9.46) if we assume the following formula for the traces of the action of Zλ0 ∩ Zλ1 on the spaces of 3-point conformal blocks conjectured (up to multiplication by characters) in [21]: 0
0
X Sλλ0 (a)Sλλ (a)Sλλ ψλ (a) 0 0 1 trW λ1 U (a) = 0 λ0 λ ψλ1 (a) S0λ 0 λ
(9.62)
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1319
Indeed, the summation of (9.62) over the weights in fixed orbits gives X λ0 ∈[λ0 ] λ1 ∈[λ1 ]
trW λ1 U (a) = λ0 λ
X X 1 |Zλ0 kZλ1 | 0 λ
zb ,zc ∈Z
0
0
0
Sbλλ1 (a)Scλλ0 (a)Sλλ ψλ0 (a) . 0 ψλ1 (a) S0λ
(9.63)
Note that the sum over λ0 is effectively restricted to integrable weights fixed by za . With the use of transformation property (9.52), the sums over zb , zc may be P 0 0 0 factored out as | zb ∈Z e2πib (QJ (λ )+a X) |2 . Since the sum inside that factor divided by N 0 represents the characteristic function of Ca , the identity (9.63) reduces to the relation X trW λ1 U (a) λ0 ∈[λ0 ] λ1 ∈[λ1 ]
λ0 λ
0
0
X Sλλ0 (a)Sλλ (a)Sλλ ψλ (a) (N 0 )2 0 0 1 , = 0 |Zλ0 kZλ1 | 0 ψλ1 (a) S0λ
(9.64)
λ ∈Ca
where on the right hand side the weights λs ∈ [λs ] are fixed and the sum over λ0 is additionally constraint by the requirement λ0 ∈ Ca . Equation (9.64), when inserted into (9.46), reproduces (9.58). One may also consider partition functions that do not resolve different branes 0 0 with the same support. Summing ZD 0 0 (τ ) over the different brane structures D0 0 D1 0 0 and D10 supported by the conjugacy classes C[τ and C[τ , respectively, freezes za 0] 1] in (9.46) to 1 and leads to the unresolved partition functions ZC0 0
C0 [τ0 ] [τ1 ]
(T ) =
X |Zλ kZλ | c 0 1 trHD0 D1 e−T (L0 − 24 ) 0 N λ ∈[λ ] 0 0 λ1 ∈[λ1 ]
X |Zλ kZλ | X Ti 0 1 λ1 = Nλ0 λ χ ˆλ , 0 N 2π λ ∈[λ ]
(9.65)
λ
0 0 λ1 ∈[λ1 ]
where, as usually, λs = kτs . Rewriting the sums over the orbits [λs ] as sums over a the group Z and using the symmetry of the fusion coefficients Naλλ01λ = Nλλ11λ , we finally obtain X X a Ti Nλλ0 λ1 χ ˆλ , (9.66) ZC0 0 C 0 (T ) = [τ0 ] [τ1 ] 2π za ∈Z
λ
where on the right hand side, λs are arbitrary elements in the Z-orbits [λs ]. The relation between the annular partition functions with resolved and unresolved branes was discussed in [35] for the case of pure simple current extensions. The geometric approach based on gerbes should allow to recover within the Lagrangian framework similar relations to the ones discussed above for general orbifold conformal field theories.
December 11, 2002 9:10 WSPC/148-RMP
1320
00155
K. Gaw¸edzki & N. Reis
D2 D1
l2
G
l1
Σ
D0
l0 Fig. 4.
10. Quantum Amplitudes The general quantum amplitudes of the WZW theory based on group G are formally given by the functional integrals Z (10.1) A(Ds ) (Σ) = e−Sσ (φ) A(Ds ) (φ)Dφ over fields φ : Σ → G satisfying boundary conditions (7.1) on closed disjoint subintervals `s of the boundary loops of Σ, with the amplitude A(Ds ) (φ) as in (7.11). Accordingly, we should have O O H D s D s0 ⊗ H . (10.2) A(Ds ) (Σ) ∈ (s,s0 )
m
The (purely) open string amplitudes have no external closed string factors H (although they may have closed strings states propagating in loops). In particular, if Σ is a disc O and fields φ are constrained to map three disjoint subintervals of the boundary into the supports of three branes Ds , s = 0, 1, 2, see Fig. 4, one obtains the quantum open string amplitude AD0 D1 D2 (O) ∈ HD0 D1 ⊗ HD1 D2 ⊗ HD2 D0
(10.3)
that encodes the operator product expansion of the boundary operators. The functional integral representation, together with the geometric interpretation of the relation (9.42) between the spaces of open string states for group G and group G0 WZW models leads to the following relation between the amplitudes (10.3) for the two cases: A0D00 D10 D20 (O) = (N 0 )2 P ⊗ P ⊗ P A˜0D00 D10 D20 (O)
(10.4)
with P is given by (9.43) and M M AD0 D1 D2 (O) ∈ HD0 D1 ⊗ HD1 D2 ⊗ HD2 D0 A˜0D00 D10 D20 (O) = (Ds )
(Ds )
˜0 0 0 ⊗ H ˜0 ˜0 ⊂H D1 D2 ⊗ HD2 D0 , D0 D1
(10.5)
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1321
see (9.41). The direct sums above are over branes Ds compatible with Ds0 , s = 0, 1, 2. An analogous relation, with (N 0 )2 replaced by (N 0 )M−1 , holds for the disc amplitudes with M subintervals mapped into brane supports. The latter amplitudes permit to reconstruct the general open string amplitudes of the WZW theory by gluing. This way one obtains a simple relation between the quantum open string amplitudes for the theory based on the simply connected group G and for its simple current orbifolds. We postpone a more detailed discussion of that point to a future publication. 11. Local Constructions Let us describe at the end the local versions of the constructions inducing from a bundle gerbe on manifold M various geometric structures on the spaces of closed or open curves (strings) in M . 11.1. Closed strings We start by recalling from [25] the construction of local data for a line bundle on LM from local data (gijk , Aij , Bi ) of a gerbe on an open covering (Oi ) of M . Given a split of the circle S 1 into closed intervals b with common vertices v and an assignment b 7→ ib and v 7→ iv , that we collectively abbreviate as I, consider the open subset OI = {ϕ ∈ LM | ϕ(b) ⊂ Oib , ϕ(v) ∈ Oiv }
(11.1)
of the loop space LM . Sets OI for different choices of I cover LM (we may discard I for which OI = ∅). Let us define 1-forms AI on OI ⊂ LM by XZ X ϕ∗ ι(δϕ)Bib + hδϕ(v), Aiv ib i (11.2) hδϕ, AI (ϕ)i = b
b
v∈b
P
with the usual sign convention in v∈b . If I is the collection of (b, v, ib , iv ) and J the one of (b0 , v 0 , jb0 , jv0 ), consider the split of S1 into intersections ¯b of the intervals b and b0 and denote by v¯ its vertices. The new split inherits two label assignments from the original ones. We set j¯b = jb0 if ¯b ⊂ b0 , i¯b = ib if ¯b ⊂ b , ( ( (11.3) iv if v¯ = v , jv0 if v = v 0 , jv¯ = iv¯ = 0 ib if v¯ ⊂ int(b) , jb0 if v¯ ⊂ int(b ) , where int(b) denotes the interior of b. Let us define functions gIJ : OIJ → U (1), " # XZ Y ∗ ϕ Aj¯b i¯b (giv¯ jv¯ j¯b (ϕ(¯ v ))/giv¯ i¯b j¯b (ϕ(¯ v ))) . (11.4) gIJ (ϕ) = exp i ¯ b
¯ b
v ¯∈¯ b
The collection (gIJ , AI ) provides local data for a hermitian line bundle L with connection over LM . The curvature of L is given by the 2-form Ω on LM , see
December 11, 2002 9:10 WSPC/148-RMP
1322
00155
K. Gaw¸edzki & N. Reis
0 (2.16). If (gijk , A0ij , Bi0 ) are equivalent local data related to the original ones by (2.11), then 0 = gIJ fJ fI−1 gIJ
for
" fI−1 (ϕ)
= exp i
A0I = AI + ifI−1 dfI #
XZ b
∗
ϕ Πib b
Y
χiv ib (ϕ(v)) ,
(11.5)
(11.6)
v∈b
i.e. the local data (gIJ , AI ) change to equivalent ones, see (2.4). 11.2. Open strings We may apply the previous constructions to the case of open curves. Using the same formulae (11.1), (11.2) and (11.4) to define an open covering (OI ) of IM , 1-forms AI on OI and U (1)-valued functions on OIJ , we obtain, however, a different structure. Now on OIJK , OI and OIJ , respectively, −1 gJK ) = gi1 j1 k1 ◦ e1 /gi0 j0 k0 ◦ e0 , (gIJ gIK
dAI = Ω + e∗0 Bi0 − e∗1 Bi1 , −1 dgIJ = e∗0 Ai0 j0 − e∗1 Ai1 j1 AJ − AI − igIJ
(11.7)
with Ω given by (2.16) and es being the evaluation maps of (7.10). Let D be a submanifold of M and let Q be a 2-form on D such that dQ = H|D . ¯i = Oi ∩ D and that χij = χ−1 are Suppose, moreover, that Πi are one-forms on O ji ¯ ij such that U (1)-valued functions on O ¯i , (1) On O Q = Bi + dΠi ,
(11.8)
0 = Aij + Πj − Πi − iχ−1 ij dχij ,
(11.9)
¯ ij (2) On O
¯ ijk , (3) On O −1 1 = gijk χ−1 jk χik χij .
(11.10)
Provided that the covering (Oi ) is sufficiently fine, the existence of (χij , Πi ) with the above properties is equivalent to the existence of a stable isomorphism between the restriction to D of the gerbe G constructed from the local data (gijk , Aij , Bi ) and the gerbe on D obtained from the local data (Q|O¯i , 0, 1). The choice of (χij , Πi ) providing the stable isomorphism is determined up to the local data (hij , Ri ) of a flat hermitian line bundle over D. The U (1)-valued functions fI defined on the open ¯ I ⊂ LD by (11.6) satisfy now subsets O gIJ |O¯IJ = fJ−1 fI
(11.11)
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1323
and define a trivialization of the line bundle L|LD defined on LD from the local data (gIJ |O¯IJ ). They permit to assign a numerical value AD (φ) = A(φ)fI−1 (φ|` )
(11.12)
to the field φ : Σ → M if ∂Σ is composed of a single loop ` mapped by φ into D. Here A(φ) is given by the expression (2.14) and the collection I is obtained by restricting the triangulation of Σ and corresponding label assignment to the boundary loop `. The result does not depend of the choices of the triangulation of Σ neither on the label assignments. It does not change under the passage to equivalent local data (gijk , Aij , Bi ) on M if we absorb the transformations (2.11) in the choice of (χij , Πi ) in (11.8) to (11.10). We may, however, always modify that choice by the local data (hij , Ri ) of a flat hermitian bundle P on D. Such a modification multiplies the amplitude AD (φ) by the holonomy of P along φ|` . It gives the local description of the change of the stable isomorphism between the restricted gerbe and the gerbe constructed from the 2-form Q, as discussed in Sec. 7. The generalization of the above discussion to the case when Σ has multiple boundary components with φ mapping (some of) them into branes is straightforward. Let D0 and D1 be two submanifolds of M and H|Ds = dQs with the choices of the data (χsij , Πsi ) as above for each Ds . Consider the subspace ID0 D1 ⊂ IM of curves ending on the branes, see (7.6). We may adapt the definitions (11.2) and (11.4) to the present case by defining ∗ 0 ∗ 0 AD I = AI + e0 Πi0 − e1 Πi0 , D = gIJ (χ0i0 j0 ◦ e0 /χ1i1 j1 ◦ e1 ) . gIJ
(11.13)
D , AD One obtains this way local data (gIJ I ) of a hermitian line bundle LD0 D1 with connection over ID0 D1 M . The terms added in (11.2) and (11.4) are sensitive to the modification of (χsij , Πsi ) by local data of flat bundles Ps on Ds . The net result is the multiplication of LD0 D1 by e∗0 P0 ⊗ e∗1 P1−1 . This does not effect the curvature ΩD of LD0 D1 given by (7.9). Upon a change to the equivalent local data 0 , A0ij , Bi0 ), the relations (11.5) still hold for fI given by (11.6), provided we (gijk 0s absorb the changes in the choice of (χ0s ij , Πi ). The line bundle LD0 D1 constructed D D from the local data (gIJ , AI ) is canonically isomorphic to the line bundle LD0 D1 (G) for the gerbe G obtained from the local data (gijk , Aij , Bi ), provided that one uses the data (χsij , Πsi ) to construct the stable isomorphism ιs of (7.3) between the restrictions of G and Ks .
12. Conclusions We have shown how the concept of a bundle gerbes with connection may be applied to resolve Lagrangian ambiguities in defining sigma models in the presence of the antisymmetric tensor field B determined locally up to closed form contributions.
December 11, 2002 9:10 WSPC/148-RMP
1324
00155
K. Gaw¸edzki & N. Reis
This was done both in closed string geometry and for open strings stretching between branes. Application of that approach to the WZW models based on groups covered by SU (N ) has permitted to recover within the Lagrangian approach the classification of symmetric branes. It has also allowed to make precise a straightforward relation between the quantum open string amplitudes for the WZW models based on simply connected groups and for their simple current orbifolds. Those relations are simpler than the ones for closed string amplitudes where the appearance of twisted sectors complicates the analysis. They should extend to more general orbifold theories. There are further problems to which one may try to apply the geometric methods based on gerbes within the Lagrangian approach to conformal field theory. The case of non-orientable worldsheets has not been discussed in the present paper. The analysis of gerbes entering the WZW models with other groups,b with applications to the classification of symmetric and symmetry-breaking branes in general nonsimply connected groups, including the SO(2N )/Z2 case with discrete torsion, is an open problem. Neither have the coset models been treated within this framework. Supersymmetric extensions require a modification of the approach presented here to take care of the fermionic anomalies [20]. This issue was tacled in the recent preprint [11]. Finally, the bundle gerbes should be useful in analyzing open string models coupled to non-abelian Chan–Patton degrees of freedom, including the fractional branes in general orbifolds [14], and in the description of the Ramond–Ramond brane charges [4]. We plan to return to some of those issues in the future. Appendix A For reader’s convenience, we gather here the basic facts about discrete group cohomology, see [5, 12, 16]. Let Γ be a discrete group with elements γ0 , γ1 , . . . and U an abelian group on which Γ acts (possibly trivially). We shall use the multiplicative notation for the product both in Γ and U and for the action of Γ on U. In our applications, Γ will be a subgroup Z of the center of SU (N ) and U will be equal to U (1) or to the group of U (1)-valued functions on the orbit of Z in the Weyl alcove of su(N ). In general, the abelian group C n (Γ, U) of n-cochains on Γ with values in U is composed of maps Γn 3 (γ1 , . . . , γn ) 7→ uγ1 ,...,γn ∈ U .
(A.1)
Consider the group homomorphisms d : C n (Γ, U) → C n+1 (Γ, U) defined by the formula ! n Y n+1 (−1)m uγ1 ,...,γm γm+1 ,...,γn+1 u(−1) (A.2) (du)γ1 ,...,γn+1 = (γ1 uγ2 ,...,γn+1 ) γ1 ,...,γn . m=1 b The
gerbes on other simple simply-connected groups have been recently explicitly constructed in Ref. 37.
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1325
For n = 0, 1, 2, 3, the cases relevant for this paper, this gives (du)γ = (γu)u−1 , (du)γ1 ,γ2 = (γ1 uγ2 )u−1 γ1 γ2 u γ1 , −1 (du)γ1 ,γ2 ,γ3 = (γ1 uγ2 ,γ3 )u−1 γ1 γ2 ,γ3 uγ1 ,γ2 γ3 uγ1 ,γ2 , −1 (du)γ1 ,γ2 ,γ3 ,γ4 = (γ1 uγ2 ,γ3 ,γ4 )u−1 γ1 γ2 ,γ3 ,γ4 uγ1 ,γ2 γ3 ,γ4 uγ1 ,γ2 ,γ3 γ4 uγ1 ,γ2 ,γ3 .
(A.3)
The square of d vanishes and the cohomology groups of Γ with values in U are defined as {u ∈ C n (Γ, U) | du = 1} . (A.4) H n (Γ, U) = dC n−1 (Γ, U) For the special case of Γ ∼ = Zp with generator γ0 and n = 1, 2, . . . , {u ∈ U | γu = u for γ ∈ Γ} , H 2n (Γ, U) ∼ = Q r { p−1 r=0 (γ0 u) | u ∈ U} Qp−1 {u ∈ U | r=0 (γ0r u) = 1} , H 2n−1 (Γ, U) ∼ = {(γ0 u)u−1 | u ∈ U}
(A.5)
see [12, 16]. In particular, for the trivial action of Γ on U (1), H 2n (Γ, U (1)) ∼ = {1} ,
H 2n−1 (Γ, U (1)) ∼ = Zp
(A.6)
T
and for U (1) being the group of U (1)-valued function on the set T with the action of Γ induced from that on T , H 2n (Γ, U (1)T ) ∼ = {1} ,
H 2n−1 (Γ, U (1)T ) ∼ =
×
[τ ]∈T /Γ
Zτ ,
(A.7)
where Zτ denotes the stabilizer subgroup of τ ∈ T (which depends only on the Γ-orbit [τ ] of τ ). The discrete group cohomology should be distinguished from the one for Lie groups appearing also in the paper. The latter is defined as for general topological ˇ spaces, e.g. by the Cech construction. Appendix B To check the associativity of the product µ0 of (4.5) over Oijln ⊂ Y 0[4] with j = [j 0 + a], l = [l0 + b] and n = [n0 + c], we first calculate µ0 (µ0 ([γ, ζ]kλij ⊗ [γwa−1 , ζ 0 ]kλj0 [l−a] ) ⊗ [γwb−1 , ζ 00 ]kλl0 [n−b] ) = µ0 ([γ, uijl ζζ 0 ]kλil ⊗ [γwb−1 , ζ 00 ]kλl0 [n−b] ) = [γ, uijl uiln ζζ 0 ζ 00 ]kλin .
(B.1)
On the other hand, using the identity −1 w[b−a] wa wb−1 , ζ 00 ]kλl0 [n−b] [γwb−1 , ζ 00 ]kλl0 [n−b] = [γwa−1 w[b−a] −1 = [γwa−1 w[b−a] , χkλl0 [n−b] (w[b−a] wa wb−1 )ζ 00 ]kλl0 [n−b]
(B.2)
December 11, 2002 9:10 WSPC/148-RMP
1326
00155
K. Gaw¸edzki & N. Reis
that follows from the equivalence relation (3.20) since w[b−a] wa wb−1 lies in the Cartan subgroup T , we infer that µ0 ([γwa−1 , ζ 0 ]kλj0 [l−a] ⊗ [γwb−1 , ζ 00 ]kλl0 [n−b] ) = [γwa−1 , uj 0 [l−a][n−a] χkλl0 [n−b] (w[b−a] wa wb−1 )ζ 0 ζ 00 ]kλj0 [n−a] .
(B.3)
Another application of µ0 gives then: µ0 ([γ, ζ]kλij ⊗ µ0 ([γwa−1 , ζ 0 ]kλj0 [l−a] ⊗ [γwb−1 , ζ 00 ]kλl0 [n−b] )) = [γ, uijn uj 0 [l−a][n−a] χkλl0 [n−b] (w[b−a] wa wb−1 )ζζ 0 ζ 00 ]kλin .
(B.4)
Comparing (B.1) and (B.4), we find condition (4.6) for the associativity of µ0 . Appendix C We describe here the construction of the “quotient gerbe” Gk0 on G0 = SU (N )/Z along the lines of Sec. 5. The resulting gerbe coincides with the gerbe on G0 constructed in Sec. 4. With M = SU (N ) and the gerbe G k = (Y, kB, Lk , µk ) constructed in Sec. 3 we shall take as Γ the subgroup Z ∼ = ZN 0 of the center of SU (N ) acting on M by the Fr (left or right) multiplication. Recall that Y = i=0 Oi . For γ = za ∈ Γ, where a is divisible by N 00 = N/N 0 , Zγ = Y ×M Yγ = {((g, i), (za−1 g, j 0 )) | g ∈ Oi , za−1 g ∈ Oj 0 } G ∼ Oij , =
(C.1)
i,j
where j = [j 0 + a]. We may take the bundle N γ over Zγ to be equal to ρ∗ij Lkλij over the Oij component. Since kBj 0 (za−1 g) − kBi (g) = kBj (g) − kBi (g) = kρ∗ij Fλij (g) ,
(C.2)
the relation (5.1) is satisfied. The isomorphism ιγ of (5.3), upon taking y1 = (g, i1 ) ,
y2 = (g, i2 ) ,
y10 = (za−1 g, j10 ) ,
y20 = (za−1 g, j20 )
(C.3)
with g = γe2πiτ γ −1 ∈ Oi1 i2 j1 j2 , may be defined by ιγ ([γ, ζ]kλi1 i2 ⊗ [γ, ζ 0 ]−1 kλi
i j1
⊗ [γ, ζ 00 ]kλi2 j2 ) = [γwa−1 , ζζ 0 ζ 00 ]kλj0 j0 .
(C.4)
1 2
For YΓ = Y with the projection on G0 and (y, y 0 , y 00 ) ∈ YΓ , [3]
y = (g, i) ,
y 0 = (za−1 g, j 0 ) ,
y 00 = (zb−1 g, l0 ) ,
j = [j 0 + a] ,
l = [l0 + b] ,
(C.5)
the map −1 0 −1 00 −1 0−1 00 ζ ζ , ijl] [γ, ζ]−1 kλij ⊗ [γwa , ζ ]kλj0 [l−a] ⊗ [γ, ζ ]kλil 7→ [g, ζ
(C.6)
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1327
˜ γ1 ,γ2 , see defines for γ1 = za and γ2 = za−1 zb an isomorphism between the bundle R (5.8), and the pullback of the flat bundle !, G γ1 ,γ2 = Oijl × C ∼ (C.7) R i,j 0 ,l0
over SU (N ), where the equivalence relation ∼ is defined by (g, ζ1 , i1 j1 l1 ) ∼ (g, ζ2 , i2 j2 l2 ) if
−1 ζ1 = ζ2 χkλl0 l0 (wb wa−1 w[b−a] ).
(C.8)
1 2
For (y, y 0 , y 00 , y 000 ) ∈ YΓ with y 000 (zc−1 g, n0 ), n = [n0 + c] and γ3 = zb−1 zc , the isomorphism (5.10) identifies [4]
−1 ) [g, ζ, ijl] ⊗ [g, ζ 0 , iln] ∼ = χkλl0 [n−b] (wb wa−1 w[b−a]
· [g, ζ, ijn] ⊗ [za−1 g, ζ 0 , j 0 [l − a][n − a]] .
(C.9)
The flat bundle P γ will be taken trivial and the isomorphisms ιγ1 ,γ2 of (5.11) will be defined by (g, ζ) ⊗ (za−1 g, ζ 0 ) → (g, ζζ 0 ) ⊗ [g, uijl , ijl]
(C.10)
for g ∈ Oijl and uijl ∈ U (1). According to the definition of the classes [g, uijl , ijl], see (C.10), we must have −1 )ui1 j1 l1 ui2 j2 l2 = χkλl0 l0 (wb wa−1 w[b−a]
(C.11)
1 2
for the same a and b that we suppressed in the notation for uijl . Equation (4.10) is a special case of the above relation. Property (5.12) reduces to (4.6) which is consistent with the transformation properties (C.11). It is then enough to consider u0ab ≡ ua[b−a] in which case (4.6) reduces to (4.7). Appendix D We shall prove here that the amplitude A(φ) of Eq. (6.2), when interpreted as a number following the procedure described in Sec. 6, coincides with the expression R R (2.14). First, note that c φ∗c B = c φ∗ Bic . Next, observe that the holonomies in L are Z O ∗ sic ib (yc , yb ) . H(φcb ) = exp i φ Aic ib b
v∈b
We have then to compute the numbers assigned for every vertex v to sib1 ic1 (yb1 , yc1 ) ⊗ sic1 ib2 (yc1 , yb2 ) ⊗ · · · ⊗ sicn ib1 (ycn , yb1 ) , see Fig. 1. For iv such that v ∈ Oiv and yv = σiv (φ(v)), we shall insert at every second place in the last chain the tensor sibr iv (ybr , yv ) ⊗ siv ibr (yv , ybr ) mapped by µ to 1 ∈ L(ybr ,ybr ) . This permits to split the chain to the blocks siv ibr (yv , ybr ) ⊗ sibr icr (ybr , ycr ) ⊗ sicr ibr+1 (ycr , ybr+1 ) ⊗ sibr+1 iv (ybr+1 , yv ) .
December 11, 2002 9:10 WSPC/148-RMP
1328
00155
K. Gaw¸edzki & N. Reis
The latter give rise under µ to the factors giv ibr icr (φ(v))gi−1 (φ(v)) which build v ibr+1 icr up the product appearing in (2.14). Appendix E Let us show that the isomorphism ι0 of (8.3) between L0 ⊗ p∗1 N 0−1 ⊗ p∗2 N 0 and the trivial bundle Z 0[2] × C intertwines the groupoid multiplication if and only if the relation (8.5) holds. Let y1 = (g, i), y2 = (za−1 g, j 0 ), and y3 = (zb−1 g, l0 ) for g = γe2πiτ γ −1 . Consider the elements −1 f1 = [γ, 1]kλij ⊗ [γ, 1]−1 k(τ −λi ) ⊗ [γwa , 1]k(σa (τ )−λj0 )
(E.1)
in the fiber (L0 ⊗ p∗1 N 0−1 ⊗ p∗2 N 0 )(y1 ,y2 ) and f2 = [γwa−1 , 1]kλj0 [l0 +b−a] ⊗ [γwa−1 , 1]−1 k(σa (τ )−λ
j0 )
−1 ⊗ [γwa−1 w[b−a] , 1]k(σb (τ )−λl0 ) 0
in the fiber (L ⊗
p∗1 N 0−1
⊗
p∗2 N 0 )(y2 ,y3 ) .
(E.2)
The product of those two elements,
−1 −1 µ0 (f1 ⊗ f2 ) = [γ, uijl ]kλil ⊗ [γ, 1]−1 k(τ −λi ) ⊗ [γwa w[b−a] , 1]k(σb (τ )−λl0 ) ,
(E.3)
see (4.5), where the last tensor factor may be rewritten as −1 )]k(σb (τ )−λl0 ) . [γwb−1 , χk(σb (τ )−λl0 ) (wb wa−1 w[b−a]
(E.4)
0
From the definition of the isomorphism ι we have ι0 (f1 ) = (y1 , y2 , vτ,a ) ,
(E.5a)
ι0 (f2 ) = (y2 , y3 , vσa (τ ),[b−a] ) ,
(E.5b)
−1 )) . ι0 (µ0 (f1 ⊗ f2 )) = (y1 , y3 , vτ,b uijl χk(σb (τ )−λl0 ) (wb wa−1 w[b−a]
(E.5c)
The product of the first two elements of the trivial bundle over Z 0[2] is equal to the third one if and only if −1 ) vτ,a vσa (τ ),[b−a] = vτ,b uijl χk(σb (τ )−λl0 ) (wb wa−1 w[b−a] −1 )ua[b−a] , = vτ,b χkσb (τ ) (wb wa−1 w[b−a]
(E.6)
where the last equality follows from (4.10). Upon the shift b 7→ [a + b] this reduces to (8.5). Appendix F Here we prove the cocycle identity (8.6). The left hand side is equal to −1 −1 χkσ[a+b+c] (τ ) (w[b+c] wb−1 wa−1 )χ−1 kσ[a+b+c] (τ ) (w[a+b+c] w[a+b] wc ) −1 −1 −1 )χ−1 · χkσ[a+b+c] (τ ) (w[a+b+c] wa−1 w[b+c] kσ[a+b] (τ ) (w[a+b] wa wb ) −1 · ubc u−1 [a+b]c ua[b+c] ubc .
(F.1)
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1329
With the use of identity χkσa (τ ) (wa twa−1 ) = χkτ (t)χ−1 kλa (t)
(F.2)
holding for t in the Cartan subgroup and of relation (4.7), this may be rewritten as −1 −1 χkσ[a+b+c] (τ ) (w[b+c] wb−1 wc−1 )χ−1 kσ[a+b+c] (τ ) (w[a+b+c] w[a+b] wc ) −1 −1 −1 −1 )χ−1 · χkσ[a+b+c] (τ ) (w[a+b+c] wa−1 w[b+c] kσ[a+b] (τ ) (wc w[a+b] wa wb wc )
(F.3)
which is equal to 1 by the multiplicativity of the characters. Appendix G λ0 given by (8.12) provides a special solution of (8.5) for Let us prove that vτ,a = vτ,a 0 00 N even and N odd. For τ = σd (τ0 ), we have −1 vτ,a = ψ([a + d], b)−1 ψ(d, [a + b])ψ(d, a)−1 vτ0 ,b vτ−1 vτ0 ,a vσa (τ ),b vτ,[a+b] 0 ,[a+b] −1 −1 −1 = ψ(a, b)−1 χ−1 kλd (u[a+b] ua ub )vτ0 ,b vτ0 ,[a+b] vτ0 ,a ,
(G.1)
where we have used the relations (8.11). Since, by (F.2), −1 −1 Vσd (τ0 ),ab = Vτ0 ,ab χ−1 kλd (u[a+b] ua ub ) ,
(G.2)
the identity (8.5) will follow if we show that vτ0 ,a = Vτ0 ,ab ψ(a, b) . vτ0 ,b vτ−1 0 ,[a+b]
(G.3)
The left hand side is −1 −1 χ−1 kτ0 (ub )χkτ0 (u[a+b] )χkτ0 (ua )χkλb (ub )χkλ[a+b] (u[a+b] )χkλa (ua )
( ·
1 b2 −[a+b]2 +a2 2n00
(−1) (
−1 −1 −1 −1 −1 −1 = χkτ0 (u[a+b] u−1 a ub )χkλ[a+b] (u[a+b] ua ub )χkλa (ub )χkλb (ua ) ·
1 (−1)− n00 ab
(G.4) and it indeed coincides with the right hand side, as may be seen from the relation −1 −1 Vτ,ab = χkτ (wa−1 wb−1 w[a+b] )χ−1 kλ[a+b] (wa wb w[a+b] )uab
(G.5)
following from the definition (8.4) and (F.2). Appendix H Here we show that the modification (8.15) with an appropriately chosen character ρλ0 (a) of Z depending on λ0 = kτ0 with τ0 ∈ [τ ] trivializes the cocycle φλ0 (b, a) of (8.14). We only have to consider the case of N 0 even and N 00 odd since for the other cases φλ0 (b, a) = 1. First, for za ∈ Zτ , i.e. for a = a00 n00 ,
December 11, 2002 9:10 WSPC/148-RMP
1330
00155
K. Gaw¸edzki & N. Reis
a00 (n0 −a00 )n00 bk − n0
ψ(b, a) = (−1) 1 =
·
1
k even , n0 k for 0 odd , n for
(−1)−a00 b
k even , n0 k for 0 odd , n
for
(−1)a00 b(n00 −1)
(H.1)
see (8.10b). Let ρ0λ0 (a) = (−1)a with ρ1 defined for
k n0
Pn00 −1 i0 =0
i0 ni0
,
a
ρ1λ0 (a) = (−1) n00
Pn00 −1 i0 =0
i0 ni0
,
(H.2)
odd. A direct check shows that for a = a00 n00 , 0 χ−1 λ0 (ua )ρλ0 (a)
(H.3)
does not depend on the choice of τ0 ∈ [τ ]. It follows that, for nk0 even, φλ0 is Pn00 −1 trivialized by (8.15) if we take ρ = ρ0 . Finally, for nk0 odd, i0 =0 i0 ni0 preserves or changes its parity under the shift λ0 7→ bλ0 for b even or odd, respectively, so that ρ0bλ0 (a)ρ1λ0 (a) ρ0λ0 (a)ρ1bλ0 (a)
00
= (−1)a
b(n00 −1)
(H.4)
and, as the result, φλ0 (b, a) is trivialized by (8.15) with ρ = ρ1 . On the other hand, relations (9.52) and (9.56), together with the proportionality ˇ of the matrix elements Sλλ0 and Sˇλˇλ0 imply that 0
0
ˇ
ˇ
φλ0 (b, a) = e2πib (QJ (λ)+a X)−2πibQJˇ(λ)
(H.5)
with λ another weight such that aλ = λ. In particular, φλ0 (b, a) appearing in (9.52) is λ0 -independent. With the use of expressions (9.15), together with (9.53) and (9.54), one checks that 0
ˇ
ˇ
e2πib QJ (λ)−2πibQJˇ(λ) ˇ bk( n 0 −1)
= (−1)
0
n ˇ =
(−1)b
for N 0 even, N 00 odd,
1
otherwise .
k odd , n ˇ0
(H.6)
On the right hand side, the condition that nˇk0 be odd may be replaced by the 00 requirement that nk0 be odd if we at the same time we replace (−1)b by (−1)a b for a = a00 n00 . On the other hand, (−1)−a00 b for N 0 even, N 00 odd, k odd , a00 bk(N −1) 0 0 n0 (H.7) = e2πia b X = (−1)− n0 1 otherwise . It follows that φλ0 (b, a), as given by (H.5), is equal to 1.
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1331
Appendix I We construct here the canonical element Φ(ϕ, ˜ ϕ) of (9.35), where ϕ : [0.π] → G with −1 ϕ(0) ∈ Cτ0 , ϕ(π) ∈ Cτ1 and ϕ˜ = za ϕ for za ∈ Z. Let us choose, for a sufficiently fine split (partition) of [0, π] into subintervals b, arbitrary lifts φb and φ˜b of ϕ|b and F ϕ| ˜ b to Y = Oi . In other words, φ˜b = (ϕ| ˜ b , ˜ib )
φb = (ϕ|b , ib ) ,
(I.1)
˜ ⊂ O˜ib . Let ψb denote the for some choice of indices such that ϕ(b) ⊂ Oib and ϕ(b) mapping from b ⊂ [0, π] to Y 0[2] defined by ψb (x) = (φ˜b (x), φb (x)) .
(I.2)
˜ respectively, for vertices v of We also choose lifts yv and y˜v to Y of ϕ(v) and ϕ(v), the partition of [0, π]. Let us set ! O 0−1 0−1 y{0} , y{0} , 1) ⊗ ι1 (y{π} , y˜{π} , 1) ⊗ HL0 (ψb ) , (I.3) Φ(ϕ, ˜ ϕ) = ι0 (˜ b⊂[0,π]
˜ ϕ) where ι0s are the bundle isomorphisms given by (8.3). We shall show how Φ(ϕ, may be considered in a canonical way as an element of the line (LD˜ 0 D˜ 1 )−1 ϕ ˜ ⊗ (LD0 D1 )ϕ . As it stands, 0 Φ(ϕ, ˜ ϕ) ∈ L0(˜y{0},y{0} ) ⊗ (N00 )−1 y˜{0} ⊗ (N0 )y{0}
⊗ L0(y{π},˜y{π} )
⊗
(N10 )−1 y{π}
⊗
(N10 )y˜{π}
O
⊗
! L0(˜yb ,yb )
(I.4)
v∈b⊂[0,π]
where yb = φb (v) and y˜b = φ˜b (v). With the use of the groupoid multiplication, the last factor is canonically isomorphic to O (L0(˜yb ,˜yv ) ⊗ L0(˜yv ,yv ) ⊗ L0(yv ,yb ) ) . (I.5) v∈b⊂[0,π]
But the line L0(˜y{0},y{0} )
⊗
!
O
L0(˜yv ,yv )
⊗ L0(y{π},˜y{π} )
(I.6)
v∈b⊂[0,π]
is canonically trivial since the factors appear in dual pairs. We infer that, in a canonical way, ! O 0 −1 0 L(˜yb ,˜yv ) ⊗ (N10 )y˜{π} Φ(ϕ, ˜ ϕ) ∈ (N0 )y˜{0} ⊗ v∈b⊂[0,π]
⊗ (N00 )y{0}
⊗
O v∈b⊂[0,π]
! L0(yv ,yb )
⊗ (N10 )−1 y{π} .
(I.7)
December 11, 2002 9:10 WSPC/148-RMP
1332
00155
K. Gaw¸edzki & N. Reis
Recalling that, by construction, the lines bundles Ns0 over the subsets π −1 (Cτs ) ⊂ Y coincide with the bundles Ns and using the definitions (6.7) and (7.7), we infer that the last line is canonically isomorphic with the line (LD˜ 0 D˜ 1 )−1 ϕ ˜ ⊗ (LD0 D1 )ϕ . The fact that the isomorphisms ι0s preserve the groupoid multiplication and the associativity of the groupoid multiplication in L0 result in the canonical identification (9.37). We leave the details to the reader. Finally, formula (9.36) is a consequence of the fact that, from the point of view of group G0 , multiplication by Φ gives the canonical isomorphism used to identify two different realizations of the same fiber of the line bundle L0D0 D0 . 0
1
Acknowledgment The first author is a membre du C.N.R.S. The second author is supported by the grant Praxis XXI/BD/18138/98 from FCT (Portugal). References [1] A. Yu. Alekseev and V. Schomerus, D-branes in the WZW model, Phys. Rev. D60 (1999) R061901–R061902. [2] O. Alvarez, Topological quantization and cohomology, Commun. Math. Phys. 100 (1985) 279–309. [3] R. E. Behrend, P. A. Pearce, V. B. Petkova and J.-B. Zuber, Boundary conditions in rational conformal field theories, Nucl. Phys. B579 (2000) 707–773. [4] P. Bouwknegt, A. L. Carey, V. Mathai, M. K. Murray and D. Stevenson, Twisted K-theory and K-theory of bundle gerbes, arXiv:hep-th/0106194. [5] K. S. Brown, Cohomology of Groups, Graduate Text in Mathematics, Vol. 87, Springer, Berlin, 1982. [6] J.-L. Brylinski, Loop Spaces, Characteristic Classes and Geometric Quantization, Prog. Math. 107, Birkh¨ auser, Boston, 1993. [7] C. Callan, J. Harvey and A. Strominger, Worldbrane actions for string solitons, Nucl. Phys. B367 (1991) 60–82. [8] A. Cappelli, C. Itzykson and J.-B. Zuber, The A-D-E classification of minimal and (1) A1 conformal invariant theories, Commun. Math. Phys. 113 (1987) 1–26. [9] J. L. Cardy, Boundary conditions, fusion rules and the Verlinde formula, Nucl. Phys. 324 (1989) 581–598. [10] A. Carey, J. Mickelsson and M. Murray, Bundle gerbes applied to quantum field theory, Rev. Math. Phys. 12 (2000) 65–90. [11] A. L. Carey, S. Johnson and M. K. Murray, Holonomy on D-Branes, arXiv:hepth/0204199. [12] H. Cartan, Homologie des Groupes, III–IV, S´eminaire H. Cartan, ENS, 1951–1952. [13] D. S. Chatterjee, On gerbes, Cambridge University theses, 1998. [14] D.-E. Diaconescu, M. R. Douglas and J. Gomis, Fractional branes and wrapped branes, J. High Energy Phys. (1998) (2) 013. [15] D.-E. Diaconescu and J. Gomis, Fractional branes and boundary states in orbifold theories, J. High Energy Phys. (2000) (10) 001. [16] S. Eilenberg, Homologie des Groupes, I–II, S´eminaire H. Cartan, ENS, 1951–1952. [17] H. Esnault and E. Viehweg, Deligne–Beilinson cohomology, in Beilinson Conjectures on Special Values of L-Functions, eds. M. Rapaport, N. Shappacher and P. Schneider, Perspectives in Mathematics Vol. 4, Academic Press, 1988, pp. 43–92.
December 11, 2002 9:10 WSPC/148-RMP
00155
WZW Branes and Gerbes
1333
[18] G. Felder, J. Fr¨ ohlich, J. Fuchs and C. Schweigert, Conformal boundary conditions and three-dimensional topological field theory, Phys. Rev. Lett. 84 (2000) 1659–1662. [19] G. Felder, K. Gaw¸edzki and A. Kupiainen, Spectra of Wess–Zumino–Witten models with arbitrary simple groups, Commun. Math. Phys. 117 (1988) 127–158. [20] D. S. Freed and E. Witten, Anomalies in string theory with D-branes, arXiv:hepth/9907189. [21] J. Fuchs, B. Schellekens and C. Schweigert, A matrix S for all simple current extensions, Nucl. Phys. B473 (1996) 323–366. [22] J. Fuchs, L. R. Huiszoon, A. N. Schellekens, C. Schweigert and J. Walcher, Boundaries, crosscaps and simple currents, Phys. Lett. B495 (2000) 427–434. [23] J. Fuchs and C. Schweigert, The action of outer automorphisms on bundles of chiral blocks, Commun. Math. Phys. 206 (1999) 691–736. [24] P. Gajer, Geometry of Deligne cohomology, arXiv:alg-geom/9601025. [25] K. Gaw¸edzki, Topological actions in two-dimensional quantum field theories, in NonPerturbative Quantum Field Theory, eds. G. ’t Hooft, A. Jaffe, G. Mack, P. K. Mitter and R. Stora, Plenum Press, New York, 1988, pp. 101–142. [26] K. Gaw¸edzki, Conformal field theory: a case study, in Conformal Field Theory: New Non-Perturbative Methods in String and Field Theory, eds. Y. Nutku, C. Saclioglu and T. Turgut, Perseus 2000. [27] K. Gaw¸edzki, Boundary WZW, G/H, G/G and CS theories, arXiv: hep-th/0108044. [28] K. Gaw¸edzki, I. Todorov and P. Tran-Ngoc-Bich, Canonical quantization of the boundary Wess–Zumino–Witten model, arXiv:hep-th/0101170. [29] J. Giraud, Cohomologie Non-Ab´ elienne, Grundl. Vol. 179, Springer, 1971. [30] N. Hitchin, Lectures on special Lagrangian submanifolds, arXiv:math.DG/9907034. [31] V. G. Kac, Infinite Dimensional Lie Algebras, Cambridge University Press, 1985. [32] A. Kirillov, Elements of the Theory of Representations, Berlin, Heidelberg, New York, Springer, 1975. [33] B. Kostant, Quantization and Unitary Representations, Lecture Notes in Math., Vol. 170, Springer, Berlin, 1970, pp. 87–207. [34] M. Kreuzer and A. N. Schellekens, Simple currents versus orbifolds with discrete torsion — a complete classification, Nucl. Phys. B411 (1994) 97–121. [35] K. Matsubara, V. Schomerus and M. Smedback, Open strings in simple current orbifolds, Nucl. Phys. B626 (2002) 53–72. [36] G. Moore and N. Seiberg, Taming the conformal Zoo, Phys. Lett. B220 (1989) 422–430. [37] E. Meinrenken, The basic gerbe over a compact simple Lie group, arXiv:math. DG/0209194. [38] M. K. Murray, Bundle gerbes, J. London Math. Soc. 54(2) (1996) 403–416. [39] M. K. Murray and D. Stevenson, Bundle gerbes: stable isomorphisms and local theory, J. London Math. Soc. 62(2) (2000) 925–937. [40] V. B. Petkova and J.-B. Zuber, From CFT’s to graphs, Nucl. Phys. B463 (1996) 161–193. [41] V. B. Petkova and J.-B. Zuber, Conformal boundary conditions and what they teach us, arXiv:hep-th/0103007. [42] E. R. Sharpe, Discrete torsion and gerbes I, II, arXiv:hep-th/9909108 and 9909120. [43] E. R. Sharpe, Discrete torsion, quotient stacks, and string orbifolds, arXiv:math. DG/0110156. [44] C. Schweigert, J. Fuchs and J. Walcher, Conformal field theory, boundary conditions and applications to string theory, arXiv:hep-th/0011109.
December 11, 2002 9:10 WSPC/148-RMP
1334
00155
K. Gaw¸edzki & N. Reis
[45] C. Vafa, Modular invariance and discrete torsion on orbifolds, Nucl. Phys. B273 (1986) 592–606. [46] E. Witten, Non-abelian bosonization in two dimensions, Commun. Math. Phys. 92 (1984) 455–472. [47] E. Witten, Quantum field theory and the Jones polynomial, Commun. Math. Phys. 121 (1989) 351–399. [48] M. Mackaay and R. Picken, Holonomy and parallel transport for Abelian gerbes, arX:vimath.DG/0007053.
December 11, 2002 9:50 WSPC/148-RMP
00154
Reviews in Mathematical Physics, Vol. 14, No. 12 (2002) 1335–1401 c World Scientific Publishing Company
EUCLIDEAN GIBBS STATES OF QUANTUM LATTICE SYSTEMS
S. ALBEVERIO Abteilung f¨ ur Stochastik, Universit¨ at Bonn, D 53115 Bonn (Germany); SFB 256; BiBoS Research Center, Bielefeld (Germany); CERFIM, Locarno and USI (Switzerland)
[email protected] YU. KONDRATIEV Fakult¨ at f¨ ur Mathematik, Universit¨ at Bielefeld, D 33615 Bielefeld (Germany); SFB 256; BiBoS Research Center, Bielefeld (Germany); Institute of Mathematics, Kiev (Ukraine)
[email protected] YU. KOZITSKY Instytut Matematyki, Uniwersytet Marii Curie-Sklodowskiej PL 20-031 Lublin (Poland)
[email protected] ¨ M. ROCKNER Fakult¨ at f¨ ur Mathematik, Universit¨ at Bielefeld, D 33615 Bielefeld (Germany)
[email protected] Received 22 March 2001 Revised 16 June 2002 An approach to the description of the Gibbs states of lattice models of interacting quantum anharmonic oscillators, based on integration in infinite dimensional spaces, is described in a systematic way. Its main feature is the representation of the local Gibbs states by means of certain probability measures (local Euclidean Gibbs measures). This makes it possible to employ the machinery of conditional probability distributions, known in classical statistical physics, and to define the Gibbs state of the whole system as a solution of the equilibrium (Dobrushin–Lanford–Ruelle) equation. With the help of this representation the Gibbs states are extended to a certain class of unbounded multiplication operators, which includes the order parameter and the fluctuation operators describing the long range ordering and the critical point respectively. It is shown that the local Gibbs states converge, when the mass of the particle tends to infinity, to the states of the corresponding classical model. A lattice approximation technique, which allows one to prove for the local Gibbs states analogs of known correlation inequalities, is developed. As a result, certain new inequalities are derived. By means of them, a number of statements describing physical properties of the model are proved. Among them are: the existence of the long-range order for low temperatures and large values of 1335
December 11, 2002 9:50 WSPC/148-RMP
1336
00154
S. Albeverio et al. the particle mass; the suppression of the critical point behavior for small values of the mass and for all temperatures; the uniqueness of the Euclidean Gibbs states for all temperatures and for the values of the mass less than a certain threshold value, dependent on the temperature. Keywords: Green function; correlation inequality; critical point; phase transition. Mathematics Subject Classification 2000: 82B10
Contents 1. Introduction 2. Euclidean Formalism for Quantum Gibbs States 2.1 Local Gibbs States 2.2. Basic Gaussian Measure 2.3. Euclidean Gibbs States 3. Classical Limits 4. Green Functions for Unbounded Operators 5. Lattice Approximation 6. Basic Inequalities 7. More Inequalities 7.1. Scalar Domination 7.2. Zero Boundary Domination 7.3. Refined Gaussian Upper Bound 8. Applications 8.1. Existence of the Long Range Order 8.2. Normality of Fluctuations and Suppression of Critical Points 8.3. Uniqueness of Gibbs States Appendix References
1336 1340 1340 1344 1350 1353 1359 1362 1370 1372 1372 1375 1378 1381 1381 1385 1390 1395 1397
1. Introduction Gibbs states of quantum systems living on a lattice IL are constructed as positive normalized functionals on von Neumann algebras whose elements (observables) represent physical quantities characterizing a given system (see [27, 52]). If the algebra of observables of each subsystem in a finite Λ ⊂ IL may be regarded as a C ∗ -algebra of bounded operators on a Hilbert space, the construction of the Gibbs states is performed within an algebraic approach, which now is quite well elaborated [27]. But if one needs to include into consideration also unbounded operators, the situation becomes much more complicated and the construction of Gibbs states even for simple models turns into a very hard task (for more details on this problem see the discussion in [50, Chap. IV, pp. 169, 170] and [51]). In 1975, an approach to the construction of Gibbs states of lattice systems of interacting quantum particles performing D-dimensional oscillations around their equilibrium positions has been initiated [1]. This approach employs the integration theory in path spaces (see also [2, 17, 18, 21, 44, 45, 52, 61, 77]). It is based on the fact, discovered by R. Høegh-Krohn [49], that the C ∗ -algebra of observables of every
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1337
subsystem in a finite subset Λ ⊂ IL is spanned by the operators of a certain type constructed with bounded multiplication operators. The essence of the approach is that the states at a given temperature T = β −1 taken on such operators (Green functions) are written as expectations with respect to a probability measure µβ,Λ on a certain infinite-dimensional space, obtained as a perturbation of a Gaussian measure. Then, for ‘nice’ perturbating functions, it was proved that, in the thermodynamic limit when Λ % IL, the weak limit µβ,Λ ⇒ µβ exists. The Gibbs state of the whole system as a functional was reconstructed by means of µβ , analogously to the case of the Euclidean quantum field theory (see [1, Sec. 4, in particular Theorem 4.1] for the reconstruction of the Gibbs state). That is the reason why µβ is often called a Euclidean Gibbs state of a quantum system. This approach was further developed in [17–21, 44, 45, 53]. As a result, it has become possible to develop substantially the theory of Gibbs states in the models of quantum anharmonic crystals employing unbounded operators. In particular, for a model of this type, the convergence at the critical point of the states taken on fluctuation operators — the only result of this kind obtained for quantum models — was proven [4]. In this article, we intend to describe the most important aspects of this approach in a systematic way. Though being mainly a review article based on our works [5–8, 19, 20, 53–57], this paper contains some new results (see the last paragraph of this introduction). Since the Euclidean Gibbs state µβ is a measure, in order to establish the set of all possible such states, one can apply the machinery of conditional probability distributions, known in classical statistical physics (see [34, 35, 43]). This was done in [8, 12–15, 64]. Certain information regarding the properties of the systems with large values of D may be obtained by means of perturbation arguments with respect to 1/D, as it has been done in [16]. Starting from [1], as a main tool in studying such states, various cluster expansion techniques were employed [10, 60, 63, 65, 69]. As a result, the existence and certain properties of the Euclidean Gibbs states (ergodicity, decay of correlations) were obtained for high temperatures [65], or for all temperatures in the case of the one-dimensional lattice [63, 65]. In [60], for small values of the particle’s mass, the convergence of corresponding cluster expansions was proved for all values of the temperature including zero. This made it possible to prove the existence of temperature and ground states and to describe a number of properties of these states. The convergence of cluster expansions implies analyticity in the coupling parameter which, for systems of particles moving on compact manifolds (considered in [10]), or for the case of ‘gentle’ anharmonicity (studied in [1]), corresponds to the uniqueness of the Euclidean Gibbs states. However, for systems with unbounded oscillations (and hence described by unbounded operators), as in the case considered in this work, it is impossible to recover the uniqueness of the states from the convergence of a cluster expansion. An alternative approach consists in establishing correlation inequalities, as it has been used in solving various problems of classical statistical physics [26, 28, 36–42, 48, 62, 75, 84]. To apply such inequalities to the Euclidean Gibbs states one should approximate
December 11, 2002 9:50 WSPC/148-RMP
1338
00154
S. Albeverio et al.
them by classical (i.e. non-quantum) Gibbs measures. In the Euclidean quantum field theory this is known as the lattice approximation technique [76, 77]. Starting from the early seventies great efforts to generalize the traditional algebraic schemes of the construction of states on C ∗ -algebras to the algebras of unbounded operators have been done [68, 72, 73]. The status quo in this domain, as well as an extensive bibliography, may be found in [50, 51]. It should be stressed here that within such an algebraic approach only the states for finite families of particles of the type considered in this work have been constructed. Thus the Euclidean approach remains so far the only method which allows one to construct the Gibbs states for the infinite systems of quantum particles described by unbounded operators. We consider the following quantum lattice system. To each point of the lattice IL = Zd , d ∈ N, there is attached a quantum particle (oscillator) with the reduced mass m = mph /~2 (mph being the physical mass), which has an unstable equilibrium position at this point. Such particles perform D-dimensional oscillations around their equilibrium positions and interact via an attractive potential. Similar objects have been studied for many years as quite realistic models of crystalline substance undergoing structural phase transitions — one of the most spectacular phenomena of contemporary statistical physics (see [30, 31, 71, 81]). They also are used as parts of the models which describe strong electron-electron correlations caused by the interaction of electrons with oscillating ions [41, 82, 83]. In the case considered, the phase transition is connected with the appearance of macroscopic displacements of particles (a long-range order), which break the O(D)-symmetry possessed by the model, when the dimensions d, D, the mass m, the temperature β −1 , and the parameters of the potential energy satisfy certain conditions. These phenomena were studied mathematically in various papers, see e.g. [19, 36, 53, 67, 86, 87]. The essential problem in this context is to understand how does a quantum model become more and more classical, i.e. how (and whether) do the quantum Gibbs states converge to the corresponding classical Gibbs states. On the other hand, of the same importance is to understand the role of quantum effects in phase transitions in such models. As was justified on the physical level [71] and observed experimentally (see [85] and [30, Chap. 2.5.4.3], quantum effects may suppress the long-range ordering. For the one-dimensional oscillations (i.e. for D = 1), this was proved in [87]. Later on it was shown in [5, 6] (D = 1), and [54, 55, 56] (D ∈ N) that not only the long-range order but also any critical anomaly of the displacements of particles from the equilibrium positions are suppressed at all temperatures if the model is ‘strongly quantum’, which may occur in particular if the mass m is small enough. Another important problem of the mathematical theory of models which exhibit such phenomena is the uniqueness of their Gibbs states. Such uniqueness would imply the absence of all critical anomalies and all the more of the long-range order. Therefore, one may expect the uniqueness of Gibbs states at all values of the temperature for ‘strongly quantum’ models. First the uniqueness of the Euclidean Gibbs states for the model considered in this work (for D = 1) was proved to occur
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1339
under conditions which were irrelevant to the ‘quantumness’ of the model (e.g. for high temperatures). This was done in [12–15] by means of the logarithmic Sobolev inequalities. Then in [8] the mentioned uniqueness was proved to hold for D = 1 and for every fixed inverse temperature β if the mass m is less than some depending on β bound m∗ . The present paper is organized as follows. In Sec. 2 we describe the models which will be considered throughout the article. Necessary facts from the theory of local Gibbs states of such models are also presented there. Thereafter, we introduce a Gaussian measure, which plays a key role in our approach. Then its properties, which we use in the sequel, are described in details. By means of this measure we define local Euclidean Gibbs measures corresponding to different boundary conditions. The Green functions constructed by bounded multiplication operators for the periodic and zero boundary conditions are written as moments of the Euclidean Gibbs measures. Moreover, by means of such measures, we introduce the Green functions corresponding to nonzero boundary conditions. Then we give the definition of the Euclidean Gibbs state for the whole system as a solution of the Dobrushin–Lanford–Ruelle equation. In Sec. 3, the results of which were announced in [7], we show that such states converge, when m → +∞, to the states isomorphic to the Gibbs states of the corresponding classical models. Section 4 is based on [54– 57]. It is dedicated to the extension of the Green functions (and hence of the local Gibbs states) to a certain class of unbounded operators, which includes the order parameter and fluctuation operators describing the long-range ordering and the critical points of the models considered. In Sec. 5 we prove that the local Euclidean Gibbs measures may be approximated by finite-dimensional measures corresponding to general ferromagnets. This allows us to prove analogs of known correlation inequalities for the moments of the local Euclidean Gibbs states (Sec. 6). In Sec. 7 we use these inequalities to prove a number of new inequalities, such as scalar domination, zero boundary domination, refined Gaussian upper bound. In Sec. 8, which is based on [5, 6, 8, 19, 53–57], we apply these results to the description of certain physical properties of the models considered. Thus, we prove the existence of the long-range order (Theorem 8.1). By means of the scalar domination inequality we show that the fluctuations of the displacement of particles remain normal, at all temperatures and for all dimensions of the oscillations, if the energy of zero-point oscillations of a given particle exceeds a certain value proportional to the energy of its interaction with the rest of the particles. In particular, this occurs when the smallest distance between the energy levels of the corresponding one-dimensional isolated oscillator is large enough or its mass is small enough (Theorem 8.3). Under a similar condition we prove that the Euclidean Gibbs state of the whole system is unique (Theorem 8.4). To this end we use the zero boundary domination inequality. General infinite dimensional methods we use in this article may be found in [23, 59]. Now let us mention which new results are contained in the present article. In Sec. 2 we give a complete description of the properties of the basic Gaussian measure
December 11, 2002 9:50 WSPC/148-RMP
1340
00154
S. Albeverio et al.
(Lemmas 2.2–2.4). In Sec. 3 we give a complete proof of Theorems 3.2, 3.3 — in [7] these theorems were only announced. In Sec. 4 we prove that the Green functions, constructed in the Euclidean region by certain unbounded operators, may be analytically continued to the same domain as the functions corresponding to bounded operators, although the former functions cannot be bounded uniformly in this domain (Theorem 4.1). Here we also prove that the Green functions corresponding to nonempty boundary conditions, and constructed by certain unbounded multiplication operators, are continuous in the Euclidean domain (Theorem 4.2). The lattice approximation technique was known in the context of quantum fields at least since the seventies [76]. Section 5 gives a version of this technique with a complete proof adapted to the models we consider. The proof of Theorem 7.4 is also new. A similar statement was proved in [6] but by means of a much more complicated technique. Theorem 8.2, proved in Sec. 8, is a generalization of a similar statement proved in [54]. Finally, the uniqueness of Euclidean Gibbs states (Theorem 8.4) here is proved for more general models than it was done in [8]. 2. Euclidean Formalism for Quantum Gibbs States 2.1. Local Gibbs states As it was mentioned above, we consider a countable system of interacting quantum particles with the reduced mass m, performing D-dimensional oscillations around their equilibrium positions which form a lattice IL = Zd . The oscillations of the particle having its equilibrium position at l ∈ IL are described by the momentum and displacement operators {pl , ql } obeying the canonical commutation relations and densely defined on the complex Hilbert space Hl = L2 (RD l ). The whole system is described by the formal Hamiltonian X 1 X dll0 (ql , ql0 ) + Hl , (2.1) H= 2 0 l,l ∈IL
Hl =
l∈IL
1 1 (pl , pl ) + (ql , ql ) + V (ql ) , 2m 2
(2.2)
where ( . , . ) stands for the scalar product in RD and dll0 are the elements of the dynamical matrix. The one-particle potential V is supposed to be O(D)-invariant, i.e., V (x) = v((x, x)) .
(2.3) def
Generally, regarding the function v we will assume that it is continuous on R+ = [0, +∞) and obeys the following condition v(ξ) ≥ aξ + b ,
∀ξ ∈ R+ ,
(2.4)
with certain positive a and b ∈ R. Sometimes we will impose more restrictive conditions:
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1341
(V1) v is a polynomial of order r ≥ 2, convex on R+ ; (V2) v has the form v(ξ) =
r X 1 aξ + bs ξ s , 2 s=2
r ≥ 2,
a ∈ R,
bs ≥ 0 ,
br > 0 .
(2.5)
Clearly, a function v which obeys (V2), also obeys (V1). For p ∈ Z, let ( ) X 2p 2 (1 + |l|) xl < ∞ , Sp = x = (xl )l∈IL l∈IL
where |l| is the Euclidean distance on IL = Zd ⊂ Rd . Let also \ [ def def Sp , S0 = S−p . S = p∈N
(2.6)
p∈N
These sets, equipped with the projective-limit (S) and inductive-limit (S 0 ) topologies respectively, constitute a mutually dual, with respect to the Hilbert space S0 = l2 (IL), pair of Schwartz spaces. The dynamical matrix (dll0 )l,l0 ∈IL is supposed to possess the following properties: (D1) dll0 is invariant under translations on IL; (D2) dll0 ≤ 0 (ferroelectricity), dll = 0; (D3) for every l ∈ IL, (dll0 )l0 ∈IL belongs to S. The formal Hamiltonian cannot be defined directly and is “represented” by local Hamiltonians HΛ — indexed by finite subsets Λ ⊂ IL essentially self-adjoint and lower bounded (due to (2.4)) operators acting in the complex Hilbert space HΛ = L2 (RD|Λ| ), (| · | stands for cardinality). In standard situations, also in this article, it is enough to consider Hamiltonians indexed by the boxes Λ = {l = (l1 , . . . , ld )|lj0 ≤ lj ≤ lj1 , j = 1, . . . , d; lj0 < lj1 , lj0 , lj1 ∈ Z} . For a box Λ, let P(Λ) denote the partition of IL by the boxes which are obtained as translations of Λ. Let also T be the group of all translations of IL, and T(Λ) ⊂ T be its subgroup consisting of the translations which generate P(Λ), i.e. P(Λ) = {t(Λ)|t ∈ T(Λ)}, where t(Λ) = {t(l)|l ∈ Λ}. Then the dynamical matrix (dΛ ll0 )l,l0 ∈Λ , obeying periodic conditions on the boundaries of Λ, and the corresponding local Hamiltonian HΛ are introduced as follows dΛ ll0 = min{dlt(l0 ) : t ∈ T(Λ)} , HΛ =
X 1 X Λ dll0 (ql , ql0 ) + Hl . 2 0 l,l ∈Λ
(2.7) (2.8)
l∈Λ
The dynamical matrix (dΛ ll0 )l,l0 ∈Λ is invariant with respect to the translations on the torus which one obtains by identifying the opposite boundaries of the box Λ.
December 11, 2002 9:50 WSPC/148-RMP
1342
00154
S. Albeverio et al.
These translations constitute a factor-group T/T(Λ). The local Hamiltonian which corresponds to the zero boundary condition is (0)
HΛ =
X 1 X dll0 (ql , ql0 ) + Hl . 2 0 l,l ∈Λ
(2.9)
l∈Λ
For a box Λ, the local periodic Gibbs state γβ,Λ at a given value of the temperature T = β −1 is defined on AΛ — the C ∗ -algebra of all bounded operators on HΛ , as the following positive normalized functional γβ,Λ (A) =
trace(A exp(−βHΛ )) . trace exp(−βHΛ )
(2.10)
(0)
The state γβ,Λ corresponding to the zero boundary condition is defined in the same (0)
way but with the Hamiltonian HΛ (2.9) instead of HΛ . Given a box Λ and t ∈ R, we introduce the following automorphisms of AΛ aΛ t (A) = exp(itHΛ )A exp(−itHΛ ) , (0)
(0)
a0,Λ t (A) = exp(itHΛ )A exp(−itHΛ ) .
(2.11)
A significant role in the construction of the Gibbs states on the algebras AΛ is played by multiplication operators. Recall that, for a function A : RD|Λ| → C, the multiplication operator A ∈ AΛ acts on Ψ ∈ HΛ as follows (AΨ)(x) = A(x)Ψ(x) . (α)
The components ql , α = 1, 2, . . . , D, l ∈ Λ of the displacement operator are multiplication operators, but they do not belong to AΛ since they are unbounded. R. Høegh-Krohn in [49] proved the following assertion (for more details see also [1] and [45]). Proposition 2.1. AΛ is the smallest strongly closed linear space containing all operators of the form Λ Λ aΛ t1 (A1 )at2 (A2 ) · · · atn (An ) ,
with all possible n ∈ N, t1 , . . . , tn ∈ R and A1 , . . . , An being bounded continuous 0,Λ functions Aj : RD|Λ| → C. The same remains true if one replaces aΛ t with at . For A1 , . . . , An ∈ AΛ and t1 , . . . , tn ∈ R, the temporal Green functions corresponding to the periodic and zero boundary conditions are Λ Λ Gβ,Λ A1 ,...,An (t1 , . . . , tn ) = γβ,Λ (at1 (A1 ) · · · atn (An )) , (0)
0,Λ 0,Λ G0,β,Λ A1 ,...,An (t1 , . . . , tn ) = γβ,Λ (at1 (A1 ) · · · atn (An )) .
(2.12) (2.13)
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1343
For a domain O ⊂ Cn , let Hol(O) stand for the set of all holomorphic in O complex valued functions. Let also def
Dnβ = {(t1 , . . . , tn ) ∈ Cn |0 < =(t1 ) < =(t2 ) · · · < =(tn ) < β} .
(2.14)
By virtue of [1, Sec. 3] and [49, Sec. 2], we prove the following statement. Lemma 2.1. For every A1 , . . . , An ∈ AΛ , β (a) Gβ,Λ A1 ,...,An may be extended to a holomorphic function on Dn ; β,Λ (b) this extension (which will also be written as GA1 ,...,An ) is continuous on the ¯ nβ , ¯ nβ of Dnβ , moreover, for all (t1 , . . . , tn ) ∈ D closure D
|Gβ,Λ A1 ,...,An (t1 , . . . , tn )| ≤ kA1 k · · · kAn k ,
(2.15)
where k · k stands for the operator norm; (c) for every ξ1 , . . . , ξn ∈ R, the set def
Dnβ (ξ1 , . . . , ξn ) = {(t1 , . . . , tn ) ∈ Dnβ |<(tj ) = ξj ,
j = 1, . . . , n} ,
(2.16)
is such that for arbitrary F, G ∈ Hol(Dnβ ), the equality F = G on Dnβ (ξ1 , . . . , ξn ) implies that these functions are equal on the whole Dnβ . The Green function G0,β,Λ A1 ,...,An has the same properties. Proof. It is known (see [24, p. 57]) that the Hamiltonian HΛ (2.8) has a discrete spectrum consisting of positive eigenvalues Es , s ∈ N. The corresponding eigenfunctions Ψs constitute an orthonormal base of the space L2 (RD|Λ| ). We set HΛ Ψs = Es Ψs ,
As,s0 = (AΨs , Ψs0 )L2 (RD|Λ| ) .
Then Gβ,Λ A1 ,...,An (t1 , . . . , tn ) =
1 Zβ,Λ
X
(2.17)
(A1 )s1 ,s2 exp[i(t2 − t1 )Es2 ]
s1 ,...,sn ∈N
× · · · × (An−1 )sn−1 ,sn exp[i(tn − tn−1 )Esn ] × (An )sn ,s1 exp[i(t1 − tn + iβ)Es1 ] ,
(2.18)
where Zβ,Λ = trace{exp(−βHΛ )} .
(2.19)
Each element of the Dirichlet series (2.18) is an entire function of (t1 , . . . , tn ). Hence ¯ nβ value on the boundary of this set, that is its module achieves the maximal on D at the points =(t1 ) = =(t2 ) = · · · = =(tk ) = 0 and =(tk+1 ) = · · · = =(tn ) = β with k running from 1 to n. For such (t1 , . . . , tn ), one has Λ |(aΛ t1 (A1 ) · · · atn (An ) exp[−βHΛ ]Ψs , Ψs )L2 (RN |Λ| ) |
≤ |(Kk+1 · · · Kn K1 · · · Kk Ψs , Ψs )L2 (RD|Λ| ) | exp[−βEs ] ,
(2.20)
December 11, 2002 9:50 WSPC/148-RMP
1344
00154
S. Albeverio et al.
where Kj = aΛ θj (Aj ) ,
θj = <(tj ) ,
j = 1, . . . , n .
(2.21)
The number k depends on s. Obviously, |(Kk+1 · · · Kn K1 · · · Kk Ψs , Ψs )L2 (RD|Λ| ) | ≤ kKk+1 · · · Kn K1 · · · Kk k ≤ kK1 k · · · kKn k , yielding Λ trace{aΛ t1 (A1 ) · · · atn (An ) exp[−βHΛ ]} ≤ kK1 k · · · kKn kZβ,Λ .
Moreover, kKj k = kAj k , since aΛ θ is a norm preserving automorphism of AΛ . Thus, the mentioned Dirichlet ¯ nβ , which proves claims (a) and (b). To prove (c) series converges uniformly on D β one observes that Dn (ξ1 , . . . , ξn ) is a generating manifold (see e.g. [74, p. 444]), hence it is an inner uniqueness set for the functions from Hol(Dnβ ). The latter means that every F ∈ Hol(Dnβ ), which is zero on this set is identically zero on the whole Dnβ . The restrictions of the functions Gβ,Λ , G0,β,Λ to Dnβ (0, . . . , 0), i.e. β,Λ Γβ,Λ A1 ,...,An (τ1 , . . . , τn ) = GA1 ,...,An (iτ1 , . . . , iτn ) ,
(2.22)
0,β,Λ Γ0,β,Λ A1 ,...,An (τ1 , . . . , τn ) = GA1 ,...,An (iτ1 , . . . , iτn ) ,
(2.23)
are called temperature (Matsubara) Green functions. Writing them in the form of the series (2.18) one immediately concludes that they have the following property β,Λ Γβ,Λ A1 ,...,An (τ1 + θ, . . . , τn + θ) = ΓA1 ,...,An (τ1 , . . . , τn ) , 0,β,Λ Γ0,β,Λ A1 ,...,An (τ1 + θ, . . . , τn + θ) = ΓA1 ,...,An (τ1 , . . . , τn ) ,
(2.24)
def
for every θ ∈ Iβ = [0, β], where addition is modulo β. In view of Proposition 2.1, the Green functions defined by (2.12), (2.13) with (0) bounded multiplication operators fully determine the states γβ,Λ , γβ,Λ . Claim (c) of the latter assertion yields in turn that these states are determined by the Matsubara functions (2.22), (2.23) constructed with such operators. 2.2. Basic Gaussian measure The essence of the Euclidean approach is that the Matsubara functions may be written as moments of probability measures. We begin the construction of such measures with the introduction of a Gaussian measure, which plays a key role in
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1345
the sequel. Given β, let Xβ stand for the real Hilbert space L2 (Iβ → RD ) equipped with the scalar product and norm respectively Z q (ω(τ ), ω 0 (τ ))dτ , kωkβ = hω, ωiβ . (2.25) hω, ω 0 iβ = Iβ
On this space we define the following operator Sβ = (−m∆β + 1)−1 1 ,
(2.26)
where ∆β is the Laplacian, m is the reduced mass, and 1 is the identity operator in RD . This operator is strictly positive and trace class. Thus it determines on Xβ an isotropic (i.e. O(D)-invariant) Gaussian measure χβ having the Laplace transform Z 1 hSβ ϕ, ϕiβ . exp{hϕ, ωiβ }χβ (dω) = exp (2.27) 2 Xβ This measure describes a D-dimensional quantum harmonic oscillator with the mass m. Sometimes to indicate its dependence on the mass we shall write χm β . The integral kernel of the operator (2.26) may be written as follows √ √ 0 δαα0 exp((β − |τ − τ 0 |)/ m) + exp(|τ − τ 0 |/ m) √ , (2.28) Sβαα (τ, τ 0 ) = √ · 2 m exp(β/ m) − 1 where δαα0 , α, α0 = 1, . . . , D stands for the Kronecker delta. Employing this kernel one can show that D · |τ − τ 0 |β , (2.29) h(ω(τ ) − ω(τ 0 ), ω(τ ) − ω(τ 0 ))iχβ ≤ m def
where |τ − τ 0 |β = min{|τ − τ 0 |, β − |τ − τ 0 |}. Here and further on we write Z (2.30) hf iµ = f dµ . Given τ, τ 0 ∈ Iβ , we set ξ1 = ω(τ ) − ω(τ 0 ) ,
|ξj |2 = (ξj , ξj ) .
ξ2 = ω(τ ) ,
For the random variables ξj , j = 1, 2, one can show that h|ξj |2p iχβ = [Cp h|ξj |2 iχβ ]p ,
p ∈ N,
(2.31)
where Cp is a constant depending on p and D only. Thus, one has from (2.29) h|ω(τ ) − ω(τ 0 )|2p iχβ ≤ (Cp D/m)p |τ − τ 0 |pβ . Further, by means of (2.29) and (2.31) one gets that Z exp[a(ω(τ ), ω(τ ))]χβ (dω) < ∞ , ∀a < a∗ ,
(2.32)
(2.33)
Xβ
where
√ √ 2 m eβ/ m − 1 · β/√m . a∗ = D e +1
(2.34)
December 11, 2002 9:50 WSPC/148-RMP
1346
00154
S. Albeverio et al.
We set Cβ = {ω ∈ C(Iβ )|ω(0) = ω(β)} ,
(2.35)
and Cβσ = {ω ∈ Cβ |(∀σ ∈ (0, 1/2))(∃Kσ (ω) > 0) (∀τ, τ 0 ∈ Iβ ) |ω(τ ) − ω(τ 0 )| ≤ Kσ (ω)|τ − τ 0 |σβ } .
(2.36)
Clearly, Cβ is a subspace of the Banach space C(Iβ ), thus in the topology induced from this space it is also a Banach space. The periodicity of the functions from Cβ is related to the property (2.24). Lemma 2.2. The measure χβ is concentrated on Cβσ . There exists a > 0 such that Z exp{akωk2Cβ }χβ (dω) < ∞ . (2.37) Xβ
Proof. The proof of the first statement follows from [77, p. 43, estimate (2.32) and Theorem 5.1]. Since the measure χ is concentrated on Cβ ⊃ Cβσ , one can apply Fernique’s theorem (see e.g. [33, p. 16]), which gives (2.37). The result just proven allows us to consider χβ as a measure on the Banach space Cβ . Recall that a family of probability measures M on a topological space X is called tight in this space if, for any ε > 0, there exists a compact subset Aε ⊂ X such that µ(X \ Aε ) ≤ ε for all µ ∈ M. A measure µ is called tight if the family {µ} is tight. Lemma 2.3. For every m0 > 0, the family of measures {χm β |m ≥ m0 } is tight in Cβ . The proof of this lemma will be based on a tightness criterium, for which we take Theorem 8.2 from Billingsley’s book [25, p. 55].a The modulus of continuity of a ω ∈ Cβ is set as follows φ(ω, δ) = sup{|ω(τ ) − ω(τ 0 )| | |τ − τ 0 |β < δ} ,
0 < δ ≤ β/2 .
(2.38)
Proposition 2.2. The family of measures {µθ |θ ∈ Θ} is tight in Cβ if and only if these two conditions hold : (i) For each positive η, there exists an a such that µθ ({ωkω(0)| > a}) ≤ η ,
∀θ ∈ Θ.
(2.39)
(ii) For each positive ε and η, there exists a δ ∈ (0, β/2) such that µθ ({ω|φ(ω, δ) ≥ ε}) ≤ η ,
∀θ ∈ Θ.
(2.40)
a This theorem gives a criterium for sequences, but just after the proof the author remarks how it can be generalized to an arbitrary family of measures.
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1347
If {µθ |θ ∈ Θ} is a sequence {µM |M ∈ N}, the above condition is to be satisfied only for M > M0 , with M0 depending on ε and η only. To employ this criterium we shall use the Chebyshev inequality (see e.g. [25, p. 223]) 1 (2.41) µθ ({ω|F (ω) ≥ a}) ≤ · hF iµθ , a which holds for any nonnegative and integrable function. Proof of Lemma 2.3. First we prove that the condition (i) holds. By (2.28) and (2.41) one has 1 m 2 2 χm β ({ωkω(0)| > a}) = χβ ({ωkω(0)| > a }) ≤ 2 · h(ω(0), ω(0))iχm β a √ D X exp(β/ m) + 1 D √ = Sβαα (0, 0) = 2 √ · 2a m exp(β/ m) − 1 α=1 ≤
√ exp(β/ m0 ) + 1 D , · √ √ 2a2 m0 exp(β/ m0 ) − 1
∀ m ≥ m0 .
(2.42)
To prove (ii) we shall use the estimates obtained in [22] by means of the Garsia– Rodemich–Rumsey lemma. For ω ∈ Cβσ , one has (see (2.36)) φ(ω, δ) ≤ Kσ (ω)δ σ ,
∀σ ∈ (0, 1/2) .
(2.43) −1
Given σ ∈ (0, 1/2), let us take p ∈ N such that p > (1 − 2σ) one has by (2.41) and (2.43)
. For this σ and p,
m 2p ≥ ε2p }) χm β ({ω|φ(ω, δ) ≥ ε}) = χβ ({ω|[φ(ω, δ)]
δ 2pσ · h[Kσ (ω)]2p iχm . (2.44) β ε2p Taking into account (2.32) and applying [22, p. 203, estimate (3b)], we get 1 Cp,σ Dp β p(1−2σ) , ≤ · h[Kσ (ω)]2p iχm β mp p(1 − 2σ) − 1 ≤
with Cp,σ depending solely on D, p, and σ. Employing this estimate in (2.44) one gets (2.40). As a strictly positive trace class operator, Sβ possesses eigenvectors, the set of which, Eβ , spans the space Xβ . This set may be written as follows 2π def κ|κ ∈ Z , (2.45) Eβ = {k |k ∈ K} , K = k = β k = (α k )α=1,...,D ,
α α k (τ ) = ek (τ )ι , r r 2 2 cos kτ (k > 0) , ek (τ ) = − sin kτ (k < 0) , ek (τ ) = β β √ e0 (τ ) = 1/ β ,
December 11, 2002 9:50 WSPC/148-RMP
1348
00154
S. Albeverio et al.
where ια , α = 1, . . . , D constitute the canonical base of RD . Let Pkα , k ∈ K, α = 1, . . . , D stand for the projector from Xβ onto the subspace spanned by α k. Then the operator Sβ may be written in the canonical form Sβ =
D X X
(mk 2 + 1)−1 Pkα .
(2.46)
α=1 k∈K def
Below we consider the sequences {χλ,M |M ∈ Z+ }, Z+ = N ∪ {0} of Gaussian measures on Xβ having zero means and the covariance operators Sλ,M =
D X X
(M)
λk
Pkα ,
(M)
λk
≥ 0.
(2.47)
α=1 k∈K (M)
We shall assume that each sequence {λ(M) = (λk )k∈K |M ∈ Z+ } converges in l1 to λ = ([mk 2 + 1]−1 )k∈K , when M → ∞. Therefore, the sequence of operators {Sλ,M |M ∈ Z+ } converges to Sβ in the trace norm. Given a measure χλ,M (respec(N ) (N ) tively χβ ), a finite-dimensional approximation χλ,M (respectively χβ ), N ∈ 2Z+ , (N )
i.e., N = 2L, L ∈ Z+ , is the measure which has the covariance operator Sλ,M (N )
(respectively Sβ ), given as follows (N )
Sλ,M =
D X X
(M)
λk
Pkα ,
α=1 k∈KN (N ) Sβ
=
D X X
(2.48) (mk 2 + 1)−1 Pkα .
α=1 k∈KN
Here
def
KN =
k=
2π κ|κ = −(L − 1), . . . , L β
.
(2.49)
Throughout this paper we deal with the weak convergence of measures on metric spaces (see e.g. [25, 66]). For a measure space (X, B(X)), where X is a real separable metric space and B(X) is the Borel σ-algebra of its subsets, let M(X) be the space of all probability measures defined on X. Let Cb (X) stand for the space of all bounded real-valued continuous functions on X. The weak topology on the space M(X) is defined in such a way that a net of measures {µθ } converges to µ ∈ M(X) (then we write µθ ⇒ µ) in this topology if Z Z f dµθ → f dµ , ∀f ∈ Cb (X) . Regarding the measures on separable Hilbert spaces, [66, p. 182, Lemma 5.1] implies the following Proposition 2.3. Let a net of Gaussian measures {χθ } on a separable Hilbert space H be given. Let also each χθ have zero mean and covariance Sθ , which is a
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1349
positive trace class operator on H. Suppose that the net {Sθ } converges in the trace norm to an operator S. Then there exists a Gaussian symmetric measure on H, such that its covariance operator is S and χθ ⇒ χ in H. Employing this fact we prove the following lemma. Lemma 2.4. Let the sequence {Sλ,M |M ∈ Z+ } converges to Sβ in the trace norm. Then the sequence of measures {χλ,M |M ∈ Z+ } converges weakly in the Banach space Cβ to the measure χβ , i.e., for every F ∈ Cb (Cβ ), one has Z Z F (ω)χλ,M (dω) → F (ω)χβ (dω) , M → ∞ . Cβ
Cβ
Proof. By Proposition 2.3 the assumed convergence of the sequence {Sλ,M } yields the weak convergence in Xβ of the sequences of finite-dimensional approximations (N ) (N ) χλ,M to χβ for every N ∈ 2Z+ . Since all these measures are concentrated on finite(N )
dimensional subspaces of Cβ ⊂ Xβ , each a sequence {χλ,M |N ∈ 2Z+ } converges (N )
weakly to χβ also in Cβ . If we show that the sequence {χλ,M |M ∈ Z+ } is tight in Cβ , the stated convergence will follow from Theorem 8.1 of Billingsley’s book [25, p. 54]. One observes that h(ω(0), ω(0))iχλ,M =
D X
αα Sλ,M (0, 0) = trace Sλ,M .
α=1
Since the sequence {trace Sλ,M |M ∈ Z+ } is bounded, Proposition 2.2(i) is satisfied. Similarly, h(ω(τ ) − ω(τ 0 ), ω(τ ) − ω(τ 0 )iχλ,M = 2
D X
αα αα [Sλ,M (0, 0) − Sλ,M (τ, τ 0 )] .
(2.50)
αα
But αα αα αα (0, 0) − Sλ,M (τ, τ 0 ) = Sβαα (0, 0) − Sβαα (τ, τ 0 ) + [Sλ,M (0, 0) − Sβαα (0, 0)] Sλ,M αα (τ, τ 0 ) − Sβαα (τ, τ 0 )] = Sβαα (0, 0) − [Sλ,M def
− Sβαα (τ, τ 0 ) + IM (0, 0) − IM (τ, τ 0 ) .
(2.51)
Further, IM (0, 0) = trace[Sλ,M − Sβ ] → 0 , M → +∞ , X 0 ([Sλ,M − Sβ ]k , k ) |IM (τ, τ )| =
(2.52)
k∈K
≤
2D X (M) |λk − (mk 2 + 1)−1 | → 0 , β k∈K
M → +∞ .
(2.53)
December 11, 2002 9:50 WSPC/148-RMP
1350
00154
S. Albeverio et al.
Taking into account (2.52), (2.53) in (2.51) and (2.32), one concludes that there exists M0 such that the estimate h|ω(τ ) − ω(τ 0 )|2p iχλ,M ≤ (Cp D/m)p |τ − τ 0 |pβ , holds for all M > M0 . Now we may proceed as in proving Lemma 2.3, where the estimate (2.32) and the Garsia–Rodemich–Rumsey lemma implied Proposition 2.2(ii). Thus the sequence {χλ,M |M ∈ Z+ } is tight. 2.3. Euclidean Gibbs states Given β and a box Λ, we write Ωβ,Λ = {ωΛ = (ωl )l∈Λ |ωl ∈ Cβ } ,
(2.54)
Xβ,Λ = {ωΛ = (ωl )l∈Λ |ωl ∈ Xβ } .
(2.55)
and
Since Λ is finite, one may equip Ωβ,Λ and Xβ,Λ with the usual Banach space and Hilbert space structures respectively. Then the space Ωβ,Λ may be densely embedded into Xβ,Λ . Let B(Ωβ,Λ ) stand for the Borel σ-algebra of the subsets of Ωβ,Λ . Further, set O χβ (dωl ) . (2.56) χβ,Λ (dωΛ ) = l∈Λ
The latter measure is concentrated on Ωσβ,Λ = {ωΛ = (ωl )l∈Λ |ωl ∈ Cβσ } . Set V (ωΛ ) Eβ,Λ
XZ 1 X Λ = dll0 hωl , ωl0 iβ + V (ωl (τ ))dτ , 2 0 Iβ l,l ∈Λ
and V (ωΛ |0) Eβ,Λ
(2.58)
l∈Λ
XZ 1 X 0 0 = dll hωl , ωl iβ + V (ωl (τ ))dτ . 2 0 Iβ l,l ∈Λ
(2.57)
(2.59)
l∈Λ
V V , Eβ,Λ (·|0) are Under the assumptions made regarding V and (dll0 )l,l0 ∈IL , both Eβ,Λ continuous functions from Ωβ,Λ to R. Thereafter, we may introduce the local Euclidean Gibbs measures corresponding to the periodic and zero boundary conditions. These are respectively the following probability measures on Ωβ,Λ ,
µβ,Λ (dωΛ ) = µβ,Λ (dωΛ |0) =
1 V exp{−Eβ,Λ (ωΛ )}χβ,Λ (dωΛ ) , Zβ,Λ
(2.60)
1 V exp{−Eβ,Λ (ωΛ |0)}χβ,Λ (dωΛ ) , Zβ,Λ (0)
(2.61)
where Zβ,Λ , Zβ,Λ (0) are the normalizing constants.
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1351
By means of these measures one can write the Green functions (2.22), (2.23), constructed with the multiplication operators A1 , . . . An ∈ AΛ , as follows [1, 45] Z β,Λ A1 (ωΛ (τ1 )) · · · An (ωΛ (τn ))µβ,Λ (dωΛ ) , (2.62) ΓA1 ,...,An (τ1 , . . . , τn ) = χβ,Λ
Z Γ0,β,Λ A1 ,...,An (τ1 , . . . , τn ) =
A1 (ωΛ (τ1 )) · · · An (ωΛ (τn ))µβ,Λ (dωΛ |0) .
(2.63)
χβ,Λ
The Gibbs states of the whole lattice system which correspond to the periodic (0) and zero boundary conditions are obtained as limits of the above states γβ,Λ , γβ,Λ when Λ % IL. More precisely, let L be a sequence of boxes ordered by inclusion and S such that Λ∈L Λ = IL. For Λ1 ⊂ Λ2 , one may introduce a natural norm-preserving embedding AΛ1 ⊂ AΛ2 , which defines an increasing sequence of algebras {AΛ |Λ ∈ L}. In a standard way [27], this sequence defines a quasi-local algebra of observables A. Two sequences L, L0 are set to be equivalent if the corresponding quasi-local algebras coincide. A standard sequence L is the sequence of boxes {ΛL|L ∈ N}, where ΛL = (−L, L]d ∩ Zd . In the sequel, all (thermodynamic) limits Λ % IL are taken over a sequence L, which is equivalent to the standard one. The mentioned Gibbs states of the whole lattice system are defined as the thermodynamic limits of (0) the local Gibbs states γβ,Λ , γβ,Λ . The existence of periodic Gibbs states for similar models was shown in [21] (see also [63–65]). As it was mentioned above, the great advantage of the Euclidean approach lies in the fact that due to the above relationship between the Green functions and local Gibbs states one may apply to the quantum case the machinery of conditional probability distributions, which form the base of modern classical equilibrium statistical physics (see e.g. [34, 35, 43] and the references therein). To this end, along with the Gibbs measures (2.60), (2.61), which correspond to the periodic and zero boundary conditions respectively, we introduce conditional local Gibbs measures. They will describe the Gibbs states of the particles contained in the box Λ and interacting between themselves and with fixed configurations of particles outside Λ. Such configurations determine conditions for the measures we are going to introduce. Since the complements of boxes Λ, in which we shall fix configurations, are infinite subsets of the lattice IL, we employ the spaces Ωβ,Λ , introduced (2.54), (2.55) also for infinite subsets Λ, in particular, we shall use Ωβ standing for Ωβ,IL . We equip such spaces with the product topology and with the σ-algebra B(Ωβ,Λ ) generated by the cylinder subsets {ωΛ = (ωl )l∈Λ |(ωl )l∈∆ ∈ B∆ } ,
B∆ = ×l∈∆ Bl ,
with finite ∆ ⊂ IL and Borel subsets {Bl ⊂ Cβ |l ∈ ∆}. For ∆ ⊂ Λ ⊂ IL, we write ω∆ × ζΛ\∆ for the configuration (ξl )l∈Λ such that ξl = ωl for l ∈ ∆, and ξl = ζl for l ∈ Λ\∆. Given a sequence of boxes L, in order to have the collections of all the spaces {Ωβ,Λ , Λ ∈ L} ordered by inclusion, we introduce the following mappings. For ∆ ⊂ Λ, we set ω∆ 7→ ω∆ × 0Λ\∆ ∈ Ωβ,Λ , where 0Λ is the zero configuration
December 11, 2002 9:50 WSPC/148-RMP
1352
00154
S. Albeverio et al.
in Ωβ,Λ . Hence one may consider every configuration ω∆ as an element of all Ωβ,Λ with ∆ ⊂ Λ. Besides, we define Ωβ,Λ 3 ωΛ 7→ (ωΛ )Λ0 ∈ Ωβ,Λ0 , as a configuration such that ωl = 0 for l ∈ Λ0 \Λ. Obviously, (ωΛ )Λ0 = 0Λ0 if Λ ∩ Λ0 = ∅. Let Ωtβ = {ζ ∈ Ωβ |(kζl kβ )l∈IL ∈ S 0 } . def
(2.64)
Given ζ ∈ Ωβ and a box Λ, we put ζ ∈ Ωβ \Ωtβ ,
µβ,Λ (B|ζ) = 0 , and for ζ ∈
B ∈ B(Ωβ,Λ ) ,
(2.65)
Ωtβ , µβ,Λ (dωΛ |ζ) =
1 V exp{−Eβ,Λ (ωΛ |ζ)}χβ,Λ (dωΛ ) , Zβ,Λ (ζ)
where def
(2.66)
Z V exp{−Eβ,Λ (ωΛ |ζ)}χβ,Λ (dωΛ ) ,
Zβ,Λ (ζ) =
Ωβ,Λ
is the local partition function subject to the external boundary condition ζΛc , Λc = IL\Λ, and XZ V (ωΛ |ζ) = Eβ,Λ (ωΛ |ζ) + V (ωl (τ ))dτ , (2.67) Eβ,Λ l∈Λ
Eβ,Λ (ωΛ |ζ) =
Iβ
1 X dll0 hωl , ωl0 iβ + 2 0 l,l ∈Λ
X
dll0 hωl , ζl0 iβ .
(2.68)
l∈Λ,l0 ∈Λc
Here V is the same as in (2.2). Under the assumptions made regarding V and dll0 , V (·|ζ) are continuous functions from Ωβ,Λ to R for all ζ ∈ Ωtβ . both Eβ,Λ (·|ζ), Eβ,Λ The function Eβ,Λ (·|ζ) describes the interaction between the particles in Λ and with the fixed configuration ζΛc . Clearly, for ζ ∈ Ωtβ , µβ,Λ (·|ζ) is a probability measure. For ζ = 0, it coincides with the measure (2.61). Thus, along with the Green functions (2.62), (2.63) we introduce the temperature Green function which corresponds to the external boundary condition ζΛc Z Γζ,β,Λ (τ , . . . , τ ) = A1 (ωΛ (τ1 )) · · · An (ωΛ (τn ))µβ,Λ (dωΛ |ζ) . (2.69) n A1 ,...,An 1 χβ,Λ
Here A1 , . . . , An are multiplication operators such that for every τ1 , . . . , τn ∈ Iβ , the function Ωβ,Λ 3 ωΛ 7→ A1 (ωΛ (τ1 )) · · · An (ωΛ (τn )) , is µβ,Λ (·|ζ) integrable for every ζ ∈ Ωβ , which obviously holds for A1 , . . . , An ∈ AΛ . Note that the above temperature Green function is defined only for multiplication operators, there is no a priori information regarding its analytic and continuity properties (except for ζ = 0), even in the case of bounded operators.
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1353
For B ∈ B(Ωβ ) and ω ∈ Ωβ , let δB (ω) take values 1, respectively 0, if ω belongs, respectively does not belong, to B. Then for a finite Λ ⊂ IL, ζ ∈ Ωβ , B ∈ B(Ωβ ), we set Z def δB (ωΛ × ζΛc )µβ,Λ (dωΛ |ζ) . (2.70) πβ,Λ (B|ζ) = Ωβ,Λ
These probability kernels satisfy the consistency conditions Z πβ,Λ (dω|ζ)πβ,∆ (B|ω) = πβ,Λ (B|ζ) ,
(2.71)
Ωβ
which holds for arbitrary pairs of finite subsets ∆ ⊂ Λ ⊂ IL, and any B ∈ B(Ωβ ), ζ ∈ Ωβ (for more details on such consistency conditions see e.g. [43]). Definition 2.1. A probability measure µ on the measure space (Ωβ , B(Ωβ )) is said to be a Euclidean Gibbs state of the model considered at the inverse temperature β if it satisfies the Dobrushin–Lanford–Ruelle (DLR) equilibrium equation Z µ(dω)πβ,Λ (B|ω) = µ(B) , (2.72) Ωβ
for all finite Λ ⊂ IL and B ∈ B(Ωβ ). In order to exclude the states with no physical relevance we impose some a priori conditions restricting the growth of the sequences of moments (see [12, 47]). Definition 2.2. The class Gβ of tempered Gibbs measures consists of the Gibbs states µ defined above, the moments of which obey the condition (hkωl kβ iµ )l∈IL ∈ S 0 . 3. Classical Limits In this section L, Lfin will stand for the set of all, respectively of all finite, subsets of IL. Given Λ ∈ L, let us consider the subset of Ωβ,Λ consisting of constant trajectories, that is def
D D Λ Ωqc β,Λ = {ωΛ ∈ Ωβ,Λ |(∀ l ∈ Λ)(∃xl ∈ R )(∀ τ ∈ Iβ )ωl (τ ) = xl } ' (R ) .
(3.1)
We also set def
qc D IL Ωβ ⊃ Ωqc β,IL = Ωβ ' (R ) .
(3.2)
qc For Λ ∈ L, let Bqc β,Λ be the σ-algebra generated by the cylinder subsets of Ωβ,Λ , D Λ which is isomorphic to the corresponding σ-algebra B((R ) ) generated by the def
cylinder subsets of (RD )Λ but, on the other hand, is a subalgebra of Bβ,Λ = B(Ωβ,Λ ). For every B ∈ Bβ,Λ , let def
C(B) = B ∩ Ωqc β,Λ .
(3.3)
December 11, 2002 9:50 WSPC/148-RMP
1354
00154
S. Albeverio et al.
We write D Λ Bqc β,Λ 3 C ' A ∈ B((R ) ) ,
(3.4)
D Λ for the pair of subsets C ∈ Bqc β,Λ , A ∈ B((R ) ) which are isomorphic in the above sense. This means that they consist of exactly those ωΛ and xΛ , for which ωl (τ ) = xl for all τ ∈ Iβ and l ∈ Λ. Consider the following Gaussian measure Y def $β (dxl ) , xΛ ∈ (RD )Λ , Λ ∈ Lfin , (3.5) $β,Λ (dxΛ ) = l∈Λ
def
$β (dxl ) =
β 2π
D/2
β exp − (xl , xl ) dxl , 2
xl ∈ RD .
(3.6)
For Λ ∈ Lfin , let χqc β,Λ be the Gaussian measure on Ωβ,Λ such that for every B ∈ Bβ,Λ one has χqc β,Λ (B) = $β,Λ (A) ,
(3.7)
where A ' C(B), which is defined by (3.3), (3.4). This means that qc χqc β,Λ (B) = χβ,Λ (C(B)) ,
(3.8)
qc i.e. χqc β,Λ is supported on Bβ,Λ . Making use of these measures we construct the periodic and conditional Gibbs measures following the scheme (2.60), (2.58) and (2.66)–(2.67). Thus we set
1 qc V qc exp{−Eβ,Λ (ωΛ )}χβ,Λ (dωΛ ) , Zβ,Λ
(3.9)
1 V exp{−Eβ,Λ (ωΛ |ζ)}χqc qc β,Λ (dωΛ ) , Zβ,Λ (ζ)
(3.10)
def
µqc β,Λ (dωΛ ) = and for ζ ∈ Ωtβ , def
µqc β,Λ (dωΛ |ζ) =
V V (·) and Eβ,Λ (·|ζ) are given by (2.58) and (2.67) respectively. Here, as where Eβ,Λ above, Z qc def V exp{−Eβ,Λ (ωΛ )}χqc (3.11) Zβ,Λ = β,Λ (dωΛ ) , Ωβ,Λ
and def qc (ζ) = Zβ,Λ
Z Ωβ,Λ
V exp{−Eβ,Λ (ωΛ |ζ)}χqc β,Λ (dωΛ ) ,
(3.12)
are the normalizing constants. We remark that the measures (3.9), (3.10) are defined on the same space as µβ,Λ (·) and µβ,Λ (·|ζ) given by (2.60) and (2.66) respectively. Further, (3.8) implies that qc µqc β,Λ (B) = µβ,Λ (C(B)) ;
qc µqc β,Λ (B|ζ) = µβ,Λ (C(B)|ζ) ,
∀ ζ ∈ Ωβ .
(3.13)
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1355
By means of the conditional Gibbs measures (3.10) we define the family of probaqc qc bility kernels {πβ,Λ (·|ζ)|Λ ∈ Lfin } (setting as above πβ,Λ (·|ζ) = 0 for ζ ∈ Ωβ \Ωtβ ), and hence the corresponding Euclidean Gibbs states. The family of such Euclidean tempered Gibbs states will be denoted Gβqc . The members of this family will be called quasi-classical Gibbs states. Now let us construct the Gibbs states for the classical model described by the Hamiltonian X 1 X [(xl , xl )/2 + V (xl )] + dll0 (xl , xl0 ) , (3.14) H cl = 2 0 l,l ∈IL
l∈IL
where V is the same as in (2.2), which means that in this case only the potential energy of the oscillators described by (2.1)–(2.3) is taken into account. Heuristically, this potential energy may be obtained from (2.1) by passing to the limit m → +∞. For Λ ∈ Lfin , we set X 1 X Λ V (xl ) + dll0 (xl , xl0 ) , (3.15) IΛ (xΛ ) = 2 0 l,l ∈Λ
l∈Λ
and IΛ (xΛ |y) =
X
V (xl ) +
1 X dll0 (xl , xl0 ) + 2 0 l,l ∈Λ
l∈Λ
X
dll0 (xl , yl0 ) ,
(3.16)
l∈Λ,l0 ∈Λc
where y = (yl )l∈IL ∈ S 0 determines the boundary conditions outside Λ and plays here the same role as ζ does in the case of quantum Euclidean Gibbs states. It is not difficult to show that IΛ (·) and IΛ (·|y) are continuous functions on (RD )Λ , Λ ∈ Lfin . The periodic and conditional Gibbs measures for the classical model are introduced respectively as ρβ,Λ (dxΛ ) = ρβ,Λ (dxΛ |y) =
1 Yβ,Λ
exp{−βIΛ (xΛ )}$β,Λ (dxΛ ) ,
1 Yβ,Λ (y)
exp{−βIΛ (xΛ |y)}$β,Λ (dxΛ ) ,
(3.17) (3.18)
where Yβ,Λ , Yβ,Λ (y) are the corresponding normalizing constants. As above, {ρβ,Λ (·|y)|Λ ∈ Lfin } defines the family of probability kernels, and, thereby, the family of classical Gibbs states. We will denote this family by Gβcl . For ζ, ζ˜ ∈ Ωβ , we write ζ ∼ ζ˜ if for every l ∈ IL, Z Z ζl (τ )dτ = (3.19) ζ˜l (τ )dτ . Iβ
Iβ
For y ∈ (RD )IL , let Υβ (y) be the equivalence class consisting of such ζ that Z ζl (τ )dτ = yl , ∀ l ∈ IL . (3.20) β −1 Iβ
We write y ∈ Υβ (y) assuming that the former y stands for the configuration of constant loops ωl (τ ) = yl , l ∈ IL and τ ∈ Iβ .
December 11, 2002 9:50 WSPC/148-RMP
1356
00154
S. Albeverio et al.
qc Since all the quasi-classical kernels πβ,λ (·|ζ), as measures on Ωβ , are concenqc trated on Ωβ (see (3.8), (3.10)), every solution of the DLR equation constructed by means of such kernels has the same property.
Proposition 3.1. For every µ ∈ Gβqc and all B ∈ Bβ , µ(B) = µ(C(B)) ,
(3.21)
i.e. every quasi-classical Euclidean Gibbs state is supported by the configurations consisting of constant loops. Gβqc
Our first theorem in this section establishes the relationship between the families and Gβcl .
Theorem 3.1. For every µ ∈ Gβqc , there exists ρ ∈ Gβcl , such that ρ(A) = µ(B) = µ(C(B)) ,
(3.22)
for all A ∈ B((RD )IL ) and B ∈ Bβ , where C(B) ' A in the sense (3.2). The mapping µ 7→ ρ (3.22) is a bijection. qc D IL Proof. By construction, the measure spaces (Ωqc β , B(Ωβ )) and ((R ) , D IL B((R ) )) are isomorphic. On the other hand, since every equivalence class Υβ contains exactly one element of Ωqc β , the latter space and the corresponding factor space are isomorphic as well. Also by construction (3.7), (3.10), (3.18), every solution µ of the DLR equation constructed with the help of the quasi-classical kernels defines by (3.22) a measure ρ on ((RD )IL , B((RD )IL )), which solves the corresponding DLR equation in this space, and vice versa. m m In this section χm β , χβ,Λ , µβ,Λ (·|ζ) will stand for the measures (2.27), (2.56), (2.66) respectively. In such a way we indicate their dependence on the mass m. We m shall speak about a net of measures {µm β,Λ } assuming the net {µβ,Λ (·|ζ)|m ≥ m0 } with a certain positive m0 . IL be chosen. Then for every ζ ∈ Theorem 3.2. Let β > 0, Λ ∈ Lfin , and y ∈ RD (·|ζ)} . . . converges weakly in Ωβ,Λ , as m → +∞, Υβ (y), the net of measures {µm β,Λ qc (·|ζ) = µ (·|y). to the measure µqc β,Λ β,Λ
Theorem 3.3. For every β > 0, Λ ∈ Lfin , and ξ ∈ Ωqc β , the conditional Gibbs (·|ξ), given by (3.10), is a weak limit in Ω , as m → +∞, of the net measure µqc β,Λ β,Λ (·|ζ)} with arbitrary ζ ∈ Υ (ξ). of measures {µm β β,Λ Remark 3.1. Similar statements may be proven also for the measures (2.60) and corresponding quasi-classical periodic measures. The proof of the two just stated theorems is based upon the following lemmas.
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1357
Lemma 3.1. For every box Λ and any β > 0, the net of measures χm β,Λ converges given by (3.7). weakly in the Hilbert space Xβ,Λ to the measure χqc β,Λ Proof. Since, for a box Λ, χm β,Λ is a product measure (see (2.56)), it is enough to prove this lemma for a one-point box. By (3.6), (3.7), one has ( !) Z Z Z χβ
D/2 exp{hϕ, ωiβ }χqc β (dω) = (β/2π)
exp
x,
RD
ϕ(τ )dτ Iβ
× exp{−β(x, x)/2}dx ) ( Z Z 1 0 0 (ϕ(τ ), ϕ(τ ))dτ dτ = exp − 2β Xβ Xβ = exp{−h0 , ϕi2β /2} ,
(3.23)
where k belongs to the base Eβ given by (2.45). This implies that the covariance operator Sβqc of this measure may be written as follows D X
Sβqc =
P0α .
(3.24)
α=1
Then by (2.46) one obtains trace Sβ − Sβqc =
X k∈K\{0}
=
D ≤ mk 2 + 1
X k∈K\{0}
β 2 D X −2 n → 0, 2π 2 m
D mk 2
(3.25)
when m → +∞ .
n∈N
Now one may use Proposition 2.3 which yields the convergence to be proven. Lemma 3.2. For every box Λ and any β > 0, the net of measures {χm β,Λ } converges qc weakly in the Banach space Ωβ,Λ to the measure χβ,Λ . Hence, for arbitrary F ∈ Cb (Ωβ,Λ ), one has Z Z F (ωΛ )χm (dω ) → F (ωΛ )χqc m → +∞ . Λ β,Λ β,Λ (dωΛ ) , Ωβ,Λ
Ωβ,Λ
Proof. Again, it is enough to prove this lemma for a one-element box Λ. By Lemma 2.3 the net {χm β } is tight in the Banach space Cβ . On the other hand, } conby the above lemma, each a net of finite-dimensional approximations {χm,M β in C since it converges in the Hilbert space X (see also the proof verges to χqc,M β β β of Lemma 2.4). Thus, by mentioned Theorem 8.1 of Billingsley’s book [25, p. 54], the same convergence holds for the net {χm β }.
December 11, 2002 9:50 WSPC/148-RMP
1358
00154
S. Albeverio et al.
Proof of Theorems 3.2 and 3.3. Let y ∈ RD Ωβ \ Ωtβ since by (3.20) kζl kβ ≤ kζl kC(Iβ ) ≤ |yl | ,
IL
\ S 0 , then every ζ belongs to
∀l ∈ IL .
Thus, every member of the net {µm β,Λ (·|ζ)}, as well as its limit, are the zero measures. 0 t For y ∈ S , one has ζ ∈ Ωβ , and the members of the net given by (2.66)–(2.67) now may be written as follows m µm β,Λ (dωΛ |ζ) = Fβ,Λ (ωΛ |ζ)χβ,Λ (dωΛ ) ,
where Fβ,Λ (ωΛ |ζ) = where
1 exp − Zβ,Λ (ζ)
X
dll0 hωl , ζl0 iβ
l∈Λ,l0 ∈Λc
(3.26)
Ψβ,Λ (ωΛ ) ,
1 X XZ dll0 hωl , ωl0 iβ − V (ωl (τ ))dτ . Ψβ,Λ (ωΛ ) = exp − 2 0 Iβ l,l ∈Λ
(3.27)
(3.28)
l∈Λ
Since ζ ∈ Ωtβ and the dynamical matrix satisfies the condition (D3), Sec. 2.1, both Fβ,Λ (·|ζ), Ψβ,Λ belong to Cb (Ωβ,Λ ). Moreover, GF (·|ζ) ∈ Cb (Ωβ,Λ ), for all ζ ∈ Ωtβ and any G ∈ Cb (Ωβ,Λ ). Thus by Lemma 3.2, one has Z Z G(ωΛ )µm (dω |ζ) = G(ωΛ )Fβ,Λ (ωΛ |ζ)χm Λ β,Λ β,Λ (dωΛ ) Ωβ,Λ
Ωβ,Λ
Z
→ Ωβ,Λ
G(ωΛ )Fβ,Λ (ωΛ |ζ)χqc β,Λ (dωΛ ) ,
when m → ∞. But Z G(ωΛ )Fβ,Λ (ωΛ |ζ)χqc β,Λ (dωΛ ) Ωβ,Λ
=
1
Z
G(ωΛ )Ψβ,Λ (ωΛ ) Zβ,Λ (ζ) Ωβ,Λ X dll0 hωl , ζl0 iβ χqc (dωΛ ) × exp − β,Λ 0 c l∈Λ,l ∈Λ
Z
=
1 ˆ Λ )Ψ ˆ β,Λ (xΛ ) G(x Zβ,Λ (ζ) RD|Λ| ! Z X dll0 xl , ζl0 (τ )dτ $β,Λ (dxΛ ) × exp − Iβ 0 c l∈Λ,l ∈Λ
(3.29)
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
=
1
1359
Z
G(ωΛ )Ψβ,Λ (ωΛ ) Zβ,Λ (ζ) Ωβ,Λ X dll0 hωl , yl0 iβ χqc (dωΛ ) × exp − β,Λ 0 c l∈Λ,l ∈Λ
Z = Ωβ,Λ
G(ωΛ )µqc β,Λ (dωΛ |y) ,
where ˆ Λ ) = G(ωΛ ) , G(x
ωΛ (τ ) = xΛ ,
ˆ β,Λ . The proof of Theorem 3.3 is straightforward. and similarly Ψ 4. Green Functions for Unbounded Operators The most spectacular phenomenon described by the model considered in this work is the spontaneous O(D)-symmetry breaking, which occurs when the fluctuations of displacements of particles become large. Since the displacement operators ql , l ∈ IL are unbounded, to study this phenomenon we should extend the local Gibbs states, as well as the corresponding Green functions, to certain classes of unbounded multiplication operators. To this end we will use representations like (2.62), (2.63), which makes it possible to replace bounded functions by suitable integrable unbounded functions. Theorem 4.1. Let the functions A1 , . . . , An : RD|Λ| → C be such that for every β > 0 and every τ ∈ Iβ , the functions Ωβ,Λ 3 ωΛ 7→ Aj (ωΛ (τ )), j = 1, . . . n, are (0) µβ,Λ (respestively µβ,Λ ) integrable. Then, for the corresponding multiplication operators A1 , . . . , An , the Green function (2.62) (respectively (2.63)) can be analytically continued to the domain Dnβ defined by (2.14). Proof. In view of the statement (c) of Lemma 2.1, it is enough to show that there exists a function F ∈ Hol(Dnβ ), such that its restriction to the set Dnβ (0, . . . , 0) coincides with the function (2.62) (respectively (2.63)). Let us show this in the case def of periodic boundary conditions. By (2.62), for any δ > 0, the operators Aˆj = Aj exp(−δHΛ ), j = 1, . . . , n are bounded since Z Aj (ωΛ (0))µδ,Λ (dωΛ ) < ∞ . trace {Aj exp(−δHΛ )} = Zδ,Λ
(4.1)
Ωδ
Given δ ∈ (0, β), we take positive δ1 , . . . , δn , such that δ1 + · · · + δn = δ. Then, for 0 ≤ τ1 ≤ · · · ≤ τn ≤ β, one has Γβ,Λ A1 ,...,An (τ1 , . . . , τn ) =
Zβ−δ,Λ β−δ,Λ Γ (ˆ τ1 , . . . , τˆn ) , Zβ,Λ Aˆ1 ,...,Aˆn
(4.2)
December 11, 2002 9:50 WSPC/148-RMP
1360
00154
S. Albeverio et al.
where the arguments of the Green function on the right-hand side are τˆ1 = τ1 , τˆk = τk − (δ1 + · · · + δk−1 ), k = 2, . . . , n ,
(4.3)
and satisfy the condition 0 ≤ τˆ1 ≤ · · · ≤ τˆn ≤ β − δ . By Lemma 2.1, the function on the right-hand side of (4.2) can be continued to a b β−δ function holomorphic in (tˆ1 , . . . , tˆn ) ∈ Dnβ−δ . Let D δ1 ,...,δn stand for the set of values β ˆ ˆ of (t1 , . . . , tn ) ∈ Dn , such that t1 = t1 , tk = tk + i(δ1 + · · · + δk−1 ), k = 2, . . . , n with (tˆ1 , . . . , tˆn ) ∈ Dnβ−δ . Then the left-hand side of (4.2) can be continued to a function b β−δ , which is an open subset of Dnβ . But of (t1 , . . . , tn ) holomorphic in D δ1 ,...,δn [ β−(δ +···+δ ) 1 n β b D , Dn = δ1 ,...,δn where summation is taken over all δ1 , . . . , δn running through the interval (0, β) and obeying the condition δ1 + · · · + δn < β. Thus Γβ,Λ A1 ,...,An can be continued to the whole Dnβ . In contrast to the case of bounded operators (c.f. Lemma 2.1(b)), one cannot expect that, for unbounded operators, the extended Green functions Gβ,Λ A1 ,...,An are ¯ β and continuous on its boundaries. To get such a continuity uniformly bounded on D n we impose additional restrictions on the functions A1 , . . . , An . (D)
Definition 4.1. A continuous function A : RD|Λ| → C belongs to the family PΛ if for arbitrary a > 0, the function ( ) X D|Λ| 2 3 xΛ 7→ |A(xΛ )| exp −a |xl | (4.4) R l∈Λ
is bounded on R
D|Λ|
.
Here |xl | stands for the Euclidean norm of xl ∈ RD . In the case of one-point boxes, i.e. for |Λ| = 1, we will simply write P(D) . It is worth noting that under (D) point-wise multiplication PΛ is an algebra. (D)
Corollary 4.1. For arbitrary A1 , . . . , An ∈ PΛ , the temperature Green functions (2.62), (2.63) may be continued analytically in accordance with Theorem 4.1. (D)
Indeed, by (2.33), functions from PΛ
are integrable. (D)
Theorem 4.2. Given a box Λ, let A1 , . . . , An belong to PΛ . Then for all ζ ∈ Ωβ , the Green functions (2.62), (2.63), (2.69) are continuous functions of (τ1 , . . . , τn ) ∈ Iβn . Proof. In view of (2.66) one may rewrite (2.69) as follows Z (τ , . . . , τ ) = A1 (ωΛ (τ1 )) · · · An (ωΛ (τn )) Γζ,β,Λ 1 n A1 ,...,An Ωβ,Λ
× Ψβ,Λ (ωΛ |ζ)χβ,Λ (dωΛ ) ,
(4.5)
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1361
where 1
def
Ψβ,Λ (ωΛ |ζ) =
Zβ,Λ (ζ)
V exp −Eβ,Λ (ωΛ |ζ) .
(4.6)
All Aj are continuous, thus all the functions Ωβ,Λ 3 ωΛ 7→ Aj (ωΛ (τ )) are continuous as well. Set def
R(ωΛ ) =
sup |Aj (ωΛ (τj ))| .
max
(4.7)
j=1,...,n τj ∈Iβ
(D)
By Lemma 2.2 the latter function is χβ,Λ -integrable since all Aj belong to PΛ . Hence def
φ(dωΛ ) = [R(ωΛ )]n Ψβ,Λ (ωΛ |ζ)χβ,Λ (dωΛ ) is a measure on Ωβ,Λ . It is tight because, for finite Λ, Ωβ,Λ is a Polish space. Therefore, for every ε > 0, there exists a compact subset Ωεβ,Λ ⊂ Ωβ,Λ such that φ(Ωβ,Λ \ Ωεβ,Λ ) <
ε . 4
(4.8)
For δ > 0, let
def ζ,β,Λ 0 0 Υδ = sup Γζ,β,Λ A1 ,...,An (τ1 , . . . , τn ) − ΓA1 ,...,An (τ1 , . . . , τn ) ,
(4.9)
where the supremum is taken over the subset of Iβ2n defined by the condition max |τj − τj 0 | < δ .
j=1,...,n
For such δ and ωΛ ∈ Ωβ,Λ , we set def
Wδ (ωΛ ) =
max
sup
j=1,...,n |τ −τ 0 |<δ j j
|Aj (ωΛ (τj )) − Aj (ωΛ (τj0 ))| .
(4.10)
Since all Aj are continuous functions from RD|Λ| to C, in order that Ωεβ,Λ be compact it is necessary and sufficient that the following conditions be satisfied simultaneously (see [66, p. 213]): (i) lim
sup
δ&0 ωΛ ∈Ωε
Wδ (ωΛ ) = 0,
(4.11)
β,Λ
(ii)
sup
ωΛ ∈Ωεβ,Λ
R(ωΛ ) < ∞ ,
(4.12)
where R was defined by (4.7). Now let us estimate Υδ . From (4.9), (4.7) and (4.10), one obtains Z Wδ (ωΛ )[R(ωΛ )]n−1 Ψβ,Λ (ωΛ |ζ)χβ,Λ (dωΛ ) + 2φ(Ωβ,Λ \ Ωεβ,Λ ) . Υδ ≤ n Ωεβ,Λ
In view of (4.11) and (4.12), one can choose δ small enough making the first term in the right-hand side of the latter formula less than ε/2. The second one has already been estimated by (4.8). The stated continuity of the Green functions (2.62) and (2.63) may be proven just in the same way.
December 11, 2002 9:50 WSPC/148-RMP
1362
00154
S. Albeverio et al.
5. Lattice Approximation In the following two sections our aim is to prove, for the Euclidean measures (2.60), (2.61) and (2.66), correlation inequalities analogous to the inequalities known in the Euclidean quantum field theory (see e.g. [76, 77]). In the subsequent sections we use these basic inequalities to get a number of new correlation inequalities, which in turn are used in studying physical properties of our models. The basic inequalities we are going to prove concern the one-dimensional oscillations, that does not preclude from their application to the vector case which will be given below. Thus, we put in the following two sections D = 1. Since we will use the mentioned inequalities for the moments not only of the above introduced local Gibbs measures, we prove them in a more general setting. Given a box Λ and ξ ∈ Ωβ,Λ , we define the following measure %β,Λ (dωΛ |ξ) = Φβ,Λ (ωΛ |ξ)χβ,Λ (dωΛ ) 1 X X 1 def exp − = Jll0 hωl , ωl0 iβ − hωl , ξl iβ 2 0 Yβ,Λ (ξ) l,l ∈Λ
−
XZ l∈Λ
l∈Λ
)
Iβ
W (ωl (τ ))dτ
χβ,Λ (dωΛ ) ,
(5.1)
with certain nonpositive Jll0 = Jl0 l , ξ ∈ Xβ and W (x) = w((x, x)), w being a polynomial satisfying (2.4). Clearly, every measure (2.60), (2.61) and (2.66) may be written in this form. The Gaussian measure χβ is determined by its covariance operator Sβ given by (2.26). Since D = 1, the base Eβ (2.45) consists of the eigenfunctions {ek |k ∈ K}. In this case the canonical representation (2.46) may be rewritten as follows Sβ =
X k∈K
1 Pk . +1
mk 2
(5.2)
Now we choose N = 2L, L ∈ N and set (N ) def
λk
=
1 , 2 h i 2 β m 2N + 1 sin k β 2N
(5.3)
and (N ) def
Sβ
=
X
(N )
λk Pk ,
k∈KN
It is a technical exercise to prove the following statement.
(5.4)
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1363
(N )
Proposition 5.1. The sequence of finite-rank operators {Sβ } converges in the trace norm, when N → ∞, to the operator Sβ . (N )
(N )
Let χβ be the Gaussian measure on Xβ having Sβ as a covariance operator. This measure may also be written in a coordinate representation. To this end we (N ) introduce Gaussian measures on R, σk , k ∈ K, such that Z 1 (N ) (N ) exp(ixy)σk (dy) = exp − λk x2 , (5.5) 2 R (N )
with λk
given by (5.3). Then O (N ) (N ) σk (dˆ ωk ) χβ (dω) = k∈KN
where ω(τ ) =
X
O
δ(ˆ ωk )dˆ ωk ,
(5.6)
ω(τ )ek (τ )dτ ,
(5.7)
k∈K\KN
Z ω ˆ (k)ek (τ ) ,
ω ˆ (k) = Iβ
k∈K
and δ is the Dirac δ-function on R. Proposition 5.1 and Lemma 2.4 immediately yield (N )
Lemma 5.1. The sequence of Gaussian measures {χβ } converges weakly in the Banach space Cβ to the measure χβ . (N )
Employing the sequence {χβ |N ∈ N}, we will construct by means of (2.60) and (2.66) approximations of the measure %β,Λ (·|ξ) (5.1), and hence of its moments such as the Green functions (5.14). This means that, for integrable functions F : Ωβ,Λ → C, the integrals Z Z def F (ωΛ )%β,Λ (dωΛ |ξ) = F (ωΛ )Φβ,Λ (ωΛ |ξ)χβ,Λ (dωΛ ) hF i%β,Λ (·|ξ) = Ωβ,Λ
Ωβ,Λ
(5.8) will be approximated by Z Z (N ) F (ωΛ )Φβ,Λ (ωΛ |ξ)χβ,Λ (dωΛ ) = Ωβ,Λ
(N )
Ωβ,Λ
(N )
F (N ) (ωΛ )Φβ,Λ (ωΛ |ξ)χβ,Λ (dωΛ ) , (5.9)
where def
(N )
χβ,Λ (dωΛ ) =
O
(N )
χβ (dωl ) ,
(5.10)
l∈Λ
and def
(N )
F (N ) (ωΛ ) = F (ωΛ ) , (N ) (N ) ωΛ = ωl
l∈Λ
def
(N )
(N )
Φβ,Λ (ωΛ |ξ) = Φβ,Λ (ωΛ |ξ) , ,
(N ) def
ωl
=
X k∈KN
Pk ωl .
(5.11) (5.12)
December 11, 2002 9:50 WSPC/148-RMP
1364
00154
S. Albeverio et al.
The reason to use such approximations is that the integral on the right-hand side of (5.9) may be rewritten as an integral over a finite-dimensional space. To the latter integral one may apply the classical ferromagnetic interpretation, which would yield the correlation inequalities we are going to get. To do this we should make more precise definition of the class of functions F , for which such an interpretation makes sense. First of all we will need the following functions F (ωΛ ) = A1 (ωΛ (τ1 )) · · · An (ωΛ (τn )) ,
(5.13)
(1)
with A1 , · · · , An ∈ PΛ , which determine the Green functions Z (ξ) A1 (ωΛ (τ1 )) · · · An (ωΛ (τn ))%β,Λ (dωΛ |ξ) ΓA1 ,...,An (τ1 , . . . , τn ) = Ωβ,Λ
def
= hA1 (ωΛ (τ1 )) · · · An (ωΛ (τn ))i%β,Λ (·|ξ) .
(5.14) (N )
The latter functions are continuous on Iβn in view of Theorem 4.2. Since the ωΛ belong to a finite-dimensional subspace of Xβ,Λ , they can be written as linear combinations of ωΛ ((ν/N )β), ν = 0, . . . , N −1, which may be chosen as variables for the mentioned finite-dimensional integrals. Therefore, for F given by (5.13), it would be much more convenient to construct such approximations if the arguments τ1 , . . . , τn belonged to Qβ ⊂ Iβ , where Qβ consists of the values of τ , for which τ /β is rational. Then, for given τ1 , . . . , τn ∈ Qβ , one can find ν1 , . . . , νn , N ∈ N, such that τj = (νj /N )β, j = 1, . . . , n. In this case, we obtain ferromagnetic approximations of the Green functions (5.14) only for the arguments belonging to Qβ . But in view of the continuity of these functions, this will be enough since Qβ is dense in Iβ . Following this way we will deal with such basic types of functions Ωβ,Λ → R: (i) ωΛ 7→ ωΛ (τ ) , τ ∈ Qβ ; (ii) ωΛ 7→ hωl , ωl0 iβ , ωΛ 7→ hωl , ξl iβ , l, l0 ∈ Λ , ξ ∈ Ωβ,Λ ; Z W (ωl (τ ))dτ . (iii) ωΛ 7→
(5.15)
Iβ
Definition 5.1. Given τ1 , . . . , τn , n ∈ Z+ , the family Fβ,Λ (τ1 , . . . , τn ) consists of the continuous functions F : Ωβ,Λ → C which are compositions of the functions ωΛ 7→ (ωΛ (τ1 ), . . . , ωΛ (τn )) , with functions Rn|Λ| → C, such that for all a1 , . . . , an > 0, n XX aj [ωl (τj )]2 < ∞ . |F (ωΛ )| exp −
(5.16)
l∈Λ j=1
(1)
Clearly, the functions F having the form (5.13) with A1 , . . . , An ∈ PΛ belong to the family just introduced.
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1365
Thereafter, we choose τ1 , . . . , τn ∈ Qβ , n ∈ Z+ , ξ ∈ Ωβ,Λ and keep them fixed. (m) Then for n ≥ 1, there exist tending to infinity sequences {N (m) |m ∈ N}, {νj |m ∈ N}, j = 1, . . . , n, such that for all m ∈ N, (m)
τj =
νj
N (m)
β,
j = 1, . . . , n .
(5.17)
Below we drop the superscript (m) assuming that N and νj tend to infinity in such a way that (5.17) holds. We also suppose that all N are even. The set of values of such N satisfying (5.17) depends on the choice of {τj , j = 1, . . . , n}, we denote it by N (τ1 , . . . , τn ) . Theorem 5.1. For every F ∈ Fβ,Λ (τ1 , . . . , τn ) and all ξ ∈ Ωβ,Λ , the following convergence holds Z Z (N ) F (N ) (ωΛ )Φ(N ) (ωΛ |ξ)χβ,Λ (dωΛ ) → F (ωΛ )Φ(ωΛ |ξ)χβ,Λ (dωΛ ) Ωβ,Λ
Ωβ,Λ
(5.18) as N (τ1 , . . . , τn ) 3 N → ∞. Proof. By (5.6) and (5.10) one has Z Z (N ) F (N ) (ωΛ )Φ(N ) (ωΛ |ξ)χβ,Λ (dωΛ ) = Ωβ,Λ
Ωβ,Λ
(N )
F (ωΛ )Φ(ωΛ |ξ)χβ,Λ (dωΛ ) . (N )
By Lemma 5.1, for any box Λ, the sequence of product measures {χβ,Λ |N ∈ N (τ1 , . . . , τn )} converges weakly in Ωβ,Λ to the measure χβ,Λ . On the other hand, for every ξ ∈ Ωβ,Λ , the function F Φ(·|ξ) is bounded and continuous on Ωβ,Λ , which yields (5.18). Now let G : Rn|Λ| → C be such that F (ωΛ ) = G(ωΛ (τ1 ), . . . , ωΛ (τn )). Then (N ) (N ) (ωΛ ) = G(ωΛ (τ1 ), . . . , ωΛ (τn )). Our next statement gives the ferromagnetic F representation of the above approximating integrals. (N )
Theorem 5.2. For every τ1 , . . . , τn ∈ Qβ , all F ∈ Fβ,Λ (τ1 , . . . , τn ), any ξ ∈ Ωβ,Λ , and all N ∈ N (τ1 , . . . , τn ) the following representation holds Z (N ) (N ) F (N ) (ωΛ )Φβ,Λ (ωΛ |ξ)χβ,Λ (dωΛ ) Ωβ,Λ
=
Z (N ) Kβ,Λ (ξ) (N )
RN |Λ|
G(SΛ (ν1 ), . . . , SΛ (νn ))ρβ,Λ (dSΛ |X)
= Kβ,Λ (ξ) < G(SΛ (ν1 ), . . . , SΛ (νn )) >ρβ,Λ (·|X) .
(5.19)
December 11, 2002 9:50 WSPC/148-RMP
1366
00154
S. Albeverio et al. (N )
Here Kβ,Λ (ξ) is a positive constant, νj = (τj /β)N, j = 1, . . . , n, and the probability measure ρβ,Λ (·|X) has the form N −1 1 X X 1 def exp − Jll0 Sl (ν)Sl0 (ν) ρβ,Λ (dSΛ |X) = 2 0 Cβ,Λ (X) ν=0 l,l ∈Λ
−
−1 X XN l∈Λ
×
N −1 mN 2 X X Sl (ν)Xl (ν) − [S(ν + 1) − S(ν)]2 2 2β ν=0 ν=0
−1 O ON
)
l∈Λ
σ (N ) (dSl (ν)) ,
(5.20)
l∈Λ ν=0
where the coefficients Jll0 are the same as in (5.1), {Xl (ν)|l ∈ Λ, ν = 0, . . . , N − 1} is a certain, dependent on ξ, real vector (X = 0 for ξ = 0), and s ( ! ) β 1 N (N ) 2 Sl (ν) − [Sl (ν)] dSl (ν) . (5.21) σ (dSl (ν)) = exp − W N β 2 It should be noted here that by [76] the measure ρβ,Λ (·|X) corresponds to a general type ferromagnet, whereas ρβ,Λ (·|0) corresponds to an even ferromagnet. Proof of Theorem 5.2. First we change the variables in the integral on the righthand side of (5.9) by means of the following Fourier transformations (c.f., (5.7)) Z X ω ˆ l (k)ek (τ ) , ω ˆ l (k) = ωl (τ )ek (τ )dτ , (5.22) ωl (τ ) = Iβ
k∈K
where the functions ek , k ∈ K were defined by (2.45). Then, for Qβ 3 τ = (ν/N )β, one has s ν X N N X (N ) β = p εp (ν) , ω ˆ l (k)ek ω ˆl (5.23) ωl (τ ) = N β β p∈PN
k∈KN
where def
PN =
2π κ|κ = −(L − 1), . . . , L , p= N
and for ν = 0, 1, . . . , N − 1, (c.f. (2.45)) r 2 cos pν(p > 0) , εp (ν) = N
r εp (ν) = −
(5.24)
2 sin pν(p < 0), N
1 ε0 (ν) = √ . N
(5.25)
(N )
For the functions of the type (ii) taken at ωΛ , one has X X N N (N ) (N ) p ω ˆ l0 p , ω ˆ l (k)ˆ ωl0 (k) = ω ˆl hωl , ωl0 iβ = β β k∈KN
p∈PN
(5.26)
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1367
and (N )
, ξl iβ =
hωl
X
Z ω ˆ l (k)ξˆl (k) ,
ξˆl (k) =
k∈KN
Iβ
ξl (τ )ek (τ )dτ .
(5.27)
As for functions of the type (iii), instead of (5.22) it is more convenient to use the following transformation 1 X ω ˜ l (k) exp(ikτ ) , ωl (τ ) = √ β k∈K Z 1 ωl (τ ) exp(−ikτ )dτ . ω ˜ l (k) = √ β Iβ
(5.28)
Then one has Z Iβ
Z r X (N ) W ωl (τ ) dτ = ws s=1
Iβ
i2s h (N ) ωl (τ ) dτ .
(5.29)
Further Z Iβ
i2s h (N ) ωl (τ ) dτ = β −s
X
Z ×
ω ˜ l (k1 ) · · · ω ˜ l (k2s )
k1 ,...,k2s ∈KN
Iβ
= β −s+1
exp[i(k1 + · · · + k2s )τ ]dτ X
ω ˜ l (k1 ) · · · ω ˜ l (k2s )
k1 ,...,k2s ∈KN
× δ(k1 + · · · + k2s ) .
(5.30)
Here δ(0) = 1 and δ(k) = 0 for k 6= 0. Now we introduce the variables (quasi-spins) Sl (ν): r
β ν ωl β , l ∈ Λ , ν = 0, 1, . . . , N − 1 ; N N N N p , S˜l (p) = ω p , p ∈ PN ; ˆl ˜l Sˆl (p) = ω β β
Sl (ν) =
(5.31)
for which one has Sl (ν) = Sˆl (p) =
X
1 X ˜ Sˆl (p)εp (ν) = √ Sl (p) exp(ipν) ; N p∈PN p∈PN N −1 X ν=0
Sl (ν)εp (ν) ,
N −1 1 X S˜l (p) = √ Sl (ν) exp(−ipν) . N ν=0
(5.32)
December 11, 2002 9:50 WSPC/148-RMP
1368
00154
S. Albeverio et al.
Then (5.30) may be rewritten in the following way Z h i2s X β β (N ) −s+1 ˜ ˜ k1 · · · Sl k2s δ(k1 + · · · + k2s ) ωl (τ ) dτ = β Sl N N Iβ k1 ,...,k2s ∈KN
=
1 N s β s−1
N −1 X
X
Sl (ν1 ) · · · Sl (ν2s )
k1 ,...,k2s ∈KN
ν1 ,...,ν2s =0
iβ × δ(k1 + · · · + k2s ) exp − (k1 ν1 + · · · + k2s ν2s ) N =
1 s N β s−1
N −1 X
Sl (ν1 ) · · · , Sl (ν2s )
ν1 ,...,ν2s =0 L X
×
κ1 ,...,κ2s−1 =−L+1
2πi κ1 (ν1 − ν2s ) exp − N
2πi κ2s−1 (ν2s−1 − ν2s ) × · · · × exp − N #2s "s N −1 N −1 β X N N 2s−1 X 2s Sl (ν) [Sl (ν)] = . = s s−1 N β N ν=0 β ν=0
Returning to (5.29) one obtains Z N −1 β X (N ) W ωl (τ ) dτ = W N ν=0 Iβ
s
! N Sl (ν) . β
(5.33)
Accordingly, (N )
hωl
X
(N )
, ωl0 iβ =
Sˆl (p)Sˆl0 (p) =
N −1 X
Sl (ν)Sl0 (ν) ,
(5.34)
ν=1
p∈PN
and N −1 X ω ˆ l (k)ξl (k) = Sl (ν)Xl (ν) , ν=0 k∈KN X N def p εp (ν) . ξl Xl (ν) = β
(N )
hωl
X
, ξl iβ =
(5.35)
p∈PN
At last, (5.23) takes the form ωl (τj ) = ωl
ν
j
N
β =
s N Sl (νj ) , β
j = 1, . . . , n .
(5.36)
The next step is to introduce the measure on finite-dimensional space which would have the above mentioned ferromagnetic properties and such that the integral
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1369
on the right-hand side of (5.9) could be substituted by the integral over this finitedimensional space. Here we use the representation (5.6) and construct a finite(N ) dimensional analogue of χβ . To this end we introduce the following Gaussian measure on RN : O (N ) ) ˆ = ˆ φ(N (5.37) φβ (dS) p (dS(p)) , p∈PN
where the measure
(N ) φp
θp(N ) (N )
It is clear that $k
satisfies (5.5) with (c.f.(5.3)) 1 def (N ) = λN p/β . = 2 2N 2 m β (sin p/2) + 1
(5.38)
(N )
= φβk/N , where the former measure defines by (5.6) the
(N )
measure χβ . On the other hand, this new Gaussian measure may be written ˆ in the coordinates {S(ν), ν = 0, . . . , N − 1}, related to {S(p), p ∈ PN } by the transformation (5.32), as follows ( N −1 1 mN 2 X (N ) exp − [S(ν + 1) − S(ν)]2 φβ (dS) = Cβ,N 2β 2 ν=0 N −1 1 X − [S(ν)]2 2 ν=0
) N −1 O
dS(ν) ,
(5.39)
ν=0
with the convention S(N ) = S(0) and the normalizing constant Cβ,N · · · . There(N ) fore, the measure φβ may be regarded as the Gibbs measure of a chain of unbounded (Gaussian) spins. Due to the choice of the numbers (5.38), the interaction is ferromagnetic and of the nearest-neighbor type. (N ) Now we define the measure which will correspond to χβ,Λ given by (5.10). It is O (N ) def (N ) φβ (dSl ) φβ,Λ (dSΛ ) = l∈Λ
=
(
1
exp −
[Cβ,N ]|Λ| −
N −1 mN 2 X X [Sl (ν + 1) − Sl (ν)]2 2β 2 ν=0 l∈Λ
N −1 1XX [Sl (ν)]2 2 ν=0 l∈Λ
)
−1 O ON
dSl (ν) .
(5.40)
l∈Λ ν=0
By construction we have that Z (N ) (N ) F (N ) (ωΛ )Φβ,Λ (ωΛ |ξ)χβ,Λ (dωΛ ) Ωβ,Λ
Cβ,Λ (X) = Yβ,Λ (ξ)
N β
n2 Z RN |Λ|
G(SΛ (ν1 ), . . . , SΛ (νn ))ρβ,Λ (dSΛ |X) ,
where the measure ρβ,Λ (·|X) is given by (5.20) and (5.21).
(5.41)
December 11, 2002 9:50 WSPC/148-RMP
1370
00154
S. Albeverio et al.
6. Basic Inequalities In this section we use the lattice approximation to prove a number of basic inequalities for the moments, like (5.8), of the measure (5.1) with the function w satisfying the condition (V1) given in Sec. 2. Theorem 6.1 ([FKG Inequality]). Given Λ, β, and τ1 , . . . , τn ∈ Iβ , let the functions F, G ∈ Fβ,Λ (τ1 , . . . , τn ) increase when every chosen ωl (τj ), l ∈ Λ, j = 1, . . . , n increases. Then the following inequality hF Gi%β,Λ (·|ξ) ≥ hF i%β,Λ (·|ξ) hGi%β,Λ (·|ξ) ,
(6.1)
holds for all ξ ∈ Ωβ,Λ . The proof follows from Theorems 5.1 and 5.2 and the fact that the measure (5.20) corresponds to a general type ferromagnet, for which the FKG inequality holds (see [76, Theorem VIII.16]). Below %β,Λ will stand for the measure (5.1) with ξ = 0. Theorem 6.2 ([GKS Inequalities]). Given Λ and β, let the real valued functions (1) A1 , . . . , An+m ∈ PΛ , n, m ∈ N have the properties: (a) every Aj depends only on the values of xlj with a certain lj ∈ Λ; (b) every Aj is either an odd monotone growing function of xlj or an even positive function, monotone growing on [0, +∞). Then the following inequalities hold hA1 (ωΛ (τ1 )) · · · An (ωΛ (τn ))i%β,Λ ≥ 0 ,
(6.2)
hA1 (ωΛ (τ1 )) · · · An (ωΛ (τn ))An+1 (ωΛ (τn+1 )) · · · An+m (ωΛ (τn+m ))i%β,Λ ≥ hA1 (ωΛ (τ1 )) · · · An (ωΛ (τn ))i%β,Λ hAn+1 (ωΛ (τn+1 )) · · · An+m (ωΛ (τn+m ))i%β,Λ . (6.3) The proof follows from Theorems 5.1 and 5.2 and the fact that the measure (5.20) with X = 0 corresponds to an even ferromagnet, for which the GKS inequalities hold (see [76, Theorem VIII.14]). Corollary 6.1. For all τ1 , . . . , τn+m ∈ Iβ , the Green functions (2.62), (2.63) obey the following inequalities: Γβ,Λ A1 ,...,An (τ1 , . . . , τn ) ≥ 0 ,
Γ0,β,Λ A1 ,...,An (τ1 , . . . , τn ) ≥ 0 ;
(6.4)
Γβ,Λ A1 ,...,An+m (τ1 , . . . , τn+m ) β,Λ ≥ Γβ,Λ A1 ,...,An (τ1 , . . . , τn )ΓAn+1 ,...,An+m (τn+1 , . . . , τn+m )
Γ0,β,Λ A1 ,...,An+m (τ1 , . . . , τn+m ) 0,β,Λ ≥ Γ0,β,Λ A1 ,...,An (τ1 , . . . , τn )ΓAn+1 ,...,An+m (τn+1 , . . . , τn+m ) .
(6.5)
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
For ϕΛ = (ϕl )l∈Λ ∈ Xβ,Λ , we set ( ) Z X exp hϕl , ωl iβ %β,Λ (dωΛ ) . F (ϕΛ ) = χβ,Λ
1371
(6.6)
l∈Λ
F (ϕΛ ) is an entire real analytic function, which means that the expansion F (ϕΛ ) =
∞ X 1 (n) F (ϕΛ , . . . , ϕΛ ) , n! n=0
(6.7)
where every F (n) (·, . . . , ·) is a n-linear bounded functional on Xβ,Λ , converges absolutely on bounded subsets of the Hilbert space Xβ,Λ . These functionals may be written in the integral form X Z (n) Fl1 ,...,ln (τ1 , . . . , τn ) F (ϕΛ , . . . , ϕΛ ) = Iβn
l1 ,...,ln ∈Λ
× ϕl1 (τ1 ) · · · ϕln (τn )dτ1 · · · dτn ,
(6.8)
with the kernels being the moments of the measure %β,Λ , i.e. Fl1 ,...,ln (τ1 , . . . , τn ) = hωl1 (τ1 ) · · · ωln (τn )i%β,Λ , which means in turn that they are the Green functions (5.14) with Aj (xΛ ) = xlj . These kernels are continuous as functions of τ1 , . . . , τn . Since F (0) = 1, the function log F (ϕΛ ) is a real analytic function in a neighborhood of the point ϕΛ = 0, where it can be expanded similarly to (6.7) def
U (ϕΛ ) = log F (ϕΛ ) = and U (n) (ϕΛ , . . . , ϕΛ ) =
∞ X 1 (n) U (ϕΛ , . . . , ϕΛ ) , n! n=0
(6.9)
Z
X l1 ,...,ln ∈Λ
Iβn
Ul1 ,...,ln (τ1 , . . . , τn )
× ϕl1 (τ1 ) · · · ϕln (τ1 )dτn · · · dτn .
(6.10)
Theorem 6.3 ([Lebowitz Inequalities]). Given Λ and β, the following inequality Ul1 ,...,l4 (τ1 , . . . , τ4 ) ≤ 0 ,
(6.11)
holds for all τ1 , . . . , τ4 ∈ Iβ and l1 , . . . , l4 ∈ Λ. The proof follows from Theorems 5.1 and 5.2 and the fact that the Lebowitz inequality holds for the measure (5.20) with the function w satisfying the condition (V1) given in Sec. 2 and with X = 0 (see [84, Theorem 2.4 and Corollary 2.5]). For Gaussian random variables X1 , . . . , X2n , n ∈ N with zero mean, one has hX1 · · · X2n i =
n X Y σ∈Sn k=1
hXσ(2k−1) Xσ(2k) i ,
December 11, 2002 9:50 WSPC/148-RMP
1372
00154
S. Albeverio et al.
where the sum is taken over all partitions of the set {1, . . . , 2n} onto the unordered pairs of its different elements. If in the latter expression one has “≤” instead of “=”, the variables X1 , . . . , X2n are said to obey the Gaussian upper bound principle. Theorem 6.4 ([Gaussian Upper Bound]). Given Λ and β, the following inequality hωl1 (τ1 ) · · · ωl2n (τ2n )i%β,Λ ≤
n X Y
hωlσ(2k−1) (τσ(2k−1) )ωlσ(2k) (τσ(2k) )i%β,Λ ,
σ∈Sn k=1
(6.12) holds for all values of l1 , . . . , l2n ∈ Λ and τ1 , . . . , τ2n ∈ Iβ . The proof follows from Theorems 5.1 and 5.2 and the fact that the Gaussian upper bound principle holds for the measure (5.20) with X = 0 (see [40, Sec. 12]). Theorem 6.5. Under the conditions of Theorem 6.3 let the potential W , which defines the measures (5.1), (5.20), have the form W (x) =
1 2 ax + bx4 , 2
a ∈ R,
b > 0.
(6.13)
Then the following inequalities (−1)n−1 Ul1 ,...,l2n (τ1 , . . . , τ2n ) ≥ 0 ,
(6.14)
hold for all n ∈ N, all τ1 , . . . , τ2n ∈ Iβ and l1 , . . . , l2n ∈ Λ The proof follows from Theorems 5.1 and 5.2 and the fact that the above sign rule holds for the measure (5.20) with X = 0 and W given by (6.13), which can be deduced from Shlosman’s results [75] for the Ising model by means of the classical Ising approximation (for more details see [76, Chap. IX]). 7. More Inequalities 7.1. Scalar domination Here we follow [55] and assume that the measures (2.60), (2.61) and the Green functions (2.62), (2.63) describe the vector model (2.1)–(2.9) with D > 1 and with the potential V (2.3) obeying the condition (V2), Sec. 2.1. Since we will compare the Green functions for this model with similar functions for the corresponding scalar model, we need a special notation for the latter ones. Let the Green functions ˜ 0,β,Λ be defined also by (2.62) and (2.63) respectively but for the model ˜ β,Λ and Γ Γ (2.1)–(2.9) with D = 1. Theorem 7.1 ([Scalar Domination]). For a box Λ, let the local Gibbs measures be defined by (2.60), (2.61) with the potential V obeying the condition (D) (V2), Sec. 2.1 and with arbitrary D > 1. Let the functions A1 , . . . , An ∈ PΛ ,
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1373
n ∈ N have the following property: there exists α ∈ {1, 2, . . . , D} and the func(1) tions A˜1 , . . . , A˜n ∈ PΛ satisfying the conditions of Theorem 6.2 and such that α Aj (xΛ ) = A˜j (xΛ ), j = 1, . . . , n. Then for arbitrary τ1 , . . . , τn ∈ Iβ , ˜ β,Λ (7.1) 0 ≤ Γβ,Λ ˜ (τ1 , . . . , τn ) . ˜ ,...,A A1 ,...,An (τ1 , . . . , τn ) ≤ ΓA 1
n
Remark 7.1. It is important that all Aj depend on their xα Λ with one and the same α. The first above inequality is a D-dimensional version of the first GKS inequality (6.4). The second inequality in (7.1) describes scalar domination. The same inequalities hold also for the zero boundary condition. Proof of Theorem 7.1. For the α mentioned in the hypothesis, let us decompose α , Xβ = X¯β × Xβα , Xβ,Λ = X¯β,Λ × Xβ,Λ α ωΛ , ωΛ ), where a (D − 1)which means that every ωΛ ∈ Xβ,Λ is regarded as ωΛ = (¯ dimensional vector ω ¯ Λ belongs to X¯β,Λ (or to X¯β for one-element Λ), whereas the α α is supposed to belong to the Hilbert space Xβ,Λ (respectively to Xβα for scalar ωΛ one-element Λ). Then the Gaussian measure χβ can also be decomposed
¯ β ⊗ χα ω , dω α ) , χβ (dω) = (χ β )(d¯
(7.2)
α ¯ where the Gaussian measures χ ¯ β , χα β are defined on the Hilbert spaces Xβ and Xβ respectively. The potential V (2.3), may be written
V (x) = v((¯ x, x¯)) + v((xα )2 ) +
r−1 X
Bs (¯ x) ≥ 0 .
(xα )2s Bs (¯ x) ,
(7.3)
s=2
x) follows from the condition (V2). Set The nonnegativity of Bs (¯ Z X r−1 def (ω α (τ ))2s Bs (¯ ω (τ ))dτ . Q(¯ ω, ωα) =
(7.4)
Iβ s=2
Then by means of the decomposition (7.2) one may write the measure (2.60) as follows ( ) X α α Q(¯ ωl , ωl ) (¯ µβ,Λ ⊗ µα ωΛ , dωΛ ), (7.5) µβ,Λ (dωΛ ) = Cβ,Λ exp − β,Λ )(d¯ l∈Λ
¯β,Λ and µα where Cβ,Λ is the normalization constant and the Gibbs measures µ β,Λ describe systems of (D − 1)–and one-dimensional interacting anharmonic oscillators respectively. This allows us to rewrite (2.62) in the following way Z (τ , . . . , τ ) = C Ξ(1|¯ ωΛ , τ1 , . . . , τn )Θ(1|¯ ωΛ )¯ µβ,Λ (d¯ ωΛ ) , (7.6) Γβ,Λ n β,Λ A1 ,...,An 1 X¯β,Λ
where, for ϑ ∈ [0, 1], we have set 1 Ξ(ϑ|¯ ωΛ , τ1 , . . . , τn ) = Θ(ϑ|¯ ωΛ ) (
Z α Xβ,Λ
× exp −ϑ
A1 (ωΛ (τ1 )) · · · An (ωΛ (τn ))
X l∈Λ
) Q(¯ ωl , ωlα )
α µα β,Λ (dωβ,Λ ) ,
(7.7)
December 11, 2002 9:50 WSPC/148-RMP
1374
00154
S. Albeverio et al.
and
(
Z Θ(ϑ|¯ ωΛ ) =
α Xβ,Λ
exp −ϑ
X
) α Q(¯ ωl , ωlα ) µα β,Λ (dωΛ ) .
(7.8)
l∈Λ
α (τj )), as it is Now let the functions A˜1 , . . . , A˜n be such that Aj (ωΛ (τj )) = A˜j (ωΛ supposed in Theorem 7.1. Then Z α α α A˜1 (ωΛ (τ1 )) · · · A˜n (ωΛ (τn ))µα Ξ(0|¯ ωΛ , τ1 , . . . , τn ) = β,Λ (dωΛ ) α Xβ,Λ
˜ β,Λ =Γ ˜ (τ1 , . . . , τn ) . ˜ ,...,A A 1
(7.9)
n
As a function of ϑ, Ξ is continuous on [0, 1] and differentiable on (0, 1), where one has r−1 Z XX ∂ Ξ(ϑ|¯ ωΛ , τ1 , . . . , τn ) = − Bs (¯ ωl (t)) ∂ϑ s=1 Iβ l∈Λ
n 2s α α (τ1 )) · · · A˜n (ωΛ (τn )) (ωlα (t)) iφ × hA˜1 (ωΛ α α − hA˜1 (ωΛ (τ1 )) · · · A˜n (ωΛ (τn ))iφ × h(ωlα (t))2s iφ dt .
(7.10)
α as follows Here (see (2.30)), for a fixed ω ¯ Λ ∈ X¯β,Λ , the measure φ is defined on Xβ,Λ ( ) X 1 α α exp −ϑ )= Q(¯ ωl , ωlα ) µα φ(dωΛ β,Λ (dωΛ ) . Θ(ϑ|¯ ωΛ ) l∈Λ
α α 2s ˜ ˜ Since the measure µα β,Λ and the functions A1 , . . . , An , ωΛ (t) 7→ (ωl (t)) , satisfy the conditions of Theorem 6.2, the estimate (6.5) yields in (7.10)
∂ Ξ(ϑ|¯ ωΛ , τ1 , . . . , τn ) ≤ 0 , ∂ϑ for all ϑ ∈ (0, 1), ω ¯ Λ ∈ X¯β,Λ , and τ1 , . . . , τn ∈ Iβ . The latter fact and the estimate (6.4) yield in turn ωΛ , τ1 , . . . , τn ) 0 ≤ Ξ(1|¯ ωΛ , τ1 , . . . , τn ) ≤ Ξ(0|¯ ˜ β,Λ =Γ ˜ (τ1 , . . . , τn ) . ˜ ,...,A A 1
n
(7.11)
Using this double inequality in (7.6), we obtain ˜ β,Λ 0 ≤ Γβ,Λ ˜ (τ1 , . . . , τn ) ˜ ,...,A A1 ,...,An (τ1 , . . . , τn ) ≤ ΓA Z × Cβ,Λ
α Xβ,Λ
(
Z X¯β,Λ
exp − Z
˜ β,Λ =Γ ˜ ,...,A ˜ (τ1 , . . . , τn ) A 1
n
1
Xβ,Λ
X
n
)
α Q(¯ ωl , ωlα ) (¯ µβ,Λ ⊗ µα ωβ,Λ , dωΛ ) β,Λ )(d¯
l∈Λ
˜ β,Λ µβ,Λ (dωΛ ) = Γ ˜ ,...,A ˜ (τ1 , . . . , τn ) . A 1
n
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1375
The above theorem admits a generalization. One observes that (7.1) may be rewritten 0 ≤ hA1 (ωΛ (τ1 )) · · · An (ωΛ (τn ))iµβ,Λ ≤ hA˜1 (ωΛ (τ1 )) · · · A˜n (ωΛ (τn ))iµαβ,Λ . Theorem 7.2. Let the conditions of Theorem 7.1 be satisfied. Then for every µβ,Λ integrable function F : Ωβ,Λ → R+ , which does not depend on xα Λ mentioned in this theorem, the following inequalities 0 ≤ hA1 (ωΛ (τ1 )) · · · An (ωΛ (τn ))F (ωΛ )iµβ,Λ ≤ hA˜1 (ωΛ (τ1 )) · · · A˜n (ωΛ (τn ))iµαβ,Λ · hF (ωΛ )iµβ,Λ
(7.12)
hold for all τ1 , . . . , τn ∈ Iβ . To prove this theorem, one writes (c.f. (7.6)) hA1 (ωΛ (τ1 )) · · · An (ωΛ (τn ))F (ωΛ )iµβ,Λ Z = Cβ,Λ Ξ(1|¯ ωΛ , τ1 , . . . , τn )F (ωΛ )Θ(1|¯ ωΛ )¯ µβ,Λ (d¯ ωΛ ) . X¯β,Λ
Then employing (7.11) one gets (7.12). 7.2. Zero boundary domination Here we consider the scalar case, thus the measures (2.60), (2.61), and (2.66) describe the model (2.1)–(2.9) with D = 1. The potential V is supposed to obey the condition (V2) given in Sec. 2.1. This model will be compared with the model described by the Hamiltonian (2.1), (2.2) but with the following one-particle potential r X x 1 def 21−s bs x2s , x ∈ R , (7.13) = ax2 + Vˆ (x) = 2V √ 2 2 s=2 instead of V given by (2.3). Here the parameters a, and all bs , s = 2, . . . , r are the same as in (2.3). The polynomials V , Vˆ obey the relation x−y x+y √ √ +V = Vˆ (x) + Vˆ (y) + W (x|y) , (7.14) V 2 2 where W (x|y) = W (y|x) =
r−1 X
bs (x)y 2s ,
s=1
bs (x) =
r X p=s+1
2p 2s
21−p bp x2(p−s) .
(7.15)
December 11, 2002 9:50 WSPC/148-RMP
1376
00154
S. Albeverio et al.
Then the measures constructed with Vˆ by (2.60) and (2.61) will be denoted as µ ˆβ,Λ , (0) and µ ˆβ,Λ , respectively. Let also def
Kllζ 0 (τ, τ 0 ) = hωl (τ )ωl0 (τ 0 )i − hωl (τ )ihωl0 (τ 0 )i , l, l0 ∈ Λ ,
ζ ∈ Ωβ ,
τ, τ 0 ∈ Iβ ,
(7.16)
where the expectations are taken with respect to the measure µβ,Λ (.|ζ) (2.66). Further, for the same l, l0 and τ, τ 0 , let Z (0) 0 0 def ˆ ωl (τ )ωl0 (τ 0 )ˆ µβ,Λ (dωΛ ) . (7.17) Kll0 (τ, τ ) = Ωβ,Λ
Theorem 7.3. For arbitrary ζ ∈ Ωβ , all τ, τ 0 ∈ Iβ and l, l0 ∈ Λ, the following estimates hold ˆ ll0 0 (τ, τ 0 ) . 0 ≤ Kllζ 0 (τ, τ 0 ) ≤ K
(7.18)
Proof. The case ζ ∈ Ωβ \ Ωtβ is trivial. For ζ ∈ Ωtβ , we rewrite (7.16) as follows Z Z 1 ωl (τ ) − ω ˜ l (τ ) ωl0 (τ 0 ) − ω ˜ l0 (τ 0 ) √ √ Kllζ 0 (τ, τ 0 ) = 2 [Zβ,Λ (ζ)] 2 2 Ωβ,Λ ×Ωβ,Λ X dλλ0 hωλ + ω ˜ λ , ζλ0 iβ × exp − 0 c λ∈Λ,λ ∈Λ
−
1 X dλλ0 [hωλ , ωλ0 iβ + h˜ ωλ , ω ˜ λ0 iβ ] 2 0 λ,λ ∈Λ
−
XZ λ∈Λ
Iβ
) [V (ωλ (t)) + V (˜ ωλ (t))]dt
O
(χβ ⊗ χβ )(dωλ , d˜ ωλ ) .
λ∈Λ
Now we apply the following transformation in the space Ωβ,Λ × Ωβ,Λ : √ √ ˜ (τ ))/ 2 , ωl (τ ) = (ξl (τ ) + ηl (τ ))/ 2, ξl (τ ) = (ωl (τ ) − ω √ √ ˜ (τ ))/ 2 ; ω ˜ l (τ ) = (−ξl (τ ) + ηl (τ ))/ 2; ηl (τ ) = (ωl (τ ) + ω which yields Kllζ 0 (τ, τ 0 ) = [Zβ,Λ (ζ)]−2 ( × exp
−
Z Z
√
χβ,Λ ×χβ,Λ
2
X
ξl (τ )ξl0 (τ 0 )
dλλ0 hηλ , ζλ0 iβ
λ∈Λ,λ0 ∈Λc
+ hηλ , ηλ0 iβ ] −
X
Q(ξλ , ηλ ) −
λ∈Λ
×
O λ∈Λ
(7.19)
(χβ ⊗ χβ )(dξλ , dηλ ) ,
1 X dλλ0 [hξλ , ξλ0 iβ 2 0 λ,λ ∈Λ
XZ λ∈Λ
Iβ
)
Vˆ (ξλ (t)) + Vˆ (ηλ (t))]dt (7.20)
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1377
where (see (7.14) and (7.15)) Z Q(ξλ , ηλ ) =
=
Iβ
W (ξλ (t)|ηλ (t))dt
r−1 Z X p=1
Iβ
bp (ηλ (t))[ξλ (t)]2p dt .
(7.21)
Since, for V obeying the condition (V2) in Sec. 2.1, all bp are nonnegative, all the coefficients bp (ηλ (t)) are nonnegative for all ηλ (t). For ϑ ∈ [0, 1], we set Ξll0 (ϑ|ηΛ , τ, τ 0 ) = hξl (τ )ξl0 (τ 0 )iφϑ (·|ηΛ ) , def
(7.22)
where the expectation is taken with respect to the measure X 1 1 X exp −ϑ Q(ξλ , ηλ ) − dλλ0 hξλ , ξλ0 iβ φϑ (dξΛ |ηΛ ) = Θ(ϑ|ηΛ ) 2 0 λ,λ ∈Λ
λ∈Λ
)
XZ
−
λ∈Λ
Iβ
where def
Θ(ϑ|ηΛ ) =
Vˆ (ξλ (t))dt
−ϑ
exp Xβ,Λ
−
X
Q(ξλ , ηλ ) −
(7.23)
)
Iβ
Vˆ (ξλ (t))dt
1 X dλλ0 hξλ , ξλ0 iβ 2 0 λ,λ ∈Λ
λ∈Λ
XZ λ∈Λ
χβ (dξλ ) ,
λ∈Λ
(
Z
O
O
χβ (dξλ ) .
(7.24)
λ∈Λ
One observes that Ξll0 is a continuous function of ϑ ∈ [0, 1]. It is differentiable on (0, 1), where its derivative is r−1 X Z X 1 ∂ Ξll0 (ϑ|ηΛ , τ, τ 0 ) = − bp (ηλ (t)){h[ξλ (t)]2p ξl (τ )ξl0 (τ 0 )iφθ (·|ηΛ ) ∂ϑ Θ(ϑ|ηΛ ) p=1 Iβ λ∈Λ
− h[ξλ (t)]2p iφθ (·|ηΛ ) hξl (τ )ξl0 (τ 0 )iφθ (·|ηΛ ) }dt . For every ηΛ ∈ Ωβ,Λ , the measure (7.23) has the form (5.1) with ξ = 0, thus the GKS inequalities (6.2), (6.3) hold for its moments. This yields ∂ Ξll0 (ϑ|ηΛ , τ, τ 0 ) ≤ 0 , ∂ϑ hence ˆ 0 0 (τ, τ 0 ) , 0 ≤ Ξll0 (0|ηΛ , τ, τ 0 ) ≤ Ξll0 (1|ηΛ , τ, τ 0 ) = K ll
(7.25)
December 11, 2002 9:50 WSPC/148-RMP
1378
00154
S. Albeverio et al.
for all ηΛ , l, l0 ∈ Λ, τ, τ 0 ∈ Iβ . On the other hand, (7.20) may be rewritten Z Ξll0 (1|ηΛ , τ, τ 0 )Θ(1|ηΛ ) Kllζ 0 (τ, τ 0 ) = [Zβ,Λ (ζ)]−2 Xβ,Λ
√ × exp − 2
−
1 2
X
X
dλλ0 (ηλ , ζλ0 iβ
λ∈Λ,λ0 ∈Λc
dλλ0 hηλ , ηλ0 iβ
hλ,λ0 i∈Λ
X
Z −
λ∈Λ
Iβ
Vˆ (ηλ (t))dt
O
χβ (dηλ ) .
λ∈Λ
(7.26) Applying here (7.25) one arrives at (7.18). Now let us return to the measures (2.60), (2.61), for which one may write (0) 1 X Zβ,Λ (0) 0 ]hωl , ωl0 iβ exp − [dΛ − d µβ,Λ (dωΛ ) = µ (dωΛ ) . 0 ll ll 2 0 β,Λ Zβ,Λ l,l ∈Λ
Taking into account (D2) in Sec. 2.1 and the GKS inequalities, one easily proves the following statement. Proposition 7.1. For every pair l, l0 ∈ Λ and all τ, τ 0 ∈ Iβ , the following estimate holds Z def ˆ 0 0 0 ˆ ωl (τ )ωl0 (τ 0 )ˆ µβ,Λ (dωΛ ) = K (7.27) Kll0 (τ, τ ) ≤ ll0 (τ, τ ) . Xβ,Λ
Combining this estimate with (7.18) one obtains that ˆ ll0 (τ, τ 0 ) , Kllζ 0 (τ, τ 0 ) ≤ K
(7.28)
which holds for arbitrary ζ ∈ Ωβ , all τ, τ 0 ∈ Iβ and l, l0 ∈ Λ. 7.3. Refined Gaussian upper bound For the periodic local Gibbs measure (2.60), we write Z 0 0 ωl (τ )ωl0 (τ 0 )µβ,Λ (dωΛ ) , Kll (τ, τ ) =
(7.29)
Ωβ,Λ
and KΛ =
Z Z 1 X Kll0 (τ, τ 0 )dτ dτ 0 2 β|Λ| 0 Iβ l,l ∈Λ
=
XZ l0 ∈Λ
Iβ
Kll0 (τ, τ 0 )dτ 0 .
(7.30)
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1379
For the one-point box Λ = {l}, we write simply K. This parameter depends on β, m, and the parameters of the potential V . We show that varying these quantities, K can be made arbitrarily small. Set X dll0 . (7.31) d=− l0 ∈IL
Recall that the limit Λ % IL is taken over a sequence of boxes L. Theorem 7.4. For the model described by the Hamiltonian (2.1), (2.2) with D = 1 and with the one-particle potential V satisfying (V1), given in Sec. 2.1, let K, defined by (7.30) with Λ = {l}, obey the condition K < 1/d .
(7.32)
Then there exist Λ0 ∈ L such that for all Λ > Λ0 , the following estimate KΛ ≤
K , 1 − dK
(7.33)
holds. Proof. For the potential V satisfying (V1), we set ( Z ) def V (ωl (t))dt χβ (dωl ) , σβ (dωl ) = exp −
(7.34)
Iβ
where χβ is the Gaussian measure defined by (2.27). Clearly, σβ is a finite measure on Cβ , which belongs to the BFS class (see [40]). For ϑ ∈ [0, 1], we set N X ϑ 1 def exp − dΛ σ (dωl ) , µ(dωΛ ) = ll0 hωl , ωl0 iβ 2 0 l∈Λ β Zβ,Λ (ϑ) l,l ∈Λ (7.35) ϑ X N R def dΛ σβ (dωl ) . Zβ,Λ (ϑ) = Ωβ,Λ exp − ll0 hωl , ωl0 iβ 2 0 l∈Λ l,l ∈Λ
This measure has the form (5.1) with ξ = 0, thus its moments obey the GKS and the FKG inequalities. Set Kll0 (ϑ|τ, τ 0 ) = hωl (τ )ωl0 (τ 0 )iµ .
(7.36)
For every ϑ ∈ [0, 1], this is a nonnegative (e.g., by (6.1)) and continuous (by Theorem 4.2) function of τ, τ 0 . One can easily show that it is also continuous on [0, 1] and differentiable on (0, 1), as a function of ϑ. Further, set (see (6.9), (6.10)) Ul1 ,...,l4 (ϑ|t1 , . . . , t4 ) = hωl1 (t1 ) · · · ωl4 (t4 )iµ − hωl1 (t1 )ωl2 (t2 )iµ hωl3 (t3 )ωl4 (t4 )iµ − hωl1 (t1 )ωl3 (t3 )iµ hωl2 (t2 )ωl4 (t4 )iµ − hωl1 (t1 )ωl4 (t4 )iµ hωl2 (t2 )ωl3 (t3 )iµ .
(7.37)
December 11, 2002 9:50 WSPC/148-RMP
1380
00154
S. Albeverio et al.
For V satisfying (V1) in Sec. 2.1, and for a ferroelectric interaction ϑdΛ ll0 , the semiinvariant Ul1 ,...,l4 satisfies the Lebowitz inequality (6.11). It is also continuous as a function of t1 , . . . , t4 and ϑ. By (7.35), (7.36), one has Z 1 X Λ ∂ 0 0 Kll (ϑ|τ, τ ) = − dλλ0 {Uλ,λ0 ,l,l0 (ϑ|t, t, τ, τ 0 ) ∂ϑ 2 0 Iβ λ,λ ∈Λ
+ 2Kλl (ϑ|t, τ )Kλ0 l0 (ϑ|t, τ 0 )}dt . Setting KΛ (ϑ) =
(7.38)
Z Z 1 X Kll0 (ϑ|τ, τ 0 )dτ dτ 0 β|Λ| 0 Iβ2 l,l ∈Λ
=
XZ l0 ∈Λ
Iβ
Kll0 (ϑ|τ, τ 0 )dτ 0 ,
(7.39)
we get from (7.38) d KΛ (ϑ) = −Ψ(ϑ) + dΛ [KΛ (ϑ)]2 . dϑ Here def
dΛ = −
X
dΛ ll0 % d ,
Λ % IL ,
(7.40)
(7.41)
l0 ∈Λ
and def
Ψ(ϑ) =
1 2|Λ|β
X
0 0 dΛ l1 l2 Ul1 ,...,l4 (ϑ|τ, τ , t, t)dτ dτ dt ≥ 0 ,
(7.42)
l1 ,...,l4 ∈Λ
for all ϑ ∈ [0, 1]. Where we have taken into account (2.7), (6.11), and (D2), in Sec. 2.1. Set RΛ (ϑ) =
KΛ (0) . 1 − dΛ KΛ (0)ϑ
(7.43)
By (7.35), (7.39) KΛ (0) = K. In view of (7.32) and (7.41) one has KdΛ < 1 , ∀Λ ∈ L , which means that, for any Λ, RΛ (ϑ) is differentiable on (0, 1), where d RΛ (ϑ) = dΛ [RΛ (ϑ)]2 . dϑ
(7.44)
Set PΛ (ϑ) = KΛ (ϑ) + RΛ (ϑ) ≥ 0 ,
QΛ (ϑ) = KΛ (ϑ) − RΛ (ϑ) .
(7.45)
Employing (7.40), (7.44) one has d QΛ (ϑ) = −Ψ(ϑ) + dΛ PΛ (ϑ)QΛ (ϑ) , dϑ
QΛ (0) = 0 .
(7.46)
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1381
In view of (7.42) one has d QΛ (0) ≤ 0 , dϑ which implies QΛ (ϑ) ≤ 0 ,
∀ϑ ∈ [0, 1] .
(7.47)
In fact, QΛ (ϑ) ≤ 0 in a right neighborhood of zero. Since the function QΛ is continuous, to become positive it should vanish at a point, where its derivative should be positive. But this is impossible in view of (7.46) and (7.42). Then QΛ (1) ≤ 0, which yields (7.33). Remark 7.2. The measure (7.35) would be Gaussian if one took V in (7.34) to be identically zero. In this case one would get an equality in (7.35). This is the reason why the latter estimate is called Gaussian upper bound. It is in fact a refined upper bound because K is computed for the non-Gaussian measure (7.34). 8. Applications 8.1. Existence of the long range order The appearance of the long range order is an effect of the phase transition, which occurs when the fluctuations of the displacements of particles become large. In this subsection, we describe the results of [19, 20, 53] and show the appearance of the long range order for the model (2.1)–(2.3) with D = 1, d ≥ 3 and dll0 = −Jδ|l−l0 |,1 ;
J > 0.
(8.1)
It should be mentioned here that, for similar models, the existence of the long range order, even for d = 2, may be shown by means of Peierls’ arguments [11]. On the other hand, for the φ4 -type anharmonicity (i.e., r = 2 in (2.5)), the existence of the long-range order was proved in [36, 67]. To prove the appearance of the long range order we introduce an order parameter. Here we will use the following one (more on this theme one may find in [38]) !2 1 X ql , (8.2) Π(β) = lim γβ,Λ Λ%IL |Λ| l∈Λ
where γβ,Λ is the periodic local Gibbs state introduced in (2.10). The value β∗ of the inverse temperature β, such that Π(β) = 0 , for β ≤ β∗ , and Π(β) > 0 , for β > β∗ , will be called a critical inverse temperature. Theorem 8.1. For the system of anharmonic oscillators described by the Hamiltonian (2.1)–(2.3) with D = 1, d ≥ 3, and with the polynomial v which is strictly
December 11, 2002 9:50 WSPC/148-RMP
1382
00154
S. Albeverio et al.
convex on R+ and such that the polynomial ξ/2 + v(ξ) has a minimum at some ¯ such that, for m > m ¯ there exists a critical inverse ξ = ξ0 , there exists m temperature. Proof. Having in mind the periodic conditions on the boundaries of the box \ Λ = (−L, L]d IL , L ∈ N , we will use the Fourier transformation of the following form 1 P p = (p1 , . . . , pd ) ∈ Λ∗ , qˆp = p l∈Λ ql exp(ipl) , |Λ| π Λ = {p|pj = −π + νj , L ∗ def
Denote
(8.3)
νj = 1, 2, . . . , 2L, j = 1, . . . , d} .
Z DΛ (p) =
Iβ
Γβ,Λ qp (0, τ )dτ . qˆ−p ,ˆ
(8.4)
Suppose that there exist positive Bp and Cp , independent of Λ and such that DΛ (p) ≤ Bp ,
γβ,Λ {[ˆ qp , [HΛ , qˆ−p ]]} ≤ Cp ,
(8.5)
where [·, ·] stands for the commutator. By means of the estimate obtained in [38, p. 363] one gets the following bound s ! 1p β Cp qp qˆ−p } ≤ Bp Cp coth . (8.6) γβ,Λ {ˆ 2 2 Bp In our case 1 . m On the other hand, the infrared estimates [38, 42] yield [ˆ qp , [HΛ , qˆ−p ]] =
DΛ (p) ≤
1 , JE(p)
(8.7)
where J is defined by (8.1) and E(p) =
d X
[1 − cos(pj )] .
j=1
Employing these estimates in (8.6) one obtains 1 1 coth β qp qˆ−p } ≤ · p γβ,Λ {ˆ 2 2mJE(p)
r
JE(p) 2m
! .
(8.8)
By (8.3) 1 X ql |Λ| l∈Λ
!2 =
1 2 qˆ . |Λ| 0
(8.9)
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
On the other hand,
X
ql2 = qˆ02 +
X
qˆp qˆ−p ,
1383
(8.10)
p∈Λ∗ \{0}
l∈Λ
which yields q02 } = |Λ|γβ,Λ {ql2 } − γβ,Λ {ˆ
X
γβ,Λ {ˆ qp qˆ−p } .
(8.11)
p∈Λ∗ \{0}
Here we have taken into account that the periodic Gibbs state γβ,Λ is invariant under translations from T/T(Λ). Then by (8.9), one has !2 1 X X 1 ql γβ,Λ {ˆ qp qˆ−p } . (8.12) γβ,Λ = γβ,Λ {ql2 } − |Λ| |Λ| ∗ p∈Λ \{0}
l∈Λ
Making use of (8.8) and passing to the limit Λ % IL we obtain the following estimate for the order parameter (8.2) ! r Z 1 1 1 JE(p) 2 p dp . coth β Π(β) ≥ γβ,Λ {ql } − · 2 (2π)d [−π,π]d 2mJE(p) 2m (8.13) The latter integral is convergent for d ≥ 3. Getting back to the representations (2.11), (2.12), (2.22), (2.62), (7.35), and (7.36) one obtains γβ,Λ {ql2 } = Kll (1|0, 0) , with dll0 given by (8.1). By means of (7.37) one may rewrite (7.38) as follows Z 1 X Λ ∂ 0 Kll0 (ϑ|τ, τ ) = − dλλ0 {hωl (τ )ωl0 (τ 0 )ωλ (t)ωλ0 (t)iµ ∂ϑ 2 0 Iβ λ,λ ∈Λ
− hωl (τ )ωl0 (τ 0 )iµ hωλ (t)ωλ0 (t)iµ } ≥ 0 . Here, to obtain the latter estimate, we have used the GKS inequality (6.3), which obviously holds for the moments of the measure (7.35). This estimate yields γβ,Λ {ql2 } = Kll (1|0, 0) ≥ Kll (0|0, 0) = hωl (0)2 iσβ =
trace{ql2 exp(−βHl )} , trace{exp(−βHl )}
(8.14)
where the Hamiltonian Hl and the measure σβ are given by (2.2) and (7.34) respectively. Now, as above, we shall use the spectral properties of the Hamiltonian Hl . Its spectrum consists of nondegenerate eigenvalues s , s ∈ N, s < s+1 , which correspond to the eigenfunctions ψs constituting an orthonormal base of the space L2 (R). Setting qs2 = (ql2 ψs , ψs )L2 (R) , we have !, ! X X 2 −βs −βs qs e e . Kll (0|0, 0) = s∈N
s∈N
December 11, 2002 9:50 WSPC/148-RMP
1384
00154
S. Albeverio et al.
Multiplying numerator and denominator by eβ1 and passing to the limit β → +∞ we get Z x2 ψ12 (x)dx . (8.15) lim Kll (0|0, 0) = β→+∞
R
In [32, 78, 79] there was proven the following semi-classical result. For a double-well potential V (x) + x2 /2 possessing nondegenerate minima at the points ±x0 , and for any ε > 0, one has Z 1 (8.16) ψ1 (x)2 dx = , lim inf m→+∞ B ± 2 ε where Bε± = [±x0 − ε, ±x0 + ε]. Therefore, given ε > 0 and any δ > 0, one may find mε,δ > 0, such that for all m ≥ mε,δ , Z ± 1 ψ12 (x)dx ≥ (1 − δ) . 2 Bε Then, for such m, Z Z x2 ψ12 (x)dx ≥ 2 R
Bε+
x2 ψ12 (x)dx Z
≥ (x0 − ε)2
Bε+
ψ12 (x)dx ≥ (x0 − ε)2 (1 − δ) .
Suppose that the parameters of the model are such that the following inequality Z 1 1 dp def Id p · , (8.17) = √ x20 > √ d 8mJ (2π) [−π,π]d E(p) 8mJ holds. The latter integral converges for d ≥ 3. Then one may choose positive ε and δ such that Id . (x0 − ε)2 (1 − δ) > √ 8mJ
(8.18)
Since coth x is a monotone decreasing function on R+ and coth x > 1 for all x > 0, one may find, taking into account also (8.15), a positive β0 (ε, δ) such that, ! r Z 1 1 1 JE(p) p coth β dp , Kll (0|0, 0) > · 2 (2π)d [−π,π]d 2mJE(p) 2m for all β > β0 (ε, δ). Then by (8.14) and (8.13) one gets Π(β) > 0 , for m ≥ m(ε, δ) and β > β0 (ε, δ). Now we fix ε and δ in such a way that β0 (ε, δ) ¯ being the value of m(ε, δ) at such has its smallest possible value β∗ . Then we put m ε and δ.
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1385
8.2. Normality of fluctuations and suppression of critical points In this subsection, we follow [6, 54, 56] and consider the model described by (2.1)– (2.3) with arbitrary D ∈ N and with the potential V satisfying (V2) in Sec. 2.1. At the critical point of the model the strong dependence between the oscillations of particles appears. At this point the fluctuations of the displacement of particles become large (abnormal). More about abnormal fluctuations in such and similar systems one may find in [3, 29, 46, 86]. To describe the fluctuations we introduce the fluctuation operator 1 X def ql , (8.19) QΛ = p |Λ| l∈Λ where Λ is again a box. If the Green functions (2.62), (2.63), constructed with the help of QΛ , remain bounded when Λ % IL, the fluctuations may be regarded as normal (under certain additional conditions this implies normality in the usual sense [62]). At the critical point the fluctuations become so large that in order to preserve the mentioned boundedness one should use an abnormal normalization, i.e., one has to employ the following operator λ(Λ) X ql , (8.20) Qλ,Λ = λ(Λ)QΛ = p |Λ| l∈Λ where {λ(Λ) ∈ R|Λ ∈ L} is a converging to zero sequence and L is a sequence of boxes exhausting the lattice IL. Typically, λ(Λ) ∼ |Λ|−σ , where σ < 1/2 is a critical exponent. For β > β∗ , the fluctuations destroy the O(D)-symmetry and λ(Λ) is to be set |Λ|−1/2 (c.f. (8.2)). In what follows, the above mentioned normality of fluctuations corresponds to the suppression of the critical point behavior of the model considered. Definition 8.1. Given β > 0, let the sequence of the Green functions (τ , . . . , τ2n )|Λ ∈ L} , {Γβ,Λ Q(α1 ) ,...,Q(α2n ) 1 be bounded uniformly on Iβ for all n ∈ N, any α1 , . . . , α2n = 1, 2, . . . , D, and any sequence of boxes L. Then the fluctuations of the displacements of particles are said to be normal at this temperature. Let Hl be the Hamiltonian (2.2) describing a one-dimensional (i.e. D = 1) oscillator. Its spectrum consists of the nondegenerate eigenvalues s , s ∈ N. Set ∆ = min{s+1 − s : s ∈ N} .
(8.21)
Theorem 8.2. Let the particle mass m, the interaction parameter d given by (7.31), and the spectral parameter ∆ obey the condition m∆2 > d .
(8.22)
Then, for any D ∈ N, the fluctuations of the displacements of particles remain normal at all temperatures.
December 11, 2002 9:50 WSPC/148-RMP
1386
00154
S. Albeverio et al.
The proof of this theorem will be given below. Our next statement shows that the condition (8.22) may be satisfied for small values of the mass m. Theorem 8.3. There exists κ > 0 such that lim m(r−1)/(r+1) (m∆2 ) = κ ,
(8.23)
m&0
where r is the same as in (2.5). Proof. Recall that the Hamiltonian Hl acts in the Hilbert space Hl = L2 (R). Given δ > 0, consider the following unitary operator on Hl (Uδ ψ)(x) = δ 1/2 ψ(δx) . Then
Uδ
d dx
Uδ−1
=δ
−1
d dx
Uδ qUδ−1 = δq .
(8.24)
R = R0 + m1/(r+1) R1 ,
(8.25)
,
Set δ = m−1/(2r+2) . Then the operator Hm = m−r/(r+1) R ,
def
is unitary equivalent to Hl given by (2.2). Here 2 1 d + br q 2r , R0 = − 2 dx and R1 =
r−1 X r−2 r−s−1 1 (1 + a)m r+1 q 2 + m r+1 bs q 2s . 2 s=2
Let ∆R and ∆0 be defined by (8.21) but with the eigenvalues of the operators R and R0 respectively. Then ∆ = m− r+1 ∆R . r
(8.26)
One observes that the operator R is a perturbation of R0 , which is analytic with respect to the variable λ = m1/(r+1) at the point λ = 0. Thus lim ∆R = ∆0 .
m&0
Taking into account (8.26) one gets (8.23). Due to the O(D)-symmetry of the model the following function (c.f. (7.29)) Z (α) (α) ωl (τ )ωl0 (τ 0 )µβ,Λ (dωΛ ) , Kll0 (τ, τ 0 ) = Xβ,Λ
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1387
does not depend on α = 1, . . . , D. Let KΛ be defined by (7.30) with the above function Kll0 . Further, set Z Γβ,Λ ν ∈ Z. (8.27) KΛ (ν) = (α) (α) (0, τ ) cos(2πντ /β)dτ , QΛ ,QΛ
Iβ
Then KΛ = KΛ (0) and 0 Γβ,Λ (α) (α) (τ, τ ) QΛ ,QΛ
2πν 1X 0 (τ − τ ) . = KΛ (ν) cos β β
(8.28)
ν∈Z
Lemma 8.1. The following estimate 0 ≤ KΛ (ν) ≤
β2 , 4mπ 2 ν 2
(8.29)
holds for all ν ∈ Z \ {0}. Proof. By (2.12), (2.22) Γβ,Λ (α)
(α)
QΛ ,QΛ
(0, τ ) =
1
(α)
Zβ,Λ
(α)
trace{QΛ exp[−τ HΛ ]QΛ exp[−(β − τ )HΛ ]} .
The Hamiltonian HΛ has a discrete spectrum consisting of positive eigenvalues Es , s ∈ N (see (2.17)). We set Qss0 = (Q(α) Ψs , Ψs0 )L2 (RD|Λ| ) . This yields in (8.27) KΛ (ν) =
1 Zβ,Λ
X
Q2ss0
s,s0 ∈N
Es − Es0 (Es − Es0 )2 + (2πν/β)2
× [exp(−βEs0 ) − exp(−βEs )] .
(8.30)
Thus KΛ (ν) ≥ 0. Further, for ν 6= 0, KΛ (ν) ≤
X β2 Q2ss0 [Es − Es0 ]2 2 (2πν) Zβ,Λ 0 s,s ∈N
× [exp(−βEs0 ) − exp(−βEs )] =
h nh iio β2 β2 (α) (α) γβ,Λ QΛ , HΛ , QΛ , = 2 (2πν) 4mπ 2 ν 2
(8.31)
where [·, ·] stands for commutator. As a corollary of (8.29) one gets from (8.28) Γβ,Λ (α)
(α)
QΛ ,QΛ
(τ, τ 0 ) ≤ Γβ,Λ (α)
(α)
QΛ ,QΛ
(0, 0) ,
∀τ, τ 0 ∈ Iβ .
(8.32)
December 11, 2002 9:50 WSPC/148-RMP
1388
00154
S. Albeverio et al.
Below we will use the scalar domination estimate (7.1). To this end we compare the D-dimensional model we consider with the corresponding scalar model. Let us set ˜ β,Λ ˜ β,Λ (τ1 , . . . , τ2n ) = Γ Γ 2n ˜ ,...,Q ˜ (τ1 , . . . , τ2n ) , Q Λ
Λ
(8.33)
˜ Λ is defined by (8.19) but for the one-dimensional model. For this model, where Q the Gaussian domination inequality (6.12) and the estimate (8.32) imply that the following estimate h in ˜ β,Λ (0, 0) ˜ β,Λ (τ1 , . . . , τ2n ) ≤ (2n)! Γ (8.34) 0≤Γ 2n 2n n! 2 ˜ instead ˜ Λ be defined by (8.27) with ν = 0 and with Γ holds for all n ∈ N. Let K ˜ of Γ (i.e., it is KΛ for the one-dimensional model). As above, K will stand for a one-point box Λ. Since the estimates (8.29) are valid for all D, they hold also for ˜ Λ . Moreover, the scalar domination inequality (7.1) yields K ˜Λ . KΛ ≤ K Lemma 8.2. Let ∆ be defined by (8.21). Then ˜ ≤ 1 . K m∆2
(8.35)
(8.36)
Proof. By (8.30) −βs0 X − e−βs ] 2 (s − s0 )[e ˜ = 1 qss K 0 (s − s0 )2 Z˜β s,s0 ∈N
≤
1 1 X 2 1 · qss0 (s − s0 )[e−βs0 − e−βs ] = , ∆2 Z˜β 0 m∆2 s,s ∈N
where ˜ l] , Z˜β = trace exp[−β H ˜ l is the one-particle Hamiltonian (2.2) for a one-dimensional oscillator. and H Corollary 8.1. Let (8.22) hold. Then the following estimate 1 ˜Λ ≤ , KΛ ≤ K m∆2 − d holds for all β, Λ, and D.
(8.37)
˜ β,Λ (0, 0)|Λ ∈ Lemma 8.3. Let (8.22) hold. Then for every β > 0, the sequence {Γ 2 L} is bounded. Proof. By (8.28), (8.33), ˜ β,Λ (0, 0) Γ 2
" # +∞ X 1X ˜ 1 ˜ ˜ Λ (ν) , KΛ + 2 KΛ (ν) = K = β β ν=1 ν∈Z
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1389
hence by (8.29), which hold also for D = 1, and by (8.37) −1 β def ˜Λ + β ≤ β ˜ β,Λ (0, 0) ≤ 1 K + = Γβ . Γ 2 β 12m m∆2 − d 12m
(8.38)
˜ Λ |Λ ∈ Thus, the stated property follows from the boundedness of the sequence {K L}, which in turn follows from (8.37). Proof of Theorem 8.2. To estimate the Green functions (τ , . . . , τ2n ) , Γβ,Λ Q(α1 ) ,...,Q(α2n ) 1
α1 , . . . , α2n = 1, . . . , D ,
(8.39)
we use the scalar domination inequality (7.1) and the Gaussian upper bound (8.34). We recall that one may apply (7.1) only to the functions with coinciding αj . Let us gather the indices αj in (8.39) into the groups gk , k = 1, 2, . . . , δ ≤ D numbered in such a way that |gk | ≥ |gk+1 |. Set |gk | = sk , then s1 + · · · + sδ = 2n. Hence (τ , . . . , τ2n ) = hX1 · · · Xδ iµβ,Λ , Γβ,Λ Q(α1 ) ,...,Q(α2n ) 1 where Y
def
Xk =
j:αj ∈gk
(8.40)
! 1 X (αj ) p ωl (τj ) . |Λ| l∈Λ
(8.41)
Now we apply repeatedly the Schwarz inequality and obtain δ−1 i2−k h i2−(δ−1) Yh k δ−1 · hXδ2 iµβ,Λ . hXk2 iµβ,Λ |hX1 · · · Xδ iµβ,Λ | ≤
(8.42)
k=1 k
(α)
The Green function hXk2 iµβ,Λ contains QΛ with the same α, thus we may employ the scalar domination inequality (7.1) and the Gaussian upper bound (8.34). This yields k ˜ β,Λ (0, 0)]2k−1 sk , hXk2 iµβ,Λ ≤ Θk (sk )[Γ 2
(8.43)
def
Θk (s) = 1 · 3 · 5 · · · (2k s − 1) = (2k s − 1)!! . In Appendix we show that, for all n ∈ Z+ , all D ∈ N, and for all possible combinations of α1 , . . . , α2n , the following estimate holds δ−1 −k −(δ−1) 1 Y [Θk (sk )]2 [Θδ (sδ )]2 ≤ cnD , (2n)!
(8.44)
k=1
where 2−D
cD = 2[(2D−2 )!]2
c1 = 1 ,
,
D ≥ 2.
(8.45)
Thus δ−1 Y
k
−k
[hXk2 iµβ,Λ ]2
δ−1
· [hXδ2
−(δ−1)
iµβ,Λ ]2
n
≤ (2n)! [cD Γβ ] ,
(8.46)
k=1
where we have taken into account (8.38). Applying this estimate in (8.42) one gets the boundedness to be proven.
December 11, 2002 9:50 WSPC/148-RMP
1390
00154
S. Albeverio et al.
8.3. Uniqueness of Gibbs states In this subsection we follow [8] and consider the scalar version of the model (2.1)– (2.5) with the potential V obeying (V2), in Sec. 2.1. As it has been proved in [15] (see also [12] and the references therein), the class of tempered Gibbs measures Gβ (see Definition 2.2), for this model, is actually nonempty. Moreover, by Theorem 8.1, the model has a critical point, which implies that, for one and the same value of the model parameters, Gβ contains more than one element. Having in mind the suppression of the critical points proved above, one may expect the uniqueness of tempered Gibbs measures to occur for small values of m. In fact, we prove this in the current subsection. Theorem 8.4. For the model with the Hamiltonian described by (2.1)–(2.5) with the potential obeying (V2), given in Sec. 2.1, for every β, there exists a positive m∗ = m∗ (β) such that for all values of the mass m ∈ (0, m∗ ), the class of tempered Gibbs measures Gβ consists of exactly one element. It should be pointed out that by Theorems 8.2 and 8.3, the suppression of abnormal fluctuations occurs for m < m? , with m? > 0 which is independent of β, whereas the above m∗ (β) → 0 for β → +∞, as it follows from (8.78), (8.79) below. To prove the latter theorem we need to create corresponding tools, which is done just below. Let (X , ρ) be a complete separable metric space and B(X ) be the Borel algebra of its subsets. Let also M be the set of all probability measures on (X , B(X )), and Z def ρ(y, y0 )µ(dy) < ∞ , (8.47) M1 = µ ∈ M X
for some y0 ∈ X . Further, Lip(X ) will stand for the set of Lipschitz functions f : X → R, for which we write |f (x) − f (y)| : x, y ∈ X , x 6= y , (8.48) [f ]Lip = sup ρ(x, y) Lip1 (X ) = {f ∈ Lip(X )|[f ]Lip ≤ 1} .
(8.49)
Given µ1 , µ2 ∈ M1 , we set Z Z def f (x)µ2 (dx) : f ∈ Lip1 (X ) . (8.50) R(µ1 , µ2 ) = sup f (x)µ1 (dx) − X
X
A key role in the proof of Theorem 8.4 will be played by Dobrushin’s matrix. It is defined by the conditional Gibbs measures µβ,Λ (·|ζ), given by (2.65)–(2.68), with ζ ∈ Ωtβ and a one-point box Λ = {l}. To simplify notations we set ζ{l}c = ζlc ,
µβ,{l} (·|ζ) = µl (·|ζ) .
(8.51)
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
Then the elements of Dobrushin’s matrix (Cll0 )l,l0 ∈IL are R(µl (·|ξ), µl (·|η)) c : ξl0 = ηlc0 . Cll0 = sup kξl0 − ηl0 kβ
1391
(8.52)
They will be used to check Dobrushin’s condition [34, 35, 43, 58]. Proposition 8.1 ([Dobrushin’s Uniqueness Condition]). Let X Cll0 : l ∈ IL < 1 . sup 0
(8.53)
l ∈IL\{l}
Then there exists exactly one tempered Gibbs measure. Taking into account (D2) in Sec. 2.1, one has from (2.65)–(2.68) ( ) Z 1 exp hω, ϕl (ξ)iβ − V (ω(t))dt χβ (dω) , µl (dω|ξ) = Zl (ξ) Iβ where
X
ϕl (ξ) = −
dlλ ξλ ,
(8.54)
(8.55)
λ∈{l}c
and Zl (ξ) is the normalization constant. Given x ∈ Xβ , set ( ) Z 1 x exp hω, xiβ − V (ω(t))dt χβ (dω) , µ (dω) = Zx Iβ and
def
C = sup
R(µx , µy ) : x, y ∈ Xβ , x 6= y kx − ykβ
(8.56)
.
(8.57)
For ξ 6= η, such that ξlc0 = ηlc0 , one has kϕl (ξ) − ϕl (η)kβ = −dll0 kξl0 − ηl0 kβ . Then by (8.52), (8.54)–(8.57) X sup 0
Cll0
l ∈IL\{l}
: l ∈ IL ≤ dC ,
(8.58)
where d was defined by (7.36). Then the condition (8.53) would be satisfied if C<
1 . d
(8.59)
Keeping this in mind let us estimate R(µx , µy ). To this end we study the variance of the following function Z f (ω)µx (dω) ∈ R , (8.60) Xβ 3 x 7→ hf iµx = Xβ
December 11, 2002 9:50 WSPC/148-RMP
1392
00154
S. Albeverio et al.
with a fixed f ∈ Lip1 (Xβ ). This function is Fr´echet differentiable [12], its derivative on a certain ψ ∈ Xβ has the following form h∇x hf iµx , ψiβ = hf giµx − hf iµx hgiµx = Covµx (f, g) , By the Schwarz inequality one has |h∇x hf iµx , ψiβ | ≤ where Varµx f =
1 2
Varµx g =
1 2
Z
Z
Xβ
Z
Xβ
Xβ
Z
Xβ
def
g(ω) = hω, ψiβ .
p p Varµx f · Varµx g ,
(8.61)
(8.62)
[f (ω) − f (ω 0 )]2 µx (dω)µx (dω 0 ) ,
(8.63)
hω − ω 0 , ψi2β µx (dω)µx (dω 0 ) .
(8.64)
The idea how to prove Theorem 8.4 may be outlined as follows. Suppose that we have estimated, uniformly for all x ∈ Xβ , the first variance by a positive continuous function of β, of the parameters of the potential V (2.3), (2.5), and of the mass m. Let also the second variance be bounded by a positive function of the same parameters multiplied by kψkβ . Then the mean-value theorem together with (8.57) would imply that the condition (8.59) be satisfied provided the product of the mentioned bounds is sufficiently small. One observes that (8.64) defines a quadratic form on Xβ Varµx g = hT x ψ, ψiβ , with the operator T x given as follows Z T x (τ, τ 0 )ψ(τ 0 )dτ 0 , (T x ψ)(τ ) = Iβ
τ ∈ Iβ .
The kernel of this integral operator is Z Z 1 [ω(τ ) − ω 0 (τ )] · [ω(τ 0 ) − ω 0 (τ 0 )]µx (dω)µx (dω 0 ) . T x (τ, τ 0 ) = 2 Xβ Xβ
(8.65)
(8.66)
Comparing this kernel with the function given by (7.16) with ζ ∈ Ωtβ , and taking into account (8.56) and (2.66)−(2.68), one concludes that X x=− dll0 ζl0 . (8.67) T x (τ, τ 0 ) = Kllζ (τ, τ 0 ) , l0 ∈IL
This yields, in particular, that T x(τ, τ 0 ) is a continuous nonnegative function of τ, τ 0 ∈ Iβ (see Theorem 4.2). Clearly, for every x ∈ Xβ , the operator T x is symmetric and positive. Moreover, Z Z 1 kω − ω 0 k2β µx (dω)µx (dω 0 ) < ∞ , (8.68) trace(T x ) = 2 Xβ Xβ
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1393
ˆ : Xβ → Xβ stand for the integral operator with which follows from (2.37). Let K 0 ˆ the kernel K(τ, τ ) defined by (7.17) with l0 = l. Then this operator is also positive and trace class, its trace may be computed as above with the help of the measure (0) µ ˆ β,{l} . For a bounded linear operator A : Xβ → Xβ , let σ(A) be its pure point spectrum and kAk stand for its operator norm. For a positive compact operator, one has kAk = max σ(A) . On the other hand, for such an operator (see e.g. [70, p. 216]) ) ( hAψ, ψiβ : ψ ∈ Xβ \ {0} . kAk = sup kψk2β
(8.69)
(8.70)
The construction of the above mentioned bounds is based upon the following lemmas, which will be proved at the end of this subsection. Lemma 8.4. For every x, ψ ∈ Xβ , one has ˆ kψkβ . hT x ψ, ψiβ ≤ kKk
(8.71)
Lemma 8.5. For every x ∈ Xβ , one has ˆ . trace(T x ) ≤ trace(K)
(8.72)
Lemma 8.6. The following estimate holds ˆ ≤ max σ(K)
1 , ˆ2 m∆
(8.73)
ˆ is defined by (8.21) but with the eigenvalues of the one-particle Hamiltonian where ∆ (2.2) with the potential Vˆ (7.13) instead of V (2.3), (2.5). Lemma 8.7. Let r in (2.5) take value r = 2. Then, for all x ∈ Xβ and any f ∈ Lip1 (Xβ ), one has 2 a 25 √ δ0 = , (8.74) Varµx f ≤ heβδ0 , 288 b2 where the constant h depends only on the interaction parameter d. The proof of the above statement may be done by means of the logarithmic Sobolev inequality, just as it was done in [12]. Another estimate of the variance of f is linear in β. We will use it for r > 2. Lemma 8.8. There exists a parameter h0 , independent of m and β, such that the estimate Varµx f ≤ βh0 m−1/(r+1) , holds for all x ∈ Xβ and any f ∈ Lip1 (Xβ ).
(8.75)
December 11, 2002 9:50 WSPC/148-RMP
1394
00154
S. Albeverio et al.
Proof of Theorem 8.4. First we estimate Varµx g given by (8.64). By Lemma 8.4, (8.69), and Lemma 8.6 one has 2 ˆ kψk2 = max σ(K)kψk ˆ Varµx g = hT x ψ, ψiβ ≤ kKk β β ≤
kψk2β . ˆ2 m∆
By Theorem 8.3, one may find m0 and κ0 such that, for m ∈ (0, m0 ), the following estimate holds 1 ≤ κ0 m(r−1)/(r+1) . (8.76) ˆ2 m∆ Then one has Varµx g ≤ κ0 m(r−1)/(r+1)kψk2β ,
(8.77)
that holds for m ∈ (0, m0 ). For r > 2, one may use (8.75), which yields the following estimate of the distance (8.50) p r−2 R(µx , µy ) ≤ kx − ykβ βh0 κ0 · m 2(r+1) , holding for all m ∈ (0, m0 ). Employing this estimate in (8.57) one obtains that the uniqueness condition (8.59) holds true if n o r+1 (8.78) m < m∗ (β) = min m0 , [βh0 κ0 d2 ]− r−2 . For r = 2, we use (8.74) and obtain R(µx , µy ) ≤ kx − ykβ
p βhκ0 eβδ0 /2 m1/6 ,
which yields that the uniqueness condition holds in this case if e−3βδ0 . m < m∗ (β) = min m0 , [hκ0 d2 ]3
(8.79)
Proof of Lemma 8.4. Here we use (8.67) and the zero boundary domination estimate (7.18). Then taking into account that the kernel T x (τ, τ 0 ) is nonnegative one obtains Z Z T x(τ, τ 0 )|ψ(τ )kψ(τ 0 )|dτ dτ 0 hT x ψ, ψiβ = |hT x ψ, ψiβ | ≤ Z ≤
Iβ
Z Iβ
Iβ
Iβ
2 2 ˆ τ 0 )|ψ(τ )kψ(τ 0 )|dτ dτ 0 ≤ kKk|ψ|k ˆ ˆ K(τ, β = kKkψkβ .
The proof of Lemma 8.5 immediately follows from the estimate (7.18). Proof of Lemma 8.6. By (8.27), (8.30), and (8.69) one has Z X e−βˆs0 − e−βˆs 2 ˆ τ )dτ = 1 ˆ = kKk ˆ = K(0, qss s − ˆs0 ) max σ(K) 0 (ˆ (ˆ s − ˆs0 )2 Zˆβ 0 Iβ s,s ∈N
≤
12 1 X 2 1 · qss0 (ˆ s − ˆs0 ){e−βˆs0 − e−βˆs } = , ˆ ˆ ˆ2 ∆ Z β s,s0 ∈N m∆
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1395
Proof of Lemma 8.8. For a Lipschitz function f , one obtains by means of (8.48), (8.49), (8.68), and Lemma 8.5 Z Z kω − ω 0 kβ µx (dω)µx (dω 0 ) = trace(T x ) Varµx f ≤ Xβ
Xβ
Z ˆ τ ) = β K(0, ˆ 0) . K(τ,
ˆ = ≤ trace(K)
(8.80)
Iβ
Further, as in (8.31) one has ˆ l }] def ˆ 0) = 1 trace[q 2 exp{−β H = hq 2 i . K(0, ˆ Zβ ˆ may be expressed in terms of the Duhamel two-point It turns out that max σ(K) functions [38] and hence estimated from below as follows β 2 ˆ , ≤ max σ(K) (8.81) βhq if 4mhq 2 i where the function f : (0, +∞) → (0, +∞) was introduced and estimated in [38]. It has the following bound 1 (1 − e−t ) ≤ f (t) . (8.82) t Then by (8.73) and (8.76) one gets in (8.81) 1√ κ0 m−1/(r+1) , 2 that holds for m ∈ (0, m0 ). Applying this estimate in (8.80) one obtains (8.75). hq 2 i ≤
Appendix Here we prove the estimate (8.44). For s ∈ N and N 3 k ≥ 2, we write Θk (s) = (2s − 1)!!
k−1 2Y
k−1
Υ2k (s) ≤ 2s s!2s(2
−1)
k−1
[(2k−1 )!]s ss(2
−1)
,
(A.1)
k=2
where def
Υ2k (s) =
(2ks − 1)!! ≤ (2ks)s . (2(k − 1)s − 1)!!
Let Mn stand for the left-hand side of (8.44). One may write Mn = K1,0 · K2,0 · · · Kδ−1,0 · L0 ,
(A.2)
where −k
Kk,0 = [Θk (sk )/(2n)!]2
,
−(δ−1)
L0 = [Θδ (sδ )/(2n)!]2
.
(A.3)
From now on we fix n. For N 3 s < 2n, we write def
[s] = (s + 1) · · · (2n) =
(2n)! . s!
(A.4)
December 11, 2002 9:50 WSPC/148-RMP
1396
00154
S. Albeverio et al.
Then one has
1/2
K1,0 = [Θ1 (s1 )/(2n)!]
(2s1 − 1)!! 2s1 · = 2s1 s1 ! [s1 ]
1/2
≤ 2s1 /2 [s1 ]−1/2 .
Applying this estimate in (A.2) we obtain Mn ≤ 2s1 /2 K2,1 · K3,1 · · · Kδ−1,1 L1 ,
(A.5)
where −k
Kk,1 = Kk,0 · [s1 ]−2 def
−(δ−1)
L1 = L0 · [s1 ]−2 def
,
.
(A.6)
Let us estimate K2,1 as follows 1/4 1/4 (2S2 − 1)!! s2 Υ4 (s2 ) (s2 )s2 s2 /2 s2 /4 ·2 · ≤2 (2!) . K2,1 = 2s2 (s2 )! [s1 ][s2 ] [s1 ][s2 ]
(A.7)
All multipliers in the products [s1 ], [s2 ] are greater than s2 (we recall that sk ≥ sk+1 for all k = 1, 2, . . . , δ, and s1 + s2 + · · · + sδ = 2n). Therefore, one may find the numbers σ2 > s1 , σ3 > s2 , both less than 2n, such that σ2 ≥ σ3 and σ2 + σ3 = s1 + 2s2 . Then one gets (s2 )s2 1 1 (s2 )s2 = ≤ . · [s1 ][s2 ] (s1 + 1) · · · σ2 · (s2 + 1) · · · σ3 [σ2 ][σ3 ] [σ2 ][σ3 ] Here we have taken onto account that the number of multipliers in the product (s1 + 1) · · · σ2 · (s2 + 1) · · · σ3 is σ2 − s1 + σ3 − s2 = s2 and that every such multiplier is greater than s2 . This yields in (A.7) K2,1 ≤ 2s2 /2 (2!)s2 /4 {[σ2 ] · [σ3 ]}−1/4 . Applying this estimate in (A.5) we get Mn ≤ 2(s1 +s2 )/2 (2!)s2 /4 K3,2 K4,2 · · · Kδ−1,2 L2 ,
(A.8)
where we have set σ1 = s1 and −k
Kk,2 = Kk,0 {[σ1 ] · [σ2 ] · [σ3 ]}−2
,
−(δ−1)
L2 = L0 {[σ1 ] · [σ2 ] · [σ3 ]}−2
.
Proceeding in this way one obtains k
Mn ≤ 2(s1 +···+sk )/2 (2!)s2 /4 (4!)s3 /8 · · · [(2k−1 )!]sk /2 × Kk+1,k · · · Kδ−1,k Lk ,
(A.9)
where k = 2, 3, . . . , δ − 1 and for j = 2, 3, . . . , k + 1, −j
Kj,k = Kj,0 {[σ1 ][σ2 ] · · · [σ2k −1 ]}−2 , −(δ−1)
Lk = L0 {[σ1 ][σ2 ] · · · [σ2k −1 ]}−2
,
σ2l−1 + σ2l−1 +1 + · · · + σ2l −1 = 2l−1 sl + 2l−2 (sl−1 + · · · + s1 ) , σ1 + σ2 + · · · + σ2l −1 = 2l−1 (sl + sl−1 + · · · + s1 ) σl+1 ≤ σl < 2n .
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1397
Finally, we arrive at δ−1
Mn ≤ 2(s1 +···sδ )/2 (2!)s2 /4 (4!)s3 /8 · · · [(2δ−1 )!](sδ−1 +sδ )/2 δ−1
2n (2!)s2 /4 (4!)s3 /8 · · · [(2δ−1 )!](sδ−1 +sδ )/2 j
j+1
Taking into account that [(2j )!]sj /2 ≤ [(2j+1 )!]sj+1 /2 one obtains (8.44).
.
, j ∈ N, and that δ ≤ D,
Acknowledgments Yuri Kozitsky is grateful for the kind hospitality extended to him at Bielefeld and Bonn Universities. His research was supported in part by the Polish Scientific Research Committee through the grant KBN 2 PO3A 02915 that is also acknowledged. Further financial support by the DFG through the Projects 436 UKR 113/53, 436 POL 113/98/0-1 “Probability Measures”; by the BiBoS Research Center and by INTAS 99-0559 is gratefully acknowledged.
References [1] S. Albeverio and R. Høegh-Krohn, “Homogeneous random fields and quantum statistical mechanics”, J. Funct. Anal. 19 (1975) 242–279. [2] S. Albeverio and R. Høegh–Krohn, Mathematical Theory of Feynman Path Integrals, Lecture Notes in Math. 523, Springer, Berlin, 1976. [3] S. Albeverio, A. Daletskii, Yu. Kondratiev and M. R¨ ockner, “Fluctuations and their glauber dynamics in lattice systems”, J. Func. Anal. 166 (1999) 148–167. [4] S. Albeverio, Yu. Kondratiev, A. Kozak and Yu. Kozitsky, “A system of quantum anharmonic oscillators with hierarchical structure: Critical Point Convergence”, preprint BiBoS, Bielefeld, 2002. [5] S. Albeverio, Yu. Kondratiev and Yu. Kozitsky, “Absence of critical points for a class of quantum hierarchical models”, Commun. Math. Phys. 187 (1997) 1–18. [6] S. Albeverio, Yu. Kondratiev and Yu. Kozitsky, “Suppression of critical fluctuations by strong quantum effects in quantum lattice systems”, Commun. Math. Phys. 194 (1998) 493–512. [7] S. Albeverio, Yu. Kondratiev and Yu. Kozitsky, Classical limits of Euclidean Gibbs states for quantum lattice models, Lett. Math. Phys. 48 (1999) 221–233. [8] S. Albeverio, Yu. Kondratiev, Yu. Kozitsky and M. R¨ ockner, “Uniqueness for Gibbs measures of quantum lattices in small mass regime”, Ann. Inst. H. Poincar´ e 37 (2001) 43–69. [9] S. Albeverio, Yu. G. Kondratiev, R. A. Minlos and A. L. Rebenko, “Small mass behaviour of quantum Gibbs states for lattice models with unbounded spins”, J. Stat. Phys. 92 (1998) 1153–1172. [10] S. Albeverio, Yu. G. Kondratiev, R. A. Minlos and G. V. Shchepan’uk, “Uniqueness problem for quantum lattice systems with compact spins,” Lett. Math. Phys. 52 (2000) 185–195. [11] S. Albeverio, A. Yu. Kondratiev and A. L. Rebenko, “Peiers argument and long range-order behavior for quantum lattice systems with unbounded spins”, J. Stat. Phys. 92 (1998) 1153–1172.
December 11, 2002 9:50 WSPC/148-RMP
1398
00154
S. Albeverio et al.
[12] S. Albeverio, Yu. Kondratiev, M. R¨ ockner and T. V. Tsikalenko, “Uniqueness of Gibbs states for quantum lattice systems”, Probab. Theory Relat. Fields 108 (1997) 193–218. [13] S. Albeverio, Yu. Kondratiev, M. R¨ ockner and T. V. Tsikalenko, “Uniqueness of Gibbs states on loop lattices”, C.R. Acad. Sci. Paris, Probabilit´ es/Probability Theory, S´erie 1, 342 (1997) 1401–1406. [14] S. Albeverio, Yu. Kondratiev, M. R¨ ockner and T. V. Tsikalenko, “Dobrushin’s uniqueness for quantum lattice systems with nonlocal interaction”, Commun. Math. Phys. 189 (1997) 621–630. [15] S. Albeverio, Yu. Kondratiev, M. R¨ ockner and T. V. Tsikalenko, “Glauber dynamics for quantum lattice systems”, Rev. Math. Phys. 13 (2001) 51–124. [16] N. Angelescu, A. Verbeure and V. A. Zagrebnov, “Quantum n-vector anharmonic crystal. I. 1/n-expansion”, Comm. Math. Phys. 205 (1999) 81–95. [17] V. S. Barbulyak and Yu. G. Kondratiev, “Functional integrals and quantum lattice systems: I. Existence of Gibbs states”, Rep. Nat. Acad. Sci of Ukraine, No 9 (1991) 38–40. [18] V. S. Barbulyak and Yu. G. Kondratiev, “Functional integrals and quantum lattice systems: II. Periodic Gibbs states”, Rep. Nat. Acad. Sci of Ukraine, No 8 (1991) 31–34. [19] V. S. Barbulyak and Yu. G. Kondratiev, “Functional integrals and quantum lattice systems: III. Phase transitions”, Rep. Nat. Acad.Sci of Ukraine, No 10 (1991) 19–21. [20] V. S. Barbulyak and Yu. G. Kondratiev, “The quasi-classical limit for the Schr¨ odinger operator and phase transitions in quantum statistical physics”, Func. Anal. Appl. 26(2) (1992) 61–64. [21] V. S. Barbulyak and Yu. G. Kondratiev, “A criterion for the existence of periodic Gibbs states of quantum lattice systems”, Selecta Math. (N.S.) 12 (1993) 25–35. [22] M. T. Barlow and M. Yor, “Semi-martingale inequalities via the Garsia–Rodemich– Rumsey lemma and applications to local times, J. Func. Anal. 49 (1982) 198–229. [23] Yu. M. Berezansky and Yu. G. Kondratiev, Spectral Methods in Infinite Dimensional Analysis, Kluwer Academic Publishers, Dordrecht Boston London, 1994. [24] F. A. Berezin and M. A. Shubin, The Schr¨ odinger Equation, Kluwer Academic Publishers, Dordrecht Boston London, 1991. [25] P. Billingsley, Convergence of Probability Measures, John Wiley & Sons, Inc., New York, 1968. [26] T. Bodineau and B. Helffer, “The log-Sobolev inequality for unbounded spin systems”, J. Func. Anal. 166 (1999) 168–178. [27] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, I, II, Springer, New York, 1981. [28] J. Bricmont, “The Gaussian inequality for multicomponent rotators”, J. Stat. Phys. 17 (1997) 289–300. [29] M. Broidoi, B. Momont and A. Verbeure, “Lie algebra of anomalously scaled fluctuations”, J. Math. Phys. 36 (1995) 6746–6757. [30] A. D. Bruce and R. A. Cowley, Structural Phase Transitions, Taylor and Francis Ltd., 1981. [31] Ph. Choquard, The Anharmonic Crystal, W. A. Benjamin, New York, 1967. [32] J. M. Combes, P. Duclos and R. Seiler, Krein’s formula and one-dimensional multiple well”, J. Funct. Anal. 52 (1983) 257–301. [33] J.-D. Deuschel and D. W. Strook, Large Deviations, Academic Press, Inc., London, 1989. [34] R. L. Dobrushin “Prescribing a system of random variables by conditional distributions”, Theory Prob. Appl. 15 (1970) 458–486.
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1399
[35] R. L. Dobrushin and S. B. Shlosman, “Constructive criterion for the uniqueness of Gibbs field”, in Statistical Physics and Dynamical Systems. Rigorous Results, Birkhaeuser, Basel, 1985, pp. 347–370. [36] W. Driessler, L. Landau and J. F. Perez, “Estimates of critical lengths and critical temperatures for classical and quantum lattice systems”, J. Stat. Phys. 20 (1979) 123-162. [37] F. Dunlop, “Correlation inequalities for multicomponent rotators”, Comm. Math. Phys. 49 (1976) 247–256. [38] F. J. Dyson, E. H. Lieb and B. Simon, “Phase transitions in quantum spin systems with isotropic and nonisotropic interactions”, J. Stat. Phys. 18 (1978) 335–383. [39] M. Fannes and A. Verbeure, Correlation inequalities and equilibrium states”, Comm. Math. Phys. 55 (1977) 125–131. [40] R. Fernandez, J. Fr¨ ohlich and A. Sokal, Random Walks, Critical Phenomena and Triviality in Quantum Field Theory, Springer, 1992. [41] J. K. Freericks, Mark Jarrell and G. D. Mahan, “The anharmonic electron-phonon problem”, Phys. Rev. Lett. 77 (1996) 4588–4591. [42] J. Fr¨ ohlich, B. Simon and T. Spencer, “Infrared bounds, phase transitions and continuous symmetry breaking”, Commun. Math. Phys. 50 (1976) 79–85. [43] H. O. Georgii, Gibbs Measures and Phase Transitions. Vol 9, Walter de Gruyter, Springer, 1988. [44] S. A. Globa, “A class of quantum lattice models and their Gibbs states”, Ukrainian Math. J. 40 (1988) 787–792. [45] S. A. Globa and Yu. G. Kondratiev, The construction of Gibbs states of quantum lattice systems”, Selecta Math. Sov. 9 (1990) 297–307. [46] D. Goderis, A. Verbeure and P. Vets, “Dynamics of fluctuations for quantum lattice systems”, Comm. Math. Phys. 128 (1990) 533–549. [47] F. Guerra, L. Rosen and B. Simon, “Boundary conditions for the P (ϕ)2 Euclidean field theory”, Ann. Inst. H. Poincar´ e 15 (1976) 231–234. [48] B. Helffer, “Splitting in large dimensions and infrared estimates. II. Moment inequalities”, J. Math. Phys. 39 (1998) 760–776. [49] R. Høegh-Krohn, “Relativistic quanum statistical mechanics in two-dimensional space-time”, Comm. Math. Phys. 38 (1974) 195–224. [50] A. Inoue, Tomita–Takesaki Theory in Algebras of Unbounded Operators. Lecture Notes in Math. 1699, Springer-Verlag, 1998. [51] A. Inoue, “A survey of Tomita–Takesaki theory in algebras of unbounded operators. II. Physical applications”, Fukuoka Univ. Sci. Rep. 30 (2000) 49–66. [52] A. Klein and L. Landau, “Stochastic processes associated with KMS states”, J. Funct. Anal. 42 (1981) 368–428. [53] Ju. G. Kondratiev, “Phase transitions in quantum models of ferroelectrics”, in Stochastic Processes, Physics, and Geometry II, World Scientific, 1994, pp. 465–475. [54] Yu. Kozitsky, “Quantum effects in a lattice model of anharmonic vector oscillators”, Lett. Math. Phys. 51 (2000) 71–81. [55] Yu. Kozitsky, “Scalar domination and normal fluctuation in N -vector quantum anharmonic crystals”, Lett. Math. Phys. 53 (2000) 289–303. [56] Yu. Kozitsky, “Quantum effects in lattice models of vector anharmonic oscillators, in Stochastic Processes, Physics and Geometry: New Interplays, II (Leipzig, 1999), CMS Conf. Proc., 29, Amer. Math. Soc., Providence, RI, 2000, pp. 403–411. [57] Yu. Kozitsky, “Gibbs states of a lattice systems of quantum anharmonic oscillators”, in Noncommutative Structures in Mathematics and Physics (Kiev 2000, S. Duplij and J. Wess, eds.), NATO Sci Ser. II Math. Phys. Chem. 22, Kluwer Acad. Publ. Dordrecht, 2001, pp. 415–425.
December 11, 2002 9:50 WSPC/148-RMP
1400
00154
S. Albeverio et al.
[58] H. K¨ unsch, Decay of correlations under Dobrushin’s uniqueness condition and its applications”, Commun. Math. Phys. 84 (1982) 207-222. [59] V. A. Malyshev and R. A. Minlos, Linear Infinite Particle Operators, Amer. Math. Soc., 1995. [60] R. A. Minlos, A. Verbeure and V. A. Zagrebnov, “A quantum crystal model in the light-mass limit: Gibbs states”, Reviews Math. Phys. 12 (2000) 981–1032. [61] E. Nelson, “Feynman integrals and the Schr¨ odinger equation”, J. Math. Phys. 5 (1964) 332–343. [62] C. M. Newman, “Normal fluctuations and the FKG inequalities”, Comm. Math. Phys. 74 (1980) 119–128. [63] E. Olivieri, P. Picco and Yu. M. Suhov, “On the Gibbs states for one-dimensional lattice boson systems with a long-range interaction”, J. Stat. Phys. 70 (1993) 985– 1028. [64] Y. M. Park and H. H Yoo, “A characterization of Gibbs states of lattice boson systems”, J. Stat. Phys. 75 (1994) 215–239. [65] Y. M. Park and H. H. Yoo, “Uniqueness and clustering properties of Gibbs states for classical and quantum unbounded spin systems”, J. Stat. Phys. 80 (1995) 223–271. [66] K. R. Parthasarathy, Probability Measures on Metric Spaces, Academic Press, New York, 1967. [67] L. A. Pastur and B. A. Khoruzhenko, “Phase transitions in quantum models of rotators and ferroelectrics”, Theoret. Math.Phys. 73 (1987) 111–124. [68] R. T. Powers, “Self-adjoint algebras of unbounded operators”, Comm. Math. Phys. 21 (1971) 85–124. [69] A. Procacci and B. Scoppola, “On decay of correlations for unbounded spin systems with arbitrary boundary conditions”, J. Stat. Phys. 105 (2001) 453–482. [70] M. Reed and B. Simon, Methods of Modern Mathematical Physics. I. Functional Analysis, Academic Press, New York, London, 1972. [71] T. Schneider, H. Beck and E. Stoll, “Quantum effects in an n-component vector model for structural phase transitions”, Phys. Rev. B13 (1976) 1123–1130. [72] G. L. Sewell, Unbounded local observables in quantum statistical mechanics”, J. Math. Phys. 11 (1970) 1868–1884. [73] G. L. Sewell, Quantum Theory of Collective Phenomena, Clarendon Press, Oxford, 1986. [74] B. V. Shabat, Introduction to Complex Analysis. II: Functions of Several Variables, Nauka, Moscow, 1985. [75] S. Shlosman, “Signs of the ising model ursell functions”, Commun. Math. Phys. 102 (1986) 679–686. [76] B. Simon, The P (ϕ)2 Euclidean (Quantum) Field Theory, Princeton Univ. Press, Princeton, 1974. [77] B. Simon, Functional Integration and Quantum Physics, Academic Press, 1979. [78] B. Simon, “Instantons, double wells and large deviations”, Bulletin of the Amer. Math. Soc. (New Series) 8 (1983) 323–326. [79] B. Simon, “Semiclassical analysis of low lying eigenvalues, II. Tunneling”, Ann. Math. 120 (1984) 89–118. [80] B. Simon, “Schr¨ odinger operators in the twentieth century”, J. Math. Phys. 41 (2000) 3523–3555. [81] S. Stamenkovi´c, “Unified model description of order-disorder and displacive structural phase transitions”, Condensed Matter Physics (Lviv) 1(14) (1998) 257–309. [82] I. V. Stasyuk, “Local anharmonic effects in high-Tc superconductors. Pseudospinelectron model”, Condensed Matter Physics (Lviv), 2(19) (1999) 435–446.
December 11, 2002 9:50 WSPC/148-RMP
00154
Euclidean Gibbs States of Quantum Lattice Systems
1401
[83] I. V. Stasyuk, “Approximate analitical dynamical mean-field approach to strongly correlated electron systems”, Condensed Matter Physics (Lviv) 3(22) (2000) 437–456. [84] G. Sylvester, “Inequalities for continuous-spin Ising ferrromagnets”, J. Stat. Phys. 15 (1976) 327–341. [85] J. E. Tibballs, R. J. Nelmes and G. J. McIntyre, “The crystal structure of tetragonal KH2 PO4 and KD2 PO4 as a function of temperature and pressure”, J. Phys. C : Solid State Phys. 15 (1982) 37–58. [86] A. Verbeure and V. A. Zagrebnov, “Phase transitions and algebra of fluctuation operators in exactly soluble model of a quantum anharmonic crystal”, J. Stat. Phys. 69 (1992) 37–55. [87] A. Verbeure and V. A. Zagrebnov, “No–Go theorem for quantum structural phase transition”, J. Phys. A: Math. Gen. 28 (1995) 5415–5421.
REVIEWS IN MATHEMATICAL PHYSICS Author Index (2002)
Adachi, T., On spectral and scattering theory for N-body Schrödinger operators in a constant magnetic field Albeverio, S., Kondratiev, Y., Kozitsky, Y. & Röckner, M., Euclidean Gibbs states of quantum lattice systems Banach, Z. & Larecki, W., Evolution of central moments for a general-relativistic Boltzmann equation: the closure by entropy maximization Bartuccelli, M. & Gentile, G., Lindstedt series for perturbations of isochronous systems: a review of the general theory Battye, R.A. & Sutcliffe, P.M., Skyrmions, fullerenes and rational maps Benci, V. & Fortunato, D., Solitary waves of the nonlinear Klein–Gordon equation coupled with the maxwell equations Betz, V., Hiroshima, F., Lörinczi, J., Minlos, R.A. & Spohn, H., Ground state properties of the Nelson hamiltonian: a Gibbs measure-based approach Birke, L. & Fröhlich, J., KMS, ETC. Boas, F.-M., see Dütsch Brunetti, R. & Fredenhagen, K., Remarks on time-energy uncertainty relations Brunetti, R., Guido, D. & Longo, R., Modular localization and Wigner particles Cammarota, C., Positive and negative correlations for conditional Ising distributions Chung, D.M., Ji, U.C. & Obata, N., Quantum stochastic analysis via white noise operators in weighted Fock space Degiovanni, L. & Magnano, G., Tri-hamiltonian vector fields,
2(2002)199
spectral curves and separation coordinates Doplicher, S. & Piacitelli, G., Any compact group is a gauge group Doplicher, S., Longo, R., Roberts, J.E. & Zsidó, L., A remark on quantum group actions and nuclearity Duclos, P., Lev, O., Òto4 ví
ek, P. & Vittot, M., Weakly regular Floquet hamiltonians with pure point spectrum Dütsch, M. & Boas, F.-M., The master ward identity Elgart, A. & Schenker, J.H., A strong operator topology adiabatic theorem Evans, D.E., Fusion rules of modular invariants Ferrara, S. & Lledó, M.A., Considerations on super Poincaré algebras and their extensions to simple superalgebras Fortunato, D., see Benci Fredenhagen, K., see Brunetti Fröhlich, J., see Birke Gawe¸dzki, K. & Reis, N., WZW branes and gerbes Gentile, G., see Bartuccelli Gérard, C., On the scattering theory of massless Nelson models Guerrini, L., Formal and analytic deformations from Witt to Virasoro Guido, D., see Brunetti Hamachi, K., Quantum moment maps and invariants for G-invariant star products Hiroshima, F., see Betz
12(2002)1335
5(2002)469
2(2002)121
1(2002)29
4(2002)409
2(2002)173
7&8(2002)829 9(2002)977 7&8(2002)897
7&8(2002)759
10(2002)1099
3(2002)241
10(2002)1115
1403
RMP-2002.p65
1403
12/12/02, 10:00 AM
7&8(2002)873
7&8(2002)787
6(2002)531
9(2002)977 6(2002)569
7&8(2002)709
6(2002)519
4(2002)409 7&8(2002)897 7&8(2002)829 12(2002)1281 2(2002)121 11(2002)1165
3(2002)303
7&8(2002)759 6(2002)601
2(2002)173
1404
AUTHOR INDEX
Izumi, M. & Kosaki, H., On a subfactor analogue of the second cohomology Jaffe, A., Derivatives with twists Ji, U.C., see Chung Junker, W., Erratum to “Hadamard states, adiabatic vacua and the construction of physical states for scalar quantum fields on curved spacetime” in Rev. Math. Phys. 8(1996) 1091–1159 Kellendonk, J., Richter, T. & Schulz-Baldes, H., Edge current channels and Chern numbers in the integer quantum Hall effect Kishimoto, A., Approximately inner flows on separable C*-Algebras Kondratiev, Y., see Albeverio Kosaki, H., see Izumi Kozitsky, Y., see Albeverio Kreuzer, M. & Skarke, H., Reflexive polyhedra, weights and toric Calabi–Yau fibrations Kuksin, S.B., Ergodic theorems for 2D statistical hydrodynamics Larecki, W., see Banach Lev, O., see Duclos Lieb, E.H. & Pedersen, G.K., Convex multivariable trace functions Lledó, M.A., see Ferrara Longo, R., see Brunetti Longo, R., see Doplicher Lörinczi, J., see Betz Lytvynov, E., Fermion and boson random point processes as particle distributions of infinite free Fermi and Bose gases of finite density
RMP-2002.p65
1404
7&8(2002)733
7&8(2002)887 3(2002)241 5(2002)511
1(2002)87
7&8(2002)649
12(2002)1335 7&8(2002)733 12(2002)1335 4(2002)343
6(2002)585
5(2002)469 6(2002)531 7&8(2002)631
6(2002)519 7&8(2002)759 7&8(2002)787 2(2002)173 10(2002)1073
Magnano, G., see Degiovanni Matsui, T., Bosonic central limit theorem for the one-dimensional XY model Minlos, R.A., see Betz Mohri, K., Exceptional string: instanton expansions and Seiberg–Witten curve Molev, A.I. & Ragoucy, E., Representations of reflection algebras Nakano, F., Absence of transport in Anderson localization Obata, N., see Chung Pedersen, G.K., see Lieb Piacitelli, G., see Doplicher Ragoucy, E., see Molev Raikov, G.D. & Warzel, S., Quasi-classical versus non-classical spectral asymptotics for magnetic Schrödinger operators with decreasing electric potentials Reis, N., see Gawe¸dzki Richter, T., see Kellendonk Roberts, J.E., see Doplicher Röckner, M., see Albeverio Ruelle, D., How should one define entropy production for nonequilibrium quantum spin systems? Schenker, J.H., see Elgart Schulz-Baldes, H., see Kellendonk Skarke, H., see Kreuzer Sobolev, A.V. & Solomyak, M., Schrödinger operators on homogeneous metric trees: spectrum in gaps
12/12/02, 10:00 AM
10(2002)1115 7&8(2002)675
2(2002)173 9(2002)913
3(2002)317
4(2002)375
3(2002)241 7&8(2002)631 7&8(2002)873 3(2002)317 10(2002)1051
12(2002)1281 1(2002)87 7&8(2002)787 12(2002)1335 7&8(2002)701
6(2002)569 1(2002)87 4(2002)343 5(2002)421
AUTHOR INDEX
Solomyak, M., see Sobolev Spohn, H., see Teufel Spohn, H., see Betz Òto4 ví
ek, P., see Duclos Sutcliffe, P.M., see Battye Teufel, S. & Spohn, H., Semiclassical motion of dressed electrons Vittot, M., see Duclos
RMP-2002.p65
1405
5(2002)421 1(2002)1 2(2002)173 6(2002)531 1(2002)29 1(2002)1
6(2002)531
1405
Warzel, S., see Raikov Woronowicz, S.L. & Zakrzewski, S., Quantum ‘AX + B’ group Zakrzewski, S., see Woronowicz Zenk, H., Anderson localization for a multidimensional model including long range potentials and displacements Zsidó, L., see Doplicher
12/12/02, 10:00 AM
10(2002)1051 7&8(2002)797
7&8(2002)797 3(2002)273
7&8(2002)787